Implementation of Redis sharded cluster

1. The core concept of Redis Cluster

Hash Slots

Redis Cluster uses hash slots to implement data sharding. The entire data space is divided into 16384 hash slots, which are evenly distributed to different master nodes.

Hash calculation: For each key, Redis uses the CRC16 algorithm to calculate its hash value, and modulates 16384 to determine which hash slot the key belongs to.
Slot distribution: Each master node is responsible for part of the hash slot. For example, if there are three master nodes, they can be responsible for 5461 hash slots respectively.

Master-slave replication and failover

To ensure high availability, each master node has one or more slave nodes. When a master node fails, the cluster will automatically fail over, promoting one of the slave nodes to the new master node.

Master-slave synchronization: The slave node will continuously synchronize data from the master node to maintain data consistency. The synchronization process includes full synchronization and incremental synchronization.
Fault detection and recovery: The nodes in the cluster will regularly exchange information to monitor the status of other nodes. If a master node fails, the cluster selects a slave node to upgrade to the master node and reassigns the hash slot.

2. Detailed configuration steps

Prepare node

Suppose we have 6 Redis instances (3 masters and 3 slaves) running on different ports respectively.

Node configuration example ()

The configuration files of each node are slightly different, mainly due to the different port numbers and roles. Here is an example configuration file for each node:

Master Node 1: (Port: 7000)

# Port numberport 7000

# Enable cluster modecluster-enabled yes

# cluster configuration file path, used to store cluster status informationcluster-config-file 

# Node timeout (milliseconds), if no response exceeds this time, it is considered a failurecluster-node-timeout 5000

# Turn on AOF persistenceappendonly yes

# Optional configuration items# Set password protection# requirepass yourpassword

# Set the maximum memory usage# maxmemory 2gb

# Set up a persistence policy# appendfsync everysec

Slave Node 1: (Port: 7001)

# Port numberport 7001

# Enable cluster modecluster-enabled yes

# cluster configuration file path, used to store cluster status informationcluster-config-file 

# Node timeout (milliseconds), if no response exceeds this time, it is considered a failurecluster-node-timeout 5000

# Turn on AOF persistenceappendonly yes

# Set the master node of the slave nodeslaveof 127.0.0.1 7000

# Optional configuration items# Set password protection# requirepass yourpassword

# Set the maximum memory usage# maxmemory 2gb

# Set up a persistence policy# appendfsync everysec

Master Node 2: (Port: 7002)

port 7002
cluster-enabled yes
cluster-config-file 
cluster-node-timeout 5000
appendonly yes

Slave Node 2: (Port: 7003)

port 7003
cluster-enabled yes
cluster-config-file 
cluster-node-timeout 5000
appendonly yes
slaveof 127.0.0.1 7002

Master Node 3: (Port: 7004)

port 7004
cluster-enabled yes
cluster-config-file 
cluster-node-timeout 5000
appendonly yes

Slave Node 3: (Port: 7005)

port 7005
cluster-enabled yes
cluster-config-file 
cluster-node-timeout 5000
appendonly yes
slaveof 127.0.0.1 7004

Start the node

Start all nodes in turn:

redis-server /path/to/
redis-server /path/to/
redis-server /path/to/
redis-server /path/to/
redis-server /path/to/
redis-server /path/to/

Create a cluster

useredis-cliTools to create clusters:

redis-cli --cluster create 127.0.0.1:7000 127.0.0.1:7001 127.0.0.1:7002 127.0.0.1:7003 127.0.0.1:7004 127.0.0.1:7005 --cluster-replicas 1

Here--cluster-replicas 1Indicates that each master node has a slave node.

3. Practical operation examples

Write data

You can write data as usual, and Redis Cluster will automatically assign data to the appropriate nodes.

redis-cli -c -p 7000
127.0.0.1:7000> SET key1 value1
-> Redirected to slot [12182] located at 127.0.0.1:7002
OK

Pay attention here-cThe parameter indicates that cluster mode is enabled, so that the client will automatically process the redirect command.

Read data

When reading data, you can also access it through any node:

redis-cli -c -p 7000
127.0.0.1:7000> GET key1
-> Redirected to slot [12182] located at 127.0.0.1:7002
"value1"

Fault simulation and recovery

You can manually close a master node to simulate a failure and observe how the cluster performs automatic failover.

Close the master node

redis-cli -p 7002 shutdown

Check cluster status

redis-cli -p 7000 cluster nodes

You should see that the original slave node has been promoted to the master node and has taken over the hash slot of the original master node.

4. Redis Cluster's internal mechanism

Data sharding and redirection

Data sharding: Each key is mapped to a specific master node through a hash slot. The request sent by the client will be routed to the corresponding node.
Redirect: If the client requests a hash slot that does not belong to the current node, the cluster will return aMOVEDResponse, indicating that the client should connect to the correct node. After receiving this response, the client will automatically redirect to the specified node.

# Connect to any node in the clusterredis-cli -c -p 7000

# Try setting a key127.0.0.1:7000&gt; SET key1 value1
-&gt; Redirected to slot [12182] located at 127.0.0.1:7002
OK

# Try to get the same key again from the same client127.0.0.1:7000&gt; GET key1
-&gt; Redirected to slot [12182] located at 127.0.0.1:7002
"value1"

Master-slave replication and failover

Master-slave synchronization: The slave node will continuously synchronize data from the master node to maintain data consistency. The synchronization process includes full synchronization and incremental synchronization.
- Fully synchronized: When synchronizing for the first time, the slave node will get the complete data set from the master node.
  - process: The slave node initiates a full synchronization request to the master node, and the master node generates an RDB file and transmits it to the slave node. Load the RDB file from the node and start receiving subsequent incremental updates.
- Incremental synchronization: During subsequent synchronization, the slave node will receive incremental updates from the master node (such as writing operation logs).
  - process: The master node records the write operation log (called the replication backlog buffer), and the slave node requests the missing part of the log based on the offset.
Fault detection: The nodes in the cluster will regularly exchange information to monitor the status of other nodes. If a master node fails to respond to a heartbeat message multiple times in a row, it will be marked as suspected offline (PFAIL).
- Heartbeat check: Each node will send heartbeat messages to other nodes regularly, with the default interval being 1 second.
- Tag PFAIL: If a node does not respond to a heartbeat message for a certain period of time, it will be marked as suspected offline (PFAIL).
- Tag FAIL: If most nodes mark a node as PFAIL, the node will be marked as FAIL.
Failover: Once the master node is confirmed to be offline, the cluster will select a slave node to promote it to the master node and reassign the hash slot. The specific steps are as follows:election: The nodes in the cluster will elect a new master node. The election process is based on the Raft consistency algorithm, ensuring that only one slave is selected.
- Voting process: Each slave will vote for a candidate according to certain rules. The slave node with the most votes in the end will be selected as the new master node.
Data synchronization: The newly elected master node will obtain the latest data from other nodes to ensure data consistency.
- Partial synchronization: If the slave node already has some data, only the missing data can be synchronized.
- Fully synchronized: If the slave node data is completely lost, full synchronization is required.
Reassign hash slots: The new master node takes over the hash slot of the original master node and the cluster restores normal service.

5. Summary

Redis Cluster implements distributed storage of data through hash slots and provides high availability and automatic failover capabilities. This makes it ideal for application scenarios where large amounts of data and high concurrent requests are required. At the same time, modern Redis client libraries can also support automatic redirection and load balancing in cluster mode.

This is the end of this article about the implementation of Redis shard cluster. For more related content of Redis shard cluster, please search for my previous articles or continue browsing the related articles below. I hope everyone will support me in the future!