# Redis cluster principle

2022-04-23 15:04:00 Mrpre

# Redis cluster principle

Redis cluster Design documents visible :
https://redis.io/topics/cluster-spec#configuration-handling-propagation-and-failovers

Related information
https://chanjarster.github.io/post/redis-cluster-config-propagation/

At present Redis6.0 In the version of the , have access to `redis-cli --cluster create` Command to plan a cluster , The following command ( If you've started 7001 7002… these Redis Service and cluster_enable Turn on )

``````redis-cli --cluster create 11.158.133.251:7001 11.158.133.251:7002 11.158.133.251:7003 11.158.133.251:7004 11.158.133.251:7005 11.158.133.251:7006 --cluster-replicas 1
``````

actually , This `--cluster create` Behind the scenes Redis To complete the construction of the cluster , This is what we need to analyze in this paper .

## How to perceive clusters

First , Suppose you start 3 platform redis machine A B C, Expectation construction Redis colony , structure Redis The prerequisite for clustering is ABC Three machines need to know each other's existence . How to quickly make one of the machines , Sense the other two machines ？ Pat your head and take it for granted , That is to tell A There is B、C , tell B There is A、C, tell C There is A、B, Logically, no problem , single Redis Didn't do it .Redis That's what it does ：

Redis to B and C Send separately `CLUSTER MEET A`,B received `CLUSTER MEET A` after , Hui He A Interact , such A、B We can know each other's existence (cluster nodes The command can see ), We use it "A. colony :A+B" Express A Know the cluster information , natural B The cluster information is "B. colony :A+B".

next C received `CLUSTER MEET A` after ,C and A Interactive exchange of information , because A The information contained B, therefore C You know A as well as B, Again A I know C The existence of , here A and C The state of is "A. colony :A+B+C"、“C. colony :A+B+C”; The rest is B 了 ,A Will broadcast their own information to B, So after a while ,B I know A The cluster information inside has been added C, therefore B Update information for your own cluster "B. colony :A+B+C", Finally reach a stable state .

B received `CLUSTER MEET A` technological process

``````    ....
} else if (!strcasecmp(c->argv[1]->ptr,"meet") && (c->argc == 4 || c->argc == 5)) {

/* CLUSTER MEET <ip> <port> [cport] */
long long port, cport;

if (getLongLongFromObject(c->argv[3], &port) != C_OK) {

(char*)c->argv[3]->ptr);
return;
}

if (c->argc == 5) {

if (getLongLongFromObject(c->argv[4], &cport) != C_OK) {

(char*)c->argv[4]->ptr);
return;
}
} else {

cport = port + CLUSTER_PORT_INCR;
}

// The core is this function   He will be   At present   need meet The node of , Join the global server.cluster->nodes  in
if (clusterStartHandshake(c->argv[2]->ptr,port,cport) == 0 &&
errno == EINVAL)
{

(char*)c->argv[2]->ptr, (char*)c->argv[3]->ptr);
} else {

}
}
``````

B and A Interaction Is in Redis Executed in the background thread

``````//clusterCron  function

di = dictGetSafeIterator(server.cluster->nodes);
server.cluster->stats_pfail_nodes = 0;
// Cycle through the book redis Node  cluster node
while((de = dictNext(di)) != NULL) {

clusterNode *node = dictGetVal(de);

// For other clusters connected to node Make a connection

link->conn = server.tls_cluster ? connCreateTLS() : connCreateSocket();

// The core function is   Callback function after connection establishment  clusterLinkConnectHandler

/* We got a synchronous error from connect before * clusterSendPing() had a chance to be called. * If node->ping_sent is zero, failure detection can't work, * so we claim we actually sent a ping now (that will * be really sent as soon as the link is obtained). */
if (node->ping_sent == 0) node->ping_sent = mstime();
serverLog(LL_DEBUG, "Unable to connect to "
"Cluster Node [%s]:%d -> %s", node->ip,
node->cport, server.neterr);

continue;
}
}
}
``````

``````void clusterLinkConnectHandler(connection *conn) {

if (connGetState(conn) != CONN_STATE_CONNECTED) {

serverLog(LL_VERBOSE, "Connection with Node %.40s at %s:%d failed: %s",
node->name, node->ip, node->cport,
connGetLastError(conn));
return;
}

mstime_t old_ping_sent = node->ping_sent;
// send out  meet Type of ping Information , So-called ping Information , It is other node information that contains the points known in this section
CLUSTERMSG_TYPE_MEET : CLUSTERMSG_TYPE_PING);
if (old_ping_sent) {

/* If there was an active ping before the link was * disconnected, we want to restore the ping time, otherwise * replaced by the clusterSendPing() call. */
node->ping_sent = old_ping_sent;
}
/* We can clear the flag after the first packet is sent. * If we'll never receive a PONG, we'll never send new packets * to this node. Instead after the PONG is received and we * are no longer in meet/handshake status, we want to send * normal PING packets. */
node->flags &= ~CLUSTER_NODE_MEET;

serverLog(LL_DEBUG,"Connecting with Node %.40s at %s:%d",
node->name, node->ip, node->cport);
}
``````

Sum up , If B received `CLUSTER MEET A` news , It will be saved A The address of , Then in the background thread , and A Establish a connection and give A send out PING news ,PING Of the type in meet. And then look at it A received B It's from PING How messages are handled .

``````        /* Add this node if it is new for us and the msg type is MEET.
* In this stage we don't try to add the node with the right
* flags, slaveof pointer, and so forth, as this details will be
* resolved when we'll receive PONGs from the node. */

// Add the sender to the cluster node, It's easy to explain , Because send meet The person is the node of the cluster
if (!sender && type == CLUSTERMSG_TYPE_MEET) {
clusterNode *node;

node = createClusterNode(NULL,CLUSTER_NODE_HANDSHAKE);
node->port = ntohs(hdr->port);
node->cport = ntohs(hdr->cport);
clusterDoBeforeSleep(CLUSTER_TODO_SAVE_CONFIG);
}

//meet The message carries the existing nodes in the sender , At this time, you also need to update these nodes to the local node .
if (!sender && type == CLUSTERMSG_TYPE_MEET)

/* Anyway reply with a PONG */
// reply pong, Be careful pong There are also other node information known in this node