当前位置:网站首页>Source code analysis of how to use jump table in redis
Source code analysis of how to use jump table in redis
2022-04-23 05:17:00 【Magic circle】
Preface
Alibaba cloud's spring school recruitment test this year , Interviewer asked Redis How to use the jump table in ? It's a headache for many students to arrive . Let's talk about it today .
Sorted Set Structure
redis An ordered collection of data types (sorted set) Very widely used , It has the function of collection , At the same time, it can support the collection with weight , And sort by weight . It can go through ZRANGEBYSCORE Returns a range of elements according to their weight , Or by ZSCORE Returns the weight value of an element . It can return the weight of the element with constant complexity , I believe many children's shoes can think of using hash table index . How to support range query ? Then we have to talk about today's key jump table . And think again , How do they combine ?
Let's have a look at sorted set Structure
typedef struct zset {
dict *dict;
zskiplist *zsl;
} zset;
From the structure , It can be seen that the purpose of using jump table is to efficiently support range query ( such as ZRANGBYSCORE operation ), The efficient single point query uses hash structure ( such as ZCORE) operation . So let's see Redis How to implement the skip table of .
Design and implementation of skip table
Jump watch (skiplist) In fact, it is a multi-layer ordered linked list . The jump table is sorted from low to high , Lowest lower floor level0, Up there are level1,level2 etc. . Is in O(log(n)) Add... In time 、 Delete 、 The data structure of the search operation . It can be said to be an alternative structure of balanced tree , But unlike red and black trees , Its implementation of balanced tree is based on a randomized algorithm .
So let's see zskiplist The implementation of the
typedef struct zskiplistNode {
sds ele;
// The weight
double score;
// Back pointer
struct zskiplistNode *backward;
level Array
struct zskiplistLevel {
// Forward pointer
struct zskiplistNode *forward;
// span , Records span level0 Several nodes on
unsigned long span;
} level[];
} zskiplistNode;
because sorted set That is to save the data , Also save the weight value . So in zkiplistNode See... In the definition sds Variable of type ele, as well as double Variable of type score.ele Saved element ,score Saved weights . Besides , In order to facilitate searching from the tail node of the jump table , Each hop table node also saves pointers backward, It points to the previous node .
Besides , Just mentioned that the jump list itself is a multi-layer ordered linked list , Therefore, each layer is connected by multiple nodes through pointers . Therefore, a... Is also included in the structure zskiplistLevel Structure type level Array .level Each element in the array defines a forward pointer (forward) Connect with subsequent nodes . Span is used to record on a certain layer foward The pointer between the node and level0 Several nodes on . for instance , So let's take this picture
In this example, the node 30 Of level The array has three elements , They correspond to three layers level The pointer , Besides ,level2,level1,level0 The corresponding spans are 3、2、1. and , Because the nodes in the jump table are arranged in order , Therefore, it can be based on each node in forward The span on the pointer is accumulated . This cumulative value can be used to calculate the order of the node in the whole hop table .
Next , Understand the definition of the next hop table , The head node and tail node of the jump table are defined in the jump table structure , The length of the skip table and the maximum number of layers of the skip table . The definition is as follows :
typedef struct zskiplist {
// The head and tail nodes of the jump table
struct zskiplistNode *header, *tail;
// Maximum length
unsigned long length;
// Maximum number of layers
int level;
} zskiplist;
Because each node in the hop table is connected by a pointer , Therefore, we only need to obtain the head node and tail node of the linked list, and then we can query each node of the jump table through the node pointer .
Jump table query
that , Now the jump table has been defined , Is it still like querying an ordinary linked list from the beginning one by one ? Of course not. , You can use... In the node level Array accelerated query .
When querying a node , The jump table will find the next node from the top of the node , Because the elements and weights are saved in the node , Therefore, when comparing nodes, the two should be compared .
1. When the weight of the node to be searched is smaller than that of the node to be searched , Then it will continue to access the next node of the layer .
2. When the weight of the node to be searched is equal to the weight of the node to be searched , Then continue to compare their values , If the search node data is smaller than the node data to be searched , The skip table will still access the next node of the layer .
When neither of these conditions is met , You will visit level The next pointer to the array , Then continue to find along the next pointer . It's equivalent to going to the next level and continuing to find . The specific code is as follows :
x = zsl->header;
for (i = zsl->level-1; i >= 0; i--) {
...
while (x->level[i].forward &&
(x->level[i].forward->score < score ||
(x->level[i].forward->score == score &&
sdscmp(x->level[i].forward->ele,ele) < 0)))
{
...
x = x->level[i].forward;
}
The number of nodes in each layer of the hop table is about half that of the next layer , The advantage of this is that the search process is similar to binary search , Time complexity comes from O(n) Down to O(logN), But this method also has negative problems , That is, once there is a node to insert and delete . Then the nodes to be inserted and deleted and the layers of the nodes behind them need to be adjusted, resulting in additional overhead .
Here's an example
Suppose the current skip table has 4 Nodes ,1、7、13、21.
Then, if you insert multiple nodes , As shown in the figure below
You can see that when multiple nodes are inserted , If it's looking for 45 This node needs to be in level 0 Search the nodes in order , The complexity is O(N) 了 . Now, in order to maintain the relationship between the number of adjacent nodes, it is modified as follows
Find it like this 45 The time complexity is reduced , But it also brings additional costs . To avoid the above problems , When creating a node, the jump table , The method of randomly generating the number of node layers is adopted . At this time, the adjacent layers are not strictly in accordance with 2:1 The way , In this case , When inserting a new node , Just modify the pointer of the front and back nodes , The number of layers of other nodes does not need to be changed , This reduces the complexity of the insertion operation .
In the source code , The number of jump table node layers is determined by zslRandomLevel Function to determine . The code is as follows
int zslRandomLevel(void) {
static const int threshold = ZSKIPLIST_P*RAND_MAX;
int level = 1;
while (random() < threshold)
level += 1;
return (level<ZSKIPLIST_MAXLEVEL) ? level : ZSKIPLIST_MAXLEVEL;
}
You can see zslRandomLevel The function initializes the number of layers to 1, This is also the minimum number of layers of nodes . And then generate random numbers , If the random number is less than ZSKIPLIST_P.ZSKIPLIST_P The definition is as follows
#define ZSKIPLIST_P 0.25
It represents the probability of increasing the number of layers by skipping the table , Then the number of layers increases 1 layer . Because the value of random number is [0,0.25) The probability within the range will not exceed 25%, So it also shows that the probability of adding one layer will not exceed 25%.
# summary
This article introduces Sorted Set The underlying implementation of data types .Sorted Set In order to support the query according to the weight range at the same time , And single point query for element weight , And single point query for element weight , In the bottom structure, the combination method of using jump table and hash table is designed .
A jump list is a multi-layer ordered linked list , When querying the jump table , It starts from the highest level , The higher the number of layers, the fewer nodes , If the high-level directly finds the node to be checked, it will directly return . If the first node larger than the element to be queried is found, the query will continue to the next level , Then continue to query in more nodes on the next layer . This design method of jump table saves query overhead , meanwhile , The skip table uses a random method to determine the number of layers of each node , This can avoid adding nodes , The chain reaction that causes other nodes to increase the number of layers .
版权声明
本文为[Magic circle]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/04/202204230510044916.html
边栏推荐
- #define 定义常量和宏,指针和结构体
- MFC实现资源单独Dll实现
- Simple application of parallel search set (red alarm)
- 项目经理值得一试的思维方式:项目成功方程式
- 4 most common automated test challenges and Countermeasures
- JS Array常见方法
- C. Tree infection (simulation + greed)
- Detailed explanation of concurrent topics
- 数据安全问题已成隐患,看vivo如何让“用户数据”重新披甲
- 好的测试数据管理,到底要怎么做?
猜你喜欢
2021 年 25 大 DevOps 工具(下)
MySQL 慢查询
Data security has become a hidden danger. Let's see how vivo can make "user data" armor again
The WebService interface writes and publishes calls to the WebService interface (2)
4 most common automated test challenges and Countermeasures
[2022 ICLR] Pyramid: low complexity pyramid attention for long range spatiotemporal sequence modeling and prediction
5 minutes to understand MySQL row column conversion
MySQL slow query
Redis data type usage scenario
Cross border e-commerce | Facebook and instagram: which social media is more suitable for you?
随机推荐
Summary of R & D technology
What are the reasons for the failure of digital transformation?
PIP free export with path (@ file: / / /) notes
Use the built-in function of win to transfer files between two computers in the same LAN (the speed is the same as that between local disks)
Qingdao agile tour, coming!
Deep learning notes - data expansion
Live delivery form template - automatically display pictures - automatically associate series products
Mairadb数据库基本操作之数据管理
看板快速启动指南
云计算与云原生 — OpenShift 的架构设计
Kubectl command automatic replenishment
When is it appropriate for automated testing? (bottom)
Docker installation and mysql5 7 installation
Deep learning notes - object detection and dataset + anchor box
API slow interface analysis
calendar. Pit point of getactualmaximum (calendar. Day_of_month)
[untitled] kimpei kdboxpro's cool running lantern coexists with beauty and strength
JSP -- Introduction to JSP
Data management of basic operation of mairadb database
Day.js 常用方法