当前位置:网站首页>Detailed description of MySQL index [B + tree index, hash index, full-text index, overlay index]
Detailed description of MySQL index [B + tree index, hash index, full-text index, overlay index]
2022-04-23 03:18:00 【Xiaodaoxian 97】
Some time ago, every time the interview mentioned the index , I'll just say a bunch of , Then it's time to talk about what you understand B+tree Indexes I'm so confused .
Direct said B+tree
It may not be well understood , Let's start with the simplest binary search tree and gradually .
One 、B+Tree Indexes
1、 Binary search tree
When I first learned about trees , We must have learned a binary tree with such a structure The root node is larger than its left node , Smaller than its right node .
If we want to query in the above binary tree 6
, Just three steps
- Find the root node
10
, Judge 6 Than 10 Small , Left seeking node - Find the node
5
, Judge 6 Than 5 Big , Find the right node - Find the node
6
, Judge 6 Meet the search needs
2、 Balanced binary trees (AVL Trees )
Those familiar with binary trees know , In special circumstances , The above binary tree may form the following structure
If you query nodes on this binary tree , That is to compare them one by one , Quite inefficient , At this time, we will introduce Balanced binary trees The concept of .
Balanced binary tree has the following requirements :
- The left node of each root node is smaller than it , The right node is larger than it
- The height difference between the left and right subtrees of each node cannot exceed
1
Every time you add elements , The code will judge whether the current structure is still balanced , If not, adjust . The cost of adjustment is also very high , Therefore, balanced binary tree is generally used in multi query function .
3、B-Tree
All our data is ultimately stored on disk . There is a problem with balanced binary trees , That is, if the amount of data is too large , Then the tree will be very long , This can lead to frequent disk IO, In this way, the efficiency is reduced a lot .
In the database, data is not stored one by one , Instead, store by page , One page is 16kb size .
B-Tree Can solve frequent IO problem , It stores data according to pages .
If we have such a table , There are two fields id、name,id It's the primary key
Then its storage structure is as follows :
Pages also point in both directions
From the above structure, we can see , each page It stores indexes and corresponding data , And there are multiple pieces of data , Not a single one .
Usually our root page ( page 1) It's stored in memory , Then we judge whether to read pages into memory one by one , Using this structure, you can query fewer pages , Then we can query the data we want .
4、B+Tree
There is also a problem with the above structure , Because the size of each page is fixed (16KB), If you want to store both indexes , And store data , Then we 16KB It doesn't store much index data ( Especially in big tables ), This will still happen frequently IO Handle .
If we only store the index in the node , Store data in the leaf node , In this way, we can store more indexes on each page .
4-1、 Accurate search
If we want to check 1
, Find the page first 1, Find another page 2, Finally, find the corresponding data from the leaf node .
4-2、 Range queries
If we want to find WHERE id > 2 AND id < 5
As above, we will find out exactly 2
, Leaf nodes have bidirectional pointers up, down, left and right , Go down one by one , When looking for a fight 5 Found that id < 5
It's not enough , Just finish searching .
5、 summary
- B+Tree Store all the data on the leaf node , Non leaf nodes only store indexes , This ensures a minimum number of IO Improve the performance of index query .
- The storage time is not a node by node , Instead, it is stored as a page , The size of each page is 16KB.
Two 、Hash Indexes
Hash An index is based on a given field , Create Hash value .Hash The index can quickly query a single matching degree , But you can't do range query .
If you create a composite index (A、B), It is based on AB Two fields for Hash Of , So when you use it alone A When performing conditional screening , You can't use indexes .
3、 ... and 、 Full-text index
Full text index is a special index , Generally, it is rarely used . It looks for keywords in the text , Instead of comparing the values in the index . Full text indexing is more similar to what search engines do .
Four 、 Clustered index and non clustered index
Cluster index is not a separate index type , It's a way of storing data . stay InnoDB The cluster index of is actually stored in the same structure B+Tree Index and data rows . When a table has a clustered index , Its data rows are actually stored in the leaf pages of the index .
Because you can't put rows in two different places at the same time , So a table can only have one clustered index .( however , Overlay index can simulate the situation of multiple clustered indexes , The following instructions )
stay InnoDB The primary key will be selected as the cluster index , If there is no primary key defined ,InnoDB Will choose a unique non empty index instead of . If there is no such index ,InnoDB A primary key is implicitly defined as a clustered index .
Cluster index : Its non leaf nodes store , primary key ( Most of them are ), Leaf nodes are stored row data .
Nonclustered index : Its non leaf nodes store , Index value , The leaf node stores the primary key index corresponding to this index .
So when we use non clustered index to query data , It will be queried twice, and the cluster index will be queried first , Then find the relevant data through the cluster index , This process is called Back to the table .
5、 ... and 、 Other
5-1、 Why is the primary key index faster than other indexes
Because the non primary key index corresponds to the non clustered index , Therefore, you need to perform the operation of returning to the table when querying , First find the corresponding primary key index in the query , Then query the real data through the primary key index .
But not all primary key indexes are faster than other indexes , For example, we will talk about the overlay index .
5-2、 Overlay index
Suppose we have such a table
CREATE TABLE `t_user` (
`id` int(11) NOT NULL AUTO_INCREMENT COMMENT ' Primary key id',
`name` varchar(50) DEFAULT NULL COMMENT ' full name ',
`age` int(3) DEFAULT NULL COMMENT ' Age ',
`sex` tinyint(1) DEFAULT NULL COMMENT ' Gender ',
PRIMARY KEY (`id`),
KEY `name_age` (`name`,`age`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;
We need to check name and age,sql as follows
SELECT name, age FROM t_user
Because we established name,age The index of , And the data we query is these two , So we don't need to go back to the table .
Overlay index : If an index contains ( Or coverage ) The value of all the fields that need to be queried , We call it “ Overlay index ”.
notes : The overlay index must store the value of the index column , And hash index 、 Spatial index and full-text index do not store the value of index column , therefore MySQL Only use B+Tree Index to do overlay index . in addition , Different storage engines implement overlay indexes in different ways , And not all engines support overlay indexes .
版权声明
本文为[Xiaodaoxian 97]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/04/202204230315096084.html
边栏推荐
- Test experience data
- C introduction of variable parameter params
- socket編程 send()與 recv()函數詳解
- Seminar playback video: how to improve Jenkins' ability to become a real Devops platform
- Tencent video VIP member, weekly card special price of 9 yuan! Tencent official direct charging, members take effect immediately!
- Data mining series (3)_ Data mining plug-in for Excel_ Estimation analysis
- JSON related
- Fight leetcode again (290. Word law)
- Peut recevoir plusieurs paramètres de type de données - paramètres variables
- Queue storage and circular queue
猜你喜欢
Data mining series (3)_ Data mining plug-in for Excel_ Estimation analysis
可以接收多种数据类型参数——可变参数
类似Jira的十大项目管理软件
The most understandable life cycle of dependency injection
The most easy to understand dependency injection and control inversion
一套组合拳,打造一款 IDEA 护眼方案
After the mobile phone is connected to the computer, how can QT's QDIR read the mobile phone file path
Fiddler use
Use of slice grammar sugar in C #
12.<tag-链表和常考点综合>-lt.234-回文链表
随机推荐
Experiment 6 input / output stream
PID debugging of coding motor (speed loop | position loop | follow)
The most easy to understand service container and scope of dependency injection
Comprehensive calculation of employee information
12. < tag linked list and common test site synthesis > - lt.234 palindrome linked list
Chapter 9 of C language programming (fifth edition of Tan Haoqiang) analysis and answer of exercises for users to establish their own data types
OLED multi-level menu record
2022T电梯修理考试模拟100题及在线模拟考试
为什么BI对企业这么重要?
Experiment 5 components and event handling
Use split to solve the "most common words" problem
[vs Code] solve the problem that the jupyter file displays exceptions in vs code
be based on. NETCORE development blog project starblog - (1) why do you need to write your own blog?
类似Jira的十大项目管理软件
[Mysql] LEFT函數 | RIGHT函數
TCP three handshakes and four waves
软件测试相关知识~
Fundamentals of software testing and development
一文了解全面静态代码分析
Use of ADB command [1]