当前位置:网站首页>[Architecture] Why do you need master-slave replication and fragmentation
[Architecture] Why do you need master-slave replication and fragmentation
2022-04-22 19:43:00 【Ch.yang】
Preface
I read a book today , The author reminds everyone to discuss consistency strictly 、 Fault tolerance . It also reveals the need to always be humble about architecture design , There is no perfect solution , Only a relatively better solution .
- This article talks about master-slave replication
In large companies , As a developer , Often you can't choose the architecture of the database side . Master-slave replication is the mainstream choice , How to deal with business consistency in master-slave replication mode is a problem that needs to be paid attention to in development - Fragmentation
Considering that most of the content on the Internet is called “ Fragmentation ”, Actually “ Partition ” That's the right thing to say , It's just that different software manufacturers have different views on “ Partition ” Took a different name .- ES——shard
- HBase——region
- Bigtable——tablet
1. The emergence of master-slave replication solves the problem
1.1. Master slave copy The master replicate more No master copy
Database based replication technology , Master-slave replication is easier to implement . The idea is to use a main library as a write entry , Other slave libraries are used to synchronize the data of the master library , Load balancing for read access of clients , At the same time, it avoids the write conflict of multiple libraries . In the case of multi master replication and unowned replication , There must be write conflicts between multiple databases , The best way to solve conflict is to avoid conflict , It's kind of like this Java Medium ThreadLocal. Master slave replication has only a single write channel , Naturally, write conflicts of multiple databases are avoided . But I still want to emphasize that , Write conflicts in a single database cannot be avoided by master-slave replication , In any single environment, the database needs to use isolation level and lock technology to ensure consistency
MySQL PostgreSQL Oracle And other databases have built-in master-slave replication , It means that master-slave replication can be realized without any client code .
1.2. Advantages of replication technology
-
Make data in Location Closer to users , This reduces access latency
The game is divided into southwest regions South China Northern region, etc -
When some components fail , The system can still work , So as to improve High availability
Gain the advantages of clusters
Just imagine , What if the primary node fails ?—— Elect a new master node ( The solution here will be discussed later ) -
Expand to multiple machines to provide data access services to colleagues , So as to improve Read throughput
Improve the concurrency of reading
2. Use asynchronous replication
The book introduces synchronous replication and asynchronous replication , First analyze the business scenarios that these two technologies need to support
- Synchronous replication
Business : Ensure that all reads in the system are consistent with the main library
technology : The global lock is updated in order - Asynchronous replication
Business : It can only ensure that the data of the main database is up-to-date
technology : No need to lock , Broadcast update log , Data synchronization can be executed by threads in the background
If the bottleneck of the system is the ability to read , Then please use asynchronous replication directly . There will be problems with asynchronous replication , But the problem may have been simplified and solved by database manufacturers . Special , If the business side is not sensitive to data lag , There is no need to use synchronous replication at the expense of concurrency .
2.1. The business side is not sensitive to lag
- Watch the ball
Due to the load balancing of multi node reads , If there is 500ms Data lag , Users are not aware of each other , Because watching the ball often receives a client , It's not often that two clients play simultaneously in the room . If the data lag reaches 1s about , Users will not complain about the software architecture , The delay caused by network fluctuation is already a consensus problem , The business end does not need to provide complete real-time guarantee for the client , But the content of the two users watching the ball is the same in the end , That's what's called Final consistency guarantee .
2.2. Keep your information up to date
-
Latest focus on your changes
stay CSDN The change of your own post should be reflected immediately , Reading other users' posts by yourself allows some lagging information .
Write your own data , It must be reading to the main library , Then do the following control on the business side :- Login user's id == The name of the user who needs access id —> Read main database
- otherwise , Read from library
-
There is no fallback for the information you read many times
Why does it cause backoff ? Known slave Library A It's the latest , Slave Library B Still synchronizing lagging information , First reading A read B Will cause information to fall back .
How to solve ? The content read by each user is routed to only one reader , In order to avoid hot spots , Evenly distribute users with the idea of hash table - The relationship between Libraries . -
The combination of client and server
The client records the data version requested , Ask again
3. Final consistency and high availability of master-slave replication
3.1 Add slave libraries for the cluster without downtime
- Slave Library B Request master library A Then get a consistent snapshot
- Slave Library B Generate your own... Based on snapshot records , Main library A Log data changes
- Slave Library B Request master library A Change log after snapshot , Additional data
- Slave Library B Successfully joined the cluster
3.2 Full order log
- The master database logs the updated data , Broadcast to all slave Libraries
- Received from library , Write in order , Ensure that the writing order is consistent with that of the main database
2.1 If there is network jitter , Known logging 1, logging 2, Receive log records from the library first 2 What do I do ?
Reference resources TCP Sliding window of , Blocking updates from the Library , Wait for logging 1 Arrive and update together
If logging 1 The loss of , You can request logs from the library 1 The record of
3.3 Dynamic problems
Dynamic problems are often difficult to solve , There are mainly
-
Hang up from the node , Modify the original routing rules
The old route will be abandoned , Otherwise, some users will fail . -
The master node is down , Select a new master node .
The election is out of sync , It will cause double master nodes , That is to say “ Split brain ” problem , Introduce write conflict .
The author believes that the above two problems , Outsourcing to software solutions is the best , such as zookeeper. Application dependency zookeeper Dynamic directory of . and zookeeper The principle of is another big topic .
4. Problems solved in pieces
-
In the face of massive data or very high query pressure , The speed of any return from the node is the bottleneck
Reduce the amount of data in a single node , The time of full table scanning is reduced , The lower limit of return speed is also reduced -
Improve scalability
Data can be extended to different disks , The query load is also distributed to more processors -
Customized query
Keyword range query , The application layer routes to the specified partition , Fast query speed .
4.1 The relationship between fragmentation and replication
The idea of slicing is very simple —— Spread the data .

4.2. How to ensure master-slave replication after fragmentation
- Make a logical view in the application layer , Route to different nodes , The route is unique . Therefore, it can be regarded as the main library being partitioned , Each fragment has its own copy .

4.3. The combination of replication and fragmentation
Attached is a picture from the book , Recall ES It supports combined operation

4. 4 The added value of fragmentation and replication
ES At the time of deployment , You can specify the number of slices and copies . Business implementation , hold ES As a search engine , Allow data inconsistency in a range . Specifically, the real-time record is stored in MySQL, Regular full update ES, The full update process still uses the interface provided by the old version for external access .ES The partition of does not need to be related to the problem of dynamic capacity expansion , In this way, it has its added value :
- Less data per node , Improve reading performance
- The primary partition of each node can provide load balanced query capability
- Each node can have different shards
- If the node crashes , You can get a copy of the node from different nodes for recovery
版权声明
本文为[Ch.yang]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/04/202204221935289580.html
边栏推荐
- mysql alter 最佳实践总结
- Analysis of three traversal filtering methods of JS
- Sqlserver determines whether a column in the table contains Chinese, English and pure numbers
- 10-Streaming Query
- 自建CA中心,为公司不同应用签发证书
- STM32 learning record 006 - new project template (based on firmware library)
- MySQL gets the collection of each day according to the start and end date
- Software testing industry must see, a text of 800 words to teach you how to build the allure test report environment
- Redis的key和value最佳实践
- Chrome plug-in dark reader, eye protecting dark mode browser
猜你喜欢
随机推荐
C hexadecimal string conversion IEEE754 standard decimal single precision floating point number
if-else 优化
08-UDFs
2018-8-10-win10-uwp-商业游戏-1.2.1
Special analysis of China's digital technology in 2022
Cannot proceed because system tables used by Event Scheduler were found damaged
Software testing industry must see, a text of 800 words to teach you how to build the allure test report environment
. net background upload pictures without saving pictures to compress pictures
番禺海事处扎实推进水上从业人员安全宣教培训百日行动
听不懂梗怎么办?谷歌5400亿参数新模型可以给你解释笑点,还能通过emoji表情猜电影
Chrome插件-Dark reader,护眼的黑暗模式浏览器
【AI视野·今日NLP 自然语言处理论文速览 第三十三期】Thu, 21 Apr 2022
.NET学习笔记(三)----无处不在的特性
Talking about time series database market
12-Delta Lake
【程序源代码】毕业设计-二手交易网站
Unable to log in to remote MySQL server 1045 error
When MySQL designs a table, two timestamp fields are required
MYSQL,组合的唯一索引中,处理NULL值的问题
指针与对象的一些注意事项








![[Niuke brush question 19] MP3 cursor position](/img/ea/8ec110cbacf68ea0337437e311cc72.png)
