当前位置：网站首页>[Architecture] Why do you need master-slave replication and fragmentation

[Architecture] Why do you need master-slave replication and fragmentation

2022-04-22 19:43:00 【Ch.yang】

Preface

I read a book today , The author reminds everyone to discuss consistency strictly 、 Fault tolerance . It also reveals the need to always be humble about architecture design , There is no perfect solution , Only a relatively better solution .

This article talks about master-slave replication
In large companies , As a developer , Often you can't choose the architecture of the database side . Master-slave replication is the mainstream choice , How to deal with business consistency in master-slave replication mode is a problem that needs to be paid attention to in development
Fragmentation
Considering that most of the content on the Internet is called “ Fragmentation ”, Actually “ Partition ” That's the right thing to say , It's just that different software manufacturers have different views on “ Partition ” Took a different name .
- ES——shard
- HBase——region
- Bigtable——tablet

1. The emergence of master-slave replication solves the problem

1.1. Master slave copy The master replicate more No master copy

Database based replication technology , Master-slave replication is easier to implement . The idea is to use a main library as a write entry , Other slave libraries are used to synchronize the data of the master library , Load balancing for read access of clients , At the same time, it avoids the write conflict of multiple libraries . In the case of multi master replication and unowned replication , There must be write conflicts between multiple databases , The best way to solve conflict is to avoid conflict , It's kind of like this Java Medium ThreadLocal. Master slave replication has only a single write channel , Naturally, write conflicts of multiple databases are avoided . But I still want to emphasize that , Write conflicts in a single database cannot be avoided by master-slave replication , In any single environment, the database needs to use isolation level and lock technology to ensure consistency

MySQL PostgreSQL Oracle And other databases have built-in master-slave replication , It means that master-slave replication can be realized without any client code .

1.2. Advantages of replication technology

Make data in Location Closer to users , This reduces access latency
The game is divided into southwest regions South China Northern region, etc
When some components fail , The system can still work , So as to improve High availability
Gain the advantages of clusters
Just imagine , What if the primary node fails ？—— Elect a new master node （ The solution here will be discussed later ）
Expand to multiple machines to provide data access services to colleagues , So as to improve Read throughput
Improve the concurrency of reading

2. Use asynchronous replication

The book introduces synchronous replication and asynchronous replication , First analyze the business scenarios that these two technologies need to support

Synchronous replication
Business ： Ensure that all reads in the system are consistent with the main library
technology ： The global lock is updated in order
Asynchronous replication
Business ： It can only ensure that the data of the main database is up-to-date
technology ： No need to lock , Broadcast update log , Data synchronization can be executed by threads in the background

If the bottleneck of the system is the ability to read , Then please use asynchronous replication directly . There will be problems with asynchronous replication , But the problem may have been simplified and solved by database manufacturers . Special , If the business side is not sensitive to data lag , There is no need to use synchronous replication at the expense of concurrency .

2.1. The business side is not sensitive to lag

Watch the ball
Due to the load balancing of multi node reads , If there is 500ms Data lag , Users are not aware of each other , Because watching the ball often receives a client , It's not often that two clients play simultaneously in the room . If the data lag reaches 1s about , Users will not complain about the software architecture , The delay caused by network fluctuation is already a consensus problem , The business end does not need to provide complete real-time guarantee for the client , But the content of the two users watching the ball is the same in the end , That's what's called Final consistency guarantee .

2.2. Keep your information up to date

Latest focus on your changes
stay CSDN The change of your own post should be reflected immediately , Reading other users' posts by yourself allows some lagging information .
Write your own data , It must be reading to the main library , Then do the following control on the business side ：
1. Login user's id == The name of the user who needs access id —> Read main database
2. otherwise , Read from library
There is no fallback for the information you read many times
Why does it cause backoff ？ Known slave Library A It's the latest , Slave Library B Still synchronizing lagging information , First reading A read B Will cause information to fall back .
How to solve ？ The content read by each user is routed to only one reader , In order to avoid hot spots , Evenly distribute users with the idea of hash table - The relationship between Libraries .
The combination of client and server
The client records the data version requested , Ask again

3. Final consistency and high availability of master-slave replication

3.1 Add slave libraries for the cluster without downtime

Slave Library B Request master library A Then get a consistent snapshot
Slave Library B Generate your own... Based on snapshot records , Main library A Log data changes
Slave Library B Request master library A Change log after snapshot , Additional data
Slave Library B Successfully joined the cluster

3.2 Full order log

The master database logs the updated data , Broadcast to all slave Libraries
Received from library , Write in order , Ensure that the writing order is consistent with that of the main database
2.1 If there is network jitter , Known logging 1, logging 2, Receive log records from the library first 2 What do I do ？
Reference resources TCP Sliding window of , Blocking updates from the Library , Wait for logging 1 Arrive and update together
If logging 1 The loss of , You can request logs from the library 1 The record of

3.3 Dynamic problems

Dynamic problems are often difficult to solve , There are mainly

Hang up from the node , Modify the original routing rules
The old route will be abandoned , Otherwise, some users will fail .
The master node is down , Select a new master node .
The election is out of sync , It will cause double master nodes , That is to say “ Split brain ” problem , Introduce write conflict .

The author believes that the above two problems , Outsourcing to software solutions is the best , such as zookeeper. Application dependency zookeeper Dynamic directory of . and zookeeper The principle of is another big topic .

4. Problems solved in pieces

In the face of massive data or very high query pressure , The speed of any return from the node is the bottleneck
Reduce the amount of data in a single node , The time of full table scanning is reduced , The lower limit of return speed is also reduced
Improve scalability
Data can be extended to different disks , The query load is also distributed to more processors
Customized query
Keyword range query , The application layer routes to the specified partition , Fast query speed .

4.1 The relationship between fragmentation and replication

The idea of slicing is very simple —— Spread the data .
Insert picture description here

4.2. How to ensure master-slave replication after fragmentation

Make a logical view in the application layer , Route to different nodes , The route is unique . Therefore, it can be regarded as the main library being partitioned , Each fragment has its own copy .

4.3. The combination of replication and fragmentation

Attached is a picture from the book , Recall ES It supports combined operation

Insert picture description here

4. 4 The added value of fragmentation and replication

ES At the time of deployment , You can specify the number of slices and copies . Business implementation , hold ES As a search engine , Allow data inconsistency in a range . Specifically, the real-time record is stored in MySQL, Regular full update ES, The full update process still uses the interface provided by the old version for external access .ES The partition of does not need to be related to the problem of dynamic capacity expansion , In this way, it has its added value ：

Less data per node , Improve reading performance
The primary partition of each node can provide load balanced query capability
Each node can have different shards
If the node crashes , You can get a copy of the node from different nodes for recovery

版权声明
本文为[Ch.yang]所创，转载请带上原文链接，感谢
https://yzsam.com/2022/04/202204221935289580.html