当前位置:网站首页>online schema change and create index
online schema change and create index
2022-08-09 02:36:00 【KD_】
online schema change
PolarDB-X Online Schema Change
PolarDB-X:让“Online DDL”更Online
VLDB13 | Online, Asynchronous Schema Change in F1
online schema change paper
online schema change的大致思路就是:Allows the database and stored in the two versions of the metadata(schema),And ensure that at the same time, based on the two versions of the metadata will not cause data inconsistency of transactions.
Why do you want to allow the database at the same time there are two versions of the metadata?
To have multiple stateless computing nodes(CN)的分布式数据库,每个CNA metadata cache on the,And based on the providedSQL服务.当执行ddlWhen statements modify metadata,Ideally allCNAt the same time to switch to the new version of the metadata,But this is impossible.
所以DBThere are at least two versions of metadata is inevitable events,That at the same time there are two versions of the metadata is what problem?
以 CREATE INDEX 为例:
集群由 CN(计算节点) 和 DN(存储节点) 构成,每个 CN 中缓存一份 Schema.由于 CN0 和 CN1 异步加载 Schema,There may be a moment in the process of adding indexes,CN0 Think there is an index and CN1 认为没有,At this point there are two exceptions
- Index have extra data(Orphan Data Anomaly): CN0 执行了 INSERT,Insert data on the table and index,随后 CN1 执行 DELETE.由于 CN1 Think no index,Just delete the data on the table
- Lack of data on index(Integrity Anomaly): CN1 执行 INSERT,由于 CN1 Think no index,Just insert the data on the table,Didn't write the relevant incremental log,To create index after the completion of the lack of the INSERT 的数据.
如何解决这个问题呢?
- MDL悲观锁:当执行ddl语句时,为所有CNThe metadata to addmdl锁,It is forbidden to continue to provide service.当所有CNAfter the metadata of all updated,Provide service to unlock.At this time although there are two versions of metadata,But it is not to serve,不会产生问题.
- Allow database has two versions of metadata,And all foreign service,But that there was no data consistency problem.
Obviously the second way is better.
How to realize the two versions of metadata to provide services at the same time,Do not produce consistency problem?
《 Online, Asynchronous Schema Change in F1 》The thesis designed for metadata several state:
、
schemaIs in fact the changeelement的增删改,So the belowelementThe state of the direct calledschema的状态.
At this time is defined as the metadata version:元数据 + 状态.The same metadata if the status is different,Also known as the two versions of the metadata.
When the old and new metadata switch becomes:
old_schema_stateN -> new_schema_state1 -> new_schema_state2 -> new_schema_state3
其中state包括了absent,delete_only,write_only,public,注意state表示的是schema中修改的element的状态.
所以online schema changeWork is to design new metadata switch yes middle shift,To ensure that the adjacent two versions of the metadata will not produce data consistency problem.
CREATE INDEX
下面以create index为例,介绍如何实现online schema change:
Its state into:
Assuming that create indexes before metadata asS1,After adding indexes of metadata forS2
S1 -> S2_delete_only -> S2_write_only -> S2_public
It's important to note that in the process of adding indexes,有三个问题:
- The index table data more than the main table,Unnecessary data(出现原因:First insert the index table and the main table,But only after removed the main table)
- The index table data is less than the main table,But to provide query service(出现原因:In the index expression to consistent with the main table before,To provide query)
- The index table data is equal to the main table,But there are inconsistent data(出现原因:An update did not modify the index table or at the same time the main table)
As long as to avoid these three questions,Ensure the index and the consistency of the main table
- S1 -> S2_delete_only:S2_delete_only虽然有索引,But don't allow to indexinsert、select,Although the index can beupdate,But there is no data on the index table at present,不会出现问题1、2、3
- S2_delete_only -> S2_write_only:S2_delete_only 和S2_write_onlyAllow the index tabledelete,Do not allow for indexselect,Allow the index tableupdate,不会出现问题1、2、3
- S2_write_only -> S2_public:S2_delete_only 和S2_write_onlyAllow the index tabledelete,S2_publicAllows the indexselect,But as the index is consistent with the data table,Allow the index tableupdate,所以不会出现问题1、2、3
How to ensure the index expression to consistent with the main table
当达到S2_write_only阶段时,Database to allow the main table and index table to write,And at the same time openbackfill.
对于S2_delete_onlyAnd the phase of the data before,We call this the stock data,对于 S2_write_onlyStage written data,我们称之为增量数据.
So in order to guarantee the index and the consistency of the main table,在由S2_write_only转为S2_public状态之前,Must ensure that the stock and incremental data has been written to index table.
- 对于增量数据,There is no doubt that will write the main table and index table at the same time.
- 对于存量数据,开启backfillBackfill to the index table,即select主表 ,然后insert索引表
注意,backfill时:
- selectThe main table need to add read lock,Because a is likely to happenbackfill时,Other threads execute aupdate,At the same time update the main table and index table.But if the corresponding data are notbackfillTo the index on the table,Will the new data is written to the index table(Or the index table don't operate),但是之后的backfillWill make the old data covering the new data(Or directly into the old data,因为不加读锁backfill是一致性读),Lead to inconsistent index and the main table data.
- selectThe main table and read lock will get to the part of the incremental data,此时backfillWhen the index table,Possible conflict of the primary key or unique index conflict,At this point need to judge whether the data of the conflict for the incremental data,That whether the primary key is consistent,如果是,则略过.
当backfill完成时,Index table and the main table agree,即可设置为public,To provide query service.
边栏推荐
- The last exam before the NPDP revision!caution
- Redis系列文章导航
- 数字 06 verilog_关于异步FIFO
- Etcd realize large-scale application service management of actual combat
- composer的使用记录
- xml引配置文件
- MT4/MQL4 Getting Started to Mastering EA Tutorial Lesson 1 - MQL Language Common Functions (1) OrderSend() Function
- 2022年自然语言处理校招社招实习必备知识点盘点分享
- uart_spi练习
- 攀爬倒影发光方块
猜你喜欢
随机推荐
【HNUMSC】C language second lecture
Which is the best increased whole life insurance?Is it really safe?
ROS 、SLAM 学习 error整理
mysql 5.7 入坑
LintCode 146. 大小写转换 II
The last exam before the NPDP revision!caution
Apache站点下载大文件自动中断或者文件不完整
2022年自然语言处理校招社招实习必备知识点盘点分享
USB 触摸在竖屏时校准
Force buckled brush problem record 7.1 -- -- -- -- -- 707. The design list
Open3D 计算点云的均值(质心)与协方差
2022年最流行的自动化测试工具有哪些?全网最全最细都在这里了
点击div内部默认文本被选中
Mysql 5.7 into the pit
Postman接口测试【官网】最新版本 安装及使用入门教程
uart_spi练习
MT4/MQL4入门到精通外汇EA教程第一课 认识MetaEditor
最近看到很多人想自学或者报班但是不清楚如何选择,我今天就和大家说说
连接数据库且在网页运行的RDLC
18.flink Table/Sql API之 catlog