当前位置:网站首页>Comparison and summary of applicable scenarios of Clickhouse and MySQL database
Comparison and summary of applicable scenarios of Clickhouse and MySQL database
2022-04-23 13:24:00 【JavaGaga】
For the moment , There are many relevant materials on the Internet to prove ClickHouse Database query response speed is faster than MySQL A hundred to hundreds of times faster . actually ,ClickHouse and MySQL It has different application scenarios and limitations , I've been working on this recently ClickHouse When you plan to apply it to a table with a large amount of data for query , Stepped on some pits , So here's a summary , Used for subsequent data storage and processing as a memo , And for those who want to use ClickHouse Replace MySQL Make a reference when storing some data in the database .
use VersionedCollapsingMergeTree The engine does the data that will be modified but not often
In itself ClickHouse It is not suitable for frequent modification and deletion of data , For deletion and modification, it will consume a lot of performance , Especially frequent single data modification . therefore , Usually, we see a lot of data saying that data should be written in batches as much as possible , No matter what it is 1000 strip ,10000 Or how many are a batch , To promote ClickHouse Performance of .
in addition , In itself ClickHouse Data written to 、 modify 、 Deletion is asynchronous , For operation write 、 modify 、 The deleted data needs to be queried in time , res ClickHouse For storage , And ClickHouse Unsupported transaction , therefore ClickHouse Do not use in scenarios with high data consistency .
In some cases , Maybe the data usually doesn't need to be modified , But some scenes need to be modified once or several times , Then in order to respond quickly , Want to switch to ClickHouse If you come , May adopt VersionedCollapsingMergeTree Engine for data storage , About this storage engine , I can briefly introduce , For details, please refer to ClickHouse Detailed documentation of .
The general principle of this storage engine is , By providing a sign And a version Mark ,sign When storing , The value is 1 and -1,version The version number of the stored data , The specific rules applied to the following two scenarios are :
- In case of deletion , Reinsert the data in this row ,
versionremain unchanged ,signSet to-1, Then there will be two items in the data table exceptsignDifferent duplicate data . thenClickHouseMerge and collapse are performed periodically , Match these two data , But twosignAdd to0The data of . Here we need to pay attention to , Is a scheduled merge and clear , So you need to usegroup byafter , Do it againhaving(sign)>0)To manually exclude deleted data . - In case of modification , While doing the above operations , Insert new data ,
signMarked as1,versionAdd... On the basis of the previous one1that will do . Of course, when querying , You also need to manually exclude the data of the previous version .
Sparse indexes are not suitable for precise queries
Before saying that this sparse index is not suitable for accurate query , Let's start with the following scenario of precise query :
- According to some indexed condition , such as
idperhapsuser_idTo query the data of the current row ; - According to some indexed condition , Such as
idperhapsuser_idTo get the data list for paging
ClickHouse Using a sparse index , and MySQL Of B+ Trees are different .(
About sparse index and B+ I won't introduce tree index here , If you introduce these two things, you won't be able to explain them clearly for a while , If someone sees this article and doesn't understand it , You can study it by yourself .
), So when you do exact condition query again ,ClickHouse The amount of scanned data will be large , The actual response speed will not reach the ideal state .
And for ClickHouse The case of sparse index , It's especially suitable for group by To make a query , After several times group by after , You can eliminate a lot of data , So usually, the most suitable scenario is to process statistical queries , under these circumstances , Response speed ratio in case of large amount of data MySQL Dozens of times faster , You can withdraw cash hundreds of times .
Column database is not suitable for querying a large number of columns at one time
The other is ClickHouse The characteristics of columnar database , Based on the above , Scenarios that require precise query , When a large number of fields need to be queried at one time , The response speed is not ideal . Even if there are many more group by Conditions , Finally, because there are many columns to scan , stay MySQL When properly indexed ,ClickHouse The response speed is usually not MySQL fast .
ClickHouse The query effect is better than MySQL Stable
I won't say more about the specific test here , Mainly about the scene . When both data tables are correctly indexed , Doing it 2 Billion When querying the data list ,MySQL When doing paging data query , The first few pages of queries will be obviously time-consuming , In about 500ms to 800ms Unequal , But the subsequent paging query can basically achieve 50ms to 80ms, This is supposed to be MySQL Data preheating works . however ClickHouse Basically stable in 230ms to 300ms.
summary
Based on current tests and observations , If you need to do statistical query , And the data is not frequently modified , use ClickHouse To store and process data queries . If you need to modify frequently or do big data list query , The best solution is to use MySQL Inquire about , And divide the data into tables , The resulting data response performance will be better than ClickHouse Too much .
版权声明
本文为[JavaGaga]所创,转载请带上原文链接,感谢
https:https://yzsam.com/html/tUMyfu.html
边栏推荐
- Servlet of three web components
- MySQL5.5安装教程
- [dynamic programming] 221 Largest Square
- Nodejs + Mysql realize simple registration function (small demo)
- 2020最新Android大厂高频面试题解析大全(BAT TMD JD 小米)
- Android clear app cache
- Using open to open a file in JNI returns a - 1 problem
- 4.22 study record (you only did water problems in one day, didn't you)
- 集简云 x 飞书深诺,助力企业运营部实现自动化办公
- [notes de marche]
猜你喜欢

The filter() traverses the array, which is extremely friendly

AUTOSAR from introduction to mastery 100 lectures (51) - AUTOSAR network management

Nodejs + Mysql realize simple registration function (small demo)

解决虚拟机中Oracle每次要设置ip的问题
![[official announcement] Changsha software talent training base was established!](/img/ee/0c2775efc4578a008c872022a95559.png)
[official announcement] Changsha software talent training base was established!

交叉碳市场和 Web3 以实现再生变革
![[point cloud series] multi view neural human rendering (NHR)](/img/40/dc042a42710096b66f3c173f04adc4.png)
[point cloud series] multi view neural human rendering (NHR)

CSDN College Club "famous teacher college trip" -- Hunan Normal University Station
![[untitled] PID control TT encoder motor](/img/ce/942a0b87994699f73da215e7cad2a1.png)
[untitled] PID control TT encoder motor

MySQL 8.0.11下载、安装和使用可视化工具连接教程
随机推荐
[point cloud series] foldingnet: point cloud auto encoder via deep grid deformation
innobackupex增量备份
[walking notes]
Playwright contrôle l'ouverture de la navigation Google locale et télécharge des fichiers
[point cloud series] full revolutionary geometric features
[point cloud series] deepmapping: unsupervised map estimation from multiple point clouds
Common commands of ADB shell
Servlet of three web components
5 tricky activity life cycle interview questions. After learning, go and hang the interviewer!
GIS practical tips (III) - how to add legend in CASS?
Scons build embedded ARM compiler
[point cloud series] so net: self organizing network for point cloud analysis
[notes de marche]
[multi screen interaction] realize dual multi screen display II: startactivity mode
[point cloud series] multi view neural human rendering (NHR)
“湘见”技术沙龙 | 程序员&CSDN的进阶之路
Machine learning -- PCA and LDA
Vscode tips
[dynamic programming] 221 Largest Square
AUTOSAR from introduction to mastery 100 lectures (81) - FIM of AUTOSAR Foundation