当前位置:网站首页>Using ltree to process hierarchical data in PostgreSQL
Using ltree to process hierarchical data in PostgreSQL
2022-04-22 06:36:00 【PostgreSQLChina】
author : alexei · Vasiliev (Alexey Vasiliev)
translator : Lei Yanliang , Working in Hangao basic software Co., Ltd ,PostgreSQL Database technology enthusiasts ,PostgreSQL ACE、PGCM、10g &11g OCM,OGG Certified Professional .
Original address :https://leopard.in.ua/2013/09/02/postgresql-ltree#.YEhtc2gzaUk
In this paper , We will learn how to use PostgreSQL Of ltree modular , The module allows data to be stored in a hierarchical tree structure .
What is? ltree?
Ltree yes PostgreSQL modular . It implements a data type ltree, Labels used to represent data stored in a hierarchical tree structure . Provides a wide range of tools for searching tag trees .
Why choose ltree?
- ltree Realized a materialized path , about INSERT / UPDATE / DELETE It's very fast for me , And for SELECT The operation is faster
- Usually , It's better than using recursion that often needs to recalculate branches CTE Or recursive functions
- For example, the built-in query syntax and special operators for query and navigation tree
- Indexes !!!
Initial data
First , You should enable extensions in the database . You can do this with the following command :
CREATE EXTENSION ltree;
Let's create a table and add some data to it :
CREATE TABLE comments (user_id integer, description text, path ltree);
INSERT INTO comments (user_id, description, path) VALUES ( 1, md5(random()::text), '0001');
INSERT INTO comments (user_id, description, path) VALUES ( 2, md5(random()::text), '0001.0001.0001');
INSERT INTO comments (user_id, description, path) VALUES ( 2, md5(random()::text), '0001.0001.0001.0001');
INSERT INTO comments (user_id, description, path) VALUES ( 1, md5(random()::text), '0001.0001.0001.0002');
INSERT INTO comments (user_id, description, path) VALUES ( 5, md5(random()::text), '0001.0001.0001.0003');
INSERT INTO comments (user_id, description, path) VALUES ( 6, md5(random()::text), '0001.0002');
INSERT INTO comments (user_id, description, path) VALUES ( 6, md5(random()::text), '0001.0002.0001');
INSERT INTO comments (user_id, description, path) VALUES ( 6, md5(random()::text), '0001.0003');
INSERT INTO comments (user_id, description, path) VALUES ( 8, md5(random()::text), '0001.0003.0001');
INSERT INTO comments (user_id, description, path) VALUES ( 9, md5(random()::text), '0001.0003.0002');
INSERT INTO comments (user_id, description, path) VALUES ( 11, md5(random()::text), '0001.0003.0002.0001');
INSERT INTO comments (user_id, description, path) VALUES ( 2, md5(random()::text), '0001.0003.0002.0002');
INSERT INTO comments (user_id, description, path) VALUES ( 5, md5(random()::text), '0001.0003.0002.0003');
INSERT INTO comments (user_id, description, path) VALUES ( 7, md5(random()::text), '0001.0003.0002.0002.0001');
INSERT INTO comments (user_id, description, path) VALUES ( 20, md5(random()::text), '0001.0003.0002.0002.0002');
INSERT INTO comments (user_id, description, path) VALUES ( 31, md5(random()::text), '0001.0003.0002.0002.0003');
INSERT INTO comments (user_id, description, path) VALUES ( 22, md5(random()::text), '0001.0003.0002.0002.0004');
INSERT INTO comments (user_id, description, path) VALUES ( 34, md5(random()::text), '0001.0003.0002.0002.0005');
INSERT INTO comments (user_id, description, path) VALUES ( 22, md5(random()::text), '0001.0003.0002.0002.0006');
in addition , We should add some indexes :
CREATE INDEX path_gist_comments_idx ON comments USING GIST(path);
CREATE INDEX path_comments_idx ON comments USING btree(path);
As you can see , I built comments Table with path Field , This field contains the tree All the way . As you can see , For the tree separator , I use 4 Two numbers and dots .
Let us in commenets Find... In the table path With ‘0001.0003’ The record of :
$ SELECT user_id, path FROM comments WHERE path <@ '0001.0003';
user_id | path
---------+--------------------------
6 | 0001.0003
8 | 0001.0003.0001
9 | 0001.0003.0002
11 | 0001.0003.0002.0001
2 | 0001.0003.0002.0002
5 | 0001.0003.0002.0003
7 | 0001.0003.0002.0002.0001
20 | 0001.0003.0002.0002.0002
31 | 0001.0003.0002.0002.0003
22 | 0001.0003.0002.0002.0004
34 | 0001.0003.0002.0002.0005
22 | 0001.0003.0002.0002.0006
(12 rows)
Let's go through EXPLAIN Order to check this SQL:
$ EXPLAIN ANALYZE SELECT user_id, path FROM comments WHERE path <@ '0001.0003';
QUERY PLAN
----------------------------------------------------------------------------------------------------
Seq Scan on comments (cost=0.00..1.24 rows=2 width=38) (actual time=0.013..0.017 rows=12 loops=1)
Filter: (path <@ '0001.0003'::ltree)
Rows Removed by Filter: 7
Total runtime: 0.038 ms
(4 rows)
Let's disable seq scan To test :
$ SET enable_seqscan=false;
SET
$ EXPLAIN ANALYZE SELECT user_id, path FROM comments WHERE path <@ '0001.0003';
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------
Index Scan using path_gist_comments_idx on comments (cost=0.00..8.29 rows=2 width=38) (actual time=0.023..0.034 rows=12 loops=1)
Index Cond: (path <@ '0001.0003'::ltree)
Total runtime: 0.076 ms
(3 rows)
Now? SQL slow , But I can see SQL How to use index Of .
first SQL The sentence uses sequence scan, Because there is not much data in the table .
We can select “path <@ ‘0001.0003’” Another way to do it :
$ SELECT user_id, path FROM comments WHERE path ~ '0001.0003.*';
user_id | path
---------+--------------------------
6 | 0001.0003
8 | 0001.0003.0001
9 | 0001.0003.0002
11 | 0001.0003.0002.0001
2 | 0001.0003.0002.0002
5 | 0001.0003.0002.0003
7 | 0001.0003.0002.0002.0001
20 | 0001.0003.0002.0002.0002
31 | 0001.0003.0002.0002.0003
22 | 0001.0003.0002.0002.0004
34 | 0001.0003.0002.0002.0005
22 | 0001.0003.0002.0002.0006
(12 rows)
You shouldn't forget the order of the data , Here's an example :
$ INSERT INTO comments (user_id, description, path) VALUES ( 9, md5(random()::text), '0001.0003.0001.0001');
$ INSERT INTO comments (user_id, description, path) VALUES ( 9, md5(random()::text), '0001.0003.0001.0002');
$ INSERT INTO comments (user_id, description, path) VALUES ( 9, md5(random()::text), '0001.0003.0001.0003');
$ SELECT user_id, path FROM comments WHERE path ~ '0001.0003.*';
user_id | path
---------+--------------------------
6 | 0001.0003
8 | 0001.0003.0001
9 | 0001.0003.0002
11 | 0001.0003.0002.0001
2 | 0001.0003.0002.0002
5 | 0001.0003.0002.0003
7 | 0001.0003.0002.0002.0001
20 | 0001.0003.0002.0002.0002
31 | 0001.0003.0002.0002.0003
22 | 0001.0003.0002.0002.0004
34 | 0001.0003.0002.0002.0005
22 | 0001.0003.0002.0002.0006
9 | 0001.0003.0001.0001
9 | 0001.0003.0001.0002
9 | 0001.0003.0001.0003
(15 rows)
Now sort :
$ SELECT user_id, path FROM comments WHERE path ~ '0001.0003.*' ORDER by path;
user_id | path
---------+--------------------------
6 | 0001.0003
8 | 0001.0003.0001
9 | 0001.0003.0001.0001
9 | 0001.0003.0001.0002
9 | 0001.0003.0001.0003
9 | 0001.0003.0002
11 | 0001.0003.0002.0001
2 | 0001.0003.0002.0002
7 | 0001.0003.0002.0002.0001
20 | 0001.0003.0002.0002.0002
31 | 0001.0003.0002.0002.0003
22 | 0001.0003.0002.0002.0004
34 | 0001.0003.0002.0002.0005
22 | 0001.0003.0002.0002.0006
5 | 0001.0003.0002.0003
(15 rows)
Can be in lquery Add a few modifiers to the end of the non asterisk label of , To make it match better than a perfect match :
“ @”- Case-insensitive matching , for example a @ matching A
“ *”- Match any tag with that prefix , for example foo * matching foobar
“%”- Match the words that start with the underscores
$ SELECT user_id, path FROM comments WHERE path ~ '0001.*{1,2}.0001|0002.*' ORDER by path;
user_id | path
---------+--------------------------
2 | 0001.0001.0001
2 | 0001.0001.0001.0001
1 | 0001.0001.0001.0002
5 | 0001.0001.0001.0003
6 | 0001.0002.0001
8 | 0001.0003.0001
9 | 0001.0003.0001.0001
9 | 0001.0003.0001.0002
9 | 0001.0003.0001.0003
9 | 0001.0003.0002
11 | 0001.0003.0002.0001
2 | 0001.0003.0002.0002
7 | 0001.0003.0002.0002.0001
20 | 0001.0003.0002.0002.0002
31 | 0001.0003.0002.0002.0003
22 | 0001.0003.0002.0002.0004
34 | 0001.0003.0002.0002.0005
22 | 0001.0003.0002.0002.0006
5 | 0001.0003.0002.0003
(19 rows)
We have to parent ‘0001.0003’ Find all the direct childrens, See below :
$ SELECT user_id, path FROM comments WHERE path ~ '0001.0003.*{1}' ORDER by path;
user_id | path
---------+----------------
8 | 0001.0003.0001
9 | 0001.0003.0002
(2 rows)
by parent ‘0001.0003’ Find all childrens, See below :
$ SELECT user_id, path FROM comments WHERE path ~ '0001.0003.*' ORDER by path;
user_id | path
---------+--------------------------
6 | 0001.0003
8 | 0001.0003.0001
9 | 0001.0003.0001.0001
9 | 0001.0003.0001.0002
9 | 0001.0003.0001.0003
9 | 0001.0003.0002
11 | 0001.0003.0002.0001
2 | 0001.0003.0002.0002
7 | 0001.0003.0002.0002.0001
20 | 0001.0003.0002.0002.0002
31 | 0001.0003.0002.0002.0003
22 | 0001.0003.0002.0002.0004
34 | 0001.0003.0002.0002.0005
22 | 0001.0003.0002.0002.0006
5 | 0001.0003.0002.0003
(15 rows)
by children ‘0001.0003.0002.0002.0005’ find parent:
$ SELECT user_id, path FROM comments WHERE path = subpath('0001.0003.0002.0002.0005', 0, -1) ORDER by path;
user_id | path
---------+---------------------
2 | 0001.0003.0002.0002
(1 row)
If your path is not unique , You'll get multiple records .
summary
It can be seen that , Use ltree The path of materialization is very simple . In this paper , I didn't list ltree All possible uses of . It's not considered a full-text search issue ltxtquery. But you can do it at PostgreSQL Official documents (http://www.postgresql.org/docs/current/static/ltree.html) Found it in .
Learn more about PostgreSQL Hot news 、 news information 、 Wonderful activities , Please visit China PostgreSQL Official website :www.postgresqlchina.com
Solve more PostgreSQL Related knowledge 、 technology 、 Work problems , Please visit China PostgreSQL Official Q & a community :www.pgfans.cn
Download more PostgreSQL Related information 、 Tools 、 Plug in problems , Please visit China PostgreSQL Official download site :www.postgreshub.cn
版权声明
本文为[PostgreSQLChina]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/04/202204220551004766.html
边栏推荐
- Hit the pain point directly, Kyushu cloud 5g private network helps FAW Fusheng intelligent logistics construction
- webService接口编写并发布与webService接口的调用(一)
- SSM源碼專題
- logrus设置日志格式与输出函数名
- 出海美利坚 不可忽视的未成年人法律红线
- Evaluation of Molly x4-1u blade computing server
- PostgreSQL使用clickhousedb_fdw访问ClickHouse
- Shumei technology was honored as the top 20 of the new enterprise service track of "real 100 innovators" in the interface news
- PostgreSQL中Oid和Relfilenode的映射
- 数美科技与澎湃新闻联合发布《网络信息内容安全洞察报告》
猜你喜欢

Outstanding | Kyushu cloud was selected as an excellent case and industry panorama of the first hybrid cloud conference

一个三目表达式,引起的空指针

webService接口编写并发布与webService接口的调用(二)

Understanding distributed transactions through code: XA pattern

Measurement of power consumption parameters of Jinbei Lb1

MySQL basics 2

Jasmine x4-q and jasmine x4-1u, detailed comparison

Solve the problem of error in installing PostgreSQL under windows2012 R2

Pgbouncer最佳实践:系列四

uglifyjs压缩JS
随机推荐
数美科技CTO梁堃建议:技术+运营组合防控 “杀猪盘”
小程序调用扫描二维码功能并跳转到二维码指定的路径
数美科技未成年人保护解决方案重磅上线,开启未成年人网络护航新时代
自定义限流框架
Flink理论基础
风靡IT圈的史诗级漏洞log4j2的产生原理及复现
报错:In aggregated query without GROUP BY, expression #1 of SELECT list contains nonaggregated column
Golang内存逃逸
区间列表的交集
webService接口编写并发布与webService接口的调用(一)
SSM source topic
OpenSSL self signed CA certificate and issuing server / client certificate
《通用数据保护条例》(GDPR)系列解读一:如何判断出海企业是否受GDPR管辖?
喜报|九州云获评“浙江省高新技术企业研究开发中心”
Official announcement | skyline officially joined the official openstack module
出海美利坚 不可忽视的未成年人法律红线
calendar.getActualMaximum(calendar.DAY_OF_MONTH)的坑点
How can enterprise risk control build four systems to achieve overall prevention and control?
Automatically add partitions for PostgreSQL tables
Shumei technology was selected into the top 20 of Chaoyang high tech high growth in 2021