当前位置：网站首页>Shardingsphere introduction and sub table usage

Shardingsphere introduction and sub table usage

2022-04-23 01:48:00 【Charge8】

One 、ShardingSphere brief introduction

1、 brief introduction

ShardingSphere Has been in 2020 year 4 month 16 Day to day Apache Top projects at the software foundation .

ShardingSphere Is a set of open source distributed database middleware solutions .

ShardingSphere Product positioning Database Plus, It aims to build the upper standard and ecosystem of heterogeneous database .

It focuses on how to make full and rational use of the computing and storage capacity of the database , Instead of implementing a new database .

ShardingSphere From the perspective of the upper layer of the database , Focus on collaboration between them more than the database itself .

It consists of ShardingSphere-JDBC、ShardingSphere-Proxy and ShardingSphere-Sidecar this 3 The products are independent of each other .

ShardingSphere-JDBC： Position as lightweight Java frame , stay Java Of JDBC Additional services provided by layer .
ShardingSphere-Proxy： Positioning as a transparent database agent , Provide the server version that encapsulates the binary protocol of the database , Used to support heterogeneous languages .
ShardingSphere-Sidecar： It is positioned as Kubernetes Cloud native database agent , With Sidecar Form agent for all access to the database .

2、 The core concept

2.1 Table concept

1） True table

Physical tables that actually exist in the database . for example ：t_order_0、t_order_1

2） Logic table

After slicing , Name of the same type of table structure （ Assembly ）. for example b_order.

Horizontally split database （ surface ） The general term of the same logic and data structure table of . for example ： The order table is divided into... According to the mantissa of the primary key 10 Zhang zhenbiao （t_order_0 To t_order_9）, Their logical table is called t_order.

3） Data nodes

After slicing , It consists of data source and data table . for example ds0.b_order1

The smallest unit of data fragmentation . Consists of data source name and data table , for example ：ds_0.t_order_0.

4） Binding table
It refers to the relationship table with consistent partition rules （ Main table 、 Sub table ）.

for example ：t_order Table and t_order_item surface , All in accordance with order_id Fragmentation , The two tables are bound to each other . There will be no Cartesian product association in multi table Association queries between bound tables , The efficiency of association query will be greatly improved .

Illustrate with examples , If SQL by ：

SELECT i.* FROM t_order o JOIN t_order_item i ON o.order_id=i.order_id WHERE o.order_id in (10, 11);

When the binding table relationship is not configured , Let's say the partition bond order_id Numerical value 10 Route to 0 slice , Numerical value 11 Route to 1 slice , So after routing SQL Should be 4 strip , They appear as Cartesian products ：

SELECT i.* FROM t_order_0 o JOIN t_order_item_0 i ON o.order_id=i.order_id WHERE o.order_id in (10, 11);

SELECT i.* FROM t_order_0 o JOIN t_order_item_1 i ON o.order_id=i.order_id WHERE o.order_id in (10, 11);

SELECT i.* FROM t_order_1 o JOIN t_order_item_0 i ON o.order_id=i.order_id WHERE o.order_id in (10, 11);

SELECT i.* FROM t_order_1 o JOIN t_order_item_1 i ON o.order_id=i.order_id WHERE o.order_id in (10, 11);

After configuring the binding table relationship , The routing SQL Should be 2 strip ：

SELECT i.* FROM t_order_0 o JOIN t_order_item_0 i ON o.order_id=i.order_id WHERE o.order_id in (10, 11);

SELECT i.* FROM t_order_1 o JOIN t_order_item_1 i ON o.order_id=i.order_id WHERE o.order_id in (10, 11);

among t_order stay FROM The far left side of ,ShardingSphere It will be used as the main table of the whole binding table . All routing calculations will use only the primary table policy , that t_order_item The slice calculation of the table will use t_order Conditions . So the partition keys between binding tables should be exactly the same .

5） Broadcast table

Broadcast table refers to the table that exists in all fragment data sources , The table structure and the data in the table are exactly the same in each database .

It is suitable for scenarios where the amount of data is not large and needs to be associated with massive data tables , There's no need to slice these watches .

for example ： For example, dictionary table 、 Province information, etc , Because they don't have much data , Moreover, this kind of table may need to be associated with tables with massive data .

2.2 The concept of fragmentation

1） Patch key
Database fields for sharding , It's the database （ surface ） Key fields for horizontal split .

for example ： Take the mantissa of the order primary key in the order table and split it into modules , Then the order primary key is a fragment field . SQL If there is no fragment field , Full routing will be performed , Poor performance . In addition to support for single fragment fields ,ShardingSphere It also supports fragmentation according to multiple fields .

2） Sharding algorithm
By slicing algorithm （ShardingAlgorithm） Shard the data , Supported by =、>=、<=、>、<、BETWEEN and IN Fragmentation .
Partition algorithm needs to be implemented by application developers themselves , The flexibility that can be achieved is very high .

At present 4 A kind of slicing algorithm . Because fragmentation algorithm is closely related to business implementation , Therefore, there is no built-in slicing algorithm , But through the segmentation strategy to extract all kinds of scenes , Provide a higher level of abstraction , And provide interface for application developers to implement sharding algorithm by themselves .

Accurate segmentation algorithm ,
Corresponding PreciseShardingAlgorithm, Used to handle the use of a single key as a partition key = And IN The scene of slicing . Need to cooperate with StandardShardingStrategy Use .
Range fragmentation algorithm
Corresponding RangeShardingAlgorithm, Used to handle the use of a single key as a partition key BETWEEN AND、>、<、>=、<= The scene of slicing . Need to cooperate with StandardShardingStrategy Use .
Composite slicing algorithm
Corresponding ComplexKeysShardingAlgorithm, Used to process scenes that use multiple keys as partitioning keys to slice , The logic that contains multiple partition keys is more complex , The complexity needs to be handled by application developers themselves . Need to cooperate with ComplexShardingStrategy Use .
Hint Sharding algorithm
Corresponding HintShardingAlgorithm, For handling use Hint The scene of line segmentation . Need to cooperate with HintShardingStrategy Use .

3） Fragmentation strategy
Including the partition key and the partition algorithm , Because of the independence of the algorithm , Separate it from . What is really available for slicing is the partitioning key + Sharding algorithm , That is, fragmentation strategy .

At present 5 Seed fragmentation strategy ：

Standard fragmentation strategy , Corresponding StandardShardingStrategy.
Composite fragmentation strategy , Corresponding ComplexShardingStrategy.
Line expression fragmentation strategy , Corresponding InlineShardingStrategy.
Hint Fragmentation strategy , Corresponding HintShardingStrategy.
No fragmentation strategy , Corresponding NoneShardingStrategy.

4） Fragment strategy configuration
For fragmentation strategy, there are two dimensions: data source fragmentation strategy and table fragmentation strategy , Of two strategies API Exactly the same .

Data source fragmentation strategy

Used to configure the target data source to which data is assigned .

Table fragmentation strategy

The target table used to configure the data to be allocated , Because the table exists in the data source , Therefore, the table fragmentation strategy depends on the results of the data source fragmentation strategy .

Two 、 Simple introduction

ShardingJDBC As a whole ShardingSphere The earliest and core functional module , Its main function is data fragmentation and read-write separation .

The following is an example of a single data source sub table ShardingSphere Use .

Create database sharding_db1
Create two tables in the database course_1 and course_2
Agreed fragmentation rules ： If you add a course id It's even , Then add the data course_1 surface , If it is an odd number, add it to course_2 surface .

CREATE TABLE course_1 (
	id BIGINT(20) PRIMARY KEY,
	gmt_create timestamp NULL DEFAULT NULL,
	name VARCHAR(50) NOT NULL,
	user_id BIGINT(20) NOT NULL,
	status varchar(10) NOT NULL
);

CREATE TABLE course_2 (
	id BIGINT(20) PRIMARY KEY,
	gmt_create timestamp NULL DEFAULT NULL,
	name VARCHAR(50) NOT NULL,	
	user_id BIGINT(20) NOT NULL,
	status varchar(10) NOT NULL
);

1、 Introduce dependencies

Create a springboot project .

        <!-- shardingJDBC Core dependence  -->
        <dependency>
            <groupId>org.apache.shardingsphere</groupId>
            <artifactId>sharding-jdbc-spring-boot-starter</artifactId>
            <version>4.1.1</version>
        </dependency>
        <!--druid Connection pool -->
        <dependency>
            <groupId>com.alibaba</groupId>
            <artifactId>druid</artifactId>
            <version>1.2.8</version>
        </dependency>

        <!-- mysql Connection driver  -->
        <dependency>
            <groupId>mysql</groupId>
            <artifactId>mysql-connector-java</artifactId>
            <version>8.0.23</version>
        </dependency>
        <!--mybatis-plus rely on -->
        <dependency>
            <groupId>com.baomidou</groupId>
            <artifactId>mybatis-plus-boot-starter</artifactId>
            <version>3.4.1</version>
        </dependency>

2、 Write code

Create entity class ,mapper. Use here MybatisPlus Relatively simple ,CRUD Basically meet our example .
We focus on the data drop of sub database and sub table .

3、application.properties The configuration file

In the project application.properties This is done in the configuration file ShardingSphere To configure .

#  Configure real data sources 
spring.shardingsphere.datasource.names=db1
#  Configure the 1 Data sources 
spring.shardingsphere.datasource.db1.type=com.alibaba.druid.pool.DruidDataSource
spring.shardingsphere.datasource.db1.driver-class-name=com.mysql.cj.jdbc.Driver
spring.shardingsphere.datasource.db1.url=jdbc:mysql://localhost:3306/sharding_db1?useUnicode=true&characterEncoding=utf8&useSSL=true&serverTimezone=GMT
spring.shardingsphere.datasource.db1.username=root
spring.shardingsphere.datasource.db1.password=123456

#  Specifies the distribution of tables 
spring.shardingsphere.sharding.tables.course.actual-data-nodes=db1.course_$->{
   1..2}
#  Specifies the primary key generation policy for the table , Snowflake algorithm 
spring.shardingsphere.sharding.tables.course.key-generator.column=id
spring.shardingsphere.sharding.tables.course.key-generator.type=SNOWFLAKE
spring.shardingsphere.sharding.tables.course.key-generator.props.worker.id=1

## Specify the fragmentation policy , Appointment id Values of even numbers are added to  course_1 surface , Add odd numbers to  course_2 surface .
# Table strategy 
spring.shardingsphere.sharding.tables.course.table-strategy.inline.sharding-column=id
spring.shardingsphere.sharding.tables.course.table-strategy.inline.algorithm-expression=course_$->{id%2+1}

#  open shardingsphere Of sql Log output .
spring.shardingsphere.props.sql.show=true

spring.main.allow-bean-definition-overriding=true

4、 Test code

1） preservation

    @Test
    public void testSave() throws InterruptedException {
    
        for (int i = 0; i < 20; i++) {
    
            Course course = new Course();
            course.setGmtCreate(new Date());
            course.setName("java");
            course.setUserId(1001L + i);
            course.setStatus("1");

            TimeUnit.MILLISECONDS.sleep(50);
            int count = courseMapper.insert(course);
            System.out.println("id ->" + course.getId());
        }

    }

The data is stored according to our table splitting rules ok,

2） Inquire about

    @Test
    public void testGetById() {
    
        QueryWrapper<Course> wrapper = new QueryWrapper<Course>();
// wrapper.eq("id",1513890338544734210L);
        wrapper.in("id",1513890338544734210L, 1513890342613209090L);
        List<Course> courses = courseMapper.selectList(wrapper);
        System.out.println(courses.size());
        courses.forEach(course -> System.out.println(course));
    }