当前位置:网站首页>Chapter II in memory architecture (im-2.2)
Chapter II in memory architecture (im-2.2)
2022-04-23 09:58:00 【Grainger】
Follow up : Chapter two Oracle Database In-Memory Architecture (IM-2.1)
This article is IM The architecture of column storage ( in ) piece
In-Memory Storage unit
IM Column storage management optimizes the data and metadata in the storage unit , Not traditional Oracle Data blocks .
Oracle The database is In-Memory Area Maintain storage units in . The image below shows In-Memory Area And the database processes that interact with it . The remaining sections describe the various memory components .
chart 2-5 IM Column store : Memory and process architecture
This section contains the following topics :
- In-Memory Compression unit (IMCU) In-Memory Compression unit (IMCU) Is a compressed read-only storage unit that contains data for one or more columns .
- Snapshot metadata unit (SMU) Snapshot metadata unit (SMU) Contains the associated IMCU Metadata and transaction information .
- In-Memory Memory expression unit (IMEU) In-Memory Expression unit (IMEU) It is used to realize In-Memory expression (IM expression ) And storage containers for user-defined virtual columns .
In-Memory Compression unit (IMCU)
In-Memory Compression unit (IMCU) Is a compressed read-only storage unit that contains data for one or more columns .
IMCU Similar to tablespace scope . IMCU Has two parts : A set of column compression units (CU) And include things like IM The header that stores the metadata of the index .
This section contains the following topics :
- IMCUs and Schema object IM The column stores a single object ( surface 、 Partition 、 Materialized view ) The data is stored in a set of IMCU in . IMCU Store column data for one and only one object .
- Column compression unit (CU) Column compression unit (CU) yes IMCU Continuous storage of a single column in . Every IMCU Having one or more CU.
- In-Memory Storage index Every IMCU Headers are automatically created and managed CU Of In-Memory Storage index (IM Storage index ). IM Store index store IMCU Minimum and maximum values for all columns in the .
IMCUs and Schema object
IM The column stores a single object ( surface 、 Partition 、 Materialized view ) The data is stored in a set of IMCU in . IMCU Store column data for one and only one object .
For specified as INMEMORY
The object of ,INMEMORY
Each column listed in the clause is included in each IMCU in . for example ,sh.sales
Table has 7 Column , Pictured 2-6 Shown . following DDL Statement specifies the table as INMEMORY
, This means that every sales
Of IMCU Including this 7 Column data of column :
ALTER TABLE sh.sales INMEMORY MEMCOMPRESS FOR QUERY LOW;
To put INMEMORY
Attribute is applied to some columns in the segment , Must be in a DDL Statement specifies all columns as INMEMORY
, Then send out a second DDL Statement to specify on the excluded column NO INMEMORY
attribute . for example , The following statement specifies sh.sales
Medium 3 As a NO INMEMORY
, This means that the rest of the table 4 The column retains its INMEMORY
attribute :
ALTER TABLE sh.sales INMEMORY MEMCOMPRESS FOR QUERY LOW
NO INMEMORY (promo_id, quantity_sold, amount_sold);
The image below shows IM Filled in the column store sh Three tables in the schema :customers
、 products
and sales
. In this example , Each table has a specified INMEMORY
Different number of columns . For each table IMCU Include only the data of the specified column .
chart 2-6 Column sum IMCU
This section contains the following topics :
- In-Memory Compress IM Column storage uses a special compression format optimized for access speed rather than storage reduction . The column format allows queries to be performed directly on compressed columns .
- IMCU and That's ok Every IMCU Contains all column values for a subset of rows in the table segment ( Include null value ). Subsets of rows are called particles .
In-Memory Compress
IM Column storage uses a special compression format optimized for access speed rather than storage reduction . The column format allows queries to be performed directly on compressed columns .
Compression enables scanning and filtering operations to process much less data , To optimize query performance . Oracle The database decompresses data only when the result set needs data .
stay IM Compression applied in column storage is closely related to mixed column compression . Two techniques deal with column vectors , The main difference is for IM Columns for storing vectors SIMD Vector processing for optimization , The column vector of mixed column compression is optimized for disk storage .
When you enable to fill into IM When the objects in the column store , stay INMEMORY
Clause to specify the compression type :FOR DML
、FOR QUERY
(LOW
or HIGH
)、FOR CAPACITY
(LOW
or HIGH
) or NONE.
IMCU and That's ok
Every IMCU Contains all column values for a subset of rows in the table segment ( Include null value ). Subsets of rows are called particles .
All of a given segment IMCU Contains roughly the same number of lines . Oracle The database depends on the data type 、 The data format and compression type automatically determine the particle size . Higher compression levels result in IMCU More lines in .
stay IMCU And a set of database blocks . As the sample 2-2 Shown , Every IMCU Store the values of the columns used for different block sets .
IMCU Columns in are not sorted . Oracle The database populates them in the order they are read from disk .
IMCU The number of lines in the determines IMCU The amount of space consumed . If the number of target rows leads to IMCU Grow more than in 1MB Continuous data available in the pool 1MB The amount of section , be IMCU Create additional sections ( block ) To keep the remaining columns CU. IMCU Always use 1 MB Allocate space for increment .
Example 2-2 IMCU And row subsets
In this simplified example , Only customers
The following of the table 4 Column has INMEMORY
attribute :cust_id
、cust_first_name
、cust_last_name
and cust_gender
. Only... Exists in the table 5 That's ok , Stored in 2 In blocks of data . Conceptually , The first data block stores its rows as follows :
82,Madeline,Li,F;37004,Abel,Embrey,M;1714,Hardy,Gentle,M
The second data block stores rows as follows :
100439,Uma,Campbell,F;3047,Lucia,Downey,F
hypothesis IMCU 1 Store the data of the first data block . under these circumstances , The data stored in the data block 3 Yes cust_id The column values are as follows “ vertical ” Stored in CU Inside :
82
37004
1714
IMCU 2 Storing data from the second data block . These two lines cust_id The column values are stored in CU in , As shown below :
100439
3047
because cust_id
The value is the first value of each row in the data block , therefore cust_id
Column at IMCU The first position in . The columns of always occupy the same position , therefore Oracle The database can read the data of the segment IMCU Reconstruction bank .
Column compression unit (CU)
Column compression unit (CU) yes IMCU Continuous storage of a single column in . Every IMCU Having one or more CU.
This section contains the following topics :
- CU Structure CU Divided into body and head .
- Local Dictionary (Local Dictionary) stay CU in , The local dictionary has a list of different values and its corresponding dictionary code .
CU Structure
CU Divided into body and head .
Every CU The principal storage of is included in IMCU The column value of the row range in . The header contains information about stored in CU Metadata of the value in the body , for example CU The minimum and maximum values within . It can also contain local dictionaries , It is a sorted list of different values in the column and their corresponding dictionary codes .
The image below shows sales Tabular 4 individual CU Of IMCU:prod_id
、cust_id
、time_id
and channel_id
. Every CU Storage is included in IMCU The column value of the row range in .
chart 2-7 IMCU Medium CU
CU Press rowid Store values sequentially . therefore , The database can be created by “ Splicing ” Answer the query together . for example , The application issues the following query :
SELECT cust_id, time_id, channel_id
FROM sales
WHERE prod_id = 5;
The value of the database is 5 The entry of prod_id
Column starts scanning . Suppose the database is in prod_id
Position in column 2 Find 5. The database must now find the corresponding... For this row cust_id,time_id and channel_id. because CU Press rowid Sequential storage of data , So the database can be in the position of those columns 2 Find the corresponding cust_id
、time_id
, and channel_id
value . therefore , To answer queries , The database must be from cust_id
、time_id
, and channel_id
Position in column 2 Extract value , The row is then spliced together to return it to the end user .
Local Dictionary (Local Dictionary)
stay CU in , The local dictionary has a list of different values and its corresponding dictionary code .
The local dictionary stores symbols contained in columns . The following figure illustrates CU How to be in vehicles Table storage name Column .
chart 2-8 Local Dictionary
In the picture above ,CU Contains only 7 That's ok . The CU Each different value in ( for example Cadillac
or Audi) Assigned different dictionary codes , Such as for Cadillac by 2, about Audi by 0
. CU Store dictionary code instead of raw values .
notes :
When the database is connected to the connection group (join group) Use a public Dictionary (common dictionary) when , Local dictionaries contain references to public dictionaries , Not symbols . for example , Not for storage vehicles.name
The value of the column Audi
, BWM
and Cadillac
, Instead, local dictionaries store information such as 101,220 and 66 Dictionary code .
CU The header contains the minimum and maximum values of the column . In this example , The minimum value is Audi, The maximum value is Cadillac. Local dictionaries store lists of different values :Audi
, BMW
and Cadillac
. Their corresponding dictionary code (0
, 1
and 2
) It's implicit . Every IMCU Medium CU Our local dictionary is independent of other IMCU Local dictionary in .
If a query is filtered Audi automobile , So the database only scans this IMCU Only 0
Code .
In-Memory Storage index
Every IMCU Headers are automatically created and managed CU Of In-Memory Storage index (IM Storage index ). IM Store index store IMCU Minimum and maximum values for all columns in the .
for example ,sales
Fill in IM Column storage . Each of this table IMCU There are all columns . sales.prod_id
Columns are stored in each IMCU Separate in CU in . IMCU The header has each prod_id
CU( And everything else CU) The minimum and maximum of .
To eliminate unnecessary scanning , The database can be based on SQL Filter predicate execution IMCU trim . The database scans only those that satisfy the query predicate IMCU, As shown in the figure below WHERE prod_id > 14 AND prod_id < 29
As shown in the example .
chart 2-9 Storage index of column data
Snapshot metadata unit (SMU)
Snapshot metadata unit (SMU) Contains the associated IMCU Metadata and transaction information .
This section contains the following topics :
- IMCU and SMU In-Memory Area The column pool of stores the actual data :IMCU and IMEU. In-Memory Area Metadata pool storage in SMU.
- Transaction log (Transaction Journal) Every SMU Contains a transaction log . The database uses the transaction log to create a transaction IMCU Be consistent in transactions .
IMCU and SMU
In-Memory Area The column pool of stores the actual data :IMCU and IMEU. In-Memory Area Metadata pool storage in SMU.
chart 2-10 IMCU and SMU
This figure shows... In the datapool IMCU And in the metadata pool SMU.
Every IMCU Map to a separate SMU. therefore , If the columnar data pool contains 100 individual IMCU, Then the metadata pool contains 100 individual SMU. SMU Associated with IMCU Store multiple types of metadata , Including the following :
- Object number
- Column number
- Information of the mapping line
Transaction log (Transaction Journal)
Every SMU Contains a transaction log . The database uses the transaction log to create a transaction IMCU Be consistent in transactions .
The database uses buffer cache (buffer cache) To deal with it DML, It's like it's not enabled IM Column storage is the same . for example ,UPDATE The statement may modify IMCU The lines in the . under these circumstances , The database will the modified row rowid Add to transaction log , And mark it as from DML Of the statement SCN Has expired since . If the query needs to access the new version of the row , The database gets the row from the database buffer cache .
chart 2-11 Transaction log (Transaction Journal)
Database by merging columns 、 Transaction log (transaction journal) And buffer cache (buffer cache) To achieve read consistency . When IMCU When refreshing during repopulation , Queries can be made directly from IMCU Access the latest line .
In-Memory Expression unit (IMEU)
In-Memory Expression Unit (IMEU) Is used to implement memory expressions (IM expression ) And storage containers for user-defined virtual columns .
The database treats materialized expressions as IMCU The other columns in . conceptually ,IMEU It's his father IMCU Logical extension of . just as IMCU Can contain multiple columns ,IMEU Can contain multiple virtual columns .
Every IMEU Map to a IMCU, Map to the same rowset . IMEU Including its related IMCU The expression result of the data contained in . When IMCU When filled , The associated IMEU Also filled .
Typical IM The expression involves one or more columns , May have a constant , And has a one-to-one mapping with the rows in the table . for example ,employees
Tabular IMCU Include columns as weekly_salary The line of 1-1000. For storage here IMCU The lines in the ,IMEU Calculate automatically detected IM expression weekly_salary*52
And user-defined virtual columns quarterly_salary
Defined as weekly_salary*12
. IMCU The third line in maps down to IMEU The third line in .
IMEU It's a specific paragraph IMCU Logical extension of . By default ,IMEU Inherit from base segment INMEMORY
Clause properties , Include Oracle Real Application Clusters(Oracle RAC) attribute , Such as DISTRIBUTE
and DUPLICATE
. You can optionally enable or disable IMEU Virtual columns stored in . You can also specify compression levels for different columns .
Expression statistics store (ESS)
Expression statistics store (ESS) Is a repository maintained by the optimizer that stores statistics about expression evaluation . ESS Resident in SGA in , And still on disk .
Enable IM When storing Columns , The database will use ESS Of In-Memory expression (IM expression ) function . however ,ESS Independent of IM Column store . ESS Is a permanent component of the database , Can't disable .
Database usage ESS To determine whether the expression “ heat ”( Frequent visits ), And therefore IM Candidates for expressions . During the hard parsing of the query ,ESS stay SELECT Find the active expression in the list ,WHERE Clause 、GROUP BY
Clause, etc .
For each segment ,ESS Maintain expression Statistics , for example :
- Frequency of execution
- Evaluate costs
- Timestamp evaluation
The optimizer is based on the cost and the number of evaluations , Assign a weighted score to each expression . These values are approximate rather than exact . More active expressions have higher scores . ESS Maintain an internal list of the most frequently accessed expressions .
Use DBMS_INMEMORY_ADMIN
Package control IM The behavior of an expression . for example ,IME_CAPTURE_EXPRESSIONS
The process prompts the database to identify and gradually fill in the hottest expression in the database . IME_POPULATE_EXPRESSIONS
The procedure forces the database to immediately populate the expression .
ESS Information is stored in a data dictionary , And in DBA_EXPRESSION_STATISTICS
The view shows . This view shows the optimizer sent to ESS Metadata . IM The expression in DBA_IM_EXPRESSIONS
The virtual columns generated for the system are displayed in the view , The prefix is string SYS_IME.
In-Memory Process Architecture
Respond to queries and DML, The server process scans the column data and updates SMU Metadata . The background process fills the row data in the disk into IM Column storage .
This section contains the following topics :
- In-Memory Coordinator process (IMCO) In-Memory Coordinator process (IMCO) management IM Many tasks of column storage . Its main task is to start background filling and column data refilling .
- Space management work process (Wnnn) Space management work process (Wnnn) representative IMCO Fill or repopulate data .
In-Memory Coordinator process (IMCO)
In-Memory Coordinator process (IMCO) management IM Many tasks of column storage . Its main task is to start background filling and column data refilling .
Population It's a streaming mechanism , Convert row data to column format , Then compress it . IMCO Automatic start has the function of removing NONE
Any priority other than INMEMORY
Object fill . When the access priority is NONE
The object is ,IMCO Use space to manage work processes (Wnnn) Processes populate them .
When IMCO When the background process meets the temporary threshold , It also starts IM Threshold based repopulation of column storage objects . IMCO It can be applied to those who have expired entries but do not meet the expiration threshold IM Any in the column store IMCU Initiate trickle (trickle) Refill .
Trickle refill (Trickle repopulation) Occurs automatically in the background . Steps are as follows :
- IMCO Wake up the .
- IMCO Determine if group tasks need to be performed , Include IMCU Whether there are obsolete entries in .
- If IMCO Find obsolete entries , Then it triggers the space management worker process to refill IMCU These entries in .
- IMCO Sleep for two minutes , Then go back to step 1.
Space management work process (Wnnn)
Space management work process (Wnnn) representative IMCO Fill or repopulate data .
During filling ,Wnnn Responsible for creating processes IMCU、SMU and IMEU. establish IMEU when , The work process performs the following tasks :
- Identify the virtual column of the population
- Create virtual column values
- Calculate the value of each row , Convert data to column format , And compress it
- Register objects with the space layer
- take IMEU It corresponds to IMCU relation
notes :
stay IMEU Creation period , Father IMCU Can still be used to query .
During refilling ,Wnnn The process is based on existing IMCU And transaction log creation IMCU A new version of the , At the same time, keep the old version temporarily . This mechanism is called Double buffering .
The database can quickly convert IM Expressions move in and out IM Column store . for example , If IMCU There is no IMEU Created in the case of , The database can be added later IMEU, Without coercion IMCU Experience a complete repopulation mechanism .
INMEMORY_MAX_POPULATE_SERVERS
The initialization parameter controls the maximum number of worker processes that can be started for population .INMEMORY_TRICKLE_REPOPULATE_PERCENT
The initialization parameter controls the working process, which can perform trickle repopulation (trickle repopulation) Maximum percentage of time .
( This chapter is not finished , See the next chapter ,IM Series of : Chapter two :IM Column storage architecture (IM-2.3))
Shandong Oracle User group (Shandong Oracle User Group), abbreviation :SDOUG, Is a full of vitality 、 Young non-profit organizations , It aims to provide an exchange platform for technology lovers in Jinan and surrounding areas .SDOUG Organize offline technology sharing activities from time to time , Promote local and surrounding IT Technological development 、 Help technology enthusiasts improve themselves . Share technology 、 Share happiness ,SDOUG On the road .
版权声明
本文为[Grainger]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/04/202204230952210937.html
边栏推荐
- Odoo server setup notes
- Code source daily question div1 (701-707)
- Juc并发编程09——Condition实现源码分析
- Sim Api User Guide(8)
- Go language practice mode - functional options pattern
- DBA common SQL statements (1) - overview information
- art-template 模板引擎
- SAP excel has completed file level validation and repair. Some parts of this workbook may have been repaired or discarded.
- Less than 100 secrets about prime numbers
- [ACM-ICPC 2018 Shenyang Network preliminaries] J. Ka Chang (block + DFS sequence)
猜你喜欢
随机推荐
【无标题】
C language: expression evaluation (integer promotion, arithmetic conversion...)
Code source daily question div1 (701-707)
Sim Api User Guide(4)
2022年制冷与空调设备运行操作考试练习题及模拟考试
元宇宙时代的职业规划与执行
1D / 1D dynamic programming learning summary
Understand scope
打印页面的功能实现
一文读懂PlatoFarm新经济模型以及生态进展
Sim Api User Guide(5)
Construire neuf capacités de fabrication agile à l'ère métacosmique
Sim Api User Guide(6)
構建元宇宙時代敏捷制造的九種能力
Odoo server setup notes
Realize data value through streaming data integration (2)
art-template 模板引擎
Sim Api User Guide(8)
Custom login failure handling
杰理之用户如何最简单的处理事件【篇】