当前位置：网站首页>Es introduction learning notes

Es introduction learning notes

2022-04-23 13:44:00 【0oIronhide】

Introduce ：

ES It is a non relational database of distributed documents （ A document is similar to a single record in a relational database ）, Each field of the document will be indexed by default , The data of each field can be searched , It can scale horizontally to hundreds of server storage and processing PB Level of data .ES be based on Restful Api Interface , It can be done by Restful Api And ES Interact .

Use cases ：

es It can be used as the database of blog system , Articles and other contents are stored in es in , It can be quickly retrieved according to the content of the article
When the amount of data in relational database is very large and the query is slow , A large amount of data can be imported into es in , Inquiry es
Deploy a large-scale logging framework 、logstash Collect logs ,es Storage 、 Search and analyze massive Events ,kibana Visual view results ;ELK technology ,elasticsearch+logstash+kibana

es Official Guide

Quick start project

stay Windows In the environment , download es And Kibana,Kibana Is an open source analysis and visualization platform , It can be done by kibana Of Dev Tool And es Data processing CURD,kibana Version and es Version synchronization , So it's best to use the same version , Download address ：es download ,kibana download

After downloading, unzip the respective compressed packages , stay cmd Running under their own compressed packages bin In the catalog .bat File can ;es Default port number 9200,kibana Default port number 5601, visit localhost:5601 Get into kibana

Concept

Indexes index： One Indexes Similar to a traditional relational database database , It's a place to store relational documents .

type ： Similar to tables in traditional relational databases , stay es In the old version, an index library can contain multiple types , stay es The index , The same property name exists in different mapping types , They all use the same Lucene attribute , Because there are few or no fields in different types of the same index library , It will affect es The query efficiency of , So the follow-up is es6.0.0 In the version, a document can only contain one type , And in the 7.0.0 in , Type will be deprecated , here we are 8.0.0 Will be completely deleted .7.x The default type in is _doc

file ： Equivalent to a single record in a relational database

Field ： Equivalent to... In a relational database columns Column

Inverted index ：

Relational database by adding a Indexes For example, a B Trees （B-tree） Indexes To the specified column , In order to improve the speed of data retrieval .Elasticsearch and Lucene It uses a name called Inverted index To achieve the same purpose . stay es in , Every attribute in a document is inverted by default . An attribute without inverted index cannot be searched . es Find out why it's fast ？

Operation steps of inverted index ：

First extract all the keywords contained in the document
Then save the corresponding relationship between keywords and documents
Finally, index and sort the keywords themselves .

This is what happens when users retrieve keywords , You can find the keyword index first , Find the document through the corresponding relationship between keywords and documents .

As shown in the following three documents ：

id	age	sex	name
1	18	female	Peking University
2	20	male	Hebei University Youth
3	18	male	Patriotic youth

Document containing the specified keyword ：

Serial number	keyword	Include documents
1	Peking University	1,2
2	hebei	2
3	university	2
4	youth	2,3
5	patriotic	3
6	18	1,3
7	20	2
8	male	2,3
9	female	1

Search by keyword , The result document can be retrieved directly

RestApi

method	url	describe
PUT	localhost:9200/ The index name / Type the name / file id	create documents （ Specify the document id）
POST	localhost:9200/ The index name / Type the name	create documents （ Random documents id）
POST	localhost:9200/ The index name / Type the name / file id/_update	Modify the document
DELETE	localhost:9200/ The index name / Type the name / file id	Delete the document
GET	localhost:9200/ The index name / Type the name / file id	Query documents through documents id
POST	localhost:9200/ The index name / Type the name /_search	Query all the data

The main data type of the field

String type ： text, keyword
Numeric type ：long, integer, short, byte, double, float, half_float, scaled_float
date ：date
date nanosecond ：date_nanos
Boolean type ：boolean
Binary：binary
Range: integer_range, float_range, long_range, double_range, date_range

Fragmentation

Divided into The primary shard And Replication fragmentation

Slicing is similar to mysql Sub database and sub table in ,es When creating an index library, you can set to create several primary partitions and replication partitions , By default , An index is assigned 5 Main segments , A master partition has a replica partition , Each document is stored in a separate main slice , When inserting a document , According to the document _id( only id) Decide which primary partition the document is stored in ;

Copy shard is just a copy of the main shard , It can prevent data loss caused by hardware failure , At the same time, it can provide read request , Such as searching or retrieving documents from other partitions . Either master shard or copy shard can handle read requests —— Search or document retrieval , So the more redundant the data , The more search throughput you can handle .

es Deep slicing

give an example

More comprehensive use kibana Inquire about es Data tutorial

Aggregate query

//  Create a commodity index library , Do not specify type , Type default 
PUT goods
{
    
  	// mappings  Is to define the field name and data type of the index in the index library , Be similar to mysql Table structure information in .
	"mappings": {
    
		"properties": {
    
			"goodsId": {
    
				"type": "integer"
			},
            //  Set up goodsName The type is text For full-text search , At the same time, the type is keyword For keyword search 
            // keyword Fields of type will be sorted by default , When fetching data, it will be output directly according to the order , High query efficiency 
			"goodsName": {
    
				"type": "text",
				"fields": {
    
					"keyword": {
    
						"type": "keyword",
                        //  The maximum field value length of the index , The excess is not indexed and stored 
						"ignore_above": 256
					}
				}
			},
			"createTime": {
    
				"type": "date"
			}
		}
	}
}
//  Insert a piece of product data , Don't specify id
POST goods/_doc
{
    
  "goodsId":1,
  "goodsName":" goods 1",
  "createTime":"1643178793939"
}

//  Query all the data 
GET goods/_search
//  result ：
{
    
  "took" : 0,
  "timed_out" : false,
  "_shards" : {
    
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    
    "total" : {
    
      "value" : 2, //  There are two pieces of data 
      "relation" : "eq"
    },
    "max_score" : 1.0,
    "hits" : [
      {
    
        "_index" : "goods",
        "_type" : "_doc",
        "_id" : "vhqYlX4BFS2tMwa2XXMK",
        "_score" : 1.0,
        "_source" : {
    
          "goodsId" : 1,
          "goodsName" : " goods 1",
          "createTime" : "1643178793939"
        }
      },
      {
    
        "_index" : "goods",
        "_type" : "_doc",
        "_id" : "wRqYlX4BFS2tMwa2dXO9",
        "_score" : 1.0,
        "_source" : {
    
          "goodsId" : 2,
          "goodsName" : " goods 2",
          "createTime" : "1643178793939"
        }
      }
    ]
  }
}

//  according to goodsId Conditions of the query 
GET goods/_search
{
    
  "query": {
    
    "match": {
    
      "goodsId": 1
    }
  }
}
//  result ：
{
    
  "took" : 0,
  "timed_out" : false,
  "_shards" : {
    
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    
    "total" : {
    
      "value" : 1,
      "relation" : "eq"
    },
    "max_score" : 1.0,
    "hits" : [
      {
    
        "_index" : "goods",
        "_type" : "_doc",
        "_id" : "vhqYlX4BFS2tMwa2XXMK",
        "_score" : 1.0,
        "_source" : {
    
          "goodsId" : 1,
          "goodsName" : " goods 1",
          "createTime" : "1643178793939"
        }
      }
    ]
  }
}