ElasticSearch系列一(什么是ElasticSearch)

    xiaoxiao2021-03-26  49

    目录

     

     

    es的安装与启动

    索引的相关操作

    快速查看集群中有哪些索引

    简单的索引操作

    创建索引:

    删除索引:

    ES的CRUD操作

    (1)新增商品:新增文档,建立索引

    (2)查询商品:检索文档

    (3)修改商品:替换文档

    修改商品:更新文档

    删除商品:删除文档

    几种搜索方式

    1、query string search

    2、query DSL

    3、query filter

    4、full-text search(全文检索)

    5、phrase search(短语搜索)

    6、highlight search(高亮搜索结果)


    lucnce和elasticsearch的关系

    lucene是最先进、功能最强大的搜索库,但是直接基于lucene开发,非常复杂,api复杂(实现一些简单的功能,写大量的java代码),需要深入理解原理(各种索引结构)

    elasticsearch,基于lucene,隐藏复杂性,提供简单易用的restful api接口、java api接口(还有其他语言的api接口) (1)分布式的文档存储引擎 (2)分布式的搜索引擎和分析引擎 (3)分布式,支持PB级数据

    开箱即用,优秀的默认参数,不需要任何额外设置,完全开源

    elasticsearch的核心概念

    (1)Near Realtime(NRT):近实时,两个意思,从写入数据到数据可以被搜索到有一个小延迟(大概1秒);基于es执行搜索和分析可以达到秒级

    (2)Cluster:集群,包含多个节点,每个节点属于哪个集群是通过一个配置(集群名称,默认是elasticsearch)来决定的,对于中小型应用来说,刚开始一个集群就一个节点很正常 (3)Node:节点,集群中的一个节点,节点也有一个名称(默认是随机分配的),节点名称很重要(在执行运维管理操作的时候),默认节点会去加入一个名称为“elasticsearch”的集群,如果直接启动一堆节点,那么它们会自动组成一个elasticsearch集群,当然一个节点也可以组成一个elasticsearch集群

    (4)Document&field:文档,es中的最小数据单元,一个document可以是一条客户数据,一条商品分类数据,一条订单数据,通常用JSON数据结构表示,每个index下的type中,都可以去存储多个document。一个document里面有多个field,每个field就是一个数据字段。

    product document

    {   "product_id": "1",   "product_name": "高露洁牙膏",   "product_desc": "高效美白",   "category_id": "2",   "category_name": "日化用品" }

    (5)Index:索引,包含一堆有相似结构的文档数据,比如可以有一个客户索引,商品分类索引,订单索引,索引有一个名称。一个index包含很多document,一个index就代表了一类类似的或者相同的document。比如说建立一个product index,商品索引,里面可能就存放了所有的商品数据,所有的商品document。 (6)Type:类型,每个索引里都可以有一个或多个type,type是index中的一个逻辑数据分类,一个type下的document,都有相同的field,比如博客系统,有一个索引,可以定义用户数据type,博客数据type,评论数据type。

    商品index,里面存放了所有的商品数据,商品document

    但是商品分很多种类,每个种类的document的field可能不太一样,比如说电器商品,可能还包含一些诸如售后时间范围这样的特殊field;生鲜商品,还包含一些诸如生鲜保质期之类的特殊field

    type,日化商品type,电器商品type,生鲜商品type

    日化商品type:product_id,product_name,product_desc,category_id,category_name 电器商品type:product_id,product_name,product_desc,category_id,category_name,service_period 生鲜商品type:product_id,product_name,product_desc,category_id,category_name,eat_period

    每一个type里面,都会包含一堆document

    {   "product_id": "2",   "product_name": "长虹电视机",   "product_desc": "4k高清",   "category_id": "3",   "category_name": "电器",   "service_period": "1年" }

    {   "product_id": "3",   "product_name": "基围虾",   "product_desc": "纯天然,冰岛产",   "category_id": "4",   "category_name": "生鲜",   "eat_period": "7天" }

    (7)shard:单台机器无法存储大量数据,es可以将一个索引中的数据切分为多个shard,分布在多台服务器上存储。有了shard就可以横向扩展,存储更多数据,让搜索和分析等操作分布到多台服务器上去执行,提升吞吐量和性能。每个shard都是一个lucene index。 (8)replica:任何一个服务器随时可能故障或宕机,此时shard可能就会丢失,因此可以为每个shard创建多个replica副本。replica可以在shard故障时提供备用服务,保证数据不丢失,多个replica还可以提升搜索操作的吞吐量和性能。primary shard(建立索引时一次设置,不能修改,默认5个),replica shard(随时修改数量,默认1个),默认每个索引10个shard,5个primary shard,5个replica shard,最小的高可用配置,是2台服务器。

    shard和replica的解释如下图

    -------------------------------------------------------------------------------------------------------------------------

    elasticsearch核心概念 vs. 数据库核心概念

    Elasticsearch            数据库

    Document                  行 Type                           表 Index                          库

     

    es的安装与启动

    Mac  启动elasticsearch命令:sh ./bin/elasticsearch

    http://localhost:9200/?pretty

    elasticsearch和kibna的版本需要一致。不然会报错

    mac启动kibna命令

    http://localhost:5601

    在dev_tools中可以用命令行操作es

    索引的相关操作

    快速查看集群中有哪些索引

    GET /_cat/indices?v

    health status index   uuid                   pri rep docs.count docs.deleted store.size pri.store.size yellow open   .kibana rUm9n9wMRQCCrRDEhqneBg   1   1          1            0      3.1kb          3.1kb

    简单的索引操作

    创建索引:

    PUT /test_index?pretty

    health status index      uuid                   pri rep docs.count docs.deleted store.size pri.store.size yellow open   test_index XmS9DTAtSkSZSwWhhGEKkQ   5   1          0            0       650b           650b yellow open   .kibana    rUm9n9wMRQCCrRDEhqneBg   1   1          1            0      3.1kb          3.1kb

    删除索引:

    DELETE /test_index?pretty

    health status index   uuid                   pri rep docs.count docs.deleted store.size pri.store.size yellow open   .kibana rUm9n9wMRQCCrRDEhqneBg   1   1          1            0      3.1kb          3.1kb

    ----------------------------------------------------------------------------------------------------------------------------

    ES的CRUD操作

    (1)新增商品:新增文档,建立索引

    PUT /index/type/id {   "json数据" }

    PUT /ecommerce/product/1 {     "name" : "gaolujie yagao",     "desc" :  "gaoxiao meibai",     "price" :  30,     "producer" :      "gaolujie producer",     "tags": [ "meibai", "fangzhu" ] }

    {   "_index": "ecommerce",   "_type": "product",   "_id": "1",   "_version": 1,   "result": "created",   "_shards": {     "total": 2,     "successful": 1,     "failed": 0   },   "created": true }

    PUT /ecommerce/product/2 {     "name" : "jiajieshi yagao",     "desc" :  "youxiao fangzhu",     "price" :  25,     "producer" :      "jiajieshi producer",     "tags": [ "fangzhu" ] }

    PUT /ecommerce/product/3 {     "name" : "zhonghua yagao",     "desc" :  "caoben zhiwu",     "price" :  40,     "producer" :      "zhonghua producer",     "tags": [ "qingxin" ] }

    es会自动建立index和type,不需要提前创建,而且es默认会对document每个field都建立倒排索引,让其可以被搜索

    (2)查询商品:检索文档

    GET /index/type/id GET /ecommerce/product/1

    {   "_index": "ecommerce",   "_type": "product",   "_id": "1",   "_version": 1,   "found": true,   "_source": {     "name": "gaolujie yagao",     "desc": "gaoxiao meibai",     "price": 30,     "producer": "gaolujie producer",     "tags": [       "meibai",       "fangzhu"     ]   } }

    (3)修改商品:替换文档

    PUT /ecommerce/product/1 {     "name" : "jiaqiangban gaolujie yagao",     "desc" :  "gaoxiao meibai",     "price" :  30,     "producer" :      "gaolujie producer",     "tags": [ "meibai", "fangzhu" ] }

    {   "_index": "ecommerce",   "_type": "product",   "_id": "1",   "_version": 1,   "result": "created",   "_shards": {     "total": 2,     "successful": 1,     "failed": 0   },   "created": true }

    {   "_index": "ecommerce",   "_type": "product",   "_id": "1",   "_version": 2,   "result": "updated",   "_shards": {     "total": 2,     "successful": 1,     "failed": 0   },   "created": false }

    替换方式有一个不好,即使必须带上所有的field,才能去进行信息的修改

    下面的操作,会让该条记录只有name一条字段 PUT /ecommerce/product/1 {     "name" : "jiaqiangban gaolujie yagao" }

    修改商品:更新文档

    POST /ecommerce/product/1/_update {   "doc": {     "name": "jiaqiangban gaolujie yagao"   } }

    {   "_index": "ecommerce",   "_type": "product",   "_id": "1",   "_version": 8,   "result": "updated",   "_shards": {     "total": 2,     "successful": 1,     "failed": 0   } }

    删除商品:删除文档

    DELETE /ecommerce/product/1

    {   "found": true,   "_index": "ecommerce",   "_type": "product",   "_id": "1",   "_version": 9,   "result": "deleted",   "_shards": {     "total": 2,     "successful": 1,     "failed": 0   } }

    {   "_index": "ecommerce",   "_type": "product",   "_id": "1",   "found": false }

     

    几种搜索方式

    搜索全部商品:GET /ecommerce/product/_search

    took:耗费了几毫秒 timed_out:是否超时,这里是没有 _shards:数据拆成了5个分片,所以对于搜索请求,会打到所有的primary shard(或者是它的某个replica shard也可以) hits.total:查询结果的数量,3个document hits.max_score:score的含义,就是document对于一个search的相关度的匹配分数,越相关,就越匹配,分数也高 hits.hits:包含了匹配搜索的document的详细数据

    {   "took": 2,   "timed_out": false,   "_shards": {     "total": 5,     "successful": 5,     "failed": 0   },   "hits": {     "total": 3,     "max_score": 1,     "hits": [       {         "_index": "ecommerce",         "_type": "product",         "_id": "2",         "_score": 1,         "_source": {           "name": "jiajieshi yagao",           "desc": "youxiao fangzhu",           "price": 25,           "producer": "jiajieshi producer",           "tags": [             "fangzhu"           ]         }       },       {         "_index": "ecommerce",         "_type": "product",         "_id": "1",         "_score": 1,         "_source": {           "name": "gaolujie yagao",           "desc": "gaoxiao meibai",           "price": 30,           "producer": "gaolujie producer",           "tags": [             "meibai",             "fangzhu"           ]         }       },       {         "_index": "ecommerce",         "_type": "product",         "_id": "3",         "_score": 1,         "_source": {           "name": "zhonghua yagao",           "desc": "caoben zhiwu",           "price": 40,           "producer": "zhonghua producer",           "tags": [             "qingxin"           ]         }       }     ]   } }

    query string search的由来,因为search参数都是以http请求的query string来附带的

    搜索商品名称中包含yagao的商品,而且按照售价降序排序:GET /ecommerce/product/_search?q=name:yagao&sort=price:desc

    适用于临时的在命令行使用一些工具,比如curl,快速的发出请求,来检索想要的信息;但是如果查询请求很复杂,是很难去构建的 在生产环境中,几乎很少使用query string search

    ---------------------------------------------------------------------------------------------------------------------------------

    2、query DSL

    DSL:Domain Specified Language,特定领域的语言 http request body:请求体,可以用json的格式来构建查询语法,比较方便,可以构建各种复杂的语法,比query string search肯定强大多了

    查询所有的商品

    GET /ecommerce/product/_search {   "query": { "match_all": {} } }

    查询名称包含yagao的商品,同时按照价格降序排序

    GET /ecommerce/product/_search {     "query" : {         "match" : {             "name" : "yagao"         }     },     "sort": [         { "price": "desc" }     ] }

    分页查询商品,总共3条商品,假设每页就显示1条商品,现在显示第2页,所以就查出来第2个商品

    GET /ecommerce/product/_search {   "query": { "match_all": {} },   "from": 1,   "size": 1 }

    指定要查询出来商品的名称和价格就可以

    GET /ecommerce/product/_search {   "query": { "match_all": {} },   "_source": ["name", "price"] }

    更加适合生产环境的使用,可以构建复杂的查询

    ---------------------------------------------------------------------------------------------------------------------------------

    3、query filter

    搜索商品名称包含yagao,而且售价大于25元的商品

    GET /ecommerce/product/_search {     "query" : {         "bool" : {             "must" : {                 "match" : {                     "name" : "yagao"                 }             },             "filter" : {                 "range" : {                     "price" : { "gt" : 25 }                 }             }         }     } }

    ---------------------------------------------------------------------------------------------------------------------------------

    4、full-text search(全文检索)

    GET /ecommerce/product/_search {     "query" : {         "match" : {             "producer" : "yagao producer"         }     } }

     

    producer这个字段,会先被拆解,建立倒排索引

    special        4 yagao        4 producer    1,2,3,4 gaolujie    1 zhognhua    3 jiajieshi    2

    yagao producer ---> yagao和producer

    {   "took": 4,   "timed_out": false,   "_shards": {     "total": 5,     "successful": 5,     "failed": 0   },   "hits": {     "total": 4,     "max_score": 0.70293105,     "hits": [       {         "_index": "ecommerce",         "_type": "product",         "_id": "4",         "_score": 0.70293105,         "_source": {           "name": "special yagao",           "desc": "special meibai",           "price": 50,           "producer": "special yagao producer",           "tags": [             "meibai"           ]         }       },       {         "_index": "ecommerce",         "_type": "product",         "_id": "1",         "_score": 0.25811607,         "_source": {           "name": "gaolujie yagao",           "desc": "gaoxiao meibai",           "price": 30,           "producer": "gaolujie producer",           "tags": [             "meibai",             "fangzhu"           ]         }       },       {         "_index": "ecommerce",         "_type": "product",         "_id": "3",         "_score": 0.25811607,         "_source": {           "name": "zhonghua yagao",           "desc": "caoben zhiwu",           "price": 40,           "producer": "zhonghua producer",           "tags": [             "qingxin"           ]         }       },       {         "_index": "ecommerce",         "_type": "product",         "_id": "2",         "_score": 0.1805489,         "_source": {           "name": "jiajieshi yagao",           "desc": "youxiao fangzhu",           "price": 25,           "producer": "jiajieshi producer",           "tags": [             "fangzhu"           ]         }       }     ]   } }

    ---------------------------------------------------------------------------------------------------------------------------------

    5、phrase search(短语搜索)

    跟全文检索相对应,相反,全文检索会将输入的搜索串拆解开来,去倒排索引里面去一一匹配,只要能匹配上任意一个拆解后的单词,就可以作为结果返回 phrase search,要求输入的搜索串,必须在指定的字段文本中,完全包含一模一样的,才可以算匹配,才能作为结果返回

    GET /ecommerce/product/_search {     "query" : {         "match_phrase" : {             "producer" : "yagao producer"         }     } }

    {   "took": 11,   "timed_out": false,   "_shards": {     "total": 5,     "successful": 5,     "failed": 0   },   "hits": {     "total": 1,     "max_score": 0.70293105,     "hits": [       {         "_index": "ecommerce",         "_type": "product",         "_id": "4",         "_score": 0.70293105,         "_source": {           "name": "special yagao",           "desc": "special meibai",           "price": 50,           "producer": "special yagao producer",           "tags": [             "meibai"           ]         }       }     ]   } }

    ---------------------------------------------------------------------------------------------------------------------------------

    6、highlight search(高亮搜索结果)

    GET /ecommerce/product/_search {     "query" : {         "match" : {             "producer" : "producer"         }     },     "highlight": {         "fields" : {             "producer" : {}         }     } }

    搜索的结果会被<em>标亮 "zhonghua <em>producer</em>"

     

    转载请注明原文地址: https://ju.6miu.com/read-350027.html

    最新回复(0)