花了2小时装完了elastic search,那么得花半小时了解下简单语法了,把简单语法的操作学习了一下,然后在这边做一个记录.
es支持restful风格api,操作可以用http请求实现
创建一个新的索引库:
#号后面的得删掉
PUT /leyou #库名
{
"settings": {
"number_of_shards": 1, #分片数量
"number_of_replicas": 0 #副本数量
}
}
查看索引信息
GET /leyou //库名
删除索引库
DELETE /leyou
创建完索引库之后,当然要添加映射了(mappings),就和数据库要加表一样
这里先说下基本的常用类型
-
String类型,又分两种:
- text:可分词,不可参与聚合
- keyword:不可分词,数据会作为完整字段进行匹配,可以参与聚合
-
Numerical:数值类型,分两类
- 基本数据类型:long、interger、short、byte、double、float、half_float
- 浮点数的高精度类型:scaled_float
- 需要指定一个精度因子,比如10或100。elasticsearch会把真实值乘以这个因子后存储,取出时再还原。
-
Date:日期类型
elasticsearch可以对日期格式化为字符串存储,但是建议我们存储为毫秒值,存储为long,节省空间。
PUT /索引库名/_mapping #(自7.0起取消了Type,不需要加索引类型名,统一为_doc)
{
"properties": {
"字段名": {
"type": "类型", #类型:可以是text、long、short、date、integer、object等
"index": true, #是否索引,默认为true
"store": true, #是否存储,默认为false
"analyzer": "分词器" #分词器,这里的`ik_max_word`即使用ik分词器
}
}
}
-
index: 默认值就是true,也就是说你不进行任何配置,所有字段都会被索引。
但是有些字段是我们不希望被索引的,比如商品的图片信息,就需要手动设置index为false。 -
store: 是否将数据进行额外存储。_source 默认false
例:
PUT /leyou/_mapping
{
"properties":{
"title":{
"type": "text",
"analyzer": "ik_max_word"
},
"images":{
"type": "keyword",
"index": "false"
},
"price":{
"type":"float"
}
}
}
查看索引库的索引类型: GET /索引库名/_mapping
例: GET /leyou/_mapping
有了映射以后,就可以新增数据了
插入分两种
指定文档ID插入:手动指定ID
自动产生文档ID插入:不是MySQL那样的AutoIncrement,而是类似MongoDB那样的自动生成ID
POST /索引库名/_doc(/手动添加的id)
{
"key":"value"
}
例:
POST /leyou/_doc/
{
"title":"华为手机",
"images":"http://image.leyou.com/10086.jpg",
"price":2799.00
}
POST /leyou/_doc/1
{
"title":"小米手机",
"images":"http://image.leyou.com/10086.jpg",
"price":2699.00
}
查询刚刚插入的数据: GET /leyou/_search
如果插入了映射之外的数据,es会对这些数据的类型进行智能判断:
这里多了两个新增的字段,es会将其推断为 long与bool 添加映射并添加数据
例:
POST /leyou/_doc/3
{
"title":"OPPO手机",
"images":"http://image.leyou.com/123456.jpg",
"price":2899.00,
"stock": 200,
"saleable":true
}
修改数据请求方式为PUT: 当表中有id相符的数据时,执行修改,没有时,执行添加
例:
PUT /leyou/_doc/3
{
"title":"OPPO手机",
"images":"http://image.leyou.com/123456.jpg",
"price":2899.00,
"stock": 200,
"saleable":false
}
PUT /leyou/_doc/4 #这条,添加了数据
{
"title":"VIVO手机",
"images":"http://image.leyou.com/123456.jpg",
"price":2199.00,
"stock": 200,
"saleable":true
}
删除: DELETE /索引库名/_doc/id值
例: DELETE /leyou/_doc/4 #刚刚添加的被删除了
查询:
基本查询:
GET /索引库名/_search
{
"query":{
"查询类型":{
"查询条件":"查询条件值"
}
}
}
查询类型: match_all:查所有
, match:匹配查询
,term:词条查询
, range:范围查询
GET /leyou/_search
{
"query": {
"match": {
"title": "手机" #查title字段包含手机的文档(匹配的数据)
}
}
}
GET /leyou/_search
{
"query":{
"match_all": {} #查所有 可简写为 GET /leyou/_search
}
}
深化一下单数据查询: 先加个华为Mac
POST /leyou/_doc/5
{
"title":"华为Mac",
"images":"http://image.leyou.com/10086.jpg",
"price":3799.00
}
这里用匹配查一下华为
GET /leyou/_search
{
"query": {
"match": {
"title": "华为Mac"
}
}
}
结果出现了手机 和 Mac ,可是 场景是 我只想搜索Mac啊,为什么给我推华为手机 这里通过条件操作属性 改良查询json
GET /leyou/_search
{
"query": {
"match": {
"title": {
"query": "华为Mac",
"operator": "and" #这里定义了操作符and,表示只检索完全匹配的数据
}
}
}
}
现在场景再次变化,添加一个华为电视
POST /leyou/_doc/6
{
"title":"华为电视",
"images":"http://image.leyou.com/10086.jpg",
"price":6799.00
}
我们想搜索 华为Mac的同时,了解电视的行情(华为Mac电视),上面的操作符and又不好用了(没有结果)
这里引入 minimum_should_match 匹配的程度 可以是词条数目也可以是百分比(通常百分比)
GET /leyou/_search
{
"query": {
"match": {
"title": {
"query": "华为Mac电视",
"minimum_should_match": "75%" #搜索75%匹配的数据
}
}
}
}
多字段查询:multi_match
先加一个数据:
POST /leyou/_doc/7
{
"title":"华为手机",
"subTitle":"吊打小米",
"images":"http://image.leyou.com/10086.jpg",
"price":6799.00
}
GET /leyou/_search
{
"query": {
"multi_match": {
"query": "华为吊打小米", #搜索内容
"fields": ["title","subTitle"], #搜索的字段title,subTitle
"minimum_should_match": "75%" #匹配程度
}
}
}
成功搜索到了插入的数据
词条查询:term 查询条件必须为不可分词的最小词条
查出华为的全部内容
GET /leyou/_search
{
"query": {
"term": {
"title": {
"value": "华为"
}
}
}
}
多词条精确匹配:terms
查出:包含手机与电视的内容(精确匹配)
GET /leyou/_search
{
"query": {
"terms": {
"title": [
"小米",
"电视"
]
}
}
}
基本查询补充 过滤
查出title匹配手机的数据,并只显示title,price字段
GET /leyou/_search
{
"_source": ["title","price"],
"query": {
"match": {
"title": "手机"
}
}
}
includes:只包含
GET /leyou/_search
{
"_source": {
"includes": "price"
},
"query": {
"match": {
"title": "手机"
}
}
}
excludes:排除
GET /leyou/_search
{
"_source": {
"excludes": "subTitle"
},
"query": {
"match": {
"title": "手机"
}
}
}
bool组合将其他查询通过must(与) should(或) must_not(非) 组合
GET /leyou/_search
{
"query": {
"bool": {
"must": [
{
"match": {
"title": "华为"
}
},{
"term": {
"price": "3799"
}
}
]
}
}
}
范围查询range
查出了价格范围为5000到9999的文档,并且,匹配华为
GET /leyou/_search
{
"query": {
"bool": {
"must": [
{
"match": {
"title": "华为"
}
},{
"range": {
"price": {
"gte": 5000,
"lte": 9999
}
}
}
]
}
}
}
fuzzy模糊查询
fuzziness表示模糊的容错为1个字(要是一个字与value不同也会被查到)
fuzzy中只能放词
GET /leyou/_search
{
"query": {
"fuzzy": {
"title": {
"value": "手视",
"fuzziness": 1
}
}
}
}
filter过滤
一般配合bool查询使用,对结果进行过滤,这种过滤是不会影响到_source(排名)
的
filter过滤一般配合搜索栏外的选项上的进行过滤
GET /leyou/_search
{
"query": {
"bool": {
"must": [
{
"match": {
"title": "华为"
}
}
],
"filter": {
"range": {
"price": {
"gte": 1000,
"lte": 8000
}
}
}
}
}
}
range放在bool中,虽然结果一样,但是会影响_source(排名)
GET /leyou/_search
{
"query": {
"bool": {
"must": [
{
"match": {
"title": "华为"
}
},
{
"range": {
"price": {
"gte": 1000,
"lte": 8000
}
}
}
]
}
}
}
排序
一般根据query的结果进行排序
desc:降序 asc:升序
GET /leyou/_search
{
"query": {
"bool": {
"must": [
{
"match": {
"title": "华为"
}
}
],
"filter": {
"range": {
"price": {
"gte": 1000,
"lte": 8000
}
}
}
}
},
"sort": [
{
"price": {
"order": "asc"
}
}
]
}
排序也可以组合,当排序1相同时,按所写字段2排序
价格升序,价格相同_id降序
GET /leyou/_search
{
"query": {
"bool": {
"must": [
{
"match": {
"title": "华为"
}
}
],
"filter": {
"range": {
"price": {
"gte": 1000,
"lte": 8000
}
}
}
}
},
"sort": [
{
"price": {
"order": "asc"
}
},{
"_id":{
"order": "desc"
}
}
]
}
聚合aggregations
桶bucket:
它只负责分组,不负责计算,因此其中通常会嵌套另一种聚合
Elasticsearch中提供的划分桶的方式有很多:
- Date Histogram Aggregation:根据日期阶梯分组,例如给定阶梯为周,会自动每周分为一个存储单元
- Histogram Aggregation:根据数值阶梯分组,与日期类似
- Terms Aggregation:根据词条内容分组,词条内容完全匹配的为一个存储单元
- Range Aggregation:数值和日期的范围分组,指定开始和结束,然后按段分
创建一个新的索引库
PUT /cars
{
"settings": {
"number_of_shards": 1,
"number_of_replicas": 0
},
"mappings": {
"properties": {
"color": {
"type": "keyword"
},
"make": {
"type": "keyword"
}
}
}
}
POST /cars/_doc/_bulk
{ "index": {}}
{ "price" : 10000, "color" : "red", "make" : "honda", "sold" : "2014-10-28" }
{ "index": {}}
{ "price" : 20000, "color" : "red", "make" : "honda", "sold" : "2014-11-05" }
{ "index": {}}
{ "price" : 30000, "color" : "green", "make" : "ford", "sold" : "2014-05-18" }
{ "index": {}}
{ "price" : 15000, "color" : "blue", "make" : "toyota", "sold" : "2014-07-02" }
{ "index": {}}
{ "price" : 12000, "color" : "green", "make" : "toyota", "sold" : "2014-08-19" }
{ "index": {}}
{ "price" : 20000, "color" : "red", "make" : "honda", "sold" : "2014-11-05" }
{ "index": {}}
{ "price" : 80000, "color" : "red", "make" : "bmw", "sold" : "2014-01-01" }
{ "index": {}}
{ "price" : 25000, "color" : "blue", "make" : "ford", "sold" : "2014-02-12" }
GET /cars/_search
聚合为桶:
aggs后跟多个聚合对象 对象由聚合方式与包含的参数组成
"size": 0
即不显示普通结果集,只显示聚合后内容(size为显示结果集的数量)
terms 词条分桶
GET /cars/_search
{
"size": 0,
"aggs": {
"popular_color": {
"terms": {
"field": "color"
}
}
}
}
histogram阶梯分桶
"interval":
阶梯区间
"min_doc_count":
展示的最小条数(设为1就是,桶中没东西就不展示)
GET /cars/_search
{
"size": 0,
"aggs": {
"price_histogram": {
"histogram": {
"field": "price",
"interval": 10000,
"min_doc_count": 1
}
}
}
}
度量(metrics):
分组完成以后,我们一般会对组中的数据进行聚合运算,例如求平均值、最大、最小、求和等,这些在ES中称为度量
比较常用的一些聚合度量方式:
- Avg Aggregation:求平均值
- Max Aggregation:求最大值
- Min Aggregation:求最小值
- Percentiles Aggregation:求百分比
- Stats Aggregation:同时返回avg、max、min、sum、count等
- Sum Aggregation:求和
- Top hits Aggregation:求前几
- Value Count Aggregation:求总数
度量嵌套在聚合中,对聚合的结果进行计算 (嵌套在聚合名称中,一个桶可以有多个度量)
下面popular_color
为桶名 price_avg
,makes
为度量名
GET /cars/_search
{
"size": 0,
"aggs": {
"popular_color": {
"terms": {
"field": "color"
},
"aggs": {
"price_avg": {
"avg": {
"field": "price"
}
},
"makes":{
"value_count": {
"field": "make"
}
}
}
}
}
}
桶内还可以嵌套桶 看到生成桶的关键字,即嵌套桶
下面makes_factory
为嵌套的桶
GET /cars/_search
{
"size": 0,
"aggs": {
"popular_color": {
"terms": {
"field": "color"
},
"aggs": {
"price_avg": {
"avg": {
"field": "price"
}
},
"makes_factory":{
"terms": {
"field": "make"
}
}
}
}
}
}
附上完整json
PUT /leyou
{
"settings": {
"number_of_shards": 1,
"number_of_replicas": 0
}
}
GET /leyou
PUT /leyou/_mapping
{
"properties":{
"title":{
"type": "text",
"analyzer": "ik_max_word"
},
"images":{
"type": "keyword",
"index": "false"
},
"price":{
"type":"float"
}
}
}
GET /leyou/_mapping
POST /leyou/_doc/
{
"title":"华为手机",
"images":"http://image.leyou.com/10086.jpg",
"price":2799.00
}
GET /leyou/_search
POST /leyou/_doc/3
{
"title":"OPPO手机",
"images":"http://image.leyou.com/123456.jpg",
"price":2899.00,
"stock": 200,
"saleable":true
}
PUT /leyou/_doc/3
{
"title":"OPPO手机",
"images":"http://image.leyou.com/123456.jpg",
"price":2899.00,
"stock": 200,
"saleable":false
}
PUT /leyou/_doc/4
{
"title":"VIVO手机",
"images":"http://image.leyou.com/123456.jpg",
"price":2199.00,
"stock": 200,
"saleable":true
}
DELETE /leyou/_doc/4
GET /leyou/_search
{
"query": {
"match": {
"title": "手机"
}
}
}
GET /leyou/_search
{
"query":{
"match_all": {}
}
}
POST /leyou/_doc/5
{
"title":"华为Mac",
"images":"http://image.leyou.com/10086.jpg",
"price":3799.00
}
GET /leyou/_search
{
"query": {
"match": {
"title": "华为Mac"
}
}
}
GET /leyou/_search
{
"query": {
"match": {
"title": {
"query": "华为Mac",
"operator": "and"
}
}
}
}
POST /leyou/_doc/6
{
"title":"华为电视",
"images":"http://image.leyou.com/10086.jpg",
"price":6799.00
}
GET /leyou/_search
{
"query": {
"match": {
"title": {
"query": "华为Mac电视",
"minimum_should_match": "75%"
}
}
}
}
POST /leyou/_doc/7
{
"title":"华为手机",
"subTitle":"吊打小米",
"images":"http://image.leyou.com/10086.jpg",
"price":6799.00
}
GET /leyou/_search
{
"query": {
"multi_match": {
"query": "华为吊打小米",
"fields": ["title","subTitle"],
"minimum_should_match": "75%"
}
}
}
GET /leyou/_search
{
"query": {
"term": {
"title": {
"value": "华为"
}
}
}
}
GET /leyou/_search
{
"query": {
"terms": {
"title": [
"小米",
"电视"
]
}
}
}
GET /leyou/_search
{
"_source": {
"includes": "price"
},
"query": {
"match": {
"title": "手机"
}
}
}
GET /leyou/_search
{
"_source": {
"excludes": "subTitle"
},
"query": {
"match": {
"title": "手机"
}
}
}
#bool组合将其他查询通过must(与) should(或) must_not(非) 组合
GET /leyou/_search
{
"query": {
"bool": {
"must": [
{
"match": {
"title": "华为"
}
},{
"term": {
"price": "3799"
}
}
]
}
}
}
GET /leyou/_search
#范围查询range
#查出了价格范围并且,匹配华为
GET /leyou/_search
{
"query": {
"bool": {
"must": [
{
"match": {
"title": "华为"
}
},{
"range": {
"price": {
"gte": 5000,
"lte": 9999
}
}
}
]
}
}
}
#fuzziness表示模糊的容错为1个字 fuzzy中只能放词
GET /leyou/_search
{
"query": {
"fuzzy": {
"title": {
"value": "手视",
"fuzziness": 1
}
}
}
}
#filter过滤,一般配合bool查询使用,对结果进行过滤,这种过滤是不会影响到_source(排名)的,过滤一般配合搜索栏外的选项上的进行过滤
GET /leyou/_search
{
"query": {
"bool": {
"must": [
{
"match": {
"title": "华为"
}
}
],
"filter": {
"range": {
"price": {
"gte": 1000,
"lte": 8000
}
}
}
}
}
}
#放在bool中,虽然结果一样,但是会影响_source(排名)
GET /leyou/_search
{
"query": {
"bool": {
"must": [
{
"match": {
"title": "华为"
}
},
{
"range": {
"price": {
"gte": 1000,
"lte": 8000
}
}
}
]
}
}
}
#排序,一般根据query的结果进行排序 desc:降序 asc:升序 排序也可以组合,当排序1相同时,按所写字段2排序
GET /leyou/_search
{
"query": {
"bool": {
"must": [
{
"match": {
"title": "华为"
}
}
],
"filter": {
"range": {
"price": {
"gte": 1000,
"lte": 8000
}
}
}
}
},
"sort": [
{
"price": {
"order": "asc"
}
}
]
}
#价格升序,价格相同_id降序
GET /leyou/_search
{
"query": {
"bool": {
"must": [
{
"match": {
"title": "华为"
}
}
],
"filter": {
"range": {
"price": {
"gte": 1000,
"lte": 8000
}
}
}
}
},
"sort": [
{
"price": {
"order": "asc"
}
},{
"_id":{
"order": "desc"
}
}
]
}
PUT /cars
{
"settings": {
"number_of_shards": 1,
"number_of_replicas": 0
},
"mappings": {
"properties": {
"color": {
"type": "keyword"
},
"make": {
"type": "keyword"
}
}
}
}
POST /cars/_doc/_bulk
{ "index": {}}
{ "price" : 10000, "color" : "red", "make" : "honda", "sold" : "2014-10-28" }
{ "index": {}}
{ "price" : 20000, "color" : "red", "make" : "honda", "sold" : "2014-11-05" }
{ "index": {}}
{ "price" : 30000, "color" : "green", "make" : "ford", "sold" : "2014-05-18" }
{ "index": {}}
{ "price" : 15000, "color" : "blue", "make" : "toyota", "sold" : "2014-07-02" }
{ "index": {}}
{ "price" : 12000, "color" : "green", "make" : "toyota", "sold" : "2014-08-19" }
{ "index": {}}
{ "price" : 20000, "color" : "red", "make" : "honda", "sold" : "2014-11-05" }
{ "index": {}}
{ "price" : 80000, "color" : "red", "make" : "bmw", "sold" : "2014-01-01" }
{ "index": {}}
{ "price" : 25000, "color" : "blue", "make" : "ford", "sold" : "2014-02-12" }
GET /cars/_search
#聚合为桶 aggs后跟多个聚合对象 对象由聚合方式,包含的参数组成
#"size": 0 即不显示 普通结果集,只显示聚合后内容
#terms词条分桶
GET /cars/_search
{
"size": 0,
"aggs": {
"popular_color": {
"terms": {
"field": "color"
}
}
}
}
#度量嵌套在聚合中,对聚合的结果进行计算 (嵌套在聚合名称中,一个桶可以有多个度量)
#popular_color为桶名 price_avg,makes为度量名
GET /cars/_search
{
"size": 0,
"aggs": {
"popular_color": {
"terms": {
"field": "color"
},
"aggs": {
"price_avg": {
"avg": {
"field": "price"
}
},
"makes":{
"value_count": {
"field": "make"
}
}
}
}
}
}
#桶内还可以嵌套桶 看到生成桶的关键字,即嵌套桶
GET /cars/_search
{
"size": 0,
"aggs": {
"popular_color": {
"terms": {
"field": "color"
},
"aggs": {
"price_avg": {
"avg": {
"field": "price"
}
},
"makes_factory":{
"terms": {
"field": "make"
}
}
}
}
}
}
#histogram阶梯分桶 "interval": 阶梯区间
#"min_doc_count": 展示的最小条数(设为1就是,桶中没东西就不展示)
GET /cars/_search
{
"size": 0,
"aggs": {
"price_histogram": {
"histogram": {
"field": "price",
"interval": 10000,
"min_doc_count": 1
}
}
}
}