?? 工作中几乎每天都需要使用到ES查询数据,需要根据自己的查询需求构造DSL查询语句来实现,本文记录并分享本人工作中常用的DSL语句以及所遇到问题的解决方案,DSL语句灵活多变,可进行多种组合,任你搭配,让我们一起往下看,希望对你有帮助。
GET _cat/indices
# 以表格形式展示,推荐*
GET _cat/indices?v
index_name/_stats
index_name/_mapping
GET index_name/_settings
GET _cat/aliases
GET index_name/_count
GET index_name/_stats/store
GET index_name/_search?size=0
GET index_name/_cat/shards
GET _cluster/health/index_name
GET _cat/thread_pool/index_name
GET _cat/thread_pool/search
GET _cat/thread_pool/index
GET _cat/thread_pool/delete
GET _cat/thread_pool/refresh
GET _cat/thread_pool/merge
GET _cat/thread_pool/get
GET _cat/thread_pool/update
查询所有字段,默认显示10条数据。
GET index_name/?search
{
"query": {
"match_all": {
}
}
}
GET index_name/?search
{
"query": {
"match": {
"user": "YiShuoIT"
}
}
}
GET index_name/?search
{
"query": {
"term": {
"user": "YiShuoIT"
}
}
}
GET index_name/?search
{
"query": {
"terms": {
"user": ["YiShuoIT", "YiShuo"]
}
}
}
GET index_name/?search
{
"query": {
"wildcard": {
"user": "*YiShuo*"
}
}
}
GET index_name/?search
{
"query": {
"bool": {
"should": [
{
"wildcard": {
"command": "curl*password*"
}
},
{
"wildcard": {
"user": "YiShuo*"
}
}
]
}
}
}
GET index_name/?search
{
"query":{
"prefix": {"user": "Yi"}
}
}
GET index_name/?search
{
"query": {
"bool": {
"should": [
{
"prefix": {
"user": "Yi"
}
},
{
"prefix": {
"command": "curl"
}
}
]
}
}
}
?? fuzziness
参数用于指定模糊匹配的容忍度。它可以是一个数字(表示编辑距离)或一个字符串(表示模糊度)。例如,“2” 表示编辑距离为 2,“auto” 表示根据术语的长度自动计算模糊度。
GET index_name/?search
{
"query": {
"fuzzy": {
"user": {
"value": "yi",
"fuzziness": "2"
}
}
}
}
GET index_name/?search
{
"query": {
"regexp": {
"user": ".*YiShuo.*"
}
}
}
GET index_name/?search
{
"query": {
"bool": {
"filter": {
"range": {
"timestamp": {
"gte": "now/d",
"lte": "now"
}
}
}
}
}
}
GET index_name/?search
{
"query": {
"bool": {
"filter": {
"range": {
"timestamp": {
"gte": "2023-12-01 00:00:00",
"lt": "2023-12-02 23:59:59",
"time_zone": "+08:00",
"format": "yyyy-MM-dd HH:mm:ss"
}
}
}
}
}
}
GET index_name/?search
{
"query": {
"bool": {
"filter": {
"range": {
"timestamp": {
"gte": "2023-12-01 00:00:00",
"lt": "2023-12-02 23:59:59",
"time_zone": "+08:00",
"format": "yyyy-MM-dd HH:mm:ss"
}
}
},
"must": [
{
"wildcard": {
"command": "*"
}
},
{
"wildcard": {
"ip": "192.*"
}
}
]
}
},
"size": 1
}
GET index_name/?search
{
"query": {
"bool": {
"filter": {
"range": {
"timestamp": {
"gte": "2023-12-01 00:00:00",
"lt": "2023-12-02 23:59:59",
"time_zone": "+08:00",
"format": "yyyy-MM-dd HH:mm:ss"
}
}
},
"must_not": [
{
"wildcard": {
"ip": "127.*"
}
},
{
"wildcard": {
"ip": "localhost*"
}
}
]
}
},
"size": 1
}
GET index_name/?search
{
"query": {
"bool": {
"must": [
{
"wildcard": {
"user": "Yi*"
}
},
{
"exists": {
"field": "address"
}
}
]
}
},
"size": 1
}
GET index_name/?search
{
"query": {
"bool": {
"must": [
{
"wildcard": {
"user": "Yi*"
}
}
],
"not_must": [
{
"exists": {
"field": "address"
}
}
]
}
},
"size": 1
}
GET index_name/?search
{
"query": {
"wildcard": {
"ip": "192.168.*"
}
},
"sort": [
{
"timestamp": {
"order": "desc"
}
}
]
}
GET index_name/?search
{
"query": {
"wildcard": {
"ip": "192.168.*"
}
},
"sort": [
{
"timestamp": {
"order": "asc"
}
}
]
}
GET index_name/?search
{
"query": {
"wildcard": {
"ip": "192.168.*"
}
},
"sort": [
{
"timestamp": {
"order": "desc"
}
}
],
"_source": [
"user",
"ip",
"phone",
"address"
]
}
最多单次查询10000条,超过10000条需要分页查询。
GET index_name/?search
{
"query": {
"wildcard": {
"ip": "192.168.*"
}
},
"sort": [
{
"timestamp": {
"order": "desc"
}
}
],
"_source": [
"user",
"ip",
"phone",
"address"
],
"size": 10000
}
GET index_name/?search
{
"query": {
"wildcard": {
"ip": "192.168.*"
}
},
"sort": [
{
"timestamp": {
"order": "desc"
}
}
],
"_source": [
"user",
"ip",
"phone",
"address"
],
"from": 1,
"size": 100
}
GET index_name_one,index_name_two/?search
{
"query": {
"bool": {
"must": [
{
"wildcard": {
"user": "Yi*"
}
}
],
"must_not": [
{
"wildcard": {
"ip": "127.*"
}
},
{
"wildcard": {
"ip": "localhost*"
}
}
]
}
},
"size": 1
}
GET index_name_*/?search
{
"query": {
"bool": {
"must": [
{
"wildcard": {
"user": "Yi*"
}
}
],
"must_not": [
{
"wildcard": {
"ip": "127.*"
}
},
{
"wildcard": {
"ip": "localhost*"
}
}
]
}
},
"size": 1
}
?? bool语句是常用的用于构建复杂查询逻辑的语句。bool语句可以通过组合多个条件子句来实现逻辑运算,包括must、must_not、should和filter。
GET index_name/?search
{
"query": {
"bool": {
"must": [
{
"wildcard": {
"user": "Yi*"
}
}
],
"must_not": [
{
"wildcard": {
"ip": "127.*"
}
},
{
"wildcard": {
"ip": "localhost*"
}
}
]
}
},
"size": 1
}
GET index_name/?search
{
"query": {
"bool": {
"must": [
{
"wildcard": {
"user": "Yi*"
}
}
],
"must_not": [
{
"wildcard": {
"ip": "127.*"
}
},
{
"wildcard": {
"ip": "localhost*"
}
}
]
}
},
"size": 1
}
GET index_name/?search
{
"track_total_hits": true,
"query": {
"bool": {
"must": [
{
"bool": {
"should": [
{
"wildcard": {
"host": "www.baidu.com"
}
},
{
"wildcard": {
"host": "www.qq.com"
}
}
]
}
},
{
"bool": {
"should": [
{
"wildcard": {
"user": "*龙*"
}
},
{
"wildcard": {
"user": "*虎*"
}
}
]
}
}
]
}
},
"sort": [
{
"timestamp": {
"order": "desc"
}
}
],
"size": 1
}
GET index_name/?search
{
"track_total_hits": true,
"query": {
"bool": {
"filter": {
"range": {
"timestamp": {
"gte": "2023-11-27 00:00:00",
"lt": "2023-12-03 23:59:59",
"time_zone": "+08:00",
"format": "yyyy-MM-dd HH:mm:ss"
}
}
}
}
},
"sort": [
{
"timestamp": {
"order": "desc"
}
}
],
"size": 0
}
GET index_name/?search
{
"track_total_hits": true,
"query": {
"bool": {
"filter": {
"range": {
"timestamp": {
"gte": "2023-11-27 00:00:00",
"lt": "2023-12-03 23:59:59",
"time_zone": "+08:00",
"format": "yyyy-MM-dd HH:mm:ss"
}
}
},
"must": [
{
"bool": {
"should": [
{
"wildcard": {
"host": "www.baidu.com"
}
},
{
"wildcard": {
"host": "www.qq.com"
}
}
]
}
},
{
"bool": {
"should": [
{
"wildcard": {
"user": "*龙*"
}
},
{
"wildcard": {
"user": "*虎*"
}
}
]
}
}
],
"must_not": [
{
"wildcard": {
"address": {
"value": "*广东*"
}
}
}
]
}
},
"sort": [
{
"timestamp": {
"order": "desc"
}
}
],
"size": 0
}
match查询是全文搜索查询,根据相关性匹配查询字符串,进行分词处理。
term查询是精确匹配查询,直接与文档中的词项进行比较,不进行分词处理。
terms查询是多值匹配查询,匹配多个值中的任意一个。
should查询表示满足任意一个条件即可匹配成功,用于构建OR逻辑关系。
must_not查询表示条件不能满足才能匹配成功,用于排除特定条件。
bool查询是复合查询,通过组合多个查询子句构建复杂的查询逻辑,包括must、should和must_not。
filter查询用于过滤文档,不计算相关性分数,仅根据条件进行精确匹配,提高查询性能。
GET index_name/?search
{
"track_total_hits": true,
"query": {
"match_all": {}
},
"aggs": {
"NAME": {
"terms": {
"field": "user",
"size": 10
}
}
}
}
GET index_name/?search
{
"track_total_hits": true,
"query": {
"match_all": {}
},
"aggs": {
"NAME": {
"cardinality": {
"field": "user"
}
}
}
}
GET index_name/?search
{
"track_total_hits": true,
"query": {
"match_all": {}
},
"aggs": {
"NAME": {
"value_count": {
"field": "user"
}
}
}
}
GET index_name/?search
{
"track_total_hits": true,
"query": {
"match_all": {}
},
"aggs": {
"NAME": {
"terms": {
"script": "doc['user'] +'####'+ doc['ip']",
"size": 1000
}
}
}
}
GET index_name/?search
{
"track_total_hits": true,
"query": {
"match_all": {}
},
"aggs": {
"NAME": {
"terms": {
"field": "user",
"size": 10
},
"aggs": {
"NAME": {
"terms": {
"script": "doc['ip'] +'####'+ doc['address']",
"size": 1000
}
}
}
}
}
}
?? 这些聚合语句可以根据具体的需求进行组合和嵌套,以实现更复杂的统计和分析操作。通过使用这些聚合语句,可以从查询结果中提取有用的统计信息,进行数据分析、可视化和业务洞察。
terms聚合:按字段进行分组,并统计每个分组的文档数量。
date_histogram聚合:按时间间隔对日期字段进行分组,并统计每个时间间隔内的文档数量。
range聚合:将字段的值划分为不同的范围,并统计每个范围内的文档数量。
histogram聚合:将数值字段的值划分为不同的区间,并统计每个区间内的文档数量。
avg聚合:计算数值字段的平均值。
sum聚合:计算数值字段的总和。
min聚合:找到数值字段的最小值。
max聚合:找到数值字段的最大值。
cardinality聚合:计算字段的基数(不重复值的数量)。
top_hits聚合:返回每个分组中的顶部文档。
extended_stats聚合:计算数值字段的统计信息,包括平均值、标准差、最小值、最大值等。
percentiles聚合:计算数值字段的百分位数。
geo_distance聚合:按地理距离对地理坐标字段进行分组,并统计每个距离范围内的文档数量。
filter聚合:根据指定的过滤条件对文档进行聚合。
nested聚合:在嵌套的文档结构中进行聚合操作。
value_count聚合:计算某个字段的值的数量。
stats聚合:计算数值字段的统计信息,包括平均值、总和、最小值、最大值和文档数量。
scripted_metric聚合:使用自定义脚本计算聚合结果。
添加"track_total_hits": true
GET index_name/?search
{
"track_total_hits": true,
"query": {
"match_all": {
}
}
}
将should整个包装成must一个条件就能解决
GET index_name/?search
{
"query": {
"bool": {
"must": [
{
"bool": {
"should": [
{
"wildcard": {
"user": "*Yi*"
}
},
{
"wildcard": {
"user": "*龙*"
}
},
{
"wildcard": {
"user": "*虎*"
}
}
]
}
}
]
}
},
"size": 10000
}
?? 无论您是数据分析师、开发人员还是与Elasticsearch相关的岗位,了解和掌握ES DSL查询语句都是非常重要的,掌握这些强大的查询工具,为您的工作带来更多的效率和成果。微信公众号搜索关注艺说IT,分享各种原创技术干货文章,对你有用的话请一键三连,感谢🙏