ElasticSearch 汇总数据
-
汇总数据
汇总作业是一项定期任务,它汇总来自索引模式指定的索引中的数据并将其汇总到新索引中。在下面的示例中,我们创建一个具有不同日期时间戳的名为sensor的索引。然后,我们创建一个汇总作业,以使用cron作业定期汇总这些索引中的数据。PUT /sensor/_doc/1 { "timestamp": 1516729294000, "temperature": 200, "voltage": 5.2, "node": "a" }
运行上面的代码,我们得到以下结果-{ "_index" : "sensor", "_type" : "_doc", "_id" : "1", "_version" : 1, "result" : "created", "_shards" : { "total" : 2, "successful" : 1, "failed" : 0 }, "_seq_no" : 0, "_primary_term" : 1 }
现在,添加第二个文档,以此类推。PUT /sensor-2020-01-01/_doc/2 { "timestamp": 1413729294000, "temperature": 201, "voltage": 5.9, "node": "a" }
-
创建汇总作业
PUT _rollup/job/sensor { "index_pattern": "sensor-*", "rollup_index": "sensor_rollup", "cron": "*/30 * * * * ?", "page_size" :1000, "groups" : { "date_histogram": { "field": "timestamp", "interval": "60m" }, "terms": { "fields": ["node"] } }, "metrics": [ { "field": "temperature", "metrics": ["min", "max", "sum"] }, { "field": "voltage", "metrics": ["avg"] } ] }
cron参数控制作业的激活时间和频率。当汇总作业的cron计划触发时,它将从上次激活后从上次中断的地方开始汇总在作业运行并处理了一些数据之后,我们可以使用DSL查询进行一些搜索。GET /sensor_rollup/_rollup_search { "size": 0, "aggregations": { "max_temperature": { "max": { "field": "temperature" } } } }