有没有一种方法可以使用elasticsearch返回每个匹配字段只有一个命中?

注:更新为包含NodeJS客户端详细信息。 请参阅下面的编辑。

我试图避免不得不反复查询ElasticSearch来获取我需要的信息。

假设我有一个由城市中的事件组成的数据集。 数据集中的文档可能如下所示:

{ city: 'Berlin', event: 'Dance party', date: '2017-04-15' }, { city: 'Seattle', event: 'Wine tasting', date: '2017-04-18' }, { city: 'Berlin', event: 'Dance party, date: '2017-04-21' }, { city: 'Hong Kong', event: 'Theater', date: '2017-04-25' }... 

现在说所有跟踪城市的名单是已知的,我需要得到每个城市最近的事件。 因此,我需要能够在查询中input一系列城市名称,例如['Berlin', 'Hong Kong', 'Seattle']并只返回底部的三个事件。

我目前的查询只能通过以1的大小重复运行,并在城市名称上进行精确匹配来实现,如下所示:

 { size: 1, body: { sort: [ {'date': {'order': 'desc'}} ], query: { 'match_phrase': {'city': 'Berlin'} } } } 

有没有办法编写脚本,以便我可以将整个城市列表转换为一个查询,并且可以预测获取每个城市的最新条目?

编辑

我的新脚本如下所示:

 { 'query': { 'match_all': {} }, '_source': ['city', 'event', 'date'], 'aggs': { 'cities': { 'terms': { 'field': 'city', 'size': 100 }, 'aggs': { 'top_cities': { 'top_hits': { 'size': 1, '_source': 'event', 'sort': { 'date': 'desc' } } } } } } } 

这看起来真的应该工作。 但是我仍然错过了我所知道的大量城市,而且多次出现。

我正在使用elasticsearch-js包在Node中运行它。 客户端以这种方式执行:

 let client = new elasticSearch.Client( { "host": [ "host1:9200", "host2:9200", "host3:9200" ] } ); client.search(SEARCH_PARAMS) .then(function (resp) { console.log(JSON.stringify(resp)); }); 

以下是生成的JSON(消毒)版本:

 { "took": 77, "timed_out": false, "_shards": { "total": 42, "successful": 42, "failed": 0 }, "hits": { "total": 5685608, "max_score": 1, "hits": [{ "_index": "sanitized", "_type": "sanitized", "_id": "AVu489lVgqYk_9QxQb-U", "_score": 1, "_source": { "event": "Dance party", "date": "2017-04-15", "city": "Berlin" } }, { "_index": "sanitized", "_type": "sanitized", "_id": "AVu489lVgqYk_9QxQb-X", "_score": 1, "_source": { "event": "Dance party", "date": "2017-04-15", "city": "Berlin" } }, { "_index": "sanitized", "_type": "sanitized_variant_1", "_id": "AVu489lVgqYk_9QxQb-a", "_score": 1, "_source": { "event": "Dance party", "date": "2017-04-29", "city": "Berlin" } }, { "_index": "sanitized", "_type": "sanitized_variant_2", "_id": "AVu489lVgqYk_9QxQb-b", "_score": 1, "_source": { "event": "Dance party", "date": "2017-04-29", "city": "Berlin" } }, { "_index": "sanitized", "_type": "sanitized_variant_2", "_id": "AVu489lVgqYk_9QxQb-d", "_score": 1, "_source": { "event": "Dance party", "date": "2017-04-29", "city": "Hong Kong" } }, { "_index": "sanitized", "_type": "sanitized_variant_2", "_id": "AVu489lVgqYk_9QxQb-f", "_score": 1, "_source": { "event": "Dance party", "date": "2017-04-29", "city": "Hong Kong" } }, { "_index": "sanitized", "_type": "sanitized_variant_2", "_id": "AVu49AkKCe9swQD44WnN", "_score": 1, "_source": { "event": "Dance party", "date": "2017-04-29", "city": "Seattle" } }, { "_index": "sanitized", "_type": "sanitized_variant_2", "_id": "AVu49AkKCe9swQD44WnP", "_score": 1, "_source": { "event": "Dance party", "date": "2017-04-29", "city": "New York" } }, { "_index": "sanitized", "_type": "sanitized_variant_1", "_id": "AVu49AkKCe9swQD44WnY", "_score": 1, "_source": { "event": "Dance party", "date": "2017-04-29", "city": "Berlin" } }, { "_index": "sanitized", "_type": "sanitized_variant_2", "_id": "AVu49AkKCe9swQD44Wnb", "_score": 1, "_source": { "event": "Dance party", "date": "2017-04-29", "city": "Berlin" } }] } } 

仔细观察,出于某种原因,聚合不会被添加到resp对象。

除了在查询中过滤城市之外,我build议在城市字段中使用terms聚合,然后使用top_hits子聚合来检索每个城市的最新事件:

 { "size": 0, "query": { "match_all": {} }, "aggs": { "cities": { "terms": { "field": "city", "size": 100 }, "aggs": { "top_events": { "top_hits": { "size": 1, "_source": "event", "sort": { "date": "desc" } } } } } } } 

您可以使用条款查询 ,通过所有这些城市,如下所示:

 "query": { "terms": { "city": [ "BERLIN", "RIO DE JANEIRO" ] } }, "size": 3, "_source": "city", "sort": [ { "date": { "order": "desc" } } ] }