Elasticsearch聚合部分string,而不是全部string

基本上,我在这里要做的是从层级存储的string中获得二级下来的类别。 问题在于层次的层次有所不同,一个产品类别可能有六个层次,另一个只有四个层次,否则我会实施预定义的层次。

我有一些类别如下的产品:

[ { title: 'product one', categories: [ 'clothing/mens/shoes/boots/steel-toe' ] }, { title: 'product two', categories: [ 'clothing/womens/tops/sweaters/open-neck' ] }, { title: 'product three', categories: [ 'clothing/kids/shoes/sneakers/light-up' ] }, { title: 'product etc.', categories: [ 'clothing/baby/bibs/super-hero' ] }, ... more products ] 

我试图得到像这样的聚合桶:

 buckets: [ { key: 'clothing/mens', ... }, { key: 'clothing/womens', ... }, { key: 'clothing/kids', ... }, { key: 'clothing/baby', ... }, ] 

我试着看filter的前缀,包括和排除条款,但我找不到任何有效的。 请有人指出我正确的方向。

您的category字段应使用自定义分析器进行分析。 也许你有其他的计划,所以我只是添加一个只用于聚合的子字段:

 { "settings": { "analysis": { "filter": { "category_trimming": { "type": "pattern_capture", "preserve_original": false, "patterns": [ "(^\\w+\/\\w+)" ] } }, "analyzer": { "my_analyzer": { "tokenizer": "keyword", "filter": [ "category_trimming", "lowercase" ] } } } }, "mappings": { "test": { "properties": { "category": { "type": "string", "fields": { "just_for_aggregations": { "type": "string", "analyzer": "my_analyzer" } } } } } } } 

testing数据:

 POST /index/test/_bulk {"index":{}} {"category": "clothing/womens/tops/sweaters/open-neck"} {"index":{}} {"category": "clothing/mens/shoes/boots/steel-toe"} {"index":{}} {"category": "clothing/kids/shoes/sneakers/light-up"} {"index":{}} {"category": "clothing/baby/bibs/super-hero"} 

查询本身:

 GET /index/test/_search?search_type=count { "aggs": { "by_category": { "terms": { "field": "category.just_for_aggregations", "size": 10 } } } } 

结果:

  "aggregations": { "by_category": { "doc_count_error_upper_bound": 0, "sum_other_doc_count": 0, "buckets": [ { "key": "clothing/baby", "doc_count": 1 }, { "key": "clothing/kids", "doc_count": 1 }, { "key": "clothing/mens", "doc_count": 1 }, { "key": "clothing/womens", "doc_count": 1 } ] } }