使elasticsearch仅返回某些字段？

434

我正在使用Elasticsearch索引我的文档。

是否有可能指示它仅返回特定字段，而不是它存储的整个json文档？

elasticsearch

— 用户1199438
source

1

elastic.co/guide/en/elasticsearch/reference/current/...，注意大公你也可以只排除某些领域

— 克里斯托夫•鲁西

619

是的使用源过滤器。如果您使用JSON搜索，它将看起来像这样：

{
    "_source": ["user", "message", ...],
    "query": ...,
    "size": ...
}

在ES 2.4及更早版本中，您还可以在search API中使用fields选项：

{
    "fields": ["user", "message", ...],
    "query": ...,
    "size": ...
}

ES 5+中已弃用此功能。而且，源过滤器更强大！

— 凯文斯纳
source

12

确保将它们定义为“存储”：在映射中为true。否则，ES仍将加载_source文档并从中加载字段。如果返回的数据小于整个文档的大小，则可能会影响性能。

— Zaar Hai

6

您的意思是“商店”：真实

— sscarduzio

这些是在conf文件中制作的还是在哪里？

— vbNewbie 2014年

@vbNewbie：无论您在哪里定义映射。如果您没有显式定义映射，而是依靠ES来生成映射，则必须为要ES存储的字段定义映射。您可以仅针对需要特殊行为的字段（例如“ store”：true，“ index”：“ not_analyzed”）或所有字段定义映射。查看映射文档以获取更多详细信息。

— Sangharsh 2014年

3

较新版本不再支持这些字段。请改用stored_fields :)

— Sachin Sharma

88

我发现的文档get api非常有用-特别是两部分，源过滤和字段：https : //www.elastic.co/guide/en/elasticsearch/reference/7.3/docs-get.html#get-source-过滤

他们声明了源过滤：

如果只需要完整的_source中的一个或两个字段，则可以使用_source_include和_source_exclude参数来包含或过滤出所需的部分。这对于大型文档尤其有用，因为在大型文档中部分检索可以节省网络开销

这完全适合我的用例。我最终只是像这样过滤源（使用速记）：

{
    "_source": ["field_x", ..., "field_y"],
    "query": {      
        ...
    }
}

仅供参考，他们在文档中声明了关于fields参数的信息：

get操作允许指定一组存储的字段，这些字段将通过传递fields参数来返回。

它似乎可以满足专门存储的字段的需要，它将每个字段都放置在数组中。如果尚未存储指定的字段，它将从_source获取每个字段，这可能导致“较慢”的检索。我也很难尝试使其返回类型为object的字段。

因此，总而言之，您有两个选择，通过源筛选或[存储的]字段。

— 马库斯·库切
source

为我做了把戏。我在使用“字段”返回geo_point时遇到问题，但是“ _source”工作正常，谢谢！

— Yonnaled '16

23

For the ES versions 5.X and above you can a ES query something like this

    GET /.../...
    {
      "_source": {
        "includes": [ "FIELD1", "FIELD2", "FIELD3" ... " ]
      },
      .
      .
      .
      .
    }

— 佩内什·夏尔马
source

12

在Elasticsearch 5.x中，不建议使用上述方法。您可以使用_source方法，但是在某些情况下，存储字段是有意义的。例如，如果您有一个带有标题，日期和很大内容字段的文档，则可能只想检索标题和日期，而不必从较大的_source字段中提取这些字段：

在这种情况下，您将使用：

{  
   "size": $INT_NUM_OF_DOCS_TO_RETURN,
   "stored_fields":[  
      "doc.headline",
      "doc.text",
      "doc.timestamp_utc"
   ],
   "query":{  
      "bool":{  
         "must":{  
            "term":{  
               "doc.topic":"news_on_things"
            }
         },
         "filter":{  
            "range":{  
               "doc.timestamp_utc":{  
                  "gte":1451606400000,
                  "lt":1483228800000,
                  "format":"epoch_millis"
               }
            }
         }
      }
   },
   "aggs":{  

   }
}

请参阅有关如何索引存储字段的文档。总是很高兴获得支持！

— 狼
source

7

here you can specify whichever field you want in your output and also which you don't.

  POST index_name/_search
    {
        "_source": {
            "includes": [ "field_name", "field_name" ],
            "excludes": [ "field_name" ]
        },
        "query" : {
            "match" : { "field_name" : "value" }
        }
    }

— 高拉夫
source

7

response_filtering

所有REST API均接受filter_path参数，该参数可用于减少elasticsearch返回的响应。此参数采用逗号分隔的以点表示法表示的过滤器列表。

https://stackoverflow.com/a/35647027/844700

— 德姆兹
source

6

这是另一个解决方案，现在使用匹配表达式

源过滤
允许控制每次匹配时如何返回_source字段。

经过Elastiscsearch 5.5版测试

关键字“包括”定义了详细信息字段。

GET /my_indice/my_indice_type/_search
{
    "_source": {
        "includes": [ "my_especific_field"]
        },
        "query": {
        "bool": {
                "must": [
                {"match": {
                    "_id": "%my_id_here_without_percent%"
                    }
                }
            ]
        }
    }
}

— 织物
source

5

可以使用“ _source”参数发出REST API GET请求。

示例请求

http://localhost:9200/opt_pr/_search?q=SYMBOL:ITC AND OPTION_TYPE=CE AND TRADE_DATE=2017-02-10 AND EXPIRY_DATE=2017-02-23&_source=STRIKE_PRICE

响应

{
"took": 59,
"timed_out": false,
"_shards": {
    "total": 5,
    "successful": 5,
    "failed": 0
},
"hits": {
    "total": 104,
    "max_score": 7.3908954,
    "hits": [
        {
            "_index": "opt_pr",
            "_type": "opt_pr_r",
            "_id": "AV3K4QTgNHl15Mv30uLc",
            "_score": 7.3908954,
            "_source": {
                "STRIKE_PRICE": 160
            }
        },
        {
            "_index": "opt_pr",
            "_type": "opt_pr_r",
            "_id": "AV3K4QTgNHl15Mv30uLh",
            "_score": 7.3908954,
            "_source": {
                "STRIKE_PRICE": 185
            }
        },
        {
            "_index": "opt_pr",
            "_type": "opt_pr_r",
            "_id": "AV3K4QTgNHl15Mv30uLi",
            "_score": 7.3908954,
            "_source": {
                "STRIKE_PRICE": 190
            }
        },
        {
            "_index": "opt_pr",
            "_type": "opt_pr_r",
            "_id": "AV3K4QTgNHl15Mv30uLm",
            "_score": 7.3908954,
            "_source": {
                "STRIKE_PRICE": 210
            }
        },
        {
            "_index": "opt_pr",
            "_type": "opt_pr_r",
            "_id": "AV3K4QTgNHl15Mv30uLp",
            "_score": 7.3908954,
            "_source": {
                "STRIKE_PRICE": 225
            }
        },
        {
            "_index": "opt_pr",
            "_type": "opt_pr_r",
            "_id": "AV3K4QTgNHl15Mv30uLr",
            "_score": 7.3908954,
            "_source": {
                "STRIKE_PRICE": 235
            }
        },
        {
            "_index": "opt_pr",
            "_type": "opt_pr_r",
            "_id": "AV3K4QTgNHl15Mv30uLw",
            "_score": 7.3908954,
            "_source": {
                "STRIKE_PRICE": 260
            }
        },
        {
            "_index": "opt_pr",
            "_type": "opt_pr_r",
            "_id": "AV3K4QTgNHl15Mv30uL5",
            "_score": 7.3908954,
            "_source": {
                "STRIKE_PRICE": 305
            }
        },
        {
            "_index": "opt_pr",
            "_type": "opt_pr_r",
            "_id": "AV3K4QTgNHl15Mv30uLd",
            "_score": 7.381078,
            "_source": {
                "STRIKE_PRICE": 165
            }
        },
        {
            "_index": "opt_pr",
            "_type": "opt_pr_r",
            "_id": "AV3K4QTgNHl15Mv30uLy",
            "_score": 7.381078,
            "_source": {
                "STRIKE_PRICE": 270
            }
        }
    ]
}

}

— 铁卢卡
source

这对我来说非常有用。

— 素提卡·印杜尼尔

4

是的，通过使用源过滤器，您可以完成此操作，这是doc 源过滤

示例请求

POST index_name/_search
 {
   "_source":["field1","filed2".....] 
 }

输出将是

{
  "took": 57,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 1,
    "max_score": 1,
    "hits": [
      {
        "_index": "index_name",
        "_type": "index1",
        "_id": "1",
        "_score": 1,
        "_source": {
          "field1": "a",
          "field2": "b"
        },
        {
          "field1": "c",
          "field2": "d"
        },....
      }
    ]
  }
}

— RCP
source

2

在Java中，您可以像这样使用setFetchSource：

client.prepareSearch(index).setTypes(type)
            .setFetchSource(new String[] { "field1", "field2" }, null)

— 用户名
source

2

例如，您有一个包含三个字段的文档：

PUT movie/_doc/1
{
  "name":"The Lion King",
  "language":"English",
  "score":"9.3"
}

如果要返回name，score可以使用以下命令：

GET movie/_doc/1?_source_includes=name,score

如果要获取一些与模式匹配的字段：

GET movie/_doc/1?_source_includes=*re

也许排除一些字段：

GET movie/_doc/1?_source_excludes=score

— 潘瑶
source

0

通过使用Java API，我使用以下命令从一组特定字段中获取所有记录：

public List<Map<String, Object>> getAllDocs(String indexName) throws IOException{
    int scrollSize = 1000;
    List<Map<String,Object>> data = new ArrayList<>();
    SearchResponse response = null;
    while( response == null || response.getHits().getHits().length != 0){
        response = client.prepareSearch(indexName)
            .setTypes("typeName")  // The document types to execute the search against. Defaults to be executed against all types.
        .setQuery(QueryBuilders.matchAllQuery())
        .setFetchSource(new String[]{"field1", "field2"}, null)
        .setSize(scrollSize)
        .execute()
        .actionGet();
        for(SearchHit hit : response.getHits()){
            System.out.println(hit.getSourceAsString());
        }
    }
    return data;
}

— 土井
source