Elasticsearch的连接超时

Question 1

from datetime import datetime
from elasticsearch import Elasticsearch
es = Elasticsearch()

doc = {
    'author': 'kimchy',
    'text': 'Elasticsearch: cool. bonsai cool.',
    'timestamp': datetime(2010, 10, 10, 10, 10, 10)
}
res = es.index(index="test-index", doc_type='tweet', id=1, body=doc)
print(res['created'])

这个简单的代码返回以下错误：

elasticsearch.exceptions.ConnectionTimeout: ConnectionTimeout caused by - ReadTimeoutError(HTTPConnectionPool(host='localhost', port=9200): Read timed out. (read timeout=10))

非常奇怪，因为服务器已准备就绪并已设置（http：// localhost：9200 /返回了一些json）。

Question 2

默认情况下，超时值设置为10秒。如果要更改全局超时值，可以通过在创建对象时设置标志timeout = your-time来实现。

如果您已经创建了对象但未指定超时值，则可以使用查询中的request_timeout = your-time标志为特定请求设置超时值。

es.search(index="my_index",
          doc_type="document",
          body=get_req_body(),
          request_timeout=30)

Question 3

如果您正在使用Amazon Elastic Search Service，则可能会发生连接超时问题。

es = Elasticsearch([{'host': 'xxxxxx.us-east-1.es.amazonaws.com', 'port': 443,  'use_ssl': True}])

上面的python代码中，您将默认端口从9200替换为443，并将SSL设置为true将解决此问题。

如果未指定端口，则它试图连接到指定主机中的端口9200，并且在超时后失败

Question 4

这与将超时增加到30秒无关。人们是否真的认为弹性搜索需要最多30秒才能返回一击？

我解决此问题的方法是转到config / elasticsearch.yml， 取消注释以下内容

http.port: 9200
network.host: 'localhost'

Network.host可能设置为192.168.0.1可能有效，但我只是将其更改为“ localhost”

Question 5

请注意，执行es.search（或es.index）时发生超时的常见原因之一是查询量大。例如，在我的ES索引很大（> 3M文档）的情况下，搜索30个单词的查询大约需要2秒，而搜索400个单词的查询则需要18秒。因此，对于足够大的查询，甚至超时= 30也无法挽救您。一个简单的解决方案是将查询裁剪为在超时以下可以回答的大小。

如果原因在流量中，增加超时时间或重试超时将对您有所帮助，否则可能是您的罪魁祸首。

Question 6

elasticsearch.exceptions.ConnectionTimeout: ConnectionTimeout caused by - ReadTimeoutError(HTTPConnectionPool(host='localhost', port=9200): Read timed out. (read timeout=10)) 表示请求未在指定的时间（默认为timeout = 10）结束。

这将在30秒内起作用：

res = es.index(index="test-index", doc_type='tweet', id=1, body=doc, timeout=30)

Question 7

尝试在Elasticsearch初始化中设置超时：

es = Elasticsearch([{'host': HOST_ADDRESS, 'port': THE_PORT}], timeout=30)

您甚至可以设置retry_on_timeout为True并提供max_retries一个可选数字：

es = Elasticsearch([{'host': HOST_ADDRESS, 'port': THE_PORT}], timeout=30, max_retries=10, retry_on_timeout=True)

Question 8

我的个人问题得以解决，(timeout = 10000)但实际上从未解决过，因为服务器上的条目只有7.000个，但是它的流量很大并且资源被占用，这就是为什么连接断开的原因

Question 9

超时的原因可能很多，似乎值得检查elasticsearch side（logs/elasticsearch.log）上的日志以查看详细的错误。在我们的案例中，ES上的错误是：

primary shard is not active Timeout: [1m]

在本描述后，这是因为我们的磁盘已满。我们已经在一天前调整了它（和分区）的大小以解决此问题，但是如果高/低水位线被击中一次（我们在5.5.x上）却没有做过，则需要重新启动ES。

只需重新启动生产中的ES即可为我们解决问题。

Question 10

有两个选项可以帮助您：

1：增加超时

设置超时为我解决了这个问题。请注意，较新的版本需要一个单位，例如timeout="60s"：

es.index(index=index_name, doc_type="domains", id=domain.id, body=body, timeout="60s")

没有单位，例如通过设置timeout=60，您将获得

elasticsearch.exceptions.RequestError: RequestError(400, 'illegal_argument_exception', 'failed to parse setting [timeout] with value [60] as a time value: unit is missing or unrecognized')

2：减少文字长度

它还有助于减少文本长度，例如，通过剪切长文本，因此elastic可以更快地存储文本，这也可以避免超时：

es.index(index=index_name, doc_type="domains", id=domain.id, body=text[:5000], timeout="60s")