You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This is with esbulk 0.5.1. I will retry with the latest 0.6.0.
The index almost completed, but after more than 100m documents, failed with an error like:
2020/01/31 11:49:40 Post http://localhost:9200/_bulk: net/http: HTTP/1.x transport connection broken: write tcp [::1]:56970->[::1]:9200: write: connection reset by peer
Warning: unable to close filehandle properly: Broken pipe during global destruction
(the "Warning" part might be one of the other pipeline commands)
I suspect this is actually a problem on the Elasticsearch side... maybe something like a GC pause? I looked in ES logs and see that there were garbage collects up until the time of failure, and none after, but no particularly large or noticeable GC right around the failure.
I would expect the esbulk HTTP retries to resolve any such issues; I assume in this case all the retries failed. Perhaps longer, more, or exponential back-offs would help. Unfortunately, I suspect that this failure may be difficult to reproduce reliably, as it has only occurred with these very large imports.
esbulk has been really useful, thank you for making it available and any maintenance time you can spare!
The text was updated successfully, but these errors were encountered:
As a follow-up on this issue, if I recall correctly the root issue was having individual batches that were too large (in bytes, not number of documents) and ES would refuse them. Worked around this by decreasing batch size.
I twice attempted to import over 140 million documents into a local, single-node ES 6.8 cluster using a command like the following:
This is with
esbulk
0.5.1. I will retry with the latest0.6.0
.The index almost completed, but after more than 100m documents, failed with an error like:
(the "Warning" part might be one of the other pipeline commands)
I suspect this is actually a problem on the Elasticsearch side... maybe something like a GC pause? I looked in ES logs and see that there were garbage collects up until the time of failure, and none after, but no particularly large or noticeable GC right around the failure.
I would expect the
esbulk
HTTP retries to resolve any such issues; I assume in this case all the retries failed. Perhaps longer, more, or exponential back-offs would help. Unfortunately, I suspect that this failure may be difficult to reproduce reliably, as it has only occurred with these very large imports.esbulk
has been really useful, thank you for making it available and any maintenance time you can spare!The text was updated successfully, but these errors were encountered: