ElasticSearch: Large data initial sync failing - Resonse Timeout #540

niranjans · 2016-10-03T22:18:33Z

I have a pretty large data (about 300k records) in a collection using ElasticSearch engine. The initial startup of the app causes the sync up of the entire data with the ES index and this is failing for me. Some records have been stored but most of the times, I get response timedout (even after increasing the timeout time to 120 seconds).

Is there any recommendation for this scenario? Is there a way to slow down the process of initial sync up?

Thanks

matteodem · 2016-10-04T14:00:25Z

Hi ninranjans, that shouldn't be a problem. How big are your documents generally?

niranjans · 2016-10-04T14:45:02Z

Thanks for your response @matteodem.

The documents are very basic (generated from Mockeroo - sample below).
I am getting random responses / errors - Most of the times, it's not going through.

The Elasticsearch is hosted on Compose.io with enough memory and space (not running localhost for testing purposes).

Using logs: 'trace', here are some of the things that happen:

Regular POSTs being sent. Note that no response is currently showing - only rarely does one of these return status 200 and hence most Timeout later:

Responses with code 0. I am not sure what these mean:

Socket hang up

And finally the Timeouts

I do get some successful writes and 200 responses sometimes, but it's very random and not too often. Any idea what might be happening?

niranjans · 2016-10-06T14:09:54Z

An update of this issue (might help someone else having this issue):

My ElasticSearch cluster is on Compose.io and looks like the bottleneck is that the sync is happening much faster than the cluster can handle (even after increasing the size of the cluster).

When I add a Meteor._sleepForMs() inside the client.defer function, this slows the entire thing down to a manageable level. So eventually after a long time, the data did get synced.

However, my follow-up question is - Does the entire index sync start from the beginning every time app restarts? Is there a way to manage this? I mean is there someway we can tell it not sync the entire thing (because the sync has already happened once after hours) and only observe changes that happen?

matteodem · 2016-10-06T16:01:26Z

that makes sense indeed. Right now there's not but that logic could be pretty easily added. As the engine itself defines things like this it's pretty encapsulated:

https://github.com/matteodem/meteor-easy-search/blob/master/packages/easysearch:elasticsearch/lib/engine.js

matteodem added question Project:Elasticsearch labels Oct 4, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ElasticSearch: Large data initial sync failing - Resonse Timeout #540

ElasticSearch: Large data initial sync failing - Resonse Timeout #540

niranjans commented Oct 3, 2016

matteodem commented Oct 4, 2016

niranjans commented Oct 4, 2016 •

edited

Loading

niranjans commented Oct 6, 2016

matteodem commented Oct 6, 2016

ElasticSearch: Large data initial sync failing - Resonse Timeout #540

ElasticSearch: Large data initial sync failing - Resonse Timeout #540

Comments

niranjans commented Oct 3, 2016

matteodem commented Oct 4, 2016

niranjans commented Oct 4, 2016 • edited Loading

niranjans commented Oct 6, 2016

matteodem commented Oct 6, 2016

niranjans commented Oct 4, 2016 •

edited

Loading