Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ElasticSearch: Large data initial sync failing - Resonse Timeout #540

Open
niranjans opened this issue Oct 3, 2016 · 4 comments
Open

ElasticSearch: Large data initial sync failing - Resonse Timeout #540

niranjans opened this issue Oct 3, 2016 · 4 comments

Comments

@niranjans
Copy link

I have a pretty large data (about 300k records) in a collection using ElasticSearch engine. The initial startup of the app causes the sync up of the entire data with the ES index and this is failing for me. Some records have been stored but most of the times, I get response timedout (even after increasing the timeout time to 120 seconds).

Is there any recommendation for this scenario? Is there a way to slow down the process of initial sync up?

Thanks

@matteodem
Copy link
Owner

Hi ninranjans, that shouldn't be a problem. How big are your documents generally?

@niranjans
Copy link
Author

niranjans commented Oct 4, 2016

Thanks for your response @matteodem.

The documents are very basic (generated from Mockeroo - sample below).
I am getting random responses / errors - Most of the times, it's not going through.

The Elasticsearch is hosted on Compose.io with enough memory and space (not running localhost for testing purposes).

Using logs: 'trace', here are some of the things that happen:

Regular POSTs being sent. Note that no response is currently showing - only rarely does one of these return status 200 and hence most Timeout later:
screenshot-1

Responses with code 0. I am not sure what these mean:
screenshot-2

Socket hang up
screenshot-3

And finally the Timeouts
screenshot-4

I do get some successful writes and 200 responses sometimes, but it's very random and not too often. Any idea what might be happening?

@niranjans
Copy link
Author

An update of this issue (might help someone else having this issue):

My ElasticSearch cluster is on Compose.io and looks like the bottleneck is that the sync is happening much faster than the cluster can handle (even after increasing the size of the cluster).

When I add a Meteor._sleepForMs() inside the client.defer function, this slows the entire thing down to a manageable level. So eventually after a long time, the data did get synced.

However, my follow-up question is - Does the entire index sync start from the beginning every time app restarts? Is there a way to manage this? I mean is there someway we can tell it not sync the entire thing (because the sync has already happened once after hours) and only observe changes that happen?

@matteodem
Copy link
Owner

that makes sense indeed. Right now there's not but that logic could be pretty easily added. As the engine itself defines things like this it's pretty encapsulated:

https://github.com/matteodem/meteor-easy-search/blob/master/packages/easysearch:elasticsearch/lib/engine.js

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants