-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
micro_synteny_search exceeding redis maxclients limit #610
Comments
quick follow-up; this may be a total coincidence but I happened to notice in the logs for the medicago_redis_1 container this startup info: |
OK, yes that was indeed it; after |
Just to be clear, at first you thought the limit in the error message was the number of concurrent Redis connections (like in gcv-docker-compose issue #14) but it now seems to be the max number of search results returned by a Redisearch query? If it is the Redisearch max number of results then I think it's worth pursuing a solution that doesn't require tuning by site admins. One soluton that comes straight to mind is paginating the Redisearch results. We know the number of genes we're asking for so it would be pretty straightforward to update the code to cobble together our set of genes by querying all the necessary pages so we don't exceed the query result limit. I'm wondering if the limit is something we can fetch from the Redis server when the microservice starts up so we don't have to hard-code it. This would help us avoid issues if Redisearch's default value for the limit ever changes. |
yes, it's the redisearch max search results limit that seems to be the problem here. |
It looks like that command is also available in the Redisearch Python library. I guess we'll have to verify if this actually a reliable way to go... |
This is not a common problem, but happened to come up relative to a region of interest for a collaborator's PhD student and a paper he's trying to write up. Full stack trace at the end of this post but ,y reading of the situation based on the code around this part of the stack trace:
is that there is a gene in the region that belongs to a large gene family and this combined with the fact that the data source that's throwing the error (medicago) contains many highly fragmented genomes means that when the initial query is made for "chromosomes" with any matching gene families, the resource limit is being triggered (not sure I entirely understand why many "clients" are being spawned, but taking the error message at its word on that).
The most obvious thing to do is try to increase the maxclients setting in redis, but although I made an attempt at this, the error message I'm getting
still indicates that I'm exceeding the default limit of 10000, though this result
suggests that my attempt to hack the increased --maxclients setting into the compose.yml file did in fact change it as intended. I am probably getting confused about something here- let me know if you have any thoughts about it when you're back @alancleary
If that server-side tweak ultimately works, it may be adequate as a solution (not suggesting we change the default, but could configurable via .env/document it for other service admins in future); but, we could also consider whether the parameters we added to exclude contigs below a certain size for the macrosynteny search might not also be applicable to the microsynteny search (so poor PhD students don't have to rely on crusty old service admins to get their work done...)
full stack trace (from services running under gcv-docker-compose-2.6.0-c0)
The text was updated successfully, but these errors were encountered: