-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Issue with Concurrent Query Processing and Document Upload #1848
Comments
They should all be independent unless you changed CONCURRENCY_COUNT to be 1. This is tested normally. The backend has no issues with this at all. |
Once you have that working, I can explain how to make it even more efficient using the function_server. |
this is the command for running h2ogpt with login. |
I'd guess I'd need to ask how you see things blocked. E.g. if you had a pytest test code that you are running that shows how things are blocking each other (e.g. long add of dock and then chat is blocked in another test you ran with -n 2) or you just show video of the UI and what you are doing, I can mimic it and see if I can see what you are seeing. |
As for the function server, you can try it. Just add to CLI:
|
the function server has issue when hitting through
|
It just looks like the function server isn't even up. Perhaps you have something else on that port etc. Check startup logs. |
when setting concurrency count to be 64:
then the following error is shown:
|
Correct, I recommend vLLM for handling concurrency well, transformers is not itself thread safe. |
I have implemented a solution using vLLM on an A100 server to support multiple users. However, I have encountered an issue:
While one user's query is being processed, other users are unable to upload documents into the
UserData
or MyData collections. The document upload process gets stuck at the processing stage without any errors appearing in the terminal or UI. Additionally, the document is not uploaded successfully.Can you suggest ways to decouple the query processing, document upload, and user interface programs so they can run independently of each other?
Alternatively, can we build or use prebuilt separate APIs to manage program in the backend?
Please provide suggestions or potential solutions.
The text was updated successfully, but these errors were encountered: