Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unexpected behaviour with concurrent requests in CloudRun #613

Open
nvn-nil opened this issue Nov 14, 2023 · 2 comments
Open

Unexpected behaviour with concurrent requests in CloudRun #613

nvn-nil opened this issue Nov 14, 2023 · 2 comments

Comments

@nvn-nil
Copy link

nvn-nil commented Nov 14, 2023

Bug report

What is the current behavior?

There are 2 issues when allowing concurrent requests in power loss and wake services.

  1. Some questions start failing with the following error.

Error in <Service('power-loss-service:0.17.22')>: local variable 'app' referenced before assignment
WARNING app_loading.py:60 Module 'app' was already removed from the system path prior to exiting the AppFrom context manager. Using the AppFrom context may yield unexpected results.

Question id: 58e25937-59c8-44cf-a187-9709b95e8da5 in WQ production

  1. The question's log section in WQ includes logs from other questions. Presumably, these are from the other questions running at the time in the same container.

Question id: 1fcc12a5-547f-40d0-998a-31fb46421246 in WQ production

To me, it looks like the message/log handler is a singleton that is sending messages to all questions without a filter.

I'm also concerned that if messages from other questions are posted to the wrong question, wrong results will also be posted. Could you verify that this is not happening, please?

Your environment

  • Library Version: branch support-automatic-question-retrying
@cortadocodes cortadocodes moved this to Priority 2 (Medium) in Octue Board Nov 14, 2023
@cortadocodes
Copy link
Member

cortadocodes commented Nov 22, 2023

Issue 1 is definitely unexpected behaviour - I'll investigate soon.

I haven't seen issue 2 before. I've looked at the code and I think the leakage is confined to log messages only. One possibility is:

  • Fact: Each question has its own GooglePubSubHandler which only sends log messages to the question's parent (a new log handler instance is created each time the Service.answer method is called)
  • What I suspect is happening: Each log handler is independently picking up all logs generated by any questions being processed at that time in that environment and sending them to its question's parent

I'm reasonably certain the same thing isn't happening for other message types because they're not exposed to something like a global logging system so, for non-log messages, only messages for a given question are sent to the parent. If not, we'd probably be seeing unacknowledged messages - for example, if a parent received the wrong result message then the correct result wouldn't be received and acknowledged as the parent stops waiting and receiving messages as soon it receives a result.

@cortadocodes
Copy link
Member

Also we should break this up into two issues

@cortadocodes cortadocodes moved this from Priority 2 (Medium) to Priority 3 (High) in Octue Board Nov 22, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Priority 3 (High)
Development

No branches or pull requests

2 participants