-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cherry pick chanks/que#166 #59
base: master
Are you sure you want to change the base?
Conversation
This should help with using a secondary database connection pool inside Que jobs. Original issue is [que-rb/que#166][166] [166]: que-rb/que#166
On second look, it seems that the upstream version of Que only wraps the logic to retrieve the database connection in the executor. I don't think this is going to address the issue we're having here. I reckon we'd need to wrap the whole job in the executor (that is, the Basically, the underlying issue is that when we're checking out a connection from the replica connection pool, we're not checking it back in when the job finishes. This is because We could work around this issue by amending the - ::ActiveRecord::Base.clear_active_connections!
+ ActiveRecord::Base.connection_handlers.each_value do |handler|
+ handler.connection_pool.clear_active_connections!
+ end Ideally though, we would wrap the job in the executor (among other things, the executor checks back all connections into the pool for us, which means we wouldn't need the above). This is what ActiveJob does. (ActiveJob actually wraps jobs in the reloader, which calls the executor. We could probably wrap Que jobs in the reloader as well but we don't need to do that to fix the connection issue.) |
On third look, we do wrap the job in So, I actually believe this would fix our issue 😄 |
For future reference, here is the error we're hoping to get fixed with this PR:
And here is the original investigation (copy-pasted from Sentry): What is the problem? The root cause of the issue is that
How to reproduce the problem? In a Rails console: pry> require "database_replica"
pry> DatabaseReplica.with_connection { ActiveRecord::Base.connection.execute("select pg_sleep(10)"); } In another shell: $ pkill -9 postgres How to fix it? A quick and dirty fix could be to fallback on A better fix would be to remove our custom logic to handle connection errors and wrap jobs in the Rails executor like the upstream version of Que does. (Doing so will ensure that all connections are checked back into the pool when a job has finished running — whereas we're only checking in connections of the primary connection handler at the moment. Checking in connections helps in that context because when a connection is subsequently checked out, Rails checks whether it's "active" — and discards it if it's not.) |
@isaacseymour can we merge this? The exception happens quite a lot. |
@danielroseman would love to, just want to be cautious about deploying it in staging for a bit to make sure it doesn't do Terrible Things in the real world. If you have time to do that and monitor it, would be awesome :) |
@isaacseymour how this this going? Was going through MIXP Sentry's and came across this. |
Someone who has some spare time should give it a go in staging I think, I haven't had time (or remembered about this) recently. You're more than welcome to if you have time! |
Just want to set the right expectations on this PR for whoever wants to pick it up: it won't fix the underlying issue, which is about the database being unreachable and which we, unfortunately, can't do much about. What this fix will give us though is:
(@isaacseymour to confirm that I'm not saying utter nonsense!) |
This should help with using a secondary database connection pool inside
Que jobs.
Original issue is chanks/que#166