Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Qless::Queue#peek not returning correct results for recurring jobs #233

Open
tpickett66 opened this issue May 11, 2015 · 5 comments
Open

Comments

@tpickett66
Copy link
Contributor

As of 515cdc8 a queue can get in an unexpected state leading to recurring jobs not being returned properly by #peek when no count argument is supplied. At this point I've traced it down to the Lua scripts by working back through the ruby code, I'll post more details as I find them.

[14] pry(main)> queue.peek
=> nil
[15] pry(main)> queue.peek(1)
=> []
[16] pry(main)> queue.peek(2)
=> [<Qless::Job SyncAccount (f5aa9babcd1e420ea38fe537812db03a-9 / power_school_jobs / waiting)>]
[17] pry(main)> queue.peek(3)
=> [<Qless::Job SyncAccount (f5aa9babcd1e420ea38fe537812db03a-9 / power_school_jobs / waiting)>, <Qless::Job SyncAccount (ed6f4e5207454cf3972459990b8eaad3-15 / power_school_jobs / waiting)>]
[18] pry(main)>
@tpickett66
Copy link
Contributor Author

Additional data from invoking the Lua 'directly'

18] pry(main)> client.call('peek', 'power_school_jobs', 1)
=> "{}"
[19] pry(main)> client.call('peek', 'power_school_jobs')
Qless::LuaScriptError: user_script:1105: Peek(): Arg "count" missing or not a number: nil
from /home/deploy/pschool/.bundle/gems/ruby/2.1.0/bundler/gems/qless-515cdc862a82/lib/qless/lua_script.rb:30:in `rescue in call'
[20] pry(main)> client.call('peek', 'power_school_jobs', 20)
=> "[{\"jid\":\"f5aa9babcd1e420ea38fe537812db03a-9\",\"retries\":5,\"data\": ...snip..."

@tpickett66
Copy link
Contributor Author

Turns out there was a lingering member in the zset ql:q:<queue_name>-work, I'm not sure how it remained after the job was removed from the queue or completed. I'll be investigating the housecleaning functions to see where/how this might have happened.

@tpickett66
Copy link
Contributor Author

So far I'm not seeing anything that would point at a root cause, the only set that seems to have had this pollution is the work set.

@tpickett66
Copy link
Contributor Author

I just discovered that the completed set has invalid values too, sadly I haven't taken good enough notes to know if there is any overlap between the extras in the work set and ql:completed. I'm going to leave this issue open while I gather more data, this problem has occurred a couple times in the past.

@tpickett66
Copy link
Contributor Author

We had more issues with one of the sorted sets being in an inconsistent state this morning, this time it was the locks set on the same queue but the inconsistent state started occurring right after we received notification of an exception of the type Qless::Job::CantCompleteError.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant