Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missing check for NullInt64.Valid in Generic.Count #374

Open
indradhanush opened this issue Dec 6, 2024 · 3 comments
Open

Missing check for NullInt64.Valid in Generic.Count #374

indradhanush opened this issue Dec 6, 2024 · 3 comments

Comments

@indradhanush
Copy link

We're running kine with mysql and see this error intermittently:

│ time="2024-12-04T18:12:50.283533302Z" level=error msg="error while range on /registry/statefulsets/ /registry/statefulsets/: sql: Scan error on column index 0, name \"prev_revision\": converting NULL to int64 is unsupported"

The error originates when LimitedServer.Range invokes the method list and ultimately invoking Generic.Count.

While the error goes away when we restart kine (thus our hunch being this related to a race condition) but it does show up again after a few days. We haven't yet narrowed down on what triggers this.

In the meantime we would be interested in contributing a fix for this. While the fix in itself is simple - check for NullInt64.Valid, but we're not sure what the error handling should do. Should it return a custom error? I'd love your thoughts before I work on a PR.

Note: The same handling should also be applied to Generic.CountCurrent and perhaps also to the other usage of NullInt64 in sqllog.scan.

@brandond
Copy link
Member

brandond commented Dec 6, 2024

What version of mysql are you using? Can you reproduce this on demand?

I'm curious what the actual query is returning in this case. I don't think it's in generic.Count because that only uses the max revision via SELECT MAX(rkv.id) AS id FROM kine AS rkv, and the error is about the prev_revision column which is only needed when reading an entry.

@indradhanush
Copy link
Author

indradhanush commented Dec 10, 2024

What version of mysql are you using? Can you reproduce this on demand?

We're running MySQL 8.0.25. Unfortunately we haven't been able to reliably reproduce this yet.

I don't think it's in generic.Count because that only uses the max revision via SELECT MAX(rkv.id) AS id FROM kine AS rkv, and the error is about the prev_revision column which is only needed when reading an entry.

I traced the usage of that query which is defined as compactRevSQL and see that:

I don't see another code path for this, so it does appear to be coming from generic.Count, unless I'm missing something?

Question is, do we expect this query to ever return null:

SELECT
  MAX(crkv.prev_revision) AS prev_revision
FROM
  kine AS crkv
WHERE
  crkv.name = 'compact_rev_key'

If yes, then under what conditions?

@brandond
Copy link
Member

Yeah, I am assuming there is an underlying SQL error here that is being obscured by the error from scan. There should never not be a compact_rev_key entry in the db.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants