Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bug(source): schema registry header of protobuf is not skipped properly #14773

Closed
xiangjinwu opened this issue Jan 24, 2024 · 1 comment · Fixed by #15444
Closed

bug(source): schema registry header of protobuf is not skipped properly #14773

xiangjinwu opened this issue Jan 24, 2024 · 1 comment · Fixed by #15444
Milestone

Comments

@xiangjinwu
Copy link
Contributor

xiangjinwu commented Jan 24, 2024

This is skipping i bytes rather than skipping i variable-length zig-zag encoded integers. For i within 1..=63, the encoding outputs a single byte so this happens to work.

prefixed by the length of the array (which is also variable length, Zigzag encoded)

i itself is also encoded so the decoded length shall be i / 2.

Originally posted by @xiangjinwu in #14057 (comment)

Got it, so the correct implementation is

        Some(i) => Ok(&remained[(1 + *i / 2) as usize..]),

Originally posted by @Rossil2012 in #14057 (comment)

Impact

In the common case where a message is defined at top level (i.e. not nested within another message's definition), and the message index is <= 63, the current implement happens to work correctly because i == 2 implies i == 1 + i/2.

Example: UserWithNewType in the PR is at top level with index 2, so its indexes array [2] would be encoded as 02 04 (02 for the length 1 and 04 for the element 2). Skipping 02 (first byte read as i) bytes happens to skip the encoded sequence exactly.

@github-actions github-actions bot added this to the release-1.7 milestone Jan 24, 2024
@Rossil2012
Copy link
Contributor

Note here:
As nesting depth of message and no. of messages can be larger than 2^6 - 1 (in varint encoding, only 7b out of 1B is for zig-zag encoding, so 63 is the max number that can be encoded in 1 byte), we need to check more bytes if the highest bit is 1.
The correct implementation may be:
if the first byte is not 0, decode one int as i with varint & zig-zag, and then decode i ints with varint & zig-zag. The remaining part is the actual payload.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants