Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement conversion from VarSeq and relax equality comparison #66

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

hvraven
Copy link
Contributor

@hvraven hvraven commented Dec 17, 2024

As I am sending sequence numbers around to reference it in other operations I extended VarSeq a bit. I implemented conversion into u8, u16 and u32.
This also adds a manual implementation of PartialEq to treat different variants with the same key value as equal. Previously VarSeq::Seq1(1) != VarSeq::Seq2(1), with this change they are treated as equal. For me this is the correct way to handle it, as the value is the important part, the variants are only to save bytes where possible. This might however be different to how the library expects it.

Hendrik v. Raven added 2 commits December 17, 2024 11:03
this adds a manual implementation of `PartialEq` to treat different
variants with the same key value as equal.

Previously `VarSeq::Seq1(1) != VarSeq::Seq2(1)`, with this change they
are treated as equal.
@jamesmunns
Copy link
Owner

jamesmunns commented Dec 17, 2024

Previously VarSeq::Seq1(1) != VarSeq::Seq2(1), with this change they are treated as equal

This was intentional, my intent is that the server should ALWAYS treat the sequence number as more or less opaque, and that servers should always return the same VarSeq as is provided by the client, as the server can't know if this is "significant" or not.

For example, with this PR, VarSeq::Seq1(0xAB) would equal VarSeq::Seq2(0x12AB), and I do not want that to be the case. (edit: see below) I'm open to discuss this, but the decision to make VarSeq::Seq1(1) != VarSeq::Seq2(1) was at least originally intended for this reason.


EDIT: I see this is not the case as you promote both to u32 to compare, which is the correct thing to do:

impl PartialEq for VarSeq {
    fn eq(&self, other: &Self) -> bool {
        Into::<u32>::into(*self) == Into::<u32>::into(*other)
    }
}

I still worry a bit that it could be "too easy" to accidentally truncate somewhere and cause unexpected behavior.


In general, my rules have been:

  • The server chooses the Key size, as it is able to tell whether shortening the key is acceptable or not, as it "knows" all keys and can detect collisions. This is why the client can make a request with an 8-byte key, and the server can respond with a 1-byte key.
  • The client chooses the SeqNo size, as it decides how much resolution it needs to avoid collisions (e.g. multiple requests in flight with the same seqno).

: the one exception to this is topics_out, where the server does the sending, in which case IT chooses the sequence number. So really it's more like the "originator" of the message chooses the sequence number.

I would be open to TryInto conversions from VarSeq -> u[8,16,32], failing if out of range, and Into conversions for u[8,16,32] -> VarSeq (with 1:1 mapping).

@jamesmunns
Copy link
Owner

jamesmunns commented Dec 17, 2024

One place where this collision could still occur is in the host client matching of incoming endpoint responses. Approximately here.

If the host sends two Seq2s of the same Key (in flight at the same time), 0x12AB and 0x00AB, and the client server truncates these both to Seq1(0xAB), then either reply could be matched to either request. This is generally unlikely, but I don't like it generally. today the first response will be matched to the first request (the WaitMap holds a queue of messages, new pendings are added to the end), but that's an impl detail that might change. Even in that case, it could still be incorrect, e.g. if the device replies to the second request first for some reason.

@hvraven
Copy link
Contributor Author

hvraven commented Dec 17, 2024

Thanks for your explanations. I also looked a bit more into the flow of the sequence numbers. I totally agree that the server should not do anything to the sequence number and pass it back as is, after a bit of rework this is how the messages are handled in my implementation now as well. With that assumption this change shouldn't create any problems (as it explicitly rules out your last example), however it's probably best done after rigorous testing.
I wasn't really sure about the relaxed conversion in the first place, that's why I mentioned it explicitly. Probably it is best to keep it as is for now. If your ok with the Into implementations it is relatively easy to implement a manual conversion via u32 if necessary.

A bit of background why I want this in the first place: I am building a framework where an initial call to an endpoint starts a longer running operation. While it is running the server sends data it generates as topics_out, finally signalling the end on a special topic. I link the data of these operations to the original request by (mis-/re-)using the sequence number of the original endpoint call for all topics sent from the server related to that operation.

While looking through the code I wondered whether it would make sense to choose the sequence length on the server automatically. Currently the HostClient requires a VarSeqKind and resizes everything to it. Instead one could add a shrink method to VarSeq, always reducing to the smallest possible size. If you think that makes sense I would try to implement it.

@jamesmunns
Copy link
Owner

I'm afk for the moment and will add specific links in a bit, but: "endpoint request with topics out as a multipart response" is something I do as well, check out the schema discovery code in the server and client. I'd be open to formalizing the semantics of this a bit, if you want to open an issue so we can hammer out the details.

There's two flavors to this I've done:

  • One where the endpoint returns immediately, for starting and stopping "infinite" streams, like sensor readings. This is usually separate start and stop endpoints.
  • One where the endpoint returns when the multipart stream is complete, like what schemas do

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants