Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pull dedupe out of queue and into personality mixin #333

Merged
merged 2 commits into from
Nov 27, 2024

Conversation

mhutchinson
Copy link
Contributor

This allows the deduplication behaviour to be more precisely controlled by the personality. Previously deduping was always performed within a single batch flushed from the queue, and this could not be disabled. Now, deduping can be disabled, or made to apply across batches if a large enough cache size is set.

The conformance binaries have been updated to use this new mixin, as this is the closest to a no-op change I can make. We may want to look at making the cache size configurable and see how this affects performance.

This fixes #331.

This allows the deduplication behaviour to be more precisely controlled by the personality. Previously deduping was always performed within a single batch flushed from the queue, and this could not be disabled. Now, deduping can be disabled, or made to apply across batches if a large enough cache size is set.

The conformance binaries have been updated to use this new mixin, as this is the closest to a no-op change I can make. We may want to look at making the cache size configurable and see how this affects performance.

This fixes transparency-dev#331.
@phbnf
Copy link
Contributor

phbnf commented Nov 26, 2024

There's already some deduplication code that was written for CT, it's Best Effort deduplication that eventually converges on always returning the smallest index possible for a submitted leaf. The plan is to take what we can out of the static-ct repo later on, and put it back in this tessera repo. This PR will change how to reason about this, both for in terms of how we align the existing and new deduplication APIs, and what happens if both deduplications are chained together (I think it will make convergence properties slightly different). Sounds good to me to move forward with it if you want this out before Alpha, and we can re-think how the personality mixins work with one another later?

@mhutchinson
Copy link
Contributor Author

Sounds good to me to move forward with it if you want this out before Alpha, and we can re-think how the personality mixins work with one another later?

Great I'll merge this if I can get an approval. Until we hit 1.0 all parts of the API are still open to change. Ideally the core parts will change less, and the mixins are more open to change. So sounds like we're aligned that this mixin works for now but can be changed if needed.

FWIW I really like the approach I've taken here of making the mixins that modify the Add call have the same signature so you can use them as a functional decorator pattern. But if it needs to change then it can. 🙃

Copy link
Contributor

@roger2hk roger2hk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was expecting the dedupe mixin to be defined as one of the *options.StorageOptions. Using the decorator pattern seems to be a better idea.

cmd/conformance/aws/main.go Outdated Show resolved Hide resolved
@mhutchinson mhutchinson merged commit 2a7696b into transparency-dev:main Nov 27, 2024
15 checks passed
@mhutchinson mhutchinson deleted the b331-dedupe branch November 27, 2024 14:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Deduplication] look at extracting deduplication logic from sequencer queue into mixin
3 participants