-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
In a batch request each message should be in its own batchspec #171
Comments
Whilst we could fix this, I think this is the wrong place to fix the problem. IMHO we should fix our Batch API, not make it more complicated to publish to our batch API endpoint. See https://ably.atlassian.net/browse/PMR-404. WDYT? |
This is an odd framing. The batch publish api has clearly defined and well-documented semantics. There are possible changes we could make to to the api in future versions, sure, but if the kafka connector is using the current version wrongly -- which it is -- that's a bug in the kafka connector. PMR-404 doesn't really make much in the way of concrete suggestions for how to 'fix' the current batch api (wrt atomicity behaviour), it just complains about the current behaviour, so I'm not quite clear on exactly what change you're imagining. But I've replied on that ticket. More practically, any major change to the semantics of the batch publish api (proposing, agreeing on, making it serverside, then changing ably-java to use it, which would also need to be a new ably-java major version) would take time, and in the meantime the kafka connector would still be doing something clearly broken. |
You are right. Apologies I should have been far clearer. For the avoidance of doubt, I agree we should fix this issue, and look at the batch API improvements separately. I recall explicitly having conversations with @subkanthi and @jaley about this requirement, and we wrote a spec ahead of the work being done, so I am honestly quite surprised the per channel limits have not been implemented. For reference, here is what was shared, which if implemented, would have meant this issue does not exist. I've checked the tests and AFAICT, this was not implemented.
I think we need to be clear on the problem and the goals for the API before we propose a technical solution. I don't see a problem in that, but appreciate your comments on PMR-404 and I will review. |
Hey folks! Testing my memory slightly, but a few things feel familiar:
We shipped without in the end as time was getting on and we felt with the ability to control the number of messages per batch, it was somewhat serviceable as it is, though it was a significant annoyance. The other major outstanding limitation, by the way, was the lack of idempotency, as I think this makes it very difficult / impossible to avoid message duplication when redeploying the connector. I left a ticket here #143 Hope you're all doing well over there! |
The kafka connector currently uses the batch publishes API, but currently, every message sent to a single channel is sent in one batchspec, meaning we treat them atomically and send them as a single protocolmessage. (indeed, if there's only one channel, every single message in the request is in one batchspec, which means it's a pretty pointless use of the batch publish api).
This causes a problem when the total protocolmessage size gets way too big. It could stop accumulating when that size gets >64kB -- but unless it really needs atomicity, and given it's just accumulating until it gets 100 messages, it clearly doesn't, there's a much simpler fix, which is to just put each message into its own batchspec.
┆Issue is synchronized with this Jira Bug by Unito
The text was updated successfully, but these errors were encountered: