Performance enhancements for streaming #150
Open
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
I made a few changes to the way streaming parsing is done to improve performance and throughput.
Here are some ad-hoc benchmarks from my workstation, consuming the streaming API with a query that generates ~1000 tweets per second for 35 seconds.
Before:
After:
In addition, the new streaming parser adds support for sending the
delimited: length
parameter to the Twitter API streaming endpoint, which causes it to prefix each tweet object with its length in bytes. This enables more efficient parsing since you don't have to scan ahead for an EOL.Using
{delimited: length}
:I rarely use Javascript or NodeJS so I would strongly suggest a code review on this before merging.