Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compress outgoing sync data in ZIP format #18

Open
aecreations opened this issue Dec 27, 2023 · 3 comments
Open

Compress outgoing sync data in ZIP format #18

aecreations opened this issue Dec 27, 2023 · 3 comments
Assignees
Labels
enhancement New feature or request fixed
Milestone

Comments

@aecreations
Copy link
Owner

A forum poster suggested compressing the synced clippings data into ZIP format to bypass the 1 MiB limit imposed by the native messaging API[1] to allow more data to be synced. Forum post: https://groups.io/g/aecreations-help/message/48

This only needs to be done for sync data being sent from Sync Clippings Helper to the extension. Incoming data to the native app has a more generous limit of 4 GiB.

--
[1] https://developer.mozilla.org/en-US/docs/Mozilla/Add-ons/WebExtensions/Native_messaging

@aecreations aecreations self-assigned this Dec 27, 2023
@aecreations aecreations added the enhancement New feature or request label Dec 27, 2023
@aecreations aecreations added this to the 2.0 milestone Dec 27, 2023
@aecreations
Copy link
Owner Author

There is a possibility that users on an older version of Clippings (which can't handle sync data in ZIP format) may update to the new version of Sync Clippings Helper with data compression.

To allow for backward compatibility, the compatible version of Clippings should send a new native message get-compressed-synced-clippings to Sync Clippings Helper. Older versions can continue to send the get-synced-clippings message, which should cause Sync Clippings Helper to respond with the sync data in the normal, uncompressed format.

@aecreations
Copy link
Owner Author

Compressing data into gzip format using Python:

import gzip
s = 'Hello world!'
b = s.encode('UTF-8')  # Convert Unicode string to bytes
z = gzip.compress(b)
print(z)  # Output: b'\x1f\x8b\x08\x00\xfc\xb3\x93e\x02\xff\xf3H\xcd\xc9\xc9W(\xcf/\xcaIQ\x04\x00\x95\x19\x85\x1b\x0c\x00\x00\x00'

Source: https://docs.python.org/3/library/gzip.html

@aecreations
Copy link
Owner Author

aecreations commented Mar 30, 2024

The message data from the native app needs to be sent to the extension in JSON format, so the compressed data needs to be wrapped in a JSON object.

Also, bytes cannot be stored in a JSON object, so the zipped data needs to be base64 encoded. This reduces the effectiveness of data compression.

Some stats using the attached test sync file as an example:

Size of the sync data, encoded in UTF-8 (bytes): 9538
Size of zipped data (bytes): 3427
Size of base64-encoded string containing the zipped data (chars): 4572

If we consider 1 character to be equal to 1 byte in the base64-encoded string, then compressing the Sync Clippings data results in a reduction of 52% in the sync data size.

clippings-sync.json

aecreations added a commit that referenced this issue Mar 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request fixed
Projects
None yet
Development

No branches or pull requests

1 participant