-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Large patch sizes #69
Comments
@mchaniotakis Thanks for providing such a detailed report. You are right, these excessively large patches for small changes are not very useful, to say the least. Tufup was created as a replacement for PyUpdater (because PyUpdater is no longer maintained). For this reason, the patch creation in tufup using Although I did add some tests for basic patch functionality, I must admit, I haven't paid very much attention to the resulting file sizes. The use of bsdiff4, in itself, does not seem to be a problem. Rather, the problem comes from the fact that we use it, naively, to create binary differences of It appears that binary diffs of either uncompressed There's probably a good explanation for this, so I'll have a closer look at it as soon as I have some free time. |
As a temporary workaround, patches can be disabled using On the command line:
or in a script: ...
repo = Repository.from_config()
repo.add_bundle(new_bundle_dir=..., new_version=..., skip_patch=True)
repo.publish_changes(private_key_dirs=...)
... |
Another problem may be the fact that pyinstaller builds are not reproducible by default, as explained in the docs:
but
in addition
I'll have to do some more tests... UPDATE: Hmm... Does not seem to make much of a difference in the tufup-example app. Setting both
|
more useful information: |
Although we can now work around most of the issues with reproducibility with The compressed output from gzip depends on the implementation, and there is no guarantee that identical input will lead to identical output between different implementations. (only equality of decompressed output is guaranteed) We assume that the tufup archives are created on the same OS that they are used on, and that the gzip implementation is sufficiently stable between versions of the same OS to guarantee byte-for-byte equality. However, this may lead to trouble in the future: If it would turn out that gzip output is unstable between different versions of the same OS, the There are a few options to prevent this:
|
After some more thought, here's another option: We stick with compressed archives ( This means the download verification process and the server configuration can remain unaltered. However:
The only problem remaining now is that our uncompressed In addition, we should implement some kind of failsafe, so that failed patches will be ignored on the next run, in favor of a full installation. (done: #101) Why go to the trouble of verifying the integrity of the reconstructed archive?The integrity and authenticity of the patch and the current archive are already guaranteed by TUF. Knowing this, it seems highly unlikely that anything could go wrong when applying the patch. Nevertheless, if anything does go wrong, our self-updating application is likely to be broken. This would require a manual re-install. Moreover, it is quite possible that a mistake somewhere in the workflow would lead to a patch being applied to the wrong archive: To illustrate the point: import bsdiff4
original = b'this represents the original file'
updated = b'this represents the updated file'
wrong = b'this is the wrong file'
patch = bsdiff4.diff(src_bytes=original, dst_bytes=updated)
reconstructed = bsdiff4.patch(src_bytes=original, patch_bytes=patch)
assert reconstructed == updated
broken = bsdiff4.patch(src_bytes=wrong, patch_bytes=patch)
assert broken != updated |
To follow up on this comment:
For completeness, it turns out a similar issue also arose with PyUpdater when using |
Describe the bug
I have generated a Version 1 of a python application buddled with pyinstaller. This package contains images, libraries, the .exe and my .py files that have been converted to .pyd (binaries). One of those .pyd files states the version of the file. If only change the version of that .pyd file without running pyinstaller again to generate the second version of the bundle with tufup I get a file difference of 200MB, which is crazy if you take into account that the whole package is 340MB. The last modification date for these files are the same except the .pyd file file that states the version. Using the bsdiff4.file_diff() method between these two version produces the same result. I can provide both of these files If needed.
To Reproduce
Steps to reproduce the behavior:
Expected behavior
A patch size that is less than 10 MB. On a previous run, I regenerated just the .exe (running pyinstaller and just copying the .exe and deleting everything else while I follow the steps mentioned above.) The .exe filesize is 17mb while the generated patch was 35MB for that run.
System info (please complete the following information):
The text was updated successfully, but these errors were encountered: