Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support Zip64 when compressing iterables and strings #25

Open
wants to merge 18 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 11 additions & 0 deletions .bumpversion.cfg
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
[bumpversion]
current_version = 1.1.8
commit = true
message = Release {new_version}
tag = true

[bumpversion:file:CHANGELOG.md]
search = Unreleased
replace = v{new_version} ({now:%Y-%m-%d})

[bumpversion:file:setup.py]
15 changes: 15 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
# ChangeLog

This file details the changes that were made after forking v1.1.4 from https://github.com/allanlei/python-zipstream.

## v1.1.8 (2020-09-14)
* New datetime parameter in write_iter (https://github.com/arjan-s/python-zipstream/pull/8)

## v1.1.7 (2019-10-22)
* Stream data in the order it was received (https://github.com/arjan-s/python-zipstream/pull/4)

## v1.1.6 (2019-06-06)
* Add partial flushing of ZipStreams (https://github.com/arjan-s/python-zipstream/pull/1)

## v1.1.5 (2019-03-18)
* Support Zip64 when compressing iterables and strings (https://github.com/allanlei/python-zipstream/pull/25)
36 changes: 28 additions & 8 deletions README.markdown → README.md
Original file line number Diff line number Diff line change
@@ -1,9 +1,6 @@

# python-zipstream

[![Build Status](https://travis-ci.org/allanlei/python-zipstream.png?branch=master)](https://travis-ci.org/allanlei/python-zipstream)
[![Coverage Status](https://coveralls.io/repos/allanlei/python-zipstream/badge.png)](https://coveralls.io/r/allanlei/python-zipstream)

zipstream.py is a zip archive generator based on python 3.3's zipfile.py. It was created to
generate a zip file generator for streaming (ie web apps). This is beneficial for when you
want to provide a downloadable archive of a large collection of regular files, which would be infeasible to
Expand Down Expand Up @@ -78,12 +75,12 @@ archives.
## Installation

```
pip install zipstream
pip install zipstream-new
```

## Requirements

* Python 2.6, 2.7, 3.2, 3.3, pypy
* Python 2.6+, 3.2+, pypy

## Examples

Expand All @@ -95,7 +92,7 @@ from flask import Response
@app.route('/package.zip', methods=['GET'], endpoint='zipball')
def zipball():
def generator():
z = zipstream.ZipFile(mode='w', compression=ZIP_DEFLATED)
z = zipstream.ZipFile(mode='w', compression=zipstream.ZIP_DEFLATED)

z.write('/path/to/file')

Expand All @@ -110,12 +107,33 @@ def zipball():

@app.route('/package.zip', methods=['GET'], endpoint='zipball')
def zipball():
z = zipstream.ZipFile(mode='w', compression=ZIP_DEFLATED)
z = zipstream.ZipFile(mode='w', compression=zipstream.ZIP_DEFLATED)
z.write('/path/to/file')

response = Response(z, mimetype='application/zip')
response.headers['Content-Disposition'] = 'attachment; filename={}'.format('files.zip')
return response

# Partial flushing of the zip before closing

@app.route('/package.zip', methods=['GET'], endpoint='zipball')
def zipball():
def generate_zip_with_manifest():
z = zipstream.ZipFile(mode='w', compression=zipstream.ZIP_DEFLATED)

manifest = []
for filename in os.listdir('/path/to/files'):
z.write(os.path.join('/path/to/files', filename), arcname=filename)
yield from z.flush()
manifest.append(filename)

z.write_str('manifest.json', json.dumps(manifest).encode())

yield from z

response = Response(generate_zip_with_manifest(), mimetype='application/zip')
response.headers['Content-Disposition'] = 'attachment; filename={}'.format('files.zip')
return response
```

### django 1.5+
Expand All @@ -124,7 +142,7 @@ def zipball():
from django.http import StreamingHttpResponse

def zipball(request):
z = zipstream.ZipFile(mode='w', compression=ZIP_DEFLATED)
z = zipstream.ZipFile(mode='w', compression=zipstream.ZIP_DEFLATED)
z.write('/path/to/file')

response = StreamingHttpResponse(z, content_type='application/zip')
Expand All @@ -149,3 +167,5 @@ def GET(self):
With python version > 2.6, just run the following command: `python -m unittest discover`

Alternatively, you can use `nose`.

If you want to run the tests on all supported Python versions, run `tox`.
23 changes: 17 additions & 6 deletions setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,15 +2,26 @@
from setuptools import setup, find_packages


with open("README.md", "r") as fh:
long_description = fh.read()

setup(
name='zipstream',
version='1.1.4',
description='Zipfile generator',
author='Allan Lei',
author_email='[email protected]',
url='https://github.com/allanlei/python-zipstream',
name='zipstream-new',
version='1.1.8',
description='Zipfile generator that takes input files as well as streams',
long_description=long_description,
long_description_content_type="text/markdown",
author='arjan5',
author_email='[email protected]',
url='https://github.com/arjan-s/python-zipstream',
packages=find_packages(exclude=['tests']),
keywords='zip streaming',
test_suite='nose.collector',
tests_require=['nose'],
classifiers=[
"Programming Language :: Python",
"License :: OSI Approved :: GNU General Public License v3 (GPLv3)",
"Operating System :: OS Independent",
"Topic :: System :: Archiving :: Compression",
],
)
48 changes: 48 additions & 0 deletions tests/test_zipstream.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@

import os
import tempfile
import time
import unittest
import zipstream
import zipfile
Expand Down Expand Up @@ -76,6 +77,28 @@ def string_generator():

os.remove(f.name)

def test_write_iterable_with_date_time(self):
file_name_in_zip = "data_datetime"
file_date_time_in_zip = time.strptime("2011-04-19 22:30:21", "%Y-%m-%d %H:%M:%S")

z = zipstream.ZipFile(mode='w')
def string_generator():
for _ in range(10):
yield b'zipstream\x01\n'
z.write_iter(iterable=string_generator(), arcname=file_name_in_zip, date_time=file_date_time_in_zip)

f = tempfile.NamedTemporaryFile(suffix='zip', delete=False)
for chunk in z:
f.write(chunk)
f.close()

z2 = zipfile.ZipFile(f.name, 'r')
self.assertFalse(z2.testzip())

self.assertEqual(file_date_time_in_zip[0:5], z2.getinfo(file_name_in_zip).date_time[0:5])

os.remove(f.name)

def test_writestr(self):
z = zipstream.ZipFile(mode='w')

Expand All @@ -92,6 +115,31 @@ def test_writestr(self):

os.remove(f.name)

def test_partial_writes(self):
z = zipstream.ZipFile(mode='w')
f = tempfile.NamedTemporaryFile(suffix='zip', delete=False)

with open(SAMPLE_FILE_RTF, 'rb') as fp:
z.writestr('sample1.rtf', fp.read())

for chunk in z.flush():
f.write(chunk)

with open(SAMPLE_FILE_RTF, 'rb') as fp:
z.writestr('sample2.rtf', fp.read())

for chunk in z.flush():
f.write(chunk)

for chunk in z:
f.write(chunk)

f.close()
z2 = zipfile.ZipFile(f.name, 'r')
self.assertFalse(z2.testzip())

os.remove(f.name)

def test_write_iterable_no_archive(self):
z = zipstream.ZipFile(mode='w')
self.assertRaises(TypeError, z.write_iter, iterable=range(10))
Expand Down
2 changes: 1 addition & 1 deletion tox.ini
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
[tox]
envlist = py26, py27, py32, py33, py34, py35, pypy
envlist = py26, py27, py32, py33, py34, py35, py36, py37, py38, pypy, pypy3

[testenv]
deps=nose
Expand Down
29 changes: 19 additions & 10 deletions zipstream/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -178,9 +178,8 @@ def __init__(self, fileobj=None, mode='w', compression=ZIP_STORED, allowZip64=Fa
self.paths_to_write = []

def __iter__(self):
for kwargs in self.paths_to_write:
for data in self.__write(**kwargs):
yield data
for data in self.flush():
yield data
for data in self.__close():
yield data

Expand All @@ -190,6 +189,12 @@ def __enter__(self):
def __exit__(self, type, value, traceback):
self.close()

def flush(self):
while self.paths_to_write:
kwargs = self.paths_to_write.pop(0)
for data in self.__write(**kwargs):
yield data

@property
def comment(self):
"""The comment text associated with the ZIP file."""
Expand All @@ -215,20 +220,20 @@ def write(self, filename, arcname=None, compress_type=None):
kwargs = {'filename': filename, 'arcname': arcname, 'compress_type': compress_type}
self.paths_to_write.append(kwargs)

def write_iter(self, arcname, iterable, compress_type=None):
def write_iter(self, arcname, iterable, compress_type=None, buffer_size=None, date_time=None):
"""Write the bytes iterable `iterable` to the archive under the name `arcname`."""
kwargs = {'arcname': arcname, 'iterable': iterable, 'compress_type': compress_type}
kwargs = {'arcname': arcname, 'iterable': iterable, 'compress_type': compress_type, 'buffer_size': buffer_size, 'date_time': date_time}
self.paths_to_write.append(kwargs)

def writestr(self, arcname, data, compress_type=None):
def writestr(self, arcname, data, compress_type=None, buffer_size=None, date_time=None):
"""
Writes a str into ZipFile by wrapping data as a generator
"""
def _iterable():
yield data
return self.write_iter(arcname, _iterable(), compress_type=compress_type)
return self.write_iter(arcname, _iterable(), compress_type=compress_type, buffer_size=buffer_size, date_time=date_time)

def __write(self, filename=None, iterable=None, arcname=None, compress_type=None):
def __write(self, filename=None, iterable=None, arcname=None, compress_type=None, buffer_size=None, date_time=None):
"""Put the bytes from filename into the archive under the name
`arcname`."""
if not self.fp:
Expand All @@ -243,7 +248,11 @@ def __write(self, filename=None, iterable=None, arcname=None, compress_type=None
mtime = time.localtime(st.st_mtime)
date_time = mtime[0:6]
else:
st, isdir, date_time = None, False, time.localtime()[0:6]
st, isdir = None, False
if date_time is not None and isinstance(date_time, time.struct_time):
date_time = date_time[0:6]
if date_time is None:
date_time = time.localtime()[0:6]
# Create ZipInfo instance to store file information
if arcname is None:
arcname = filename
Expand All @@ -265,7 +274,7 @@ def __write(self, filename=None, iterable=None, arcname=None, compress_type=None
if st:
zinfo.file_size = st[6]
else:
zinfo.file_size = 0
zinfo.file_size = buffer_size or 0
zinfo.flag_bits = 0x00
zinfo.flag_bits |= 0x08 # ZIP flag bits, bit 3 indicates presence of data descriptor
zinfo.header_offset = self.fp.tell() # Start of header bytes
Expand Down