Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Providing the ability to specify zlib compressor #273

Open
hmaarrfk opened this issue Oct 17, 2024 · 1 comment
Open

Providing the ability to specify zlib compressor #273

hmaarrfk opened this issue Oct 17, 2024 · 1 comment

Comments

@hmaarrfk
Copy link

Going through the code, it seems that the compressor is pretty hard coded based on the compressiontag.

hasattr(imagecodecs, 'DEFLATE')

This means that imagecodecs.deflate is preferentially used. Recently it seems that the zlib-ng library has made great strides and it would be great to start to experiment with it. However, I understand the dangers of just "swapping it out". I personally am not brave enough to do that, but maybe you are.

It would effectively amount to adding (not really tested at time of writing)

if (
                    hasattr(imagecodecs, 'ZLIBNG') and
                    imagecodecs.ZLIBNG.available
                ):
                    if self._encode:
                        codec = imagecodecs.zlibng_encode
                    else:
                        codec = imagecodecs.zlibng_decode
                elif (
                    hasattr(imagecodecs, 'DEFLATE')
                    and imagecodecs.DEFLATE.available
                ):

It might also be interesting to just provide the ability to switch between the two at runtime for benchmarking purposes. The larger patch below would ensure that:

  • compressiontag is an int
  • compression is a string 'NONE' meaning no compression

This would eventually let us use compression as a key into the specific codec that the user wants. However, I understand this adds complexity and maintenance and maybe just forcing zlib-ng onto people is easier.

Let me know what you prefer or if you prefer an other option.

A larger patch for more flexibility
diff --git a/tifffile/tifffile.py b/tifffile/tifffile.py
index fa7d4c1..fa2efc7 100644
--- a/tifffile/tifffile.py
+++ b/tifffile/tifffile.py
@@ -1986,7 +1986,7 @@ class TiffWriter:
                 factor. Segment lengths and rowsperstrip must be a multiple
                 of 8 times the vertical factor.
                 The values are written to the YCbCrSubSampling tag.
-            jpegtables:
+           jpegtables:
                 JPEG quantization and/or Huffman tables.
                 Use for copying pre-compressed JPEG segments.
                 The value is written to the JPEGTables tag.
@@ -2397,9 +2397,6 @@ class TiffWriter:
                 int(enumarg(EXTRASAMPLE, x)) for x in sequence(extrasamples)
             )
 
-        if compressionargs is None:
-            compressionargs = {}
-
         if compression:
             if isinstance(compression, (tuple, list)):
                 # TODO: unreachable
@@ -2422,26 +2419,33 @@ class TiffWriter:
                 compression = compression[0]
             if isinstance(compression, str):
                 compression = compression.upper()
-                if compression == 'ZLIB':
-                    compression = 8  # ADOBE_DEFLATE
-            elif isinstance(compression, bool):
-                compression = 8  # ADOBE_DEFLATE
+            # Boolean or other truthy object should be cast back to string
+            elif compression:
+                compression = 'ADOBE_DEFLATE'
+            else:
+                compression = 'NONE'
             compressiontag = enumarg(COMPRESSION, compression).value
-            compression = True
         else:
             compressiontag = 1
-            compression = False
 
-        if compressiontag == 1:
+        if compressionargs is None:
+            compressionargs = {}
+
+        if compressiontag == 1 and len(compressionargs) > 0:
+            warnings.warn(
+                "Passed 'compressionargs' to uncompressed image. "
+                "This will raise an error in the future.",
+                DeprecationWarning,
+                stacklevel=2
+            )
             compressionargs = {}
         elif compressiontag in {33003, 33004, 33005, 34712}:
             # JPEG2000: use J2K instead of JP2
-            compressionargs['codecformat'] = 0  # OPJ_CODEC_J2K
-
-        assert compressionargs is not None
+            if 'codecformat' not in compressionargs:
+                compressionargs['codecformat'] = 0  # OPJ_CODEC_J2K
 
         if predictor:
-            if not compression:
+            if compression == 'NONE':
                 raise ValueError('cannot use predictor without compression')
             if compressiontag in TIFF.IMAGE_COMPRESSIONS:
                 # don't use predictor with JPEG, JPEG2000, WEBP, PNG, ...
@@ -2848,7 +2852,7 @@ class TiffWriter:
             and bitspersample <= datadtype.itemsize * 8
         ):
             raise ValueError(f'{bitspersample=} out of range of {datadtype=}')
-        elif compression:
+        elif compression != 'NONE':
             if bitspersample != datadtype.itemsize * 8:
                 raise ValueError(
                     f'{bitspersample=} cannot be used with compression'
@@ -3168,7 +3172,7 @@ class TiffWriter:
             addtag(tags, 296, 3, 1, resolutionunit)  # ResolutionUnit
 
         # can save data array contiguous
-        contiguous = not (compression or packints or bilevel)
+        contiguous = not ((compression != 'NONE') or packints or bilevel)
         if tile:
             # one chunk per tile per plane
             if len(tile) == 2:
@@ -3209,7 +3213,7 @@ class TiffWriter:
                 if dataarray is not None:
                     dataiter = iter_tiles(dataarray, tile, tiles)
                 elif dataiter is None and not (
-                    compression or packints or bilevel
+                    (compression != 'NONE') or packints or bilevel
                 ):
 
                     def dataiter_(
@@ -3258,7 +3262,7 @@ class TiffWriter:
             if rowsperstrip is None:
                 # compress ~256 KB chunks by default
                 # TIFF-EP requires <= 64 KB
-                if compression:
+                if compression != 'NONE':
                     rowsperstrip = 262144 // rowsize
                 else:
                     rowsperstrip = storedshape.length
@@ -3372,7 +3376,7 @@ class TiffWriter:
             else:
                 raise NotImplementedError('cannot compress bilevel image')
 
-        elif compression:
+        elif compression != 'NONE':
             compressor = TIFF.COMPRESSORS[compressiontag]
 
             if compressiontag == 32773:
@@ -16772,6 +16776,14 @@ class CompressionCodec(
                     codec = imagecodecs.jpeg_decode
             elif key in {8, 32946, 50013}:
                 if (
+                    hasattr(imagecodecs, 'ZLIBNG') and
+                    imagecodecs.ZLIBNG.available
+                ):
+                    if self._encode:
+                        codec = imagecodecs.zlibng_encode
+                    else:
+                        codec = imagecodecs.zlibng_decode
+                elif (
                     hasattr(imagecodecs, 'DEFLATE')
                     and imagecodecs.DEFLATE.available
                 ):
@@ -17074,6 +17086,11 @@ class COMPRESSION(enum.IntEnum):
     JPEG = 7
     """New style JPEG."""
     ADOBE_DEFLATE = 8
+    # Aliases for ADOBE_DEFLATE
+    # These aliases allow the user to specify the details
+    # of which compressor implementation to use
+    ZLIB = 8
+    ZLIB_NG = 8
     """Deflate, aka ZLIB."""
     JBIG_BW = 9  # VC5
     JBIG_COLOR = 10
@hmaarrfk
Copy link
Author

cgholke's other response

There are already three "zlib" codecs in imagecodecs. I think libdeflate is a good default for TIFF. It is also used by libtiff if enabled at build time. I rather add an option to pass an encoder callback function with tifffile's compressionargs argument such that any compatible codec (for example, zlib-ng or isa-l) can be used for experimentation.

For encoding this seems fine.

What would you suggest for decoding? The user passing in a custom dictionary for their preferred decoders based on the compressiontag?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant