Providing the ability to specify zlib compressor #273

hmaarrfk · 2024-10-17T13:52:33Z

Going through the code, it seems that the compressor is pretty hard coded based on the compressiontag.

Line 16775 in 8a25a0d

hasattr(imagecodecs, 'DEFLATE')

This means that imagecodecs.deflate is preferentially used. Recently it seems that the zlib-ng library has made great strides and it would be great to start to experiment with it. However, I understand the dangers of just "swapping it out". I personally am not brave enough to do that, but maybe you are.

It would effectively amount to adding (not really tested at time of writing)

if (
                    hasattr(imagecodecs, 'ZLIBNG') and
                    imagecodecs.ZLIBNG.available
                ):
                    if self._encode:
                        codec = imagecodecs.zlibng_encode
                    else:
                        codec = imagecodecs.zlibng_decode
                elif (
                    hasattr(imagecodecs, 'DEFLATE')
                    and imagecodecs.DEFLATE.available
                ):

It might also be interesting to just provide the ability to switch between the two at runtime for benchmarking purposes. The larger patch below would ensure that:

compressiontag is an int
compression is a string 'NONE' meaning no compression

This would eventually let us use compression as a key into the specific codec that the user wants. However, I understand this adds complexity and maintenance and maybe just forcing zlib-ng onto people is easier.

Let me know what you prefer or if you prefer an other option.

A larger patch for more flexibility

diff --git a/tifffile/tifffile.py b/tifffile/tifffile.py
index fa7d4c1..fa2efc7 100644
--- a/tifffile/tifffile.py
+++ b/tifffile/tifffile.py
@@ -1986,7 +1986,7 @@ class TiffWriter:
                 factor. Segment lengths and rowsperstrip must be a multiple
                 of 8 times the vertical factor.
                 The values are written to the YCbCrSubSampling tag.
-            jpegtables:
+           jpegtables:
                 JPEG quantization and/or Huffman tables.
                 Use for copying pre-compressed JPEG segments.
                 The value is written to the JPEGTables tag.
@@ -2397,9 +2397,6 @@ class TiffWriter:
                 int(enumarg(EXTRASAMPLE, x)) for x in sequence(extrasamples)
             )
 
-        if compressionargs is None:
-            compressionargs = {}
-
         if compression:
             if isinstance(compression, (tuple, list)):
                 # TODO: unreachable
@@ -2422,26 +2419,33 @@ class TiffWriter:
                 compression = compression[0]
             if isinstance(compression, str):
                 compression = compression.upper()
-                if compression == 'ZLIB':
-                    compression = 8  # ADOBE_DEFLATE
-            elif isinstance(compression, bool):
-                compression = 8  # ADOBE_DEFLATE
+            # Boolean or other truthy object should be cast back to string
+            elif compression:
+                compression = 'ADOBE_DEFLATE'
+            else:
+                compression = 'NONE'
             compressiontag = enumarg(COMPRESSION, compression).value
-            compression = True
         else:
             compressiontag = 1
-            compression = False
 
-        if compressiontag == 1:
+        if compressionargs is None:
+            compressionargs = {}
+
+        if compressiontag == 1 and len(compressionargs) > 0:
+            warnings.warn(
+                "Passed 'compressionargs' to uncompressed image. "
+                "This will raise an error in the future.",
+                DeprecationWarning,
+                stacklevel=2
+            )
             compressionargs = {}
         elif compressiontag in {33003, 33004, 33005, 34712}:
             # JPEG2000: use J2K instead of JP2
-            compressionargs['codecformat'] = 0  # OPJ_CODEC_J2K
-
-        assert compressionargs is not None
+            if 'codecformat' not in compressionargs:
+                compressionargs['codecformat'] = 0  # OPJ_CODEC_J2K
 
         if predictor:
-            if not compression:
+            if compression == 'NONE':
                 raise ValueError('cannot use predictor without compression')
             if compressiontag in TIFF.IMAGE_COMPRESSIONS:
                 # don't use predictor with JPEG, JPEG2000, WEBP, PNG, ...
@@ -2848,7 +2852,7 @@ class TiffWriter:
             and bitspersample <= datadtype.itemsize * 8
         ):
             raise ValueError(f'{bitspersample=} out of range of {datadtype=}')
-        elif compression:
+        elif compression != 'NONE':
             if bitspersample != datadtype.itemsize * 8:
                 raise ValueError(
                     f'{bitspersample=} cannot be used with compression'
@@ -3168,7 +3172,7 @@ class TiffWriter:
             addtag(tags, 296, 3, 1, resolutionunit)  # ResolutionUnit
 
         # can save data array contiguous
-        contiguous = not (compression or packints or bilevel)
+        contiguous = not ((compression != 'NONE') or packints or bilevel)
         if tile:
             # one chunk per tile per plane
             if len(tile) == 2:
@@ -3209,7 +3213,7 @@ class TiffWriter:
                 if dataarray is not None:
                     dataiter = iter_tiles(dataarray, tile, tiles)
                 elif dataiter is None and not (
-                    compression or packints or bilevel
+                    (compression != 'NONE') or packints or bilevel
                 ):
 
                     def dataiter_(
@@ -3258,7 +3262,7 @@ class TiffWriter:
             if rowsperstrip is None:
                 # compress ~256 KB chunks by default
                 # TIFF-EP requires <= 64 KB
-                if compression:
+                if compression != 'NONE':
                     rowsperstrip = 262144 // rowsize
                 else:
                     rowsperstrip = storedshape.length
@@ -3372,7 +3376,7 @@ class TiffWriter:
             else:
                 raise NotImplementedError('cannot compress bilevel image')
 
-        elif compression:
+        elif compression != 'NONE':
             compressor = TIFF.COMPRESSORS[compressiontag]
 
             if compressiontag == 32773:
@@ -16772,6 +16776,14 @@ class CompressionCodec(
                     codec = imagecodecs.jpeg_decode
             elif key in {8, 32946, 50013}:
                 if (
+                    hasattr(imagecodecs, 'ZLIBNG') and
+                    imagecodecs.ZLIBNG.available
+                ):
+                    if self._encode:
+                        codec = imagecodecs.zlibng_encode
+                    else:
+                        codec = imagecodecs.zlibng_decode
+                elif (
                     hasattr(imagecodecs, 'DEFLATE')
                     and imagecodecs.DEFLATE.available
                 ):
@@ -17074,6 +17086,11 @@ class COMPRESSION(enum.IntEnum):
     JPEG = 7
     """New style JPEG."""
     ADOBE_DEFLATE = 8
+    # Aliases for ADOBE_DEFLATE
+    # These aliases allow the user to specify the details
+    # of which compressor implementation to use
+    ZLIB = 8
+    ZLIB_NG = 8
     """Deflate, aka ZLIB."""
     JBIG_BW = 9  # VC5
     JBIG_COLOR = 10

The text was updated successfully, but these errors were encountered:

hmaarrfk · 2024-10-17T16:37:39Z

cgholke's other response

There are already three "zlib" codecs in imagecodecs. I think libdeflate is a good default for TIFF. It is also used by libtiff if enabled at build time. I rather add an option to pass an encoder callback function with tifffile's compressionargs argument such that any compatible codec (for example, zlib-ng or isa-l) can be used for experimentation.

For encoding this seems fine.

What would you suggest for decoding? The user passing in a custom dictionary for their preferred decoders based on the compressiontag?

hmaarrfk mentioned this issue Oct 17, 2024

More options for zlib encoder cgohlke/imagecodecs#117

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Providing the ability to specify zlib compressor #273

Providing the ability to specify zlib compressor #273

hmaarrfk commented Oct 17, 2024

hmaarrfk commented Oct 17, 2024

Providing the ability to specify zlib compressor #273

Providing the ability to specify zlib compressor #273

Comments

hmaarrfk commented Oct 17, 2024

hmaarrfk commented Oct 17, 2024