Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add wrappers for zarr v3 #524

Merged
merged 62 commits into from
Nov 8, 2024
Merged
Show file tree
Hide file tree
Changes from 58 commits
Commits
Show all changes
62 commits
Select commit Hold shift + click to select a range
4f7c12c
import zarr-python#1839
normanrz May 6, 2024
f6d2652
pep8
normanrz May 6, 2024
9593c43
remove try/catch
normanrz May 8, 2024
2dc9281
pep8
normanrz May 8, 2024
1822576
update to latest zarr-python interfaces
normanrz May 17, 2024
636624b
Merge remote-tracking branch 'origin/main' into zarr3-codecs
normanrz Jun 12, 2024
8ba9da7
flake
normanrz Jun 12, 2024
3459871
add zarr-python to ci
normanrz Jun 12, 2024
8e10124
fix import
normanrz Jun 12, 2024
14807a5
tests
normanrz Jun 12, 2024
9932a1d
fixes
normanrz Jun 12, 2024
2a12c40
skip zarr3 tests on older python versions
normanrz Jun 13, 2024
92f247d
merge
normanrz Jun 24, 2024
cd49cf7
ruff
normanrz Jun 24, 2024
5083d66
add zfpy and pcodec
normanrz Jun 25, 2024
64e081a
Merge remote-tracking branch 'origin/main' into zarr3-codecs
normanrz Jun 25, 2024
7a40530
remove zarr from dependencies
normanrz Jun 25, 2024
2654737
change prefix
normanrz Jun 25, 2024
8c37d5d
fixes for ci
normanrz Jun 25, 2024
37700a5
fix for tests
normanrz Jun 25, 2024
c9c8f5e
Merge branch 'main' into zarr3-codecs
normanrz Jul 9, 2024
57fd71b
Merge branch 'main' into zarr3-codecs
normanrz Sep 16, 2024
b75e41e
pr feedback
normanrz Sep 24, 2024
a4cf7ad
Sync with zarr 3 beta (#597)
mpiannucci Oct 16, 2024
860956f
Update numcodecs/tests/test_zarr3.py
normanrz Oct 16, 2024
a62d258
moves zarr3 to private module, adds test for zarr-python2 installs
normanrz Oct 18, 2024
6d8bad2
Merge remote-tracking branch 'origin/main' into zarr3-codecs
normanrz Oct 18, 2024
f28775d
add typing_extensions as dep
normanrz Oct 18, 2024
6a8115c
tests
normanrz Oct 18, 2024
82fa7a8
importorskip minversion
normanrz Oct 18, 2024
20aa698
ci install
normanrz Oct 18, 2024
bd426e7
Merge remote-tracking branch 'origin/main' into zarr3-codecs
normanrz Oct 25, 2024
0ad1fdf
drop zarr 2 in ci
normanrz Oct 25, 2024
5a956a9
no zarr2 + make zarr3 a public module
normanrz Oct 25, 2024
aa3f708
pre-commit
normanrz Oct 25, 2024
8cfce5b
fixes?
normanrz Oct 25, 2024
ce6b4b5
fix validate
normanrz Oct 25, 2024
6a795b6
fix pcodec test
normanrz Oct 25, 2024
26a0374
fix pcodec test
normanrz Oct 25, 2024
77f9e05
codecov
normanrz Oct 25, 2024
dd8afc5
codecov
normanrz Oct 25, 2024
97bea53
fix error match
normanrz Oct 25, 2024
795c899
codecov
normanrz Oct 25, 2024
0285e22
codecov
normanrz Oct 25, 2024
a4ef678
Merge remote-tracking branch 'origin/main' into zarr3-codecs
normanrz Oct 28, 2024
da7538e
merge
normanrz Nov 5, 2024
bc2e704
coverage
normanrz Nov 5, 2024
e31bb2f
wip docs
normanrz Nov 5, 2024
b2e18ca
docs and renames all codecs
normanrz Nov 5, 2024
c37e7cf
docs
normanrz Nov 5, 2024
efcf24f
new zarr beta
normanrz Nov 5, 2024
bf9b18e
no zfpy for macos-14
normanrz Nov 5, 2024
73d4dc4
xfail
normanrz Nov 5, 2024
71178b0
rm dead code
normanrz Nov 5, 2024
d0f2ab9
Update .github/workflows/ci.yaml
normanrz Nov 6, 2024
5d03a0f
debug rtd
normanrz Nov 6, 2024
9f55819
debug ci
normanrz Nov 6, 2024
a774df7
Merge branch 'main' into zarr3-codecs
dstansby Nov 7, 2024
9ea916f
Filter warnings in zarr3 tests
dstansby Nov 7, 2024
96747f4
Fix warning ignore
dstansby Nov 8, 2024
f39b32d
Merge remote-tracking branch 'origin/main' into zarr3-codecs
normanrz Nov 8, 2024
153d340
pr feedback
normanrz Nov 8, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 8 additions & 2 deletions .github/workflows/ci.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ jobs:
fail-fast: false
matrix:
python-version: ["3.11", "3.12", "3.13"]
# macos-12 is an intel runner, macos-14 is a arm64 runner
# macos-13 is an intel runner, macos-14 is a arm64 runner
platform: [ubuntu-latest, windows-latest, macos-13, macos-14]

steps:
Expand Down Expand Up @@ -70,10 +70,16 @@ jobs:
conda activate env
python -m pip install -v ".[pcodec]"

- name: Install zarr-python
shell: "bash -l {0}"
run: |
conda activate env
# TODO: remove --pre option when zarr v3 is out
python -m pip install --pre zarr

# This is used to test with zfpy, which does not yet support numpy 2.0
- name: Install older numpy and zfpy
if: matrix.python-version == '3.11'
if: matrix.python-version == '3.11' && matrix.platform != 'macos-14'
normanrz marked this conversation as resolved.
Show resolved Hide resolved
shell: "bash -l {0}"
run: |
conda activate env
Expand Down
4 changes: 4 additions & 0 deletions .readthedocs.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,9 @@ build:
os: ubuntu-20.04
tools:
python: "3.12"
jobs:
post_install:
- python -m pip install --pre 'zarr'

sphinx:
configuration: docs/conf.py
Expand All @@ -19,3 +22,4 @@ python:
- docs
- msgpack
- zfpy
- crc32c
1 change: 1 addition & 0 deletions docs/api.rst
Original file line number Diff line number Diff line change
Expand Up @@ -10,3 +10,4 @@ API reference
checksum32
abc
registry
zarr3
99 changes: 99 additions & 0 deletions docs/zarr3.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,99 @@
Zarr 3 codecs
=============
.. automodule:: numcodecs.zarr3


Bytes-to-bytes codecs
---------------------
.. autoclass:: Blosc()

.. autoattribute:: codec_name

.. autoclass:: LZ4()

.. autoattribute:: codec_name

.. autoclass:: Zstd()

.. autoattribute:: codec_name

.. autoclass:: Zlib()

.. autoattribute:: codec_name

.. autoclass:: GZip()

.. autoattribute:: codec_name

.. autoclass:: BZ2()

.. autoattribute:: codec_name

.. autoclass:: LZMA()

.. autoattribute:: codec_name

.. autoclass:: Shuffle()

.. autoattribute:: codec_name


Array-to-array codecs
---------------------
.. autoclass:: Delta()

.. autoattribute:: codec_name

.. autoclass:: BitRound()

.. autoattribute:: codec_name

.. autoclass:: FixedScaleOffset()

.. autoattribute:: codec_name

.. autoclass:: Quantize()

.. autoattribute:: codec_name

.. autoclass:: PackBits()

.. autoattribute:: codec_name

.. autoclass:: AsType()

.. autoattribute:: codec_name


Bytes-to-bytes checksum codecs
------------------------------
.. autoclass:: CRC32()

.. autoattribute:: codec_name

.. autoclass:: CRC32C()

.. autoattribute:: codec_name

.. autoclass:: Adler32()

.. autoattribute:: codec_name

.. autoclass:: Fletcher32()

.. autoattribute:: codec_name

.. autoclass:: JenkinsLookup3()

.. autoattribute:: codec_name


Array-to-bytes codecs
---------------------
.. autoclass:: PCodec()

.. autoattribute:: codec_name

.. autoclass:: ZFPY()

.. autoattribute:: codec_name
233 changes: 233 additions & 0 deletions numcodecs/tests/test_zarr3.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,233 @@
from __future__ import annotations

import numpy as np
import pytest

import numcodecs.zarr3

zarr = pytest.importorskip("zarr")

pytestmark = pytest.mark.skipif(
zarr.__version__ < "3.0.0", reason="zarr 3.0.0 or later is required"
)

get_codec_class = zarr.registry.get_codec_class
Array = zarr.Array
JSON = zarr.core.common.JSON
BytesCodec = zarr.codecs.BytesCodec
Store = zarr.abc.store.Store
MemoryStore = zarr.storage.MemoryStore
StorePath = zarr.storage.StorePath


EXPECTED_WARNING_STR = "Numcodecs codecs are not in the Zarr version 3.*"


@pytest.fixture
def store() -> Store:
return StorePath(MemoryStore(mode="w"))


ALL_CODECS = [getattr(numcodecs.zarr3, cls_name) for cls_name in numcodecs.zarr3.__all__]


@pytest.mark.parametrize("codec_class", ALL_CODECS)
def test_entry_points(codec_class: type[numcodecs.zarr3._NumcodecsCodec]):
codec_name = codec_class.codec_name
assert get_codec_class(codec_name) == codec_class


@pytest.mark.parametrize("codec_class", ALL_CODECS)
def test_docstring(codec_class: type[numcodecs.zarr3._NumcodecsCodec]):
assert "See :class:`numcodecs." in codec_class.__doc__


@pytest.mark.parametrize(
"codec_class",
[
numcodecs.zarr3.Blosc,
numcodecs.zarr3.LZ4,
numcodecs.zarr3.Zstd,
numcodecs.zarr3.Zlib,
numcodecs.zarr3.GZip,
numcodecs.zarr3.BZ2,
numcodecs.zarr3.LZMA,
numcodecs.zarr3.Shuffle,
],
)
def test_generic_codec_class(store: Store, codec_class: type[numcodecs.zarr3._NumcodecsCodec]):
data = np.arange(0, 256, dtype="uint16").reshape((16, 16))

with pytest.warns(UserWarning, match=EXPECTED_WARNING_STR):
a = Array.create(
store / "generic",
shape=data.shape,
chunk_shape=(16, 16),
dtype=data.dtype,
fill_value=0,
codecs=[BytesCodec(), codec_class()],
)

a[:, :] = data.copy()
np.testing.assert_array_equal(data, a[:, :])


@pytest.mark.parametrize(
("codec_class", "codec_config"),
[
(numcodecs.zarr3.Delta, {"dtype": "float32"}),
(numcodecs.zarr3.FixedScaleOffset, {"offset": 0, "scale": 25.5}),
(numcodecs.zarr3.FixedScaleOffset, {"offset": 0, "scale": 51, "astype": "uint16"}),
(numcodecs.zarr3.AsType, {"encode_dtype": "float32", "decode_dtype": "float64"}),
],
ids=[
"delta",
"fixedscaleoffset",
"fixedscaleoffset2",
"astype",
],
)
def test_generic_filter(
store: Store, codec_class: type[numcodecs.zarr3._NumcodecsCodec], codec_config: dict[str, JSON]
):
data = np.linspace(0, 10, 256, dtype="float32").reshape((16, 16))

with pytest.warns(UserWarning, match=EXPECTED_WARNING_STR):
a = Array.create(
store / "generic",
shape=data.shape,
chunk_shape=(16, 16),
dtype=data.dtype,
fill_value=0,
codecs=[
codec_class(**codec_config),
BytesCodec(),
],
)

a[:, :] = data.copy()
a = Array.open(store / "generic")
np.testing.assert_array_equal(data, a[:, :])


def test_generic_filter_bitround(store: Store):
data = np.linspace(0, 1, 256, dtype="float32").reshape((16, 16))

with pytest.warns(UserWarning, match=EXPECTED_WARNING_STR):
a = Array.create(
store / "generic_bitround",
shape=data.shape,
chunk_shape=(16, 16),
dtype=data.dtype,
fill_value=0,
codecs=[numcodecs.zarr3.BitRound(keepbits=3), BytesCodec()],
)

a[:, :] = data.copy()
a = Array.open(store / "generic_bitround")
assert np.allclose(data, a[:, :], atol=0.1)


def test_generic_filter_quantize(store: Store):
data = np.linspace(0, 10, 256, dtype="float32").reshape((16, 16))

with pytest.warns(UserWarning, match=EXPECTED_WARNING_STR):
a = Array.create(
store / "generic_quantize",
shape=data.shape,
chunk_shape=(16, 16),
dtype=data.dtype,
fill_value=0,
codecs=[numcodecs.zarr3.Quantize(digits=3), BytesCodec()],
)

a[:, :] = data.copy()
a = Array.open(store / "generic_quantize")
assert np.allclose(data, a[:, :], atol=0.001)


def test_generic_filter_packbits(store: Store):
data = np.zeros((16, 16), dtype="bool")
data[0:4, :] = True

with pytest.warns(UserWarning, match=EXPECTED_WARNING_STR):
a = Array.create(
store / "generic_packbits",
shape=data.shape,
chunk_shape=(16, 16),
dtype=data.dtype,
fill_value=0,
codecs=[numcodecs.zarr3.PackBits(), BytesCodec()],
)

a[:, :] = data.copy()
a = Array.open(store / "generic_packbits")
np.testing.assert_array_equal(data, a[:, :])

with pytest.raises(ValueError, match=".*requires bool dtype.*"):
Array.create(
store / "generic_packbits_err",
shape=data.shape,
chunk_shape=(16, 16),
dtype="uint32",
fill_value=0,
codecs=[numcodecs.zarr3.PackBits(), BytesCodec()],
)


@pytest.mark.parametrize(
"codec_class",
[
numcodecs.zarr3.CRC32,
numcodecs.zarr3.CRC32C,
numcodecs.zarr3.Adler32,
numcodecs.zarr3.Fletcher32,
numcodecs.zarr3.JenkinsLookup3,
],
)
def test_generic_checksum(store: Store, codec_class: type[numcodecs.zarr3._NumcodecsCodec]):
data = np.linspace(0, 10, 256, dtype="float32").reshape((16, 16))

with pytest.warns(UserWarning, match=EXPECTED_WARNING_STR):
a = Array.create(
store / "generic_checksum",
shape=data.shape,
chunk_shape=(16, 16),
dtype=data.dtype,
fill_value=0,
codecs=[BytesCodec(), codec_class()],
)

a[:, :] = data.copy()
a = Array.open(store / "generic_checksum")
np.testing.assert_array_equal(data, a[:, :])


@pytest.mark.parametrize("codec_class", [numcodecs.zarr3.PCodec, numcodecs.zarr3.ZFPY])
def test_generic_bytes_codec(store: Store, codec_class: type[numcodecs.zarr3._NumcodecsCodec]):
try:
codec_class()._codec # noqa: B018
except ValueError as e:
if "codec not available" in str(e):
pytest.xfail(f"{codec_class.codec_name} is not available: {e}")
else:
raise # pragma: no cover
except ImportError as e:
pytest.xfail(f"{codec_class.codec_name} is not available: {e}")

data = np.arange(0, 256, dtype="float32").reshape((16, 16))

with pytest.warns(UserWarning, match=EXPECTED_WARNING_STR):
a = Array.create(
store / "generic",
shape=data.shape,
chunk_shape=(16, 16),
dtype=data.dtype,
fill_value=0,
codecs=[
codec_class(),
],
)

a[:, :] = data.copy()
np.testing.assert_array_equal(data, a[:, :])
13 changes: 13 additions & 0 deletions numcodecs/tests/test_zarr3_import.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
from __future__ import annotations

import pytest


def test_zarr3_import():
ERROR_MESSAGE_MATCH = "zarr 3.0.0 or later.*"

try:
import zarr # noqa: F401
except ImportError: # pragma: no cover
with pytest.raises(ImportError, match=ERROR_MESSAGE_MATCH):
import numcodecs.zarr3 # noqa: F401
Loading
Loading