Skip to content

Commit

Permalink
move kerchunk discussion to separate section
Browse files Browse the repository at this point in the history
  • Loading branch information
TomNicholas committed Dec 8, 2024
1 parent 0c80f7b commit c9f5104
Showing 1 changed file with 8 additions and 8 deletions.
16 changes: 8 additions & 8 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,13 +18,9 @@

The best way to distribute large scientific datasets is via the Cloud, in [Cloud-Optimized formats](https://guide.cloudnativegeo.org/) [^1]. But often this data is stuck in legacy pre-Cloud file formats such as netCDF.

**VirtualiZarr makes it easy to create "Virtual" Zarr stores, allowing access to data in legacy formats as if it were in the Cloud-Optimized [Zarr format](https://zarr.dev/), _without duplicating any data_.**
**VirtualiZarr makes it easy to create "Virtual" Zarr stores, allowing performant access to legacy data as if it were in the Cloud-Optimized [Zarr format](https://zarr.dev/), _without duplicating any data_.**

VirtualiZarr (pronounced like "virtualizer" but more piratey) grew out of [discussions](https://github.com/fsspec/kerchunk/issues/377) on the [kerchunk repository](https://github.com/fsspec/kerchunk), and is an attempt to provide the game-changing power of kerchunk in a zarr-native way, and with a familiar array-like API.

You now have a choice between using VirtualiZarr and Kerchunk: VirtualiZarr provides [almost all the same features](https://virtualizarr.readthedocs.io/en/latest/faq.html#how-do-virtualizarr-and-kerchunk-compare) as Kerchunk.

**Please see the [documentation](https://virtualizarr.readthedocs.io/en/stable/index.html)**
Please see the [documentation](https://virtualizarr.readthedocs.io/en/stable/index.html).

[^1]: [_Cloud-Native Repositories for Big Scientific Data_, Abernathey et. al., _Computing in Science & Engineering_.](https://ieeexplore.ieee.org/abstract/document/9354557)

Expand All @@ -36,6 +32,12 @@ You now have a choice between using VirtualiZarr and Kerchunk: VirtualiZarr prov
* Commit the virtual references to storage either using the [Kerchunk references specification](https://fsspec.github.io/kerchunk/spec.html) or the [Icechunk transactional storage engine](https://icechunk.io/).
* Users access the virtual dataset using [`xarray.open_dataset`](https://docs.xarray.dev/en/stable/generated/xarray.open_dataset.html#xarray.open_dataset).

### VirtualiZarr vs Kerchunk?

VirtualiZarr (pronounced like "virtualizer" but more piratey) grew out of [discussions](https://github.com/fsspec/kerchunk/issues/377) on the [kerchunk repository](https://github.com/fsspec/kerchunk), and is an attempt to provide the game-changing power of kerchunk in a zarr-native way, and with a familiar array-like API.

You now have a choice between using VirtualiZarr and Kerchunk: VirtualiZarr provides [almost all the same features](https://virtualizarr.readthedocs.io/en/latest/faq.html#how-do-virtualizarr-and-kerchunk-compare) as Kerchunk.

### Development Status and Roadmap

VirtualiZarr version 1 (mostly) achieves [feature parity](https://virtualizarr.readthedocs.io/en/latest/faq.html#how-do-virtualizarr-and-kerchunk-compare) with kerchunk's logic for combining datasets, providing an easier way to manipulate kerchunk references in memory and generate kerchunk reference files on disk.
Expand Down Expand Up @@ -66,5 +68,3 @@ This package was originally developed by [Tom Nicholas](https://github.com/TomNi
### Licence

Apache 2.0

### References

0 comments on commit c9f5104

Please sign in to comment.