-
Notifications
You must be signed in to change notification settings - Fork 154
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactoring? #266
Comments
Hi @hmaarrfk,
The TIFF format was designed such that it is easy to optimize for specific cases. It is really impossible to efficiently support all TIFF options, TIFF based variants, and handle possible corruptions. What layouts/metadata are you trying to optimize? Maybe it is easier to implement from scratch.
I don't think I have the time to refactor the TiffWriter.write method. It was made to dump numpy arrays contiguously to TIFF. Everything else was squeezed into the method later and somehow works. Unfortunately that makes it impossible to write other interesting layouts such as COG (neither does libtiff I think). If I had the time, I would try to support the COG layout.
I would be interested to see those optimizations. I experimented with Windows native file handles, which was significantly faster than the C runtime. The ScanImage reader takes that approach, but it is optimized to read the specific data layout in ScanImage files. The reason for using FileHandle are in the class docstring. FileHandle is part of the public API and I use it in my other libraries too. For sure FileHandle slows down things and there is room for optimizations. |
To be clear, I wouldn't expect you to do this on your own. I might need assistance early on to enable github actions (or some other CI of your choice) to ensure we are testing the same use cases (I believe your test suite is quite comprehensive). We would be adding test cases as we refactor things to ensure that the behavior stays consistent. If you have other projects on github that use To start, I might just need you to "enable github actions" if you aren't opposed to it, and I can start to make PRs. You might need to accept a dummy PR from me to avoid giving me too many rights (sometimes github actions disallows automatically running Github actions on PRs for new contributors to a project).
Generally pretty simple:
Metadata is pretty limited to "ImageJ's baseline functionality" for now, but eventually we'll likely opt to support OME tiffs (mostly many of our users use ImageJ so the extra step of using BioFormats is somewhat annoying. Maybe one day I'll pull out my java skills and try to help out there). The truth is that on a "standard SSD with standard
I think this would be a shame. Your work has been super useful to myself and many others, and I think we can generally help each other build better data management flows. |
On my Windows system, I get up to 3.3 GB/s writing a 2 MB image 20000 times in contiguous mode (that is on an encrypted volume, with virus scanning disabled). A potential 2x improvement would be nice.
That should work well with TiffFile.write. I'm not sure what's missing that makes refactoring necessary?
Do you mean
Do you mean reopening the file and appending pages? That is generally a bad idea, performance wise and metadata may get out of sync.
Mostly czifile (which is tested with test_tifffile) and some private scripts. CziFile is effectively abandoned since many years. It should be OK to refactor
I do not use GitHub to develop my projects. It's used for publishing the code and support. Also, testing is local on Windows since most test files are not public and the tests read and write in the order of 100 GB from disk. Can you develop on a fork? The tests should be able to skip all files and features that are inaccessible on CI. |
I will hand it to windows, getting 3.3GB/s is actually quite hard on Ubuntu by sticking to the default settings..... PS. Writing over the same image many times might trigger some shortcuts in your SSD, like writing to an temporary cache.
I can definitely do that. but the first few commits will likely be to setup the CI. these tend to be on the order of 500 LOC to get windows / mac / linux up and running. What would the workflow be:
I'm ok with any workflow of your choosing, this is your project afterall,
Even if fixed in numpy2, I don't think i'm quite ready to remove support for 1.26 yet (I am still on it). It seems like maybe you are open to few contributions that help the overall flow. Let me try to find some time in the coming weeks/months to start on a few low hanging fruits. |
That might be the case. Or encryption took a shortcut? I had an Python extension using the Windows Driver Kit sometimes ago to reset the Windows file system cache for specific files.
Yes, thank you. Just ping me or reference this issue if you have something to review. |
For a 50 GB file, I got write speeds around 2 GB/s with tifffile. The results are indistinguishable from writing bytes to a Python binary file handle ( |
There are 2 tricks if you search for my username across GitHub that I’ve been promoting for large file access:
Essentially unaligned accesses kill hardware patterns. So your peripheral needs to transfer to a bounce buffer you ally in the kernel. For this you really want to make sure you make room for your metadata to be loaded well and ideally in as few pages on the file (4096) so that it can be read in quickly later. O_DIRECT is admittedly a hack, but I tried the MADVISE stuff and they didn’t have the desired effects. I’m pretty convinced that the Tiny arm processors are faster than the big overdesigned kernel algorithms for regular access patterns of big image data (large contiguous writes and reads). There are likely a few small changes we can make to make things threadsafe by taking advantage of system calls that take in the file position. All in all, tifffile has been rock solid for us. A few small refactors might just make us able to expose more features as we start to enable compression (which can easily break optimization 1) |
Also note that without enterprise drives, your SSD can lie to you about the sustained speeds for the first few GB (for the smaller consumer grade ones) to up to 100-200GB for the larger ones. It makes benchmarking really annoying |
this tool doesn't run on linux, but I do think these are numbers that we are capable of approaching: You can also see some benchmarks for consumer SSDs that show that they sustain 7GB/s for about 50 seconds, so maybe 250-400GB of "faster" storage???? |
The Python reference code writes should be aligned and are not significantly faster than tifffile. Everything seems dependent on the state of the file system cache. To bypass the CRT and the file cache seems like the only way to significantly improve things for tifffile, assuming chunk sizes and storage media are already optimal. There is an TIFF is really not a high performance large file format. The specification recommends 8 KB chunks and requires 2 byte alignment. It's a bad match for the large overhead of Python/numpy/zarr. TIFF readers should make no assumptions about the positions of any structure in the file (except for the file header). Are you suggesting to align many offsets in the TIFF file (page structures, large tag values, large stripes/tiles) at 4096 instead of 2? Some of those should be easy to add, but at least for stripes/tiles that will probably not work well with tifffile reading. There are some optimizations to read chunked but contiguous image data in larger batches. I never really confirmed how effective they are. Regarding metadata: often "global" metadata are available in the first page (like ImageJ or OME-XML strings in the first ImageDescription tag), which should make it efficient to read. However, often those metadata are written last, appended to the end of the file. Really bad for performance. Then there are TIFF files where metadata are distributed among millions of pages in the file...
I was testing on the 990 Pro 4 TB and was not able to get anything close to 6-7 GB/s sequential writes (reported by Samsung software) with Python. |
I guess I’m in a uniquely privileged position when I can write the tiff files the way and read them the way I want. I may be wrong. But I really do think our application hits above 2 with your library as a backend. So I don’t think it’s the metadata handling. Perhaps I can write a few small benchmarks to share. Ps. For the 990Pro. You need to make sure you are on a semi modern motherboard on the PCIe gen 4 slot. The laptop I have on me is too old to get past gen 3 speeds. |
I’m happy to do a video call too if you want to discuss more real time. |
I tested a ctypes implementation writing a sector aligned buffer with Windows' low level Re 990 Pro: It's on PCIe gen 4 and Samsung software measures 6.6 GB/s sequential write speed (maybe testing on unencrypted overprovisioning space?). |
I think there are two things that might be happening
I think with my testing on Linux and yours on windows we may be able to get some serious performance from TIFF. At a start 2GB/s is nothing to sneeze at!!!! I’ll try to create a benchmarking project. Thrashing my SSDs isn’t my favorite activity so I’ve dragged my feet for a while on this. |
Hey Christoph,
Thank you for championing Tiff in Python all these years! I really don't think the python scientific ecosystem would be the same without all your efforts.
We are doing some work in optimizing reading and writing to TIFFs files, often it has to do with the layout of the metadata and how data is written.
I'm wondering if some refactors would be welcome.
There are 2 places were I think we can help (slowly):
tifffile/tifffile/tifffile.py
Line 2100 in d173acd
del
is usually a tell tell sign that you are trying to ensure that temporary variables are not accidentally accessed, which might make things a good place to start refactoring.write_array
functionof our FileHandle, However, lines like:tifffile/tifffile/tifffile.py
Line 4263 in d173acd
FileHandle
optimizations ahead of working to submit them upstream.Let me know if you have any appetite for this kind of (collaborative) work.
Best,
Mark
The text was updated successfully, but these errors were encountered: