Added more metadata to Dask Dataframe creation #19

alxmrs · 2024-02-19T11:13:04Z

Fixed #17. It looks like it is, in fact, lazily opened. len(era5_df) requires a full scan. I opened #18 to address the length issue.

It should return right away since we want to convert chunks lazily. From the profile traces, it looks like `to_dd` converts the chunks right away.

I found that either `from_delayed` or `from_map` took forever to get the length of era5. This looks like a more fundamental issue with Dask Dataframes. Instead, I checked how cast it was to get columns.

alxmrs added 3 commits February 18, 2024 17:36

Found bug: Opening ERA5 takes longer than expected.

5f9d186

It should return right away since we want to convert chunks lazily. From the profile traces, it looks like `to_dd` converts the chunks right away.

Fix: Added more metadata to from_map call.

62c0b38

I found that either `from_delayed` or `from_map` took forever to get the length of era5. This looks like a more fundamental issue with Dask Dataframes. Instead, I checked how cast it was to get columns.

Added TODO.

4d749b8

alxmrs merged commit 7da2184 into main Feb 19, 2024

alxmrs deleted the fix-fast-zarr-open branch February 19, 2024 11:14

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added more metadata to Dask Dataframe creation #19

Added more metadata to Dask Dataframe creation #19

alxmrs commented Feb 19, 2024

Added more metadata to Dask Dataframe creation #19

Added more metadata to Dask Dataframe creation #19

Conversation

alxmrs commented Feb 19, 2024