Troubleshoot ADCP dataset #17

pramod-thupaki · 2020-08-28T17:39:29Z

Some adcp files ** are not being imported into the ERDDAP dataset. Initial tests by @n-a-t-e suggest that this is due to insufficient memory while the ERDDAP dataset is being constructed.

Other points:

The ADCP dataset is the largest one amongst the IOS datasets
Dividing the millar1* file into 3 smaller files seems to get around this problem. Note sure why this is the case. Dividing the file into smaller files is not an ideal solution and ERDDAP ought to be able to handle much larger files.

** Problem file(s):
millar1_20171007_20181015_0018m.adcp.L1.nc

sjbruce · 2020-08-28T19:16:39Z

Last year at the Ann Arbour Code Sprint I asked Bob about the size of files in ERDDAP - was it better to use a few large files or many smaller ones. His answer was that it's better to use many smaller files - according to him, this is true whether it's local or remote (especially if retrieving files from a remote location like Amazon S3).

ERDDAP will do some internal indexing and map what files contain what data - many smaller files actually end up being more efficient.

From the most recent release notes for 2.02:

If it is convenient, it's still always a good idea to split huge tabular data files into several smaller files based on some criteria like stationID and/or time. ERDDAP will often only have to open one of the small files in response to a user's request, and thus be able to respond much faster.

Also, you might want to try bumping up the memory available to ERDDAP to be 8GB (or higher), that might help as well - when in doubt throw more RAM at the problem!

Found this thread on a the ERDDAP Google Group that gets into more details on why to split up large files: https://groups.google.com/g/erddap/c/OaX7JjV18pg?pli=1

n-a-t-e · 2020-09-08T18:26:50Z

This was a strange one, as the problematic file isn't really that big, and also isn't the largest one in the dataset. I tried bumping RAM to 12GB and it still causes ERDDAP to crash. But splitting up the file does make it work, so we can do that for now.

pramod-thupaki assigned n-a-t-e Aug 28, 2020

pramod-thupaki added the bug Something isn't working label Aug 28, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Troubleshoot ADCP dataset #17

Troubleshoot ADCP dataset #17

pramod-thupaki commented Aug 28, 2020

sjbruce commented Aug 28, 2020 •

edited

Loading

n-a-t-e commented Sep 8, 2020

Troubleshoot ADCP dataset #17

Troubleshoot ADCP dataset #17

Comments

pramod-thupaki commented Aug 28, 2020

sjbruce commented Aug 28, 2020 • edited Loading

n-a-t-e commented Sep 8, 2020

sjbruce commented Aug 28, 2020 •

edited

Loading