-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reorganisation of the MC data directory tree #42
Comments
Thanks, looks good to me. Other opinions? |
@Voutsi In your screenshot I see that you are using the gzip compression for the simtel files. We should use zstd, it is slightly faster to write, a bit smaller and much faster to read. It is also what is used for standard CTA productions. |
Thanks @maxnoe , I was not aware of that. As long as the pipeline can digest .zstd files I agree we should change. |
@Voutsi thanks, I agree with the proposed organization, but according to Daniel Mazin's comment from today, you may be having troubles writing all the MC in your home folder, shall we star transferring them elsewhere in the organization tree? |
@rlopezcoto the MC is stored in fefs: /fefs/aswg/workspace/georgios.voutsinas/AllSky/ I have a symlink in my home folder pointing to the storage space at fefs and this is what I showed in the slides today (I agree it was confusing...) Or you mean that I will have problems to store it in /fefs/aswg/workspace/georgios.voutsinas/ also? |
no, in the workspace folder it should be fine, no limits there so far |
since those will be the first really official MCs to be used by many analyzers, maybe it would be good to put the main path specifying that it is LSTProd2 another thing, while corsika is expected to be just one directory, sim_telarray will be run multiple times with various settings, so I think it would be good to add to "sim_telarray" some tags describing the time period for which they are produced (dates or analysis periods), and settings ("nominal", "low_NSB" or something like this). |
Hi @jsitarek sounds good to me, so I create a /fefs/aswg/mc/LSTProd2/, create the same directory structure, and then sym-linking data files, configs & logs. Sure, I can add a suffix in the sim_telarray directories. I understand that now we produce the nominal ones. |
Hi @maxnoe zstd is not installed and I don't have the privileges to do it. Shall we request the admins or someone can install it? |
@Voutsi To have it in the system, yes ask the admins. It's however also available in the lstchain conda environments already. |
Some of this discussion happened in emails, my bad I missed the discussion in this repo !
A final structure could look like this:
EDIT: Georgios made me realise declination should be lower in the tree so I edited the example accordingly |
Currently the data of the training dataset are stored at:
/home/georgios.voutsinas/ws/AllSky/TrainingDataset
there are 2 directories, one for protons and one for gamma diff. For each particle type, we have a directory per declination band (exception is the Crab's band which are stored simply in directories called Corsika & sim_telarray - I will move them this WE to a dir called dec_2276). Each declination's band directory splits to a Corsika and a sim_telarray dir, and in each one of this dirs we have a directory per node.
The structure is illustrated in the example directory tree attached below.
Please let me know if this scheme is satisfactory or we should organise the data in a more optimal way.
The text was updated successfully, but these errors were encountered: