Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

funannotate_annotate: run with --writable-tmpfs #1246

Merged
merged 1 commit into from
Jul 3, 2024

Conversation

bgruening
Copy link
Member

@sanjaysrikakulam @mira-miracoli anything against this?

I see some read-only /tmp in the logs:

FileNotFoundError: [Errno 2] No such file or directory: '/data/jwd02f/main/071/349/71349627/working/output/predict_misc/protein_alignments.gff3'
galaxy@vgcnbwc-worker-c8m40g1-0000:/data/jwd02f/main/071/349/71349627$ cat outputs/tool_stderr 
-------------------------------------------------------
[Jul 02 02:04 PM]: OS: Debian GNU/Linux 10, 8 cores, ~ 41 GB RAM. Python: 3.8.15
[Jul 02 02:04 PM]: Running funannotate v1.8.15
[Jul 02 02:04 PM]: Skipping CodingQuarry as no --rna_bam passed
[Jul 02 02:04 PM]: Parsed training data, run ab-initio gene predictors as follows:
  Program      Training-Method
  augustus     busco          
  glimmerhmm   busco          
  snap         busco          
[Jul 02 02:11 PM]: Loading genome assembly and parsing soft-masked repetitive sequences
Traceback (most recent call last):
  File "/usr/local/lib/python3.8/multiprocessing/process.py", line 315, in _bootstrap
    self.run()
  File "/usr/local/lib/python3.8/multiprocessing/process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
  File "/usr/local/lib/python3.8/multiprocessing/managers.py", line 616, in _run_server
    server.serve_forever()
  File "/usr/local/lib/python3.8/multiprocessing/managers.py", line 182, in serve_forever
    sys.exit(0)
SystemExit: 0

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.8/multiprocessing/util.py", line 300, in _run_finalizers
    finalizer()
  File "/usr/local/lib/python3.8/multiprocessing/util.py", line 224, in __call__
    res = self._callback(*self._args, **self._kwargs)
  File "/usr/local/lib/python3.8/multiprocessing/util.py", line 133, in _remove_temp_dir
    rmtree(tempdir)
  File "/usr/local/lib/python3.8/shutil.py", line 718, in rmtree
    _rmtree_safe_fd(fd, path, onerror)
  File "/usr/local/lib/python3.8/shutil.py", line 675, in _rmtree_safe_fd
    onerror(os.unlink, fullname, sys.exc_info())
  File "/usr/local/lib/python3.8/shutil.py", line 673, in _rmtree_safe_fd
    os.unlink(entry.name, dir_fd=topfd)
OSError: [Errno 16] Device or resource busy: '.nfs00000000068a0f1b00000001'
Traceback (most recent call last):
  File "/usr/local/lib/python3.8/multiprocessing/process.py", line 315, in _bootstrap
    self.run()
  File "/usr/local/lib/python3.8/multiprocessing/process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
  File "/usr/local/lib/python3.8/multiprocessing/managers.py", line 616, in _run_server
    server.serve_forever()
  File "/usr/local/lib/python3.8/multiprocessing/managers.py", line 182, in serve_forever
    sys.exit(0)
SystemExit: 0

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.8/multiprocessing/util.py", line 300, in _run_finalizers
    finalizer()
  File "/usr/local/lib/python3.8/multiprocessing/util.py", line 224, in __call__
    res = self._callback(*self._args, **self._kwargs)
  File "/usr/local/lib/python3.8/multiprocessing/util.py", line 133, in _remove_temp_dir
    rmtree(tempdir)
  File "/usr/local/lib/python3.8/shutil.py", line 718, in rmtree
    _rmtree_safe_fd(fd, path, onerror)
  File "/usr/local/lib/python3.8/shutil.py", line 675, in _rmtree_safe_fd
    onerror(os.unlink, fullname, sys.exc_info())
  File "/usr/local/lib/python3.8/shutil.py", line 673, in _rmtree_safe_fd
    os.unlink(entry.name, dir_fd=topfd)
OSError: [Errno 16] Device or resource busy: '.nfs00000000068a0ce900000002'
[Jul 02 02:20 PM]: Genome loaded: 28,306 scaffolds; 775,487,987 bp; 41.27% repeats masked
[Jul 02 02:20 PM]: Mapping 557,291 proteins to genome using diamond and exonerate
Traceback (most recent call last):
  File "/usr/local/lib/python3.8/site-packages/funannotate/aux_scripts/funannotate-p2g.py", line 252, in <module>
    os.makedirs(tmpdir)
  File "/usr/local/lib/python3.8/os.py", line 223, in makedirs
    mkdir(name, mode)
OSError: [Errno 30] Read-only file system: '/tmp/p2g_65434e2d-85a3-4a9f-9652-b201aea1592c'
Traceback (most recent call last):
  File "/usr/local/bin/funannotate", line 10, in <module>
    sys.exit(main())
  File "/usr/local/lib/python3.8/site-packages/funannotate/funannotate.py", line 716, in main
    mod.main(arguments)
  File "/usr/local/lib/python3.8/site-packages/funannotate/predict.py", line 1558, in main
    lib.exonerate2hints(Exonerate, hintsP)
  File "/usr/local/lib/python3.8/site-packages/funannotate/library.py", line 4600, in exonerate2hints
    with open(file, "r") as input:
FileNotFoundError: [Errno 2] No such file or directory: '/data/jwd02f/main/071/349/71349627/working/output/predict_misc/protein_alignments.gff3'

@bgruening bgruening marked this pull request as ready for review July 2, 2024 13:37
Copy link
Member

@sanjaysrikakulam sanjaysrikakulam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be fine. Hopefully, none of the tools we have in the containers misbehaves.

@mira-miracoli
Copy link
Contributor

mira-miracoli commented Jul 2, 2024

Currently we use vda as device for /tmp which is the same as the root disk and has currently 50G,
however there is a vdb device for /scratch that has 1TB capacity.

Maybe it is safer to use that one?

@bgruening
Copy link
Member Author

How do you interpret the above error? I thought its not writable at all?

@sanjaysrikakulam
Copy link
Member

sanjaysrikakulam commented Jul 2, 2024

How do you interpret the above error? I thought its not writable at all?

I interpret it the same way. The docker is configured to use /scratch/docker as its root, so all containers use this to store data, volumes, images, tmp, etc.

-- Edit ---
sorry, this is about singularity. I think if we want to use the /scratch, then this needs to be bind mounted exclusively as tmp in every singularity job. I don't know if there is a shortcut.

@sanjaysrikakulam
Copy link
Member

To avoid storage problems and arbitrary mount points, can we use the tmp dir in the JWD, for example: $job_directory/tmp in the extra args?

@kysrpex
Copy link
Contributor

kysrpex commented Jul 2, 2024

To avoid storage problems and arbitrary mount points, can we use the tmp dir in the JWD, for example: $job_directory/tmp in the extra args?

+ 💯 This would be best imo. For security reasons I do not think you want jobs to share /tmp. It does not seem like Singularity provides any good alternatives. There exists --writable-tmpfs, but it uses memory and provides too little space. Someone proposed --writable-scratch to use disk, but it has not been implemented.

Edit: I see actually you are enabling --writable-tmpfs via environment variable, moreover it is just for a single tool. I think it makes sense to have a look at this issue. It's ok anyway, no security concerns apply, and we are for sure better off with it enabled.

@bgruening
Copy link
Member Author

Edit: I see actually you are enabling --writable-tmpfs via environment variable, moreover it is just for a single tool. I think it makes sense to apptainer/singularity#5718. It's ok anyway, no security concerns apply, and we are for sure better off with it enabled.

Yes. This is really just for tools that have /tmp hardcoded or do other bad stuff. Galaxy should set by default TMPDIR and Co to the JWD already.

@mira-miracoli
Copy link
Contributor

There exists --writable-tmpfs, but it uses memory and provides too little space.

We should just keep in mind that OOM killer can then kill the jobs if we not provision enough memory for the extra tmp.

Maybe I am blind here, but why do we not mount in the tmp from jwd as @sanjaysrikakulam suggested?

@kysrpex
Copy link
Contributor

kysrpex commented Jul 3, 2024

There exists --writable-tmpfs, but it uses memory and provides too little space.

We should just keep in mind that OOM killer can then kill the jobs if we not provision enough memory for the extra tmp.

Maybe I am blind here, but why do we not mount in the tmp from jwd as @sanjaysrikakulam suggested?

It's not either this or mounting tmp from jwd. Those are not mutually exclusive. You may check out apptainer/singularity#798. Running with --writable-tmpfs means you get a throwaway overlay mounted on / (only 64MB in size). Thus, if the program tries to write something anywhere (no matter if /tmp or somewhere else), it will be able to do so (up to 64MB, more if that location is mounted somewhere else). I hope the following example clarifies it:

centos@vgcnbwc-worker-c36m100-0013:~$ singularity shell /cvmfs/singularity.galaxyproject.org/f/u/funannotate\:1.8.13--pyhdfd78af_0 
Singularity> df -h
Filesystem                Size      Used Available Use% Mounted on
overlay                  64.0M     12.0K     64.0M   0% /
devtmpfs                  4.0M         0      4.0M   0% /dev
tmpfs                    48.9G         0     48.9G   0% /dev/shm
/dev/vda1                49.9G      8.9G     41.0G  18% /etc/localtime
/dev/vda1                49.9G      8.9G     41.0G  18% /etc/hosts
/dev/vda1                49.9G      8.9G     41.0G  18% /home/centos
/dev/vda1                49.9G      8.9G     41.0G  18% /tmp
/dev/vda1                49.9G      8.9G     41.0G  18% /var/tmp
tmpfs                    64.0M     12.0K     64.0M   0% /etc/resolv.conf
tmpfs                    64.0M     12.0K     64.0M   0% /etc/passwd
tmpfs                    64.0M     12.0K     64.0M   0% /etc/group
Singularity> echo a > /test
bash: /test: Read-only file system
Singularity> exit
centos@vgcnbwc-worker-c36m100-0013:~$ singularity shell --writable-tmpfs /cvmfs/singularity.galaxyproject.org/f/u/funannotate\:1.8.13--pyhdfd78af_0 
Singularity> df -h
Filesystem                Size      Used Available Use% Mounted on
fuse-overlayfs           64.0M     12.0K     64.0M   0% /
devtmpfs                  4.0M         0      4.0M   0% /dev
tmpfs                    48.9G         0     48.9G   0% /dev/shm
/dev/vda1                49.9G      8.9G     41.0G  18% /etc/localtime
/dev/vda1                49.9G      8.9G     41.0G  18% /etc/hosts
/dev/vda1                49.9G      8.9G     41.0G  18% /home/centos
/dev/vda1                49.9G      8.9G     41.0G  18% /tmp
/dev/vda1                49.9G      8.9G     41.0G  18% /var/tmp
tmpfs                    64.0M     12.0K     64.0M   0% /etc/resolv.conf
tmpfs                    64.0M     12.0K     64.0M   0% /etc/passwd
tmpfs                    64.0M     12.0K     64.0M   0% /etc/group
Singularity> echo a > /test
Singularity> 

Björn is suggesting to enable this for toolshed.g2.bx.psu.edu/repos/iuc/funannotate_annotate/funannotate_annotate/.*. I do not know if that solves the problem, but the cost of letting him try is close to zero.

@kysrpex kysrpex changed the title Make /tmp writable funannotate_annotate: run with --writable-tmpfs Jul 3, 2024
@mira-miracoli
Copy link
Contributor

Thank you, I think I understand this now better was confused, because I thought it is only about /tmp
64 MB should be fine memory wise, if the tmp files the tool creates are small enough, ofc

@sanjaysrikakulam sanjaysrikakulam merged commit f77c166 into master Jul 3, 2024
4 checks passed
@kysrpex kysrpex deleted the bgruening-patch-10 branch July 3, 2024 08:52
@bgruening
Copy link
Member Author

Thanks for merging. Can I redeploy this? We just got one bug report again with this tool and the NFS lock issue today :(

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants