Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

adds checkm #773

Merged
merged 4 commits into from
Oct 31, 2023
Merged

adds checkm #773

merged 4 commits into from
Oct 31, 2023

Conversation

Kincekara
Copy link
Collaborator

Closes #766

Pull Request (PR) checklist:

  • Include a description of what is in this pull request in this message.
  • The dockerfile successfully builds to a test target for the user creating the PR. (i.e. docker build --tag samtools:1.15test --target test docker-builds/samtools/1.15 )
  • Directory structure as name of the tool in lower case with special characters removed with a subdirectory of the version number (i.e. spades/3.12.0/Dockerfile)
    • (optional) All test files are located in same directory as the Dockerfile (i.e. shigatyper/2.0.1/test.sh)
  • Create a simple container-specific README.md in the same directory as the Dockerfile (i.e. spades/3.12.0/README.md)
    • If this README is longer than 30 lines, there is an explanation as to why more detail was needed
  • Dockerfile includes the recommended LABELS
  • Main README.md has been updated to include the tool and/or version of the dockerfile(s) in this PR
  • Program_Licenses.md contains the tool(s) used in this PR and has been updated for any missing

@erinyoung
Copy link
Contributor

This was the step that was killed due to memory errors? It seems to be similar to Ecogenomics/CheckM#372.

#10 [test 2/2] RUN checkm test ./checkm_test_results &&    tail -n2 ./checkm_test_results/checkm.log &&    head ./checkm_test_results/results/qa_test.tsv
#10 0.870 [2023-10-26 16:57:11] INFO: CheckM v1.2.2
#10 0.871 [2023-10-26 16:57:11] INFO: checkm test ./checkm_test_results
#10 0.871 [2023-10-26 16:57:11] INFO: CheckM data: /data/checkm_db
#10 0.871 [2023-10-26 16:57:11] INFO: [CheckM - Test] Processing E.coli K12-W3310 to verify operation of CheckM.
#10 0.871 [2023-10-26 16:57:11] INFO: [Step 1]: Verifying tree command.
#10 0.871 [2023-10-26 16:57:11] INFO: [CheckM - tree] Placing bins in reference genome tree.
#10 1.004 [2023-10-26 16:57:11] INFO: Identifying marker genes in 1 bins with 1 threads:
#10 1.038     Finished processing 0 of 1 (0.00%) bins.
    Finished processing 1 of 1 (100.00%) bins.
#10 17.80 [2023-10-26 16:57:28] INFO: Saving HMM info to file.
#10 17.81 [2023-10-26 16:57:28] INFO: Calculating genome statistics for 1 bins with 1 threads:
#10 17.81     Finished processing 0 of 1 (0.00%) bins.
    Finished processing 1 of 1 (100.00%) bins.
#10 18.03 [2023-10-26 16:57:28] INFO: Extracting marker genes to align.
#10 18.03 [2023-10-26 16:57:28] INFO: Parsing HMM hits to marker genes:
#10 18.03     Finished parsing hits for 1 of 1 (100.00%) bins.
#10 18.11 [2023-10-26 16:57:28] INFO: Extracting 43 HMMs with 1 threads:
#10 18.12     Finished extracting 0 of 43 (0.00%) HMMs.
    Finished extracting 1 of 43 (2.33%) HMMs.
    Finished extracting 2 of 43 (4.65%) HMMs.
    Finished extracting 3 of 43 (6.98%) HMMs.
    Finished extracting 4 of 43 (9.30%) HMMs.
    Finished extracting 5 of 43 (11.63%) HMMs.
    Finished extracting 6 of 43 (13.95%) HMMs.
    Finished extracting 7 of 43 (16.28%) HMMs.
    Finished extracting 8 of 43 (18.60%) HMMs.
    Finished extracting 9 of 43 (20.93%) HMMs.
    Finished extracting 10 of 43 (23.26%) HMMs.
    Finished extracting 11 of 43 (25.58%) HMMs.
    Finished extracting 12 of 43 (27.91%) HMMs.
    Finished extracting 13 of 43 (30.23%) HMMs.
    Finished extracting 14 of 43 (32.56%) HMMs.
    Finished extracting 15 of 43 (34.88%) HMMs.
    Finished extracting 16 of 43 (37.21%) HMMs.
    Finished extracting 17 of 43 (39.53%) HMMs.
    Finished extracting 18 of 43 (41.86%) HMMs.
    Finished extracting 19 of 43 (44.19%) HMMs.
    Finished extracting 20 of 43 (46.51%) HMMs.
    Finished extracting 21 of 43 (48.84%) HMMs.
    Finished extracting 22 of 43 (51.16%) HMMs.
    Finished extracting 23 of 43 (53.49%) HMMs.
    Finished extracting 24 of 43 (55.81%) HMMs.
    Finished extracting 25 of 43 (58.14%) HMMs.
    Finished extracting 26 of 43 (60.47%) HMMs.
    Finished extracting 27 of 43 (62.79%) HMMs.
    Finished extracting 28 of 43 (65.12%) HMMs.
    Finished extracting 29 of 43 (67.44%) HMMs.
    Finished extracting 30 of 43 (69.77%) HMMs.
    Finished extracting 31 of 43 (72.09%) HMMs.
    Finished extracting 32 of 43 (74.42%) HMMs.
    Finished extracting 33 of 43 (76.74%) HMMs.
    Finished extracting 34 of 43 (79.07%) HMMs.
    Finished extracting 35 of 43 (81.40%) HMMs.
    Finished extracting 36 of 43 (83.72%) HMMs.
    Finished extracting 37 of 43 (86.05%) HMMs.
    Finished extracting 38 of 43 (88.37%) HMMs.
    Finished extracting 39 of 43 (90.70%) HMMs.
    Finished extracting 40 of 43 (93.02%) HMMs.
    Finished extracting 41 of 43 (95.35%) HMMs.
    Finished extracting 42 of 43 (97.67%) HMMs.
    Finished extracting 43 of 43 (100.00%) HMMs.
#10 18.46 [2023-10-26 16:57:28] INFO: Aligning 43 marker genes with 1 threads:
#10 18.47     Finished aligning 0 of 43 (0.00%) marker genes.
    Finished aligning 1 of 43 (2.33%) marker genes.
    Finished aligning 2 of 43 (4.65%) marker genes.
    Finished aligning 3 of 43 (6.98%) marker genes.
    Finished aligning 4 of 43 (9.30%) marker genes.
    Finished aligning 5 of 43 (11.63%) marker genes.
    Finished aligning 6 of 43 (13.95%) marker genes.
    Finished aligning 7 of 43 (16.28%) marker genes.
    Finished aligning 8 of 43 (18.60%) marker genes.
    Finished aligning 9 of 43 (20.93%) marker genes.
    Finished aligning 10 of 43 (23.26%) marker genes.
    Finished aligning 11 of 43 (25.58%) marker genes.
    Finished aligning 12 of 43 (27.91%) marker genes.
    Finished aligning 13 of 43 (30.23%) marker genes.
    Finished aligning 14 of 43 (32.56%) marker genes.
    Finished aligning 15 of 43 (34.88%) marker genes.
    Finished aligning 16 of 43 (37.21%) marker genes.
    Finished aligning 17 of 43 (39.53%) marker genes.
    Finished aligning 18 of 43 (41.86%) marker genes.
    Finished aligning 19 of 43 (44.19%) marker genes.
    Finished aligning 20 of 43 (46.51%) marker genes.
    Finished aligning 21 of 43 (48.84%) marker genes.
    Finished aligning 22 of 43 (51.16%) marker genes.
    Finished aligning 23 of 43 (53.49%) marker genes.
    Finished aligning 24 of 43 (55.81%) marker genes.
    Finished aligning 25 of 43 (58.14%) marker genes.
    Finished aligning 26 of 43 (60.47%) marker genes.
    Finished aligning 27 of 43 (62.79%) marker genes.
    Finished aligning 28 of 43 (65.12%) marker genes.
    Finished aligning 29 of 43 (67.44%) marker genes.
    Finished aligning 30 of 43 (69.77%) marker genes.
    Finished aligning 31 of 43 (72.09%) marker genes.
    Finished aligning 32 of 43 (74.42%) marker genes.
    Finished aligning 33 of 43 (76.74%) marker genes.
    Finished aligning 34 of 43 (79.07%) marker genes.
    Finished aligning 35 of 43 (81.40%) marker genes.
    Finished aligning 36 of 43 (83.72%) marker genes.
    Finished aligning 37 of 43 (86.05%) marker genes.
    Finished aligning 38 of 43 (88.37%) marker genes.
    Finished aligning 39 of 43 (90.70%) marker genes.
    Finished aligning 40 of 43 (93.02%) marker genes.
    Finished aligning 41 of 43 (95.35%) marker genes.
    Finished aligning 42 of 43 (97.67%) marker genes.
    Finished aligning 43 of 43 (100.00%) marker genes.
#10 19.29 [2023-10-26 16:57:29] INFO: Reading marker alignment files.
#10 19.29 [2023-10-26 16:57:29] INFO: Concatenating alignments.
#10 19.30 [2023-10-26 16:57:29] INFO: Placing 1 bins into the genome tree with pplacer (be patient).
#10 96.69 Killed
#10 96.78 Uncaught exception: Sys_error("./checkm_test_results/results/storage/tree/concatenated.pplacer.json: No such file or directory")
#10 96.78 Fatal error: exception Sys_error("./checkm_test_results/results/storage/tree/concatenated.pplacer.json: No such file or directory")
#10 96.89 [2023-10-26 16:58:47] INFO: { Current stage: 0:01:35.926 || Total: 0:01:35.926 }
#10 97.48 [2023-10-26 16:58:47] INFO: [Passed]
#10 97.48 [2023-10-26 16:58:47] INFO: [Step 2]: Verifying tree_qa command.
#10 97.48 [2023-10-26 16:58:47] INFO: [CheckM - tree_qa] Assessing phylogenetic markers found in each bin.
#10 97.49 [2023-10-26 16:58:47] INFO: Reading HMM info from file.
#10 97.54 [2023-10-26 16:58:47] INFO: Parsing HMM hits to marker genes:
#10 97.54     Finished parsing hits for 1 of 1 (100.00%) bins.
#10 97.78 
#10 97.78 Unexpected error: <class 'FileNotFoundError'>
#10 97.78 Traceback (most recent call last):
#10 97.78   File "/usr/local/bin/checkm", line 856, in <module>
#10 97.79     checkmParser.parseOptions(args)
#10 97.79   File "/usr/local/lib/python3.10/dist-packages/checkm/main.py", line 1031, in parseOptions
#10 97.80     self.test(options)
#10 97.80   File "/usr/local/lib/python3.10/dist-packages/checkm/main.py", line 940, in test
#10 97.80     verifyEcoli.run(self, options.output_dir)
#10 97.80   File "/usr/local/lib/python3.10/dist-packages/checkm/test/test_ecoli.py", line 75, in run
#10 97.80     parser.treeQA(options)
#10 97.80   File "/usr/local/lib/python3.10/dist-packages/checkm/main.py", line 225, in treeQA
#10 97.80     treeParser.printSummary(
#10 97.80   File "/usr/local/lib/python3.10/dist-packages/checkm/treeParser.py", line 45, in printSummary
#10 97.80     self.reportBinTaxonomy(outDir, resultsParser, bTabTable, outFile, binStats, bLineageStatistics=False)
#10 97.80   File "/usr/local/lib/python3.10/dist-packages/checkm/treeParser.py", line 641, in reportBinTaxonomy
#10 97.80     binIdToTaxonomy = self.getBinTaxonomy(outDir, binIds)
#10 97.80   File "/usr/local/lib/python3.10/dist-packages/checkm/treeParser.py", line 191, in getBinTaxonomy
#10 97.80     tree = dendropy.Tree.get_from_path(treeFile, schema='newick', rooting="force-rooted", preserve_underscores=True)
#10 97.80   File "/usr/local/lib/python3.10/dist-packages/dendropy/datamodel/basemodel.py", line 217, in get_from_path
#10 97.81     with open(src, *open_args) as fsrc:
#10 97.81 FileNotFoundError: [Errno 2] No such file or directory: './checkm_test_results/results/storage/tree/concatenated.tre'
#10 ERROR: process "/bin/sh -c checkm test ./checkm_test_results &&    tail -n2 ./checkm_test_results/checkm.log &&    head ./checkm_test_results/results/qa_test.tsv" did not complete successfully: exit code: 1
------
 > [test 2/2] RUN checkm test ./checkm_test_results &&    tail -n2 ./checkm_test_results/checkm.log &&    head ./checkm_test_results/results/qa_test.tsv:
97.80     treeParser.printSummary(
97.80   File "/usr/local/lib/python3.10/dist-packages/checkm/treeParser.py", line 45, in printSummary
97.80     self.reportBinTaxonomy(outDir, resultsParser, bTabTable, outFile, binStats, bLineageStatistics=False)
97.80   File "/usr/local/lib/python3.10/dist-packages/checkm/treeParser.py", line 641, in reportBinTaxonomy
97.80     binIdToTaxonomy = self.getBinTaxonomy(outDir, binIds)
97.80   File "/usr/local/lib/python3.10/dist-packages/checkm/treeParser.py", line 191, in getBinTaxonomy
97.80     tree = dendropy.Tree.get_from_path(treeFile, schema='newick', rooting="force-rooted", preserve_underscores=True)
97.80   File "/usr/local/lib/python3.10/dist-packages/dendropy/datamodel/basemodel.py", line 217, in get_from_path
97.81     with open(src, *open_args) as fsrc:
97.81 FileNotFoundError: [Errno 2] No such file or directory: './checkm_test_results/results/storage/tree/concatenated.tre'
------
WARNING: local cache import at /tmp/.buildx-cache-checkm not found due to err: could not read /tmp/.buildx-cache-checkm/index.json: open /tmp/.buildx-cache-checkm/index.json: no such file or directory
Dockerfile:48
--------------------
  47 |     # run an internal test
  48 | >>> RUN checkm test ./checkm_test_results &&\
  49 | >>>     tail -n2 ./checkm_test_results/checkm.log &&\
  50 | >>>     head ./checkm_test_results/results/qa_test.tsv
  51 |     
--------------------
ERROR: failed to solve: process "/bin/sh -c checkm test ./checkm_test_results &&    tail -n2 ./checkm_test_results/checkm.log &&    head ./checkm_test_results/results/qa_test.tsv" did not complete successfully: exit code: 1
Error: buildx failed with: ERROR: failed to solve: process "/bin/sh -c checkm test ./checkm_test_results &&    tail -n2 ./checkm_test_results/checkm.log &&    head ./checkm_test_results/results/qa_test.tsv" did not complete successfully: exit code: 1

@Kincekara
Copy link
Collaborator Author

@erinyoung Thank you! I'll try to find a workaround solution.

@Kincekara
Copy link
Collaborator Author

taxonomy_wf has less memory requirement, so I replaced lineage_wf with taxonmy_wf in the test. This is ready for review.

@Kincekara Kincekara marked this pull request as ready for review October 31, 2023 13:12
Copy link
Contributor

@erinyoung erinyoung left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have no changes to recommend.

I'm going to

  1. merge this PR
  2. deploy to dockerhub and quay using both the '1.2.2' and 'latest' tags

@erinyoung
Copy link
Contributor

I'm glad you were able to find a lower-memory test!

@erinyoung erinyoung merged commit 55d0b32 into StaPH-B:master Oct 31, 2023
2 checks passed
@erinyoung
Copy link
Contributor

Thank you for your contribution!

You can check the status of the deploy here : https://github.com/StaPH-B/docker-builds/actions/runs/6710642946

@Kincekara Kincekara deleted the checkm branch October 31, 2023 18:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Container Request]: CheckM
2 participants