nifH dada2

This is a nifH database formated for the dada2 pipeline.

The database is based on the nifH ARB database from the Zehr Lab (Heller 2014) (version June 2017), which was exported from ARB in XML format. The XML file was modified, updated, and reformatted. A log of modifications can be found in this repository (Taxonomic_modifications.pdf). Modifications were based on information available on NCBI.

v1.1.0 has been updated with additional sequences, including nifH homologs from Cluster V (chlL/bchL) (see Documentation).

Database generation

The modified XML file included in this repository (nifH_mod_v1.xml.zip) was used to generate a dada2 formatted database. The XML file contains detailed information for each entry, including accession number, full taxon name, taxonomy, amino acid and nucleotide sequences. The script for parsing data and generating the fasta file is also included in this repository (nifH_xml_to_dada2_fasta.R).

Taxonomy in the fasta file is in the following format:

>Domain; Phylum; Class; Order; Family; Genus ACCTAGAAAGTCGTAGATCGAAGTTGAAGCATCGCCCGATGATCGTCTGAAGCTGTAGCATGAGTCGATTTTCACATTCAGGGATACCATAGGATAC

Three versions of the database are included:

**Only sequences idetified to the Phylum level or more (nifH_dada2_phylum_v1.1.0.fasta) (**recommended version)
Only sequences identified to the Domain level or more (nifH_dada2_domain_v1.1.0.fasta)
All sequences from the original nifH database (nifH_dada2_all_v1.1.0.fasta)

Please cite

M. A. Moynihan. 2020. nifHdada2 GitHub repository. Zenodo. http://doi.org/10.5281/zenodo.3958370

References

Heller, P., Tripp, H. J., Turk-Kubo, K., & Zehr, J. P. (2014). ARBitrator: A software pipeline for on-demand retrieval of auto-curated nifH sequences from GenBank. Bioinformatics, btu417.

ARB nifH database: https://wwwzehr.pmc.ucsc.edu/nifH_Database_Public/

NCBI Resource Coordinators. Database resources of the National Center for Biotechnology Information. Nucleic Acids Res. 2018;46(D1):D8-D13. doi:10.1093/nar/gkx1095

Name		Name	Last commit message	Last commit date
Latest commit History 76 Commits
Documentation		Documentation
XMLdatabase		XMLdatabase
README.md		README.md
nifH_dada2_all_v1.1.0.fasta		nifH_dada2_all_v1.1.0.fasta
nifH_dada2_domain_v1.1.0.fasta		nifH_dada2_domain_v1.1.0.fasta
nifH_dada2_phylum_v1.1.0.fasta		nifH_dada2_phylum_v1.1.0.fasta

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

nifH dada2

Database generation

Please cite

References

About

Releases

Packages

Languages

elximo/nifHdada2

Folders and files

Latest commit

History

Repository files navigation

nifH dada2

Database generation

Please cite

References

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages