Skip to content

Commit

Permalink
Allows for CDS (as well as gene) features to generate a new gene refe…
Browse files Browse the repository at this point in the history
…rence

The newreference.py script processes a GenBank file, producing new gene
reference files (GenBank and FASTA) which are required for creating the
Nextstrain gene tree views.

As we intend to use this script as a template for other viruses (e.g. dengue, measles),
it was necessary to allow the use of CDS features in the GenBank file, not just the gene.
  • Loading branch information
j23414 committed Mar 8, 2024
1 parent b9f4f4f commit 570daa6
Showing 1 changed file with 3 additions and 1 deletion.
4 changes: 3 additions & 1 deletion scripts/newreference.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,10 +6,12 @@

def new_reference(referencefile, outgenbank, outfasta, gene):
ref = SeqIO.read(referencefile, "genbank")
startofgene = None
endofgene = None
for feature in ref.features:
if feature.type == 'source':
ref_source_feature = feature
if feature.type =='gene':
if feature.type =='gene' or feature.type == 'CDS':
a = list(feature.qualifiers.items())[0][-1][0]
if a == gene:
startofgene = int(list(feature.location)[0])
Expand Down

0 comments on commit 570daa6

Please sign in to comment.