Skip to content

BioPython

Meg Staton edited this page Oct 20, 2020 · 4 revisions

Biopython

Demo on ISAAC

module load biopython/1.72

Example script 1 - Lets create a sequence object and do some common sequence manipulations.

from Bio.Seq import Seq
my_seq = Seq("AGTACACTGGT")
print(my_seq)
print(my_seq.complement())
print(my_seq.reverse_complement())
print(my_seq)

Example script 2 - We'll parse through the sequences in a fasta file and print their lengths. Get the fasta file first.

On command line:

wget https://raw.githubusercontent.com/biopython/biopython/master/Doc/examples/ls_orchid.fasta

How many sequences does it have?

grep -c '^>' ls_orchid.fasta

And the script:

from Bio import SeqIO

for seq_record in SeqIO.parse("ls_orchid.fasta", "fasta"):
    print(seq_record.id)
    print(len(seq_record))

What if we wanted the count the number of sequences?

from Bio import SeqIO

seq_count = 0
for seq_record in SeqIO.parse("ls_orchid.fasta", "fasta"):
    print(seq_record.id)
    print(len(seq_record))
    seq_count += 1

print("The number of sequences is: " + str(seq_count))