Skip to content

hadeelsharaf/Bioinformatic-docs

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Python for Bioinformatics

Bioinformatics is the field that develops methods and software tools for understanding biological data. Units 6 and 7 in this course will help with understanding the basics of biology for this field.

Next-generation sequencing (NGS) is one of the fundamental technological developments. Whole-genome sequencing (WGS), restriction site-associated DNA sequencing (RAD-Seq), ribonucleic acid sequencing (RNA-Seq), chromatin immunoprecipitation sequencing (ChIP-Seq), and several other technologies are routinely used to investigate important biological problems. These are called high-throughput (HT) sequencing technologies. See this for a python package to help with the HT sequencing.

DNA in text files is represented as a string with sequence of specific characters; so, knowing about the following topics will be helpful:

  • File processing (txt and csv).
  • String and Regex functions.
  • BioPython.

Examples of Functions Bioinformatics:

  • Counting bases in a DNA sequence (Tetranucleotide Frequency): Code1

  • Reverse Complement of DNA:

    Code3

  • Computing GC Content: A higher GC content level indicates a relatively higher melting temperature in molecular biology, and DNA sequences that encode proteins tend to be found in GC-rich regions.

    Code4

  • Transcribing DNA into mRNA: regions of DNA must be transcribed into a form of RNA called messenger RNA (mRNA). Code5

  • Translating mRNA into Protein: mRNA makes protein.

    Code6

    Note on points 4,5: these functions can be done using string replacement and regex but using BioPython is the recommended approach.

  • Finding Open Reading Frames ORF: finding a region in DNA or RNA. using regex: This region starts with M and ends with (*).

    Code7

    the following section is applied after a series of transcribing and translating steps

    Code8

Sequence file extensions:

Code9

  • To read or write to a file:

Code10

For compressed a fastq files:

Code11

About

summarized introduction to Bioinformatic with python

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published