Home

Abstract

Biopython is a set of open source python packages and modules for bioinformatics works. In the Bio.Phylo package, there are already implementations for some basic phylogenetics tasks: basic tree operations, parsers for Newick, Nexus and PhyloXML, and wrapers for Phyml, Raxml and PAML. While there are some important components that remain to be implemented to better support phylogenetic workflows. These include simple tree construction algorithms(UPGMA, NJ and MP), consensus tree searching(Stric, Majority-Rule and Adam Consensus), tree comparison and tree visualization. In this project, the first two will be implemented.

Approach & Goals

Implement simple tree inference algorithms of Unweighted Pair Group Method with Arithmetic Mean(UPGMA), Neighbour Joining(NJ) and Maximum Parsimony(MP).
Implement consensus tree search functions of multiple trees, including Strict consensus tree, majority-rule consensus tree and Adams consensus tree.
Implement branch support calculation functions given a target tree and a list of bootstrap replicate trees.
Implement a bootstrap method for a given alignment and provide two interface methods to generate a tree list and construct a consensus tree(given the parameters of treeMethod, consensusMethod and bootstrapTime).

Key Deliverables

Tree construction module: UPGMA, NJ and MP algorithms, branch support method
Consensus tree module: strict, majority-rule and adams consensus tree methods.

Project Timeline

Week1(6.17-6.23): Distance Matrix and Calculation

A DistanceMatrix class with __getitem__, __setitem__, __delitem__, __len__ and insert(name, distances) methods;
A DistanceCalculator class to calculate and return a DistanceMatrix object from a dna or protein 'MSA' object(if time permmited);

Week2(6.24-6.30): UPGMA & NJ

Write a method to calculate the clade height -- the longest path to one of the terminals
Implement the UPGMA algorithm by porting the Java code from BlastGraph;
Implement NJ algorithm by porting the Java code;

Week3(7.1-7.7): Parsimony Score Method for MP

Get more clear understanding of the parsimony methods for both DNA and protein sequences.
Design the parsimony score method and write document and tests for it;
Implement method to calculate the parsimony score for a given tree and an alignment;

Week4(7.8-7.14): Bioinformatics Course

Work as TA for the bioinformatics course hosted by WHIOV and NEScent.

Week5(7.15-7.21): Tree Searching Method for MP

Design the parsimony tree searching method and write document and tests for it.
Implement the Nearest-neighbour interchanges algorithm to search for a tree minimizing the score. A compatible tree manipulation method is needed to interchange the tree branches.

Week6(7.22-7.28): Binary Array Class for Consensus Tree Search

To be efficient in consensus tree search, design a binary array class with binary like operations to store and count clades, and write document and tests for it.
Implement the binary array manipulation class using a normal way for each methods at the beginning and improve the performance later(with the same API).

Week7(7.29-8.4): Cleanup and Mid-term Evaluations

Cleanup existing code, improve tests and document;
Write and submit mid-term evaluations.

Week8(8.5-8.11): Strict and Majority-rule Consensus Tree

Design the strict and majority-rule consensus tree methods and write document and tests for them;
Implement a method for counting the presence time of each clade given a list of trees. It will be used by both strict consensus and majority-rule consensus methods;
Implement the strict consensus tree method and majority-rule consensus tree method by porting my Java code into python.

Week9(8.12-8.18): Adams Consensus Tree

Design the adams consensus tree method and write document and tests for it;
Get familiar with the adams consensus tree algorithm and implement it.

Week10(8.19-8.25): Branch Support

Design the branch support calculation method and write document and tests for it;
Implement the branch support calculation method given a tree and a list of trees;

Week11(8.26-9.1): Bootstrap Method

Design the bootstrap and some interface methods, and write document and tests for these methods;
Implement the bootstrap method;
Write a interface method to generate a bootstrapped tree list providing the parameter of tree method(UPGMA,NJ,MP) and bootstrap time;
Write another one for consensus tree given the tree method, consensus method and bootstrap time;

Provide feedback

Saved searches

Use saved searches to filter your results more quickly