-
Notifications
You must be signed in to change notification settings - Fork 0
Home
yeyanbo edited this page Jun 13, 2013
·
11 revisions
Biopython is a set of open source python packages and modules for bioinformatics works. In the Bio.Phylo package, there are already implementations for some basic phylogenetics tasks: basic tree operations, parsers for Newick, Nexus and PhyloXML, and wrappers for Phyml, Raxml and PAML. While there are some important components that remain to be implemented to better support phylogenetic workflows. These include simple tree construction algorithms, consensus tree searching, and tree visualization. In this project, the first two will be implemented.
- Implement simple tree inference algorithms of Unweighted Pair Group Method with Arithmetic Mean(UPGMA), Neighbour Joining(NJ) and Maximum Parsimony(MP).
- Implement consensus tree search functions of multiple trees, including Strict consensus tree, majority-rule consensus tree and Adams consensus tree.
- Implement branch support calculation functions given a target tree and a list of bootstrap replicate trees.
- Implement a bootstrap method for a given alignment and provide two interface methods to generate a tree list and construct a consensus tree(given the parameters of treeMethod, consensusMethod and bootstrapTime).
- Design the distance matrix calculation method and write document and tests for it;
- Implement distance matrix calculation method for the multiple sequence alignment object;
- Design UPGMA method and write document and tests for the it;
- Implement UPGMA method by porting my Java code;
- Design the NJ method and write document and tests for it;
- Implement NJ by porting my Java code;
- Design the parsimony score method and write document and tests for it;
- Implement method to calculate the parsimony score for a given tree and an alignment;
- Design the parsimony tree searching method and write document and tests for it.
- Implement the Nearest-neighbour interchanges algorithm to search for a tree minimizing the score. A compatible tree manipulation method is needed to interchange the tree branches.
- To be efficient in consensus tree search, design a binary array class with binary like operations to store and count clades, and write document and tests for it.
- Implement the binary array manipulation class using a normal way for each methods at the beginning and improve the performance later(with the same API).
- Cleanup existing code, improve tests and document;
- Write and submit mid-term evaluations.
- Design the strict and majority-rule consensus tree methods and write document and tests for them;
- Implement a method for counting the presence time of each clade given a list of trees. It will be used by both strict consensus and majority-rule consensus methods;
- Implement the strict consensus tree method and majority-rule consensus tree method by porting my Java code into python.
- Design the adams consensus tree method and write document and tests for it;
- Get familiar with the adams consensus tree algorithm and implement it.
- Design the branch support calculation method and write document and tests for it;
- Implement the branch support calculation method given a tree and a list of trees;
- Design the bootstrap and some interface methods, and write document and tests for these methods;
- Implement the bootstrap method;
- Write a interface method to generate a bootstrapped tree list providing the parameter of tree method(UPGMA,NJ,MP) and bootstrap time;
- Write another one for consensus tree given the tree method, consensus method and bootstrap time;