Skip to content

Commit

Permalink
Update project.md
Browse files Browse the repository at this point in the history
  • Loading branch information
nekrut authored Apr 23, 2024
1 parent dca45f8 commit 481c2c3
Showing 1 changed file with 37 additions and 1 deletion.
38 changes: 37 additions & 1 deletion 2024/project.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ I have subdivided this class into the following groups:

These numbers correspond to colony labels in Fig S3 of the [supplement](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5534434/bin/NIHMS874162-supplement-Supplemental_Methods_And_Figures.pdf).

## Task 1
## Task 1: Prep
(due March 26, 2024 in class)

1. Go to NCBI SRA
Expand All @@ -34,3 +34,39 @@ Here:
**Start point** - point with the number that was assigned to your group (e.g., Point 4 for Manifold4)

Now all numbers were sequenced. So you objective is to find adaptation trajectory which has the data in SRA.

## Task 2: Analysis
(due May 1st, 2024 by email)

### Assumptions

1. You have a final varinat dataset for your samples (see below for an example)
2. You have created a mapping between your samples and accession numbers as was [described here](https://github.com/nekrut/BMMB554/blob/master/2024/assessimg_variants.md#establish-the-relationship-between-samples-and-accessions) and downloaded it as a .csv file named `names.csv`.

Example of variant dataset:

```
Sample CHROM POS REF ALT AF DP DP4 EFF[*].GENE EFF[*].CODON EFF[*].FUNCLASS
SRR3722117 CP009273 360103 C T 0.075949 79 30,43,0,6 . . NONE
SRR3722117 CP009273 870516 G T 0.157895 19 10,6,2,1 yliE ctG/ctT SILENT
SRR3722117 CP009273 1330682 G T 0.363636 11 4,3,2,2 acnA Ggt/Tgt MISSENSE
SRR3722117 CP009273 1631797 C A 0.25 16 4,8,2,2 ydfJ . NONE
```
### What do do

1. Create a copy of this nodebook -> https://colab.research.google.com/drive/1hnoNGQx7MEORWv7KcQAMzpFIsRxvdYVM?usp=sharing
2. Upload you `names.csv` file into notebook disk
3. Run your samples through the notebook

#### Group 1 (Sym)

1. Identify which FIXED mutations are shared within clusters.
2. Plot a comparison of these mutations across clusters

#### Gloups 2 - 4 (Manifold)

1. Identify all fixed mutations that are present in the terminal points but are absent at the start
2. Trace their trajectroies through the time points from beginning to end
3. Plot the change in frequencies


0 comments on commit 481c2c3

Please sign in to comment.