Skip to content

Commit

Permalink
Update alignment.md
Browse files Browse the repository at this point in the history
  • Loading branch information
nekrut authored Apr 18, 2024
1 parent 2ff8016 commit 5d347df
Showing 1 changed file with 2 additions and 2 deletions.
4 changes: 2 additions & 2 deletions 2024/alignment.md
Original file line number Diff line number Diff line change
Expand Up @@ -80,9 +80,9 @@ where $\delta(x,y) = 0$ if $x = y$ (nucleotides match) and $\delta(x,y) = 1$ if

The take-home-message here is that it takes a very long time to compute the edit distance between two sequences that are only **nine** nucleotides long! Why is this happening? Figure 1 below shows a small subset of situations the algorithm is evaluating for two very short strings $\texttt{TAG}$ and $\texttt{TAC}$:

![](http://www.bx.psu.edu/~anton/bioinf-images/editDist.png)
![image](https://github.com/nekrut/BMMB554/assets/4291636/399468e5-cc12-4a84-969e-ce4c1e5186a4)

**Figure 1** | A fraction of situations evaluated by the naïve algorithm for computing the edit distance. Just like in the case of the change problem discussed in the previous lecture a lot of time is wasted on computing distances between suffixes that has been seen more than once (shown in red).
<small>**Figure 1** | A fraction of situations evaluated by the naïve algorithm for computing the edit distance. Just like in the case of the change problem discussed in the previous lecture a lot of time is wasted on computing distances between suffixes that has been seen more than once (shown in red).</small>

To understand the magnitude of this problem let's look at slightly modified version of the previous Python code below. All we do here is keeping track how many times a particular pair of suffixes (in this case $\texttt{AC}$ and $\texttt{AC}$) are seen by the program. The number is staggering: 48,639. So this algorithm is **extremely** wasteful.

Expand Down

0 comments on commit 5d347df

Please sign in to comment.