Skip to content

Commit

Permalink
Update psa.py docstring and README.md to document the addition of str…
Browse files Browse the repository at this point in the history
…etcher implementation
  • Loading branch information
aziele committed Oct 30, 2024
1 parent 1722b9e commit de764b1
Show file tree
Hide file tree
Showing 3 changed files with 21 additions and 9 deletions.
12 changes: 8 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,7 @@
# pairwise-sequence-alignment (psa)

![PyPI - Version](https://img.shields.io/pypi/v/pairwise-sequence-alignment)

This is a Python module to calculate a pairwise alignment between biological sequences (protein or nucleic acid). This module uses the [needle](https://www.ebi.ac.uk/Tools/psa/emboss_needle/), [stretcher](https://www.ebi.ac.uk/jdispatcher/psa/emboss_stretcher) and [water](https://www.ebi.ac.uk/Tools/psa/emboss_water/) tools from the EMBOSS package to calculate an optimal, global/local pairwise alignment.

I wrote this module for two reasons. First, the needle and water tools are faster than any Python implementation. Second, Biopython has dropped support for tools from the EMBOSS package and recommends running them via the subprocess module directly.
Expand Down Expand Up @@ -28,11 +30,13 @@ I wrote this module for two reasons. First, the needle and water tools are faste


## Introduction
Pairwise sequence alignment is used to identify regions of similarity that may indicate functional, structural and/or evolutionary relationships between two sequences.
Pairwise sequence alignment identifies regions of similarity between two sequences, which can indicate functional, structural, or evolutionary relationships.

1. Global alignment aligns two sequences from start to finish, ideal for sequences that are similar and of comparable length.
* *needle* [Needleman-Wunsch algorithm](https://en.wikipedia.org/wiki/Needleman–Wunsch_algorithm)) calculates a full-length global alignment by maximizing similarity across both sequences through a dynamic programming approach.
* *stretcher* [documentation](https://galaxy-iuc.github.io/emboss-5.0-docs/stretcher.html) performs global alignment using a modified dynamic programming algorithm optimized for linear space efficiency.

1. Global alignment (*needle*; [the Needleman-Wunsch algorithm](https://en.wikipedia.org/wiki/Needleman–Wunsch_algorithm)) aligns two sequences across their entire length, from beginning to end. It is most useful when sequences you are aligning are similar and roughly the same size.
1a. *stretcher*; [explaination](https://galaxy-iuc.github.io/emboss-5.0-docs/stretcher.html) calculates a global alignment of two sequences using a modification of the classic dynamic programming algorithm which uses linear space.
2. Local alignment (*water*; [the Smith-Waterman algorithm](https://en.wikipedia.org/wiki/Smith–Waterman_algorithm)) finds the region with the highest level of similarity between the two sequences. It is suitable for sequences that are not assumed to be similar over the entire length.
2. Local alignment (*water*; [Smith-Waterman algorithm](https://en.wikipedia.org/wiki/Smith–Waterman_algorithm)) finds the region with the highest level of similarity between the two sequences. It is suitable for sequences that are not assumed to be similar over the entire length.


## Requirements
Expand Down
6 changes: 3 additions & 3 deletions psa.py
Original file line number Diff line number Diff line change
@@ -1,12 +1,12 @@
"""Global and local pairwise alignments between nucleotide/protein sequences.
The module uses needle/water from the EMBOSS package to compute an optimal
global/local alignment between a pair of sequences (query and subject).
The module uses needle/strecher and water from the EMBOSS package to compute
an optimal global and local alignment between a pair of sequences (query and
subject).
Copyright 2022 Andrzej Zielezinski ([email protected])
https://github.com/aziele/pairwise-sequence-alignment
Adapted to also use stretcher based on needle.
"""

from __future__ import annotations
Expand Down
12 changes: 10 additions & 2 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -11,11 +11,19 @@ readme = "README.md"
license = {file = "LICENSE"}
classifiers = [
"License :: OSI Approved :: GNU General Public License v3 or later (GPLv3+)",
"Programming Language :: Python",
"Programming Language :: Python :: 3",
"Operating System :: POSIX :: Linux",
"Natural Language :: English",
"Intended Audience :: Developers",
"Intended Audience :: Science/Research",
"Topic :: Scientific/Engineering",
"Topic :: Scientific/Engineering :: Bio-Informatics",
"Programming Language :: Python :: 3",
"Programming Language :: Python :: 3.7",
"Programming Language :: Python :: 3.8",
"Programming Language :: Python :: 3.9",
"Programming Language :: Python :: 3.10",
"Programming Language :: Python :: 3.11",
"Programming Language :: Python :: 3.12",
]
keywords = [
"sequence alignment",
Expand Down

0 comments on commit de764b1

Please sign in to comment.