Skip to content

Releases: biotite-dev/biotite

Biotite 0.19.1

24 Dec 11:59
fb0f96a
Compare
Choose a tag to compare

Changelog

Additions

  • FASTA files can be downloaded from RCSB PDB via database.rcsb
  • Added structure.rotate_about_axis() and structure.align_vectors()
  • Added shape property and copy() method to structure.Atom
  • All array-like objects can be used to set an annotation array in an atom array (stack)
  • Added structure.info.residue() for getting the standard atoms and their coordinates for a given residue name
  • Added structure.graphics.plot_atoms() for interactive molecular visualization
  • Added exclusive_stop parameter to structure.get_residue_starts and structure.get_chain_starts
  • Added connect_via_residue_names() and connect_via_distances() for calculating a structure.BondList for a structure.AtomArray

Changes

  • structure.rotate() does not rotate the box of an atom array (stack) anymore
  • structure.BondList equality is not order dependent

Fixes

  • structure.BondList accepts all dtypes for integer arrays
  • structure.BondList accepts negative integers as indices
  • sequence.io.fasta.FastaFile: Tests for invalid or empty files
  • structure.io.pdb.PDBFile: Exception is raised if an invalid field in extra_fields is given
  • structure.rotate(): Fixed rotation direction

Biotite 0.18.0

20 Nov 15:12
Compare
Choose a tag to compare

Changelog

Additions

  • Added shape property to structure.AtomArray() and
    structure.AtomArrayStack()
  • structure.Atom() has default values for annotation arrays
  • The functions structure.rmsf(), structure.rmsd() and structure.average()
    accept directly coordinates
  • Added use_author_fields parameter to structure.io.pdbx.get_structure(),
    that allows to decide between the usage of label_xxx and auth_xxx fields
  • Added chunk_size parameter to read() method of trajectory files to
    resolve memory issues
  • Added density() function for calculating atom densities.
  • Added sequence.align.get_pairwise_sequence_identity()
  • API reference shows source files of Cython modules

Changes

  • The module name (__module__ attribute) of functions/classes are
    changed to the name of the respective Biotite subpackage
    (e.g. biotite.structure.atoms to biotite.structure)
  • Changed handling of PDB insertion codes:
    • Atoms with insertion codes are not filtered out
    • Removed insertion_code parameter in
      biotite.structure.io.xxx.get_structure()
    • New mandatory annotation category ins_code
    • Changed structure.filter_inscode_and_altloc() to
      structure.filter_altloc()

Fixes

  • The step parameter in the read() method of trajectory files does not
    increase the stop frame
  • Negative residue IDs are handled correctly by structure file readers/writers
  • Fixed issues with indexing behavior in sequence.align.Alignment class
  • structure.remove_pbc() raises proper error message when box is missing
    in the given atom array (stack)
  • sequence.align.align_multiple() raises proper error message, if
    pairwise distance cannot be calculated due to great sequence dissimilarity
  • In sequence.io.genbank.get_annotation() qualifier keys without values
    (e.g. /pseudo) are handled properly
  • Added pyproject.toml specifying build dependencies for setup.py

Biotite 0.17.0

20 Sep 12:59
aa1b1f9
Compare
Choose a tag to compare

Changelog

Additions

  • Support for hybrid-36 encoding in structure.io.pdb.PDBFile
  • Added get_coord() method in structure.io.pdb.PDBFile for efficiently reading only the coordinates from a file
  • structure.CellList can be configured to put only a subset of atoms into the cells via the selection parameter
  • Improved functionalities in database subpackage.
    • A lot of new query types in database.rcsb
    • The min and max parameter of some database.rcsb queries are now optional
    • database.rcsb.fetch() and database.entrez.fetch() are able to write the downloaded files into a file-like object instead of writing the file to hard drive
    • database.entrez.fetch() properly checks for invalid responses from server based on https://github.com/kblin/ncbi-entrez-error-messages
    • database.entrez.fetch() also supports common database names
    • database.entrez.SimpleQuery also supports abbreviated field names
  • structure.io.load_structure() and structure.io.save_structure() support keyword arguments that are forwarded to the respective read() or get_structure() method.

Changes

  • database subpackage raises database.RequestError objects when the server gives an invalid response

Fixes

  • Fixed cross references in the API reference
  • sequence.io.genbank.GenBankFile raises a warning instead of an exception if the feature's location identifier is not understood and skips the feature
  • structure.io.pdb.PDBFile properly checks whether all models have the same amount of atoms, when building a structure.AtomArrayStack

Biotite 0.16.0

16 Aug 13:31
bed7b6d
Compare
Choose a tag to compare

Changelog

Additions

  • New alignment color schemes
    • Color schemes for protein sequence alignments created with Gecos software
      • Including a color scheme adapted for red-green blindness
    • Color scheme for protein block sequence alignments created with Gecos software
    • Color schemes for protein sequence alignments adapted from JalView
  • More functionalities for external MSA software (application.MSAApp subclasses)
    • Additional CLI options can be set via add_additional_options()
    • The executed command of application.LocalApp can be optained via get_coomand()
    • Most MSA software interfaces allow setting and getting the distance matrix and the guide tree
      • The corresponding method are get_guide_tree(), set_guide_tree(), get_distance_matrix() and set_distance_matrix()
    • MSA software supporting cutom substitution matrices can be used to align almost any type of sequence, even if the type is not directly supported by the underlying software
  • Added euality operator for sequence.align.Alignment objects
  • sequence.phylo.Tree supports non-binary trees
    • sequence.phylo.TreeNode can handle more than two child nodes
    • len() gives amount of leaves in sequence.phylo.Tree
    • sequence.phylo.Tree and sequence.phylo.TreeNode support hash and equality operator
    • sequence.phylo.as_binary() function converts non-binary tree into binary tree, as required for guide trees
  • Added sequence.phylo.neighbor_joining() for hierarchical clustering

Changes

  • Removed X as symbol for ambiguous nucleotides, use N instead
  • Removed protected method get_default_bin_path() from application.MSAApp
  • Renamed protected method set_options() to set_arguments() application.LocalApp
  • Renamed set_matrix() to set_substitution_matrix() application.muscle.MuscleApp
  • Removed protected method get_cli_arguments() in application.LocalApp
  • Adapted constructor of sequence.phylo.TreeNode for variable amount of child nodes
  • application.MSAApp subclasses must implement abstract static methods describing which sequence types they support and whether they support custom substitution matrices

Fixes

  • U is automatically converted to T when loading nulceotide sequences from FASTA files
  • Score matrix in sequence.align.SubstitutionMatrix is now truly read-only via ndarray flag
  • application.Application subclasses (all external software interfaces) now properly check whether the corresponding objects are in the correct application.AppState
  • Error in evaluation step of application.Application now leaves application in application.AppState.CANCELLED state
  • Fixed InvalidFileError not being exposed to user
  • Symmetry checks in sequence.phylo.upgma() allow for small rounding errors

Biotite 0.15.1

22 Jul 09:31
eb1bb5e
Compare
Choose a tag to compare

Changelog

Additions

  • Increased performance of sequence.NucleotideSequence.translate() method
    • Added map_codon_codes() method to sequence.CodonTable for efficient codon to amino acid mapping

Biotite 0.15.0

16 Jul 11:35
8dfa3ed
Compare
Choose a tag to compare

Changelog

Additions

  • Highly increased performance of encoding/decoding of sequence to sequence code and vice versa
    • sequence.Alphabet.decode_multiple() accepts any array-like object as sequence code
  • Added read/write support for GFF3 files
    • Added sequence.io.gff subpackage
    • Contains sequence.io.gff.GFFFile as low level interface to GFF3 files
    • Contains get_annotation() and set_annotation() functions as high level interface to GFF3 files

Biotite 0.14.0

29 May 13:50
0a10ef9
Compare
Choose a tag to compare

Changelog

Additions

  • Added convenience functions for chains, similar to the functions for residues
    • Added get_chain_starts()
    • Added get_chains()
    • Added get_chain_count()
    • Added chain_iter()
  • Revamped interface for GenBank files
    • sequence.io.genbank.GenBankFile provides a low-level API for obtaining field names and the corresponding lines and subfields
    • sequence.io.genbank.GenBankFile is now used for both GenBank and GenPept files
    • High level objects are obtained via module-level functions
      • get_locus(), get_definition(), get_accession(), get_version(), get_gi(), get_source(), get_db_link(), get_annotation(), get_sequence() and get_annotated_sequence() are now functions sequence.io.genbank
    • Added set_locus(), set_annotation(), set_sequence(), set_annotated_sequence(), to sequence.io.genbank for creating and editing GenBank files

Changes

  • structure.dihedral_backbone() does not require a chain ID anymore
    • The dihedrals are automatically calculated over all chains
    • Dihedrals at the transition of one chain to the next one are NaN
  • Completely changed usage of sequence.io.genbank.GenBankFile (see above)

Fixes

  • Atom IDs above 99999 and residue IDs above 9999 do not break writing structure.io.gro.GROFile
    • In case of overflow, the ID restarts at 1
  • Dummy boxes in .gro files are not converted into a real box attribute of a structure.AtomArray anymore
  • When creating a sequence.Alignment from strings, it is checked whether at least to strings (sequences) are given
  • Fixed annotation equality checks when setting an structure.AtomArrayStack element with an structure.AtomArray
  • Fixed indexing a sequence.AnnotatedSequence with a sequence.Feature containing multiple locations
    • Previously the locations were merged in a random order resulting in wrong sequence.Sequence objects

Biotite 0.13.1

29 Mar 15:46
Compare
Choose a tag to compare

Changelog

Fixes

  • structure.io.gro.GROFile appends a newline at the end of file
    • This allows PyMOL to open .gro files
  • structure.io.pdb.PDBFile raises an exception when coordinates contain NaN values
  • structure.io.pdb.PDBFile works properly for residue IDs greater than 9999
  • structure.io.pdb.PDBFile works properly for atom array lengths greater than 199998

Biotite 0.13.0

18 Mar 13:56
521c320
Compare
Choose a tag to compare

Changelog

Additions

  • structure.hbond() supports periodic boundary conditions
  • Added structure.remove_pbc() and structure.remove_pbc_coord(), that sanitize structures that are segmented due to periodic boundaries
  • Write support for trajectory files
    • This includes structure.io.save_structure()
  • Support for DCD and NetCDF trajectory formats (structure.io.dcd and structure.io.netcdf)

Changes

  • The coord and box attribute in structure.AtomArray and structure.AtomArrayStack are stored as float32 arrays
    • File readers are changed accordingly
    • Previously, both float64 and float32 were allowed, which made type conversions necessary for some functions
    • If an atom array (stack) is provided with an float64 array, the type is implicitly converted.
  • Faster encoding and decoding in sequence.LetterAlphabet
    • Internally uses bytes instead of str for symbols
    • sequence.LetterAlphabet does only accept ASCII characters
  • Discontinued support for Python type annotations
    • May be re-enabled when official type annotations for NumPy arrive
  • Changed protected abstract methods in structure.io.TrajectoryFile
  • structure.filter_solvent() regards only the res_name and ignores the hetero field

Fixes

  • structure.io.mmtf.MMTFFile fields can be set with ndarray objects for non-encoded fields
    • The ndarray objects are implicitly converted into list objects
  • The box attribute of structure.AtomArrayStack objects are correctly sliced when slicing the atom array stack
  • structure.displacement(), structure.distance(), structure.angle() and structure.dihedral() accept structure.Atom instances as parameter

Biotite 0.12.0

14 Feb 12:59
8f6b0db
Compare
Choose a tag to compare

Changelog

Additions

  • Added new structure.info subpackage that contains all kinds of basic structure-related data
    • structure.info.mass() function provides weight for elements, residue name and enitre structures
    • structure.info.vdw_radius_single() and structure.info.vdw_radius_protor() function provides Van-der-Waals radii for single elements or atoms with bonded hydrogens, respectively
    • structure.info.bond_order() and structure.info.bonds_in_residue() provide information about bonded atoms
    • structure.full_name() provides the full name of an up to 3-letter residue/compound name
    • structure.link_type() provides the link type for a residue name
  • Added structure.rdf() function for calculation of the radial distribution function of positions in an AtomArray or AtomArrayStack
  • Added structure.residue_iter() function, that yields each residue in an AtomArray or AtomArrayStack as an subarray (stack)

Changes

  • Removed structure.mass_of_element() and structure.atom_masses()
  • When writing MMTF files, the chemCompType in groupList is determined via structure.link_type()

Fixes

  • PDB files can be written for structures with more than 100,000 atoms
    • When the amount of 99,999 atoms is exceeded, the atom ID starts over again
  • dir() function gives proper results for AtomArray and AtomArrayStack
  • Fixed ProtOr radii in structure.sasa()
  • structure.stack() includes the box attribute when stacking AtomArray objects