Skip to content

Releases: seqan/seqan3

SeqAn 3.0.2

13 Oct 10:39
3.0.2
370673f
Compare
Choose a tag to compare

GitHub commits since tagged version (branch)

Despite all circumstances, we are excited to present a new update of our SeqAn library. We present some great new features and also a lot of usability improvements. Among others, this release will fully comply with the final C++-20 standard.

⚠️ In this release we harmonised the algorithm configurations for a better user experience. This, much like 2020, will break a lot of code. But rest assured that the changes are easy to apply and are worth every bit. 😄

You can find a comprehensive list of the changes in our changelog.

Note that 3.1.0 will be the first API stable release and interfaces in this release might still change.

🎉 Notable new features

  • We added the seqan3::views::minimiser and seqan3::views::minimiser_hash views to compute the minimum in a shifted window and apply hashing, respectively.

  • The seqan3::search_cfg::hit configuration can now be set dynamically.

    Click for an example
    #include <vector>
    
    #include <seqan3/alphabet/nucleotide/dna4.hpp>
    #include <seqan3/core/debug_stream.hpp>
    #include <seqan3/search/configuration/max_error.hpp>
    #include <seqan3/search/configuration/hit.hpp>
    #include <seqan3/search/fm_index/fm_index.hpp>
    #include <seqan3/search/search.hpp>
    
    int main()
    {
        using seqan3::operator""_dna4;
    
        std::vector<seqan3::dna4_vector> text{"CGCTGTCTGAAGGATGAGTGTCAGCCAGTGTA"_dna4,
                                            "ACCCGATGAGCTACCCAGTAGTCGAACTG"_dna4,
                                            "GGCCAGACAACCCGGCGCTAATGCACTCA"_dna4};
        seqan3::dna4_vector query{"GCT"_dna4};
        seqan3::fm_index index{text};
    
        // Use the dynamic hit configuration to set hit_all_best mode.
        seqan3::configuration search_config = seqan3::search_cfg::max_error_total{seqan3::search_cfg::error_count{1}} |
                                            seqan3::search_cfg::hit{seqan3::search_cfg::hit_all_best{}};
    
        seqan3::debug_stream << "All single best hits:\n";
        for (auto && hit : search(query, index, search_config)) // Find all best hits:
            seqan3::debug_stream << hit << '\n';
    
        // Change the hit configuration to the strata mode with a stratum of 1.
        using seqan3::get;
        get<seqan3::search_cfg::hit>(search_config).hit_variant = seqan3::search_cfg::hit_strata{1};
    
        seqan3::debug_stream << "\nAll x+1 hits:\n";
        for (auto && hit : search(query, index, search_config)) // Find all strata hits.
            seqan3::debug_stream << hit << '\n';
    }
  • The return type of the search algorithm was adapted to use a lazy result range over the found hits during the search and is now independent of the used FM-index type.

    Click for an example
    #include <vector>
    
    #include <seqan3/alphabet/nucleotide/dna4.hpp>
    #include <seqan3/core/debug_stream.hpp>
    #include <seqan3/search/configuration/max_error.hpp>
    #include <seqan3/search/configuration/hit.hpp>
    #include <seqan3/search/search.hpp>
    #include <seqan3/search/fm_index/fm_index.hpp>
    
    int main()
    {
        using seqan3::operator""_dna4;
    
        std::vector<seqan3::dna4_vector> text{"CGCTGTCTGAAGGATGAGTGTCAGCCAGTGTA"_dna4,
                                            "ACCCGATGAGCTACCCAGTAGTCGAACTG"_dna4,
                                            "GGCCAGACAACCCGGCGCTAATGCACTCA"_dna4};
        seqan3::dna4_vector query{"GCT"_dna4};
    
        seqan3::configuration const search_config = seqan3::search_cfg::max_error_total{seqan3::search_cfg::error_count{1}} |
                                                    seqan3::search_cfg::hit_all_best{};
    
        // Always provide a unified interface over the found hits independent of the index its text layout.
        seqan3::debug_stream << "Search in text collection:\n";
        seqan3::fm_index index_collection{text};
        for (auto && hit : search(query, index_collection, search_config)) // Over a text collection.
            seqan3::debug_stream << hit << '\n';
    
        seqan3::debug_stream << "\nSearch in single text:\n";
        seqan3::fm_index index_single{text[0]};
        for (auto && hit : search(query, index_single, search_config)) // Over a single text.
            seqan3::debug_stream << hit << '\n';
    }
  • We added a data structure called interleaved Bloom filter, which can answer set-membership queries efficiently.

  • The pairwise alignment can now be configured with a user-defined callback, which is called for every computed alignment
    result instead of returning a lazy range over the alignment results.

    Click for an example
    #include <mutex>
    #include <vector>
    
    #include <seqan3/alignment/configuration/align_config_edit.hpp>
    #include <seqan3/alignment/configuration/align_config_method.hpp>
    #include <seqan3/alignment/configuration/align_config_on_result.hpp>
    #include <seqan3/alignment/configuration/align_config_parallel.hpp>
    #include <seqan3/alignment/pairwise/align_pairwise.hpp>
    #include <seqan3/alphabet/nucleotide/dna4.hpp>
    #include <seqan3/core/debug_stream.hpp>
    
    int main()
    {
    
        // Generate some sequences.
        using seqan3::operator""_dna4;
        using sequence_pair_t = std::pair<seqan3::dna4_vector, seqan3::dna4_vector>;
        std::vector<sequence_pair_t> sequences{100, {"AGTGCTACG"_dna4, "ACGTGCGACTAG"_dna4}};
    
        std::mutex write_to_debug_stream{}; // Need mutex to synchronise the output.
    
        // Use edit distance with 4 threads.
        auto const alignment_config = seqan3::align_cfg::method_global{} |
                                      seqan3::align_cfg::edit_scheme |
                                      seqan3::align_cfg::parallel{4} |
                                      seqan3::align_cfg::on_result{[&] (auto && result)
                                                    {
                                                        std::lock_guard sync{write_to_debug_stream}; // critical section
                                                        seqan3::debug_stream << result << '\n';
                                                    }};
    
        // Compute the alignments in parallel, and output them unordered using the callback (order is not deterministic).
        seqan3::align_pairwise(sequences, alignment_config);  // seqan3::align_pairwise is now declared void.
    }

:trollface: Notable API changes

  • The alignment and search configurations have been refactored and improved.
  • Some type traits and concepts have been added to the seqan3/std module complying with the C++-20 standard.

🐛 Notable bug fixes

  • FM-index based search produces now the correct results when using quality sequences.
  • The parallel search was fixed. So no time for ☕ here, sorry.
  • Fixed an issue with spawning too many threads in parallel pairwise alignment.
  • Various fixes to make our views and ranges comply with the C++-20 standard.

🔌 External dependencies

  • SeqAn 3.0.2 is known to compile with GCC 7.5, 8.4, 9.3 and 10.2. Future versions (e.g. GCC 10.3 and 11) might work,
    but aren't yet available at the time of this release.
  • We now support ranges-v3 versions >= 0.11.0 and < 0.12.0, increasing the previous requirement of >= 0.10.0 and < 0.11.0.

Note: We changed our naming scheme of our source package from seqan-[VERSION]-with-submodules.tar.gz to seqan3-[VERSION]-Source.tar.xz. Please use the new package seqan3-[VERSION]-Source.tar.xz.

SeqAn 3.0.1

22 Jan 14:34
a88bfb6
Compare
Choose a tag to compare

GitHub commits since tagged version (branch)

We are excited to present a new update of our SeqAn library. This release has been in the making for roughly half a year now and we are proud to present some great new features and also a lot of improvements with respect to runtime and usability. You can find a comprehensive list of the changes in our changelog.

Note that 3.1.0 will be the first API stable release and interfaces in this release might still change.

🎉 Notable new features

  • We added support for type erasing semialphabets that allows you to manage semialphabets with the same alphabet size in one container. This can have a big effect on your compile-time, in case you don't drink as much ☕ as we do.

  • We added parallel support for the alignment algorithm. You can now configure the number of threads you want to use for the alignment computation.

    Click for an example
    #include <iostream>
    
    #include <seqan3/alphabet/nucleotide/dna4.hpp>
    #include <seqan3/alignment/pairwise/all.hpp>
    
    int main()
    {
        using seqan3::operator""_dna4;
    
        auto sequence1{"ACCA"_dna4};
        auto sequence2{"ATTA"_dna4};
    
        seqan3::configuration alignment_config = seqan3::align_cfg::edit |
                                                 seqan3::align_cfg::parallel{4};
    
        for (auto const & res : seqan3::align_pairwise(std::tie(sequence1, sequence2),
                                                       alignment_config))
            std::cout << "Score: " << res.score() << '\n';
    }
  • One to command them all: Our argument parser now supports subcommands, such as git pull. A How-to will guide you through setting this up for your tool.

  • The performance of the I/O was improved to allow faster file reading. Further, we added support for reading and writing the CIGAR string through alignment files.

  • We added several new ranges and views. Most notably, the seqan3::views::kmer_hash view, which transforms a sequence into a range of k-mer hashes efficiently. Another view of great practice is the seqan3::views::to, which can be used to convert a view into a container. We also added a seqan3::dynamic_bitset which is a dynamic version of the std::bitset.

  • Memory consumption of the (bidirectional) FM-Index for text collections was reduced by 10%.

:trollface: Notable API changes

As much as we'd like to reduce inconsistencies between releases, we are sometimes forced to change an interface either to improve usability or to follow changes made by the ISO C++ committee.

  • All our concepts are named in the snake_case style (e.g. seqan3::WritableAlphabet -> seqan3::writable_alphabet)!
  • The directory seqan3/range/view has been renamed to seqan3/range/views.
  • The namespace seqan3::view has been renamed to seqan3::views.
  • The CMake variable SEQAN3_VERSION_STRING defined by find_package(SEQAN3) was renamed to SEQAN3_VERSION.

You can find a comprehensive list of the changes in our changelog.

🐛 Notable bug fixes

  • Copying and moving the seqan3::fm_index and seqan3::bi_fm_index now work properly.
  • The translation table for nucleotide to amino acid translation was corrected.
  • The amino acid score matrices were corrected.

🔌 External dependencies

  • We now support ranges-v3 versions >= 0.10.0 and < 0.11.0, increasing the previous requirement of >= 0.5.0 and < 0.6.0.
  • We now support cereal version 1.3.0, increasing the previous requirement of 1.2.2

Note: We changed our naming scheme of our source package from seqan-[VERSION]-with-submodules.tar.gz to seqan3-[VERSION]-Source.tar.xz. Please use the new package seqan3-[VERSION]-Source.tar.xz.

SeqAn 3.0.0 "Escala"

06 Jun 18:27
@h-2 h-2
5b59e64
Compare
Choose a tag to compare

This is the initial release of SeqAn3. It is an entirely new library so there is no changelog that covers the differences to SeqAn2.

Please see the release announcement:
https://www.seqan.de/announcing-seqan3/

See the porting guide for some help on porting:
http://docs.seqan.de/seqan/3-master-user/howto_porting.html

Note that 3.1.0 will be the first API stable release and interfaces in this release might still change.


Note: We changed our naming scheme of our source package from seqan-[VERSION]-with-submodules.tar.gz to seqan3-[VERSION]-Source.tar.xz. Please use the new package seqan3-[VERSION]-Source.tar.xz.