chrom_mini_graph

chrom_mini_graph is a tool for generating and mapping reads onto a chromatic (coloured) minimizer pangenome graph.

Requirements

rust and associated tools such as cargo are required and assumed to be in PATH.
cmake version >= 3.12 has to be in path due to dependency on --parallel command (see here).

Install

git clone https://github.com/bluenote-1577/chrom_mini_graph
cd chrom_mini_graph
cargo build --release
./target/release/chrom_mini_graph generate test_refs/*
./target/release/chrom_mini_graph map -a -b test_bam.bam serialized_mini_graph.bin test_reads/hg_01243_pacbio_reads.fastq > output.txt

cargo build --release first builds the chrom_mini_graph binary, which is found in the ./target/release/ directory.
The chrom_mini_graph generate command generates a coloured minimizer pangenome graph.
The chrom_mini_graph map command chains onto the output graph and produces an alignment. The -a option outputs a BAM file with name specified by the -b option.

6 reference 1M bp segments of chromosome 20 are provided in the test_ref folder.
Simulated PacBio CLR reads for hg01243 are available in the test_reads folder.

Using chrom_mini_graph

generate

chrom_mini_graph generate ref_1.fasta ref_2.fasta ... -o output_from_generate to create a coloured minimizer pangenome graph for references ref_1.fasta, ref_2.fasta, etc. The output specified by the -o option is used for the mapping step.

The window size can be easily modified in the src/bin/chrom_mini_graph.rs file. The default value is 16.
Outputs a *.bin file to be used for mapping and other auxillary information; see below.
Each fasta file can have multiple contigs. Each contig will be treated as its own reference genome.

Ordering for `generate`

The first reference used (i.e. ref_1.fasta) serves as the backbone for the minimizer graph. Make sure that this first reference is the most contiguous contig.

map

A proof of concept read-to-graph chainer by chaining minimizers in the read onto the graph without knowledge of colour and then finding the best colours (reference genomes) for the chain.

chrom_mini_graph map -a output_from_generate.bin your_reads.fastq -b bam_name.bam > output.txt outputs the bam file bam_name.bam and directs stdout to a output.txt log.

The file best_genome_reads.txt is also output. The best_genome_reads.txt shows the top 5 (or less) best candidate reference genomes for each read. The format is

>read_1
chrom_1 score_1
chrom_2 score_2
...
>read_2
chrom_1 score_1
chrom_2 score_2
...

More than 5 best candidates may be output due to secondary alignments and less than 5 may be output if the read is deemed unmappable to certain references.

Caveats:

Only the best candidate genome is aligned to.
No supplementary/secondary alignments are output in the bam file; MapQ is defaulted to 60.

Name		Name	Last commit message	Last commit date
Latest commit History 99 Commits
.vscode		.vscode
src		src
test_reads		test_reads
test_refs		test_refs
test_sim_reads		test_sim_reads
test_sim_refs		test_sim_refs
tests		tests
.gitignore		.gitignore
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
README.md		README.md
analyze_mapping_accuracy.py		analyze_mapping_accuracy.py
badread1.sh		badread1.sh
benchmark.sh		benchmark.sh
benchmark_generate.sh		benchmark_generate.sh
benchmark_given_size.sh		benchmark_given_size.sh
cleanup.sh		cleanup.sh
convert_to_gfa.py		convert_to_gfa.py
convert_to_gfa_compressed.py		convert_to_gfa_compressed.py
rand_seq.py		rand_seq.py
run_generate.sh		run_generate.sh
run_map.sh		run_map.sh
visualize_graph.py		visualize_graph.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

chrom_mini_graph

Requirements

Install

Using chrom_mini_graph

generate

Ordering for `generate`

map

About

Releases

Packages

Contributors 2

Languages

gaojunxuan/chrom_mini_graph

Folders and files

Latest commit

History

Repository files navigation

chrom_mini_graph

Requirements

Install

Using chrom_mini_graph

generate

Ordering for generate

map

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Ordering for `generate`

Packages