Skip to content

Latest commit

 

History

History
123 lines (111 loc) · 5.92 KB

tipp-help.md

File metadata and controls

123 lines (111 loc) · 5.92 KB
usage: run_tipp.py [-h] [-v] [-A N] [-P N] [-F N] [--distance DISTANCE]
                   [-M DIAMETER] [-S DECOMP] [-p DIR] [-o OUTPUT]
                   [-d OUTPUT_DIR] [-c CONFIG] [-t TREE] [-r RAXML] [-a ALIGN]
                   [-f FRAG] [-m MOLECULE] [-x N] [-cp CHCK_FILE] [-cpi N]
                   [-seed N] [-R N] [-at N] [-D] [-pt N] [-PD N]
                   [-tx TAXONOMY] [-txm MAPPING] [-adt TREE] [-C N]

This script runs the SEPP algorithm on an input tree, alignment, fragment
file, and RAxML info file.

optional arguments:
  -h, --help            show this help message and exit
  -v, --version         show program's version number and exit

DECOMPOSITION OPTIONS:
  These options determine the alignment decomposition size and taxon
  insertion size. If None is given, then the default is to align/place at
  10% of total taxa. The alignment decomosition size must be less than the
  taxon insertion size.

  -A N, --alignmentSize N
                        max alignment subset size of N [default: 10% of the
                        total number of taxa or the placement subset size if
                        given]
  -P N, --placementSize N
                        max placement subset size of N [default: 10% of the
                        total number of taxa or the alignment length
                        (whichever bigger)]
  -F N, --fragmentChunkSize N
                        maximum fragment chunk size of N. Helps controlling
                        memory. [default: 5000]
  --distance DISTANCE   minimum p-distance before stopping the
                        decomposition[default: 1]
  -M DIAMETER, --diameter DIAMETER
                        maximum tree diameter before stopping the
                        decomposition[default: None]
  -S DECOMP, --decomp_strategy DECOMP
                        decomposition strategy [default: using tree branch
                        length]

OUTPUT OPTIONS:
  These options control output.

  -p DIR, --tempdir DIR
                        Tempfile files will be written to DIR. Full-path
                        required. [default: /tmp/sepp]
  -o OUTPUT, --output OUTPUT
                        output files with prefix OUTPUT. [default: output]
  -d OUTPUT_DIR, --outdir OUTPUT_DIR
                        output to OUTPUT_DIR directory. full-path required.
                        [default: .]

INPUT OPTIONS:
  These options control input. To run SEPP the following is required.A
  backbone tree (in newick format), a RAxML_info file (this is the file
  generated by RAxML during estimation of the backbone tree. Pplacer uses
  this info file to set model parameters),a backbone alignment file (in
  fasta format), and a fasta file including fragments. The input sequences
  are assumed to be DNA unless specified otherwise.

  -c CONFIG, --config CONFIG
                        A config file, including options used to run SEPP.
                        Options provided as command line arguments overwrite
                        config file values for those options. [default: None]
  -t TREE, --tree TREE  Input tree file (newick format) [default: None]
  -r RAXML, --raxml RAXML
                        RAxML_info file including model parameters, generated
                        by RAxML.[default: None]
  -a ALIGN, --alignment ALIGN
                        Aligned fasta file [default: None]
  -f FRAG, --fragment FRAG
                        fragment file [default: None]
  -m MOLECULE, --molecule MOLECULE
                        Molecule type of sequences. Can be amino, dna, or rna
                        [default: dna]

OTHER OPTIONS:
  These options control how SEPP is run

  -x N, --cpu N         Use N cpus [default: number of cpus available on the
                        machine]
  -cp CHCK_FILE, --checkpoint CHCK_FILE
                        checkpoint file [default: no checkpointing]
  -cpi N, --interval N  Interval (in seconds) between checkpoint writes. Has
                        effect only with -cp provided.[default: 3600]
  -seed N, --randomseed N
                        random seed number.[default: 297834]

TIPP OPTIONS:
  These arguments set settings specific to TIPP

  -R N, --reference_pkg N
                        Use a pre-computed reference package [default: None]
  -at N, --alignmentThreshold N
                        Enough alignment subsets are selected to reach a
                        commulative probability of N. This should be a number
                        between 0 and 1 [default: 0.95]
  -D, --dist            Treat fragments as distribution
  -pt N, --placementThreshold N
                        Enough placements are selected to reach a commulative
                        probability of N. This should be a number between 0
                        and 1 [default: 0.95]
  -PD N, --push_down N  Whether to classify based on children below or above
                        insertion point. [default: True]
  -tx TAXONOMY, --taxonomy TAXONOMY
                        A file describing the taxonomy. This is a comma-
                        separated text file that has the following fields:
                        taxon_id,parent_id,taxon_name,rank. If there are other
                        columns, they are ignored. The first line is also
                        ignored.
  -txm MAPPING, --taxonomyNameMapping MAPPING
                        A comma-separated text file mapping alignment sequence
                        names to taxonomic ids. Formats (each line):
                        sequence_name,taxon_id. If there are other columns,
                        they are ignored. The first line is also ignored.
  -adt TREE, --alignmentDecompositionTree TREE
                        A newick tree file used for decomposing taxa into
                        alignment subsets. [default: the backbone tree]
  -C N, --cutoff N      Placement probability requirement to count toward the
                        distribution. This should be a number between 0 and 1
                        [default: 0.0]