DESCRIPTION

SMALT efficiently aligns DNA sequencing reads with a reference genome. It Reads from a wide range of sequencing platforms, for example Illumina, Roche-454, Ion Torrent, PacBio or ABI-Sanger, can be processed including paired reads.

The software employs a perfect hash index of short words (< 20 nucleotides long), sampled at equidistant steps along the genomic reference sequences.

For each read, potentially matching segments in the reference are identified from seed matches in the index and subsequently aligned with the read using a banded Smith-Waterman algorithm.

The best gapped alignments of each read is reported including a score for the reliability of the best mapping. The user can adjust the trade-off between sensitivity and speed by tuning the length and spacing of the hashed words.

A mode for the detection of split (chimeric) reads is provided. Multi-threaded program execution is supported.

SYNOPSIS

smalt <task> [TASK_OPTIONS] [<index_name> <file_name_A> [<file_name_B>]]

Available tasks:

smalt check

- checks FASTA/FASTQ input

smalt help

- prints a brief summary of this software

smalt index

- builds an index of k-mer words for the reference

smalt map

- maps single or paired reads onto the reference

smalt sample

- sample insert sizes for paired reads

  • smalt version - prints version information

Help on individual tasks:

  • smalt <task> -H