DESCRIPTION

The chimera.uchime command reads a fasta file and reference file and outputs potentially chimeric sequences. The original uchime program was written by Robert C. Edgar.

SYNOPSIS

uchime --input query.fasta [--db db.fasta] [--uchimeout results.uchime]

  • [--uchimealns results.alns]

OPTIONS

--input filename

  • Query sequences in FASTA format. If the --db option is not specificed, uchime uses de novo detection. In de novo mode, relative abundance must be given by a string /ab=xxx/ somewhere in the label, where xxx is a floating-point number, e.g. >F00QGH67HG/ab=1.2/.

--db filename

  • Reference database in FASTA format. Optional, if not specified uchime uses de novo mode.

  • ***WARNING*** The database is searched ONLY on the plus strand. You MUST include reverse-complemented sequences in the database if you want both strands to be searched.

--abskew x

  • Minimum abundance skew. Default 1.9. De novo mode only. Abundance skew is:

  • min [ abund(parent1), abund(parent2) ] / abund(query).

--uchimeout filename

  • Output in tabbed format with one record per query sequence. First field is score (h), second field is query label. For details, see manual.

--uchimealns filename

  • Multiple alignments of query sequences to parents in humanreadable format. Alignments show columns with differences that support or contradict a chimeric model.

--minh h

  • Mininum score to report chimera. Default 0.3. Values from 0.1 to 5 might be reasonable. Lower values increase sensitivity but may report more false positives. If you decrease --xn, you may need to increase --minh, and vice versa.

--mindiv div

  • Minimum divergence ratio, default 0.5. Div ratio is 100% - %identity between query sequence and the closest candidate for being a parent. If you don't care about very close chimeras, then you could increase --mindiv to, say, 1.0 or 2.0, and also decrease --min h, say to 0.1, to increase sensitivity. How well this works will depend on your data. Best is to tune parameters on a good benchmark.

--xn beta

  • Weight of a no vote, also called the beta parameter. Default 8.0. Decreasing this weight to around 3 or 4 may give better performance on denoised data.

--dn n

  • Pseudo-count prior on number of no votes. Default 1.4. Probably no good reason to change this unless you can retune to a good benchmark for your data. Reasonable values are probably in the range from 0.2 to 2.

--xa w

  • Weight of an abstain vote. Default 1. So far, results do not seem to be very sensitive to this parameter, but if you have a good training set might be worth trying. Reasonable values might range from 0.1 to 2.

--chunks n

  • Number of chunks to extract from the query sequence when searching for parents. Default 4.

--[no]ovchunks

  • [Do not] use overlapping chunks. Default do not.

--minchunk n

  • Minimum length of a chunk. Default 64.

--idsmoothwindow w

  • Length of id smoothing window. Default 32.

--minsmoothid f

  • Minimum factional identity over smoothed window of candidate parent. Default 0.95.

--maxp n

  • Maximum number of candidate parents to consider. Default 2. In tests so far, increasing --maxp gives only a very small improvement in sensivity but tends to increase the error rate quite a bit.

--[no]skipgaps --[no]skipgaps2

  • These options control how gapped columns affect counting of diffs. If --skipgaps is specified, columns containing gaps do not found as diffs. If --skipgaps2 is specified, if column is immediately adjacent to a column containing a gap, it is not counted as a diff. Default is --skipgaps --skipgaps2.

--minlen L --maxlen L

  • Minimum and maximum sequence length. Defaults 10, 10000. Applies to both query and reference sequences.

--ucl

  • Use local-X alignments. Default is global-X. On tests so far, global-X is always better; this option is retained because it just might work well on some future type of data.

--queryfract f

  • Minimum fraction of the query sequence that must be covered by a local-X alignment. Default 0.5. Applies only when --ucl is specified.

--quiet

  • Do not display progress messages on stderr.

--log filename

  • Write miscellaneous information to the log file. Mostly of interest to me (the algorithm developer). Use --verbose to get more info.

--self

  • In reference database mode, exclude a reference sequence if it has the same label as the query. This is useful for benchmarking by using the ref db as a query to test for false positives.

--abskew <float>

help

--absort <str>

help

--abx <float>

help

--allpairs <str>

help

--alpha <str>

help

--band <uint>

help

--blast6out <str>

help

--[no]blast_termgaps

help

--blastout <str>

help

--bump <uint>

help

--[no]cartoon_orfs

help

--cc <str>

help

--chain_evalue <float>

help

--chain_targetfract <float>

help

--chainhits <str>

help

--chainout <str>

help

--chunks <uint>

help

--clstr2uc <str>

help

--clump <str>

help

--clump2fasta <str>

help

--clumpfasta <str>

help

--clumpout <str>

help

--cluster <str>

help

--compilerinfo

Write info about compiler types and #defines to stdout.

--computekl <str>

help

--db <str>

help

--dbstep <uint>

help

--[no]denovo

help

--derep

help

--diffchar <str>

help

--dn <float>

help

--doug <str>

help

--droppct <uint>

help

--evalue <float>

help

--evalue_g <float>

help

--exact

help

--[no]fastalign

help

--fastapairs <str>

help

--fastq2fasta <str>

help

--findorfs <str>

help

--[no]flushuc

help

--frame <int>

help

--fspenalty <float>

help

--gapext <str>

help

--gapopen <str>

help

--getseqs <str>

help

--global

help

--hash

help

--hashsize <uint>

help

--help

Display command-line options.

--hireout <str>

help

--hspalpha <str>

help

--id <float>

help

--idchar <str>

help

--iddef <uint>

help

--idprefix <uint>

help

--ids <str>

help

--idsmoothwindow <uint>

help

--idsuffix <uint>

help

--indexstats <str>

help

--input <str>

help

--[no]isort

help

--k <uint>

help

--ka_dbsize <float>

help

--ka_gapped_k <float>

help

--ka_gapped_lambda <float>

help

--ka_ungapped_k <float>

help

--ka_ungapped_lambda <float>

help

--[no]label_ab

help

--labels <str>

help

--[no]leftjust

help

--lext <float>

help

--local

help

--log <str>

Log file name.

--[no]log_hothits

help

--[no]log_query

help

--[no]logmemgrows

help

--logopts

Log options.

--[no]logwordstats

help

--lopen <float>

help

--makeindex <str>

help

--match <float>

help

--matrix <str>

help

--max2 <uint>

help

--maxaccepts <uint>

help

--maxclump <uint>

help

--maxlen <uint>

help

--maxovd <uint>

help

--maxp <uint>

help

--maxpoly <uint>

help

--maxqgap <uint>

help

--maxrejects <uint>

help

--maxspan1 <uint>

help

--maxspan2 <uint>

help

--maxtargets <uint>

help

--maxtgap <uint>

help

--mcc <str>

help

--mergeclumps <str>

help

--mergesort <str>

help

--minchunk <uint>

help

--mincodons <uint>

help

--mindiffs <uint>

help

--mindiv <float>

help

--minh <float>

help

--minhsp <uint>

help

--minlen <uint>

help

--minorfcov <uint>

help

--minspanratio1 <float>

help

--minspanratio2 <float>

help

--[no]minus_frames

help

--mismatch <float>

help

--mkctest <str>

help

--[no]nb

help

--optimal

help

--orfstyle <uint>

help

--otusort <str>

help

--output <str>

help

--[no]output_rejects

help

--probmx <str>

help

--query <str>

help

--queryfract <float>

help

--querylen <uint>

help

--quiet

Turn off progress messages.

--randseed <uint>

help

--realign

help

--[no]rev

help

--[no]rightjust

help

--rowlen <uint>

help

--secs <uint>

help

--seeds <str>

help

--seedsout <str>

help

--seedt1 <float>

help

--seedt2 <float>

help

--self

help

--[no]selfid

help

--simcl <str>

help

--[no]skipgaps

help

--[no]skipgaps2

help

--sort <str>

help

--sortuc <str>

help

--sparsedist <str>

help

--sparsedistparams <str>

help

--split <float>

help

--[no]ssort

help

--sspenalty <float>

help

--[no]stable_sort

help

--staralign <str>

help

--stepwords <uint>

help

--strand <str>

help

--targetfract <float>

help

--targetlen <uint>

help

--tmpdir <str>

help

--[no]trace

help

--tracestate <str>

help

--[no]trunclabels

help

--[no]twohit

help

--uc <str>

help

--uc2clstr <str>

help

--uc2fasta <str>

help

--uc2fastax <str>

help

--uchime <str>

help

--uchimealns <str>

help

--uchimeout <str>

help

--[no]ucl

help

--uhire <str>

help

--ungapped

help

--userfields <str>

help

--userout <str>

help

--usersort

help

--uslink <str>

help

--[no]usort

help

--utax <str>

help

--[no]verbose

help

--version

Show version and exit.

--w <uint>

help

--weak_evalue <float>

help

--weak_id <float>

help

--[no]wordcountreject

help

--[no]wordweight

help

--xa <float>

help

--xdrop_g <float>

help

--xdrop_nw <float>

help

--xdrop_u <float>

help

--xdrop_ug <float>

help

--xframe <str>

help

--xlat

help

--xn <float>

help

AUTHOR

Robert C. Edgar

RELATED TO uchime…

http://www.drive5.com/uchime