SYNOPSIS

readseq [-options] in.seq > out.seq

DESCRIPTION

This manual page documents briefly the readseq command. This manual page was written for the Debian GNU/Linux distribution because the original program does not have a manual page. Instead, it has documentation in text form, see below.

readseq reads and writes biosequences (nucleic/protein) in various formats. Data files may have multiple sequences. readseq is particularly useful as it automatically detects many sequence formats, and interconverts among them.

FORMATS

Formats which readseq currently understands:

  * IG/Stanford, used by Intelligenetics and others
  * GenBank/GB, genbank flatfile format
  * NBRF format
  * EMBL, EMBL flatfile format
  * GCG, single sequence format of GCG software
  * DNAStrider, for common Mac program
  * Fitch format, limited use
  * Pearson/Fasta, a common format used by Fasta programs and others
  * Zuker format, limited use. Input only.
  * Olsen, format printed by Olsen VMS sequence editor. Input only.
  * Phylip3.2, sequential format for Phylip programs
  * Phylip, interleaved format for Phylip programs (v3.3, v3.4)
  * Plain/Raw, sequence data only (no name, document, numbering)
  + MSF multi sequence format used by GCG software
  + PAUP's multiple sequence (NEXUS) format
  + PIR/CODATA format used by PIR
  + ASN.1 format used by NCBI
  + Pretty print with various options for nice looking output. Output only.
  + LinAll format, limited use (LinAll and ConStruct programs)
  + Vienna format used by ViennaRNA programs

See the included "Formats" file for detail on file formats.

OPTIONS

-help

Show summary of options.

-a[ll]

Select All sequences

-c[aselower]

Change to lower case

-C[ASEUPPER]

Change to UPPER CASE

-degap[=-]

Remove gap symbols

-i[tem=2,3,4]

Select Item number(s) from several

-l[ist]

List sequences only

-o[utput=]out.seq

Redirect Output

-p[ipe]

Pipe (command line, <stdin, >stdout)

-r[everse]

Change to Reverse-complement

-v[erbose]

Verbose progress

-f[ormat=]# Format number for output, or

    -f[ormat=]Name Format name for output:
    1. IG/Stanford           11. Phylip3.2
    2. GenBank/GB            12. Phylip
    3. NBRF                  13. Plain/Raw
    4. EMBL                  14. PIR/CODATA
    5. GCG                   15. MSF
    6. DNAStrider            16. ASN.1
    7. Fitch                 17. PAUP/NEXUS
    8. Pearson/Fasta         18. Pretty (out-only)
    9. Zuker (in-only)       19. LinAll
   10. Olsen (in-only)       20. Vienna

Pretty format options:

-wid[th]=#

Sequence line width

-tab=#

Left indent

-col[space]=#

Column space within sequence line on output

-gap[count]

Count gap chars in sequence numbers

-nameleft, -nameright[=#]

Name on left/right side [=max width]

-nametop

Name at top/bottom

-numleft, -numright

Seq index on left/right side

-numtop, -numbot

Index on top/bottom

-match[=.]

Use match base for 2..n species

-inter[line=#]

Blank line(s) between sequence blocks

EXAMPLES

  readseq
      -- for interactive use
  readseq my.1st.seq  my.2nd.seq  -all  -format=genbank  -output=my.gb
      -- convert all of two input files to one genbank format output file
  readseq my.seq -all -form=pretty -nameleft=3 -numleft -numright -numtop -match
      -- output to standard output a file in a pretty format
  readseq my.seq -item=9,8,3,2 -degap -CASE -rev -f=msf -out=my.rev
      -- select 4 items from input, degap, reverse, and uppercase them
  cat *.seq | readseq -pipe -all -format=asn > bunch-of.asn
      -- pipe a bunch of data thru readseq, converting all to asn

RELATED TO readseq…

The programs are documented fully in text form. See the files in /usr/share/doc/readseq

AUTHOR

This manual page was written by Stephane Bortzmeyer <[email protected]>, for the Debian GNU/Linux system (but may be used by others).