SYNOPSIS

cat in.vcf | vcf-annotate [OPTIONS] > out.vcf

DESCRIPTION

About: Annotates VCF file, adding filters or custom annotations. Requires tabix indexed file with annotations.

  • Currently annotates only the INFO column, but it will be extended on demand.

OPTIONS

-a, --annotations <file.gz>

The tabix indexed file with the annotations: CHR\tFROM[\tTO][\tVALUE]+.

-c, --columns <list>

The list of columns in the annotation file, e.g. CHROM,FROM,TO,-,INFO/STR,INFO/GN. The dash in this example indicates that the third column should be ignored. If TO is not present, it is assumed that TO equals to FROM.

-d, --description <file|string>

Header annotation, e.g. key=INFO,ID=HM2,Number=0,Type=Flag,Description='HapMap2 membership'. The descriptions can be read from a file, one annotation per line.

-f, --filter <list>

Apply filters, list is in the format flt1=value/flt2/flt3=value/etc.

-h, -?, --help

This help message.

Filters:

+

Apply all filters with default values (can be overridden, see the example below).

-X

Exclude the filter X

1, StrandBias

FLOAT Min P-value for strand bias (given PV4) [0.0001]

2, BaseQualBias

FLOAT Min P-value for baseQ bias [1e-100]

3, MapQualBias

FLOAT Min P-value for mapQ bias [0]

4, EndDistBias

FLOAT Min P-value for end distance bias [0.0001]

a, MinAB

INT Minimum number of alternate bases [2]

c, SnpCluster

INT1,INT2 Filters clusters of 'INT1' or more SNPs within a run of 'INT2' bases []

D, MaxDP

INT Maximum read depth [10000000]

d, MinDP

INT Minimum read depth [2]

q, MinMQ

INT Minimum RMS mapping quality for SNPs [10]

Q, Qual

INT Minimum value of the QUAL field [10]

r, RefN

Reference base is N []

W, GapWin

INT Window size for filtering adjacent gaps [10]

w, SnpGap

INT SNP within INT bp around a gap to be filtered [10]

Example:

  • zcat in.vcf.gz | vcf-annotate -a annotations.gz -d descriptions.txt | bgzip -c >out.vcf.gz zcat in.vcf.gz | vcf-annotate -f +/-a/c=3,10/q=3/d=5/-D -a annotations.gz -d descriptions.txt | bgzip -c >out.vcf.gz

Where descriptions.txt contains:

  • key=INFO,ID=GN,Number=1,Type=String,Description='Gene Name' key=INFO,ID=STR,Number=1,Type=Integer,Description='Strand'