gmt music path-scan

VERSION

This document describes gmt music path-scan version 0.04 (2013-05-14 at 16:03:05)

SYNOPSIS

gmt music path-scan --gene-covg-dir=? --bam-list=? --pathway-file=? --maf-file=? --output-file=? [--bmr=?] [--genes-to-ignore=?] [--min-mut-genes-per-path=?] [--skip-non-coding] [--skip-silent]

 ... music path-scan \
        --bam-list input_dir/bam_file_list \
        --gene-covg-dir output_dir/gene_covgs/ \
        --maf-file input_dir/myMAF.tsv \
        --output-file output_dir/sm_pathways \
        --pathway-file input_dir/pathway_dbs/KEGG.txt \
        --bmr 8.7E-07

REQUIRED ARGUMENTS

gene-covg-dir Text

Directory containing per-gene coverage files (Created using music bmr calc-covg)

bam-list Text

Tab delimited list of \s-1BAM\s0 files [sample_name, normal_bam, tumor_bam] (See Description)

pathway-file Text

Tab-delimited file of pathway information (See Description)

maf-file Text

List of mutations using \s-1TCGA\s0 \s-1MAF\s0 specifications v2.3

output-file Text

Output file that will list the significant pathways and their p-values

OPTIONAL ARGUMENTS

bmr Number

Background mutation rate in the targeted regions Default value '1e-06' if not specified

genes-to-ignore Text

Comma-delimited list of genes whose mutations should be ignored

min-mut-genes-per-path Number

Pathways with fewer mutated genes than this, will be ignored Default value '1' if not specified

skip-non-coding Boolean

Skip non-coding mutations from the provided \s-1MAF\s0 file Default value 'true' if not specified

skip-silent Boolean

Skip silent mutations from the provided \s-1MAF\s0 file Default value 'true' if not specified

DESCRIPTION

Only the following four columns in the \s-1MAF\s0 are used. All other columns may be left blank.

Col 1: Hugo_Symbol (Need not be HUGO, but must match gene names used in the pathway file) Col 2: Entrez_Gene_Id (Matching Entrez ID trump gene name matches between pathway file and MAF) Col 9: Variant_Classification Col 16: Tumor_Sample_Barcode (Must match the name in sample-list, or contain it as a substring)

The Entrez_Gene_Id can also be left blank (or set to 0), but it is highly recommended, in case genes are named differently in the pathway file and the \s-1MAF\s0 file.

ARGUMENTS

--pathway-file

For example, a line in the pathway-file would look like: hsa00061 Fatty acid biosynthesis Lipid Metabolism 31:ACACA|32:ACACB|27349:MCAT|2194:FASN|54995:OXSM|55301:OLAH Ensure that the gene names and entrez IDs used match those used in the \s-1MAF\s0 file. Entrez IDs are not mandatory (use a 0 if Entrez \s-1ID\s0 unknown). But if a gene name in the \s-1MAF\s0 does not match any gene name in this file, the entrez IDs are used to find a match (unless it's a 0).

--gene-covg-dir
--bam-list
Provide a file containing sample names and normal/tumor \s-1BAM\s0 locations for each. Use the tab- delimited format [sample_name normal_bam tumor_bam] per line. This tool only needs sample_name, so all other columns can be skipped. The sample_name must be the same as the tumor sample names used in the \s-1MAF\s0 file (16th column, with the header Tumor_Sample_Barcode).
--bmr
--genes-to-ignore
A comma-delimited list of genes to ignore from the \s-1MAF\s0 file. This is useful when there are recurrently mutated genes like \s-1TP53\s0 which might mask the significance of other genes.

AUTHORS

Michael Wendl, Ph.D.

CREDITS

This module uses reformatted copies of data from the Kyoto Encyclopedia of Genes and Genomes (\s-1KEGG\s0) database:

* KEGG - http://www.genome.jp/kegg/