Mercurial > repos > jjohnson > snpsift
diff snpSift_filter.xml @ 4:baf6602903e1
Uploaded
| author | jjohnson |
|---|---|
| date | Wed, 09 Dec 2015 14:03:26 -0500 |
| parents | 796388c291d3 |
| children | 675fa55f5c02 824f78c0d0df |
line wrap: on
line diff
--- a/snpSift_filter.xml Thu Oct 23 06:06:25 2014 -0500 +++ b/snpSift_filter.xml Wed Dec 09 14:03:26 2015 -0500 @@ -1,11 +1,13 @@ -<tool id="snpSift_filter" name="SnpSift Filter" version="4.0.0"> +<tool id="snpSift_filter" name="SnpSift Filter" version="@WRAPPER_VERSION@.0"> <options sanitize="False" /> <description>Filter variants using arbitrary expressions</description> + <macros> + <import>snpSift_macros.xml</import> + </macros> <expand macro="requirements" /> - <macros> - <import>snpEff_macros.xml</import> - </macros> - <command> + <expand macro="stdio" /> + <expand macro="version_command" /> + <command><![CDATA[ java -Xmx6G -jar \$SNPEFF_JAR_PATH/SnpSift.jar filter -f $input -e $exprFile $inverse #if $filtering.mode == 'field': #if $filtering.replace.pass: @@ -22,6 +24,7 @@ #end if #end if > $output +]]> </command> <inputs> <param format="vcf" name="input" type="data" label="Variant input file in VCF format"/> @@ -57,7 +60,6 @@ <outputs> <data format="vcf" name="output" /> </outputs> - <expand macro="stdio" /> <tests> <test> <param name="input" ftype="vcf" value="test01.vcf"/> @@ -85,7 +87,7 @@ <test> <param name="input" ftype="vcf" value="test01.vcf"/> - <param name="expr" value="(POS >= 20175) & (POS <= 35549)"/> + <param name="expr" value="(POS >= 20175) & (POS <= 35549)"/> <param name="mode" value="entries"/> <output name="output"> <assert_contents> @@ -111,11 +113,11 @@ </output> </test> </tests> - <help> + <help><![CDATA[ **SnpSift filter** -You can filter ia vcf file using arbitrary expressions, for instance "(QUAL > 30) | (exists INDEL) | ( countHet() > 2 )". The actual expressions can be quite complex, so it allows for a lot of flexibility. +You can filter a VCF file using arbitrary expressions, for instance "(QUAL > 30) | (exists INDEL) | ( countHet() > 2 )". The actual expressions can be quite complex, so it allows for a lot of flexibility. Some examples: @@ -123,7 +125,7 @@ :: - ( CHROM = 'chr1' ) & ( POS > 1000000 ) & ( POS < 2000000 ) + ( CHROM = 'chr1' ) & ( POS > 1000000 ) & ( POS < 2000000 ) - *Filter value is either 'PASS' or it is missing*: @@ -131,11 +133,13 @@ (FILTER = 'PASS') | ( na FILTER ) - - *I want to filter lines with an EFF of 'frameshift_variant' ( for vcf files using Sequence Ontology terms )*: + - *I want to filter lines with an ANN annotation EFFECT of 'frameshift_variant' ( for vcf files using Sequence Ontology terms )*: :: - ( EFF[*].EFFECT = 'frameshift_variant' ) + ( ANN[*].EFFECT has 'frameshift_variant' ) + + **Important** According to the specification, there can be more than one EFFECT separated by & (e.g. 'missense_variant&splice_region_variant', thus using has operator is better than using equality operator (=). For instance 'missense_variant&splice_region_variant' = 'missense_variant' is false, whereas 'missense_variant&splice_region_variant' has 'missense_variant' is true. - *I want to filter lines with an EFF of 'FRAME_SHIFT' ( for vcf files using Classic Effect names )*: @@ -147,31 +151,31 @@ :: - ( QUAL > 30 ) + ( QUAL > 30 ) - *...but we also want InDels that have quality 20 or more*: :: - (( exists INDEL ) & (QUAL >= 20)) | (QUAL >= 30 ) + (( exists INDEL ) & (QUAL >= 20)) | (QUAL >= 30 ) - *...or any homozygous variant present in more than 3 samples*: :: - (countHom() > 3) | (( exists INDEL ) & (QUAL >= 20)) | (QUAL >= 30 ) + (countHom() > 3) | (( exists INDEL ) & (QUAL >= 20)) | (QUAL >= 30 ) - *...or any heterozygous sample with coverage 25 or more*: :: - ((countHet() > 0) & (DP >= 25)) | (countHom() > 3) | (( exists INDEL ) & (QUAL >= 20)) | (QUAL >= 30 ) + ((countHet() > 0) & (DP >= 25)) | (countHom() > 3) | (( exists INDEL ) & (QUAL >= 20)) | (QUAL >= 30 ) - *I want to keep samples where the genotype for the first sample is homozygous variant and the genotype for the second sample is reference*: :: - (isHom( GEN[0] ) & isVariant( GEN[0] ) & isRef( GEN[1] )) + (isHom( GEN[0] ) & isVariant( GEN[0] ) & isRef( GEN[1] )) **For information regarding HGVS and Sequence Ontology terms versus classic names**: @@ -185,5 +189,6 @@ @CITATION_SECTION@ +]]> </help> </tool>
