annotate filtering.xml @ 4:a188de29f06e

Uploaded
author g2cmnty@test-web1.g2.bx.psu.edu
date Tue, 28 Jun 2011 10:29:09 -0400
parents 6ad594db0143
children
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
3
6ad594db0143 Uploaded
g2cmnty@test-web1.g2.bx.psu.edu
parents:
diff changeset
1 <tool id="Filter1" name="Filter" version="1.0.1">
6ad594db0143 Uploaded
g2cmnty@test-web1.g2.bx.psu.edu
parents:
diff changeset
2 <description>data on any column using simple expressions</description>
6ad594db0143 Uploaded
g2cmnty@test-web1.g2.bx.psu.edu
parents:
diff changeset
3 <command interpreter="python">
6ad594db0143 Uploaded
g2cmnty@test-web1.g2.bx.psu.edu
parents:
diff changeset
4 filtering.py $input $out_file1 "$cond" ${input.metadata.columns} "${input.metadata.column_types}"
6ad594db0143 Uploaded
g2cmnty@test-web1.g2.bx.psu.edu
parents:
diff changeset
5 </command>
6ad594db0143 Uploaded
g2cmnty@test-web1.g2.bx.psu.edu
parents:
diff changeset
6 <inputs>
6ad594db0143 Uploaded
g2cmnty@test-web1.g2.bx.psu.edu
parents:
diff changeset
7 <param format="tabular" name="input" type="data" label="Filter" help="Query missing? See TIP below."/>
6ad594db0143 Uploaded
g2cmnty@test-web1.g2.bx.psu.edu
parents:
diff changeset
8 <param name="cond" size="40" type="text" value="c1=='chr22'" label="With following condition" help="Double equal signs, ==, must be used as shown above. To filter for an arbitrary string, use the Select tool.">
6ad594db0143 Uploaded
g2cmnty@test-web1.g2.bx.psu.edu
parents:
diff changeset
9 <validator type="empty_field" message="Enter a valid filtering condition, see syntax and examples below."/>
6ad594db0143 Uploaded
g2cmnty@test-web1.g2.bx.psu.edu
parents:
diff changeset
10 </param>
6ad594db0143 Uploaded
g2cmnty@test-web1.g2.bx.psu.edu
parents:
diff changeset
11 </inputs>
6ad594db0143 Uploaded
g2cmnty@test-web1.g2.bx.psu.edu
parents:
diff changeset
12 <outputs>
6ad594db0143 Uploaded
g2cmnty@test-web1.g2.bx.psu.edu
parents:
diff changeset
13 <data format="input" name="out_file1" metadata_source="input"/>
6ad594db0143 Uploaded
g2cmnty@test-web1.g2.bx.psu.edu
parents:
diff changeset
14 </outputs>
6ad594db0143 Uploaded
g2cmnty@test-web1.g2.bx.psu.edu
parents:
diff changeset
15 <tests>
6ad594db0143 Uploaded
g2cmnty@test-web1.g2.bx.psu.edu
parents:
diff changeset
16 <test>
6ad594db0143 Uploaded
g2cmnty@test-web1.g2.bx.psu.edu
parents:
diff changeset
17 <param name="input" value="1.bed"/>
6ad594db0143 Uploaded
g2cmnty@test-web1.g2.bx.psu.edu
parents:
diff changeset
18 <param name="cond" value="c1=='chr22'"/>
6ad594db0143 Uploaded
g2cmnty@test-web1.g2.bx.psu.edu
parents:
diff changeset
19 <output name="out_file1" file="filter1_test1.bed"/>
6ad594db0143 Uploaded
g2cmnty@test-web1.g2.bx.psu.edu
parents:
diff changeset
20 </test>
6ad594db0143 Uploaded
g2cmnty@test-web1.g2.bx.psu.edu
parents:
diff changeset
21 <test>
6ad594db0143 Uploaded
g2cmnty@test-web1.g2.bx.psu.edu
parents:
diff changeset
22 <param name="input" value="7.bed"/>
6ad594db0143 Uploaded
g2cmnty@test-web1.g2.bx.psu.edu
parents:
diff changeset
23 <param name="cond" value="c1=='chr1' and c3-c2>=2000 and c6=='+'"/>
6ad594db0143 Uploaded
g2cmnty@test-web1.g2.bx.psu.edu
parents:
diff changeset
24 <output name="out_file1" file="filter1_test2.bed"/>
6ad594db0143 Uploaded
g2cmnty@test-web1.g2.bx.psu.edu
parents:
diff changeset
25 </test>
6ad594db0143 Uploaded
g2cmnty@test-web1.g2.bx.psu.edu
parents:
diff changeset
26 </tests>
6ad594db0143 Uploaded
g2cmnty@test-web1.g2.bx.psu.edu
parents:
diff changeset
27 <help>
6ad594db0143 Uploaded
g2cmnty@test-web1.g2.bx.psu.edu
parents:
diff changeset
28
6ad594db0143 Uploaded
g2cmnty@test-web1.g2.bx.psu.edu
parents:
diff changeset
29 .. class:: warningmark
6ad594db0143 Uploaded
g2cmnty@test-web1.g2.bx.psu.edu
parents:
diff changeset
30
6ad594db0143 Uploaded
g2cmnty@test-web1.g2.bx.psu.edu
parents:
diff changeset
31 Double equal signs, ==, must be used as *"equal to"* (e.g., **c1 == 'chr22'**)
6ad594db0143 Uploaded
g2cmnty@test-web1.g2.bx.psu.edu
parents:
diff changeset
32
6ad594db0143 Uploaded
g2cmnty@test-web1.g2.bx.psu.edu
parents:
diff changeset
33 .. class:: infomark
6ad594db0143 Uploaded
g2cmnty@test-web1.g2.bx.psu.edu
parents:
diff changeset
34
6ad594db0143 Uploaded
g2cmnty@test-web1.g2.bx.psu.edu
parents:
diff changeset
35 **TIP:** Attempting to apply a filtering condition may throw exceptions if the data type (e.g., string, integer) in every line of the columns being filtered is not appropriate for the condition (e.g., attempting certain numerical calculations on strings). If an exception is thrown when applying the condition to a line, that line is skipped as invalid for the filter condition. The number of invalid skipped lines is documented in the resulting history item as a "Condition/data issue".
6ad594db0143 Uploaded
g2cmnty@test-web1.g2.bx.psu.edu
parents:
diff changeset
36
6ad594db0143 Uploaded
g2cmnty@test-web1.g2.bx.psu.edu
parents:
diff changeset
37 .. class:: infomark
6ad594db0143 Uploaded
g2cmnty@test-web1.g2.bx.psu.edu
parents:
diff changeset
38
6ad594db0143 Uploaded
g2cmnty@test-web1.g2.bx.psu.edu
parents:
diff changeset
39 **TIP:** If your data is not TAB delimited, use *Text Manipulation-&gt;Convert*
6ad594db0143 Uploaded
g2cmnty@test-web1.g2.bx.psu.edu
parents:
diff changeset
40
6ad594db0143 Uploaded
g2cmnty@test-web1.g2.bx.psu.edu
parents:
diff changeset
41 -----
6ad594db0143 Uploaded
g2cmnty@test-web1.g2.bx.psu.edu
parents:
diff changeset
42
6ad594db0143 Uploaded
g2cmnty@test-web1.g2.bx.psu.edu
parents:
diff changeset
43 **Syntax**
6ad594db0143 Uploaded
g2cmnty@test-web1.g2.bx.psu.edu
parents:
diff changeset
44
6ad594db0143 Uploaded
g2cmnty@test-web1.g2.bx.psu.edu
parents:
diff changeset
45 The filter tool allows you to restrict the dataset using simple conditional statements.
6ad594db0143 Uploaded
g2cmnty@test-web1.g2.bx.psu.edu
parents:
diff changeset
46
6ad594db0143 Uploaded
g2cmnty@test-web1.g2.bx.psu.edu
parents:
diff changeset
47 - Columns are referenced with **c** and a **number**. For example, **c1** refers to the first column of a tab-delimited file
6ad594db0143 Uploaded
g2cmnty@test-web1.g2.bx.psu.edu
parents:
diff changeset
48 - Make sure that multi-character operators contain no white space ( e.g., **&lt;=** is valid while **&lt; =** is not valid )
6ad594db0143 Uploaded
g2cmnty@test-web1.g2.bx.psu.edu
parents:
diff changeset
49 - When using 'equal-to' operator **double equal sign '==' must be used** ( e.g., **c1=='chr1'** )
6ad594db0143 Uploaded
g2cmnty@test-web1.g2.bx.psu.edu
parents:
diff changeset
50 - Non-numerical values must be included in single or double quotes ( e.g., **c6=='+'** )
6ad594db0143 Uploaded
g2cmnty@test-web1.g2.bx.psu.edu
parents:
diff changeset
51 - Filtering condition can include logical operators, but **make sure operators are all lower case** ( e.g., **(c1!='chrX' and c1!='chrY') or not c6=='+'** )
6ad594db0143 Uploaded
g2cmnty@test-web1.g2.bx.psu.edu
parents:
diff changeset
52
6ad594db0143 Uploaded
g2cmnty@test-web1.g2.bx.psu.edu
parents:
diff changeset
53 -----
6ad594db0143 Uploaded
g2cmnty@test-web1.g2.bx.psu.edu
parents:
diff changeset
54
6ad594db0143 Uploaded
g2cmnty@test-web1.g2.bx.psu.edu
parents:
diff changeset
55 **Example**
6ad594db0143 Uploaded
g2cmnty@test-web1.g2.bx.psu.edu
parents:
diff changeset
56
6ad594db0143 Uploaded
g2cmnty@test-web1.g2.bx.psu.edu
parents:
diff changeset
57 - **c1=='chr1'** selects lines in which the first column is chr1
6ad594db0143 Uploaded
g2cmnty@test-web1.g2.bx.psu.edu
parents:
diff changeset
58 - **c3-c2&lt;100*c4** selects lines where subtracting column 3 from column 2 is less than the value of column 4 times 100
6ad594db0143 Uploaded
g2cmnty@test-web1.g2.bx.psu.edu
parents:
diff changeset
59 - **len(c2.split(',')) &lt; 4** will select lines where the second column has less than four comma separated elements
6ad594db0143 Uploaded
g2cmnty@test-web1.g2.bx.psu.edu
parents:
diff changeset
60 - **c2>=1** selects lines in which the value of column 2 is greater than or equal to 1
6ad594db0143 Uploaded
g2cmnty@test-web1.g2.bx.psu.edu
parents:
diff changeset
61 - Numbers should not contain commas - **c2&lt;=44,554,350** will not work, but **c2&lt;=44554350** will
6ad594db0143 Uploaded
g2cmnty@test-web1.g2.bx.psu.edu
parents:
diff changeset
62 - Some words in the data can be used, but must be single or double quoted ( e.g., **c3=='exon'** )
6ad594db0143 Uploaded
g2cmnty@test-web1.g2.bx.psu.edu
parents:
diff changeset
63
6ad594db0143 Uploaded
g2cmnty@test-web1.g2.bx.psu.edu
parents:
diff changeset
64 </help>
6ad594db0143 Uploaded
g2cmnty@test-web1.g2.bx.psu.edu
parents:
diff changeset
65 </tool>