annotate fastx_collapser.xml @ 1:460c78dbadf8

Remove spurious version strings.
author Dave Bouvier <dave@bx.psu.edu>
date Tue, 26 Nov 2013 12:47:40 -0500
parents 9246516d9dd5
children 17bfc147c9ea
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
0
9246516d9dd5 Uploaded
devteam
parents:
diff changeset
1 <tool id="cshl_fastx_collapser" version="1.0.0" name="Collapse">
9246516d9dd5 Uploaded
devteam
parents:
diff changeset
2 <description>sequences</description>
9246516d9dd5 Uploaded
devteam
parents:
diff changeset
3 <requirements>
9246516d9dd5 Uploaded
devteam
parents:
diff changeset
4 <requirement type="package" version="0.0.13">fastx_toolkit</requirement>
9246516d9dd5 Uploaded
devteam
parents:
diff changeset
5 </requirements>
9246516d9dd5 Uploaded
devteam
parents:
diff changeset
6 <command>zcat -f '$input' | fastx_collapser -v -o '$output'
9246516d9dd5 Uploaded
devteam
parents:
diff changeset
7 #if $input.ext == "fastqsanger":
9246516d9dd5 Uploaded
devteam
parents:
diff changeset
8 -Q 33
9246516d9dd5 Uploaded
devteam
parents:
diff changeset
9 #end if
9246516d9dd5 Uploaded
devteam
parents:
diff changeset
10 </command>
9246516d9dd5 Uploaded
devteam
parents:
diff changeset
11
9246516d9dd5 Uploaded
devteam
parents:
diff changeset
12 <inputs>
1
460c78dbadf8 Remove spurious version strings.
Dave Bouvier <dave@bx.psu.edu>
parents: 0
diff changeset
13 <param format="fasta,fastqsanger,fastqsolexa" name="input" type="data" label="Library to collapse" />
0
9246516d9dd5 Uploaded
devteam
parents:
diff changeset
14 </inputs>
9246516d9dd5 Uploaded
devteam
parents:
diff changeset
15
9246516d9dd5 Uploaded
devteam
parents:
diff changeset
16 <!-- The order of sequences in the test output differ between 32 bit and 64 bit machines.
9246516d9dd5 Uploaded
devteam
parents:
diff changeset
17 <tests>
9246516d9dd5 Uploaded
devteam
parents:
diff changeset
18 <test>
1
460c78dbadf8 Remove spurious version strings.
Dave Bouvier <dave@bx.psu.edu>
parents: 0
diff changeset
19 <param name="input" value="fasta_collapser1.fasta" />
460c78dbadf8 Remove spurious version strings.
Dave Bouvier <dave@bx.psu.edu>
parents: 0
diff changeset
20 <output name="output" file="fasta_collapser1.out" />
0
9246516d9dd5 Uploaded
devteam
parents:
diff changeset
21 </test>
9246516d9dd5 Uploaded
devteam
parents:
diff changeset
22 </tests>
9246516d9dd5 Uploaded
devteam
parents:
diff changeset
23 -->
9246516d9dd5 Uploaded
devteam
parents:
diff changeset
24 <outputs>
1
460c78dbadf8 Remove spurious version strings.
Dave Bouvier <dave@bx.psu.edu>
parents: 0
diff changeset
25 <data format="fasta" name="output" metadata_source="input" />
0
9246516d9dd5 Uploaded
devteam
parents:
diff changeset
26 </outputs>
9246516d9dd5 Uploaded
devteam
parents:
diff changeset
27 <help>
9246516d9dd5 Uploaded
devteam
parents:
diff changeset
28
9246516d9dd5 Uploaded
devteam
parents:
diff changeset
29 **What it does**
9246516d9dd5 Uploaded
devteam
parents:
diff changeset
30
9246516d9dd5 Uploaded
devteam
parents:
diff changeset
31 This tool collapses identical sequences in a FASTA file into a single sequence.
9246516d9dd5 Uploaded
devteam
parents:
diff changeset
32
9246516d9dd5 Uploaded
devteam
parents:
diff changeset
33 --------
9246516d9dd5 Uploaded
devteam
parents:
diff changeset
34
9246516d9dd5 Uploaded
devteam
parents:
diff changeset
35 **Example**
9246516d9dd5 Uploaded
devteam
parents:
diff changeset
36
9246516d9dd5 Uploaded
devteam
parents:
diff changeset
37 Example Input File (Sequence "ATAT" appears multiple times)::
9246516d9dd5 Uploaded
devteam
parents:
diff changeset
38
9246516d9dd5 Uploaded
devteam
parents:
diff changeset
39 >CSHL_2_FC0042AGLLOO_1_1_605_414
9246516d9dd5 Uploaded
devteam
parents:
diff changeset
40 TGCG
9246516d9dd5 Uploaded
devteam
parents:
diff changeset
41 >CSHL_2_FC0042AGLLOO_1_1_537_759
9246516d9dd5 Uploaded
devteam
parents:
diff changeset
42 ATAT
9246516d9dd5 Uploaded
devteam
parents:
diff changeset
43 >CSHL_2_FC0042AGLLOO_1_1_774_520
9246516d9dd5 Uploaded
devteam
parents:
diff changeset
44 TGGC
9246516d9dd5 Uploaded
devteam
parents:
diff changeset
45 >CSHL_2_FC0042AGLLOO_1_1_742_502
9246516d9dd5 Uploaded
devteam
parents:
diff changeset
46 ATAT
9246516d9dd5 Uploaded
devteam
parents:
diff changeset
47 >CSHL_2_FC0042AGLLOO_1_1_781_514
9246516d9dd5 Uploaded
devteam
parents:
diff changeset
48 TGAG
9246516d9dd5 Uploaded
devteam
parents:
diff changeset
49 >CSHL_2_FC0042AGLLOO_1_1_757_487
9246516d9dd5 Uploaded
devteam
parents:
diff changeset
50 TTCA
9246516d9dd5 Uploaded
devteam
parents:
diff changeset
51 >CSHL_2_FC0042AGLLOO_1_1_903_769
9246516d9dd5 Uploaded
devteam
parents:
diff changeset
52 ATAT
9246516d9dd5 Uploaded
devteam
parents:
diff changeset
53 >CSHL_2_FC0042AGLLOO_1_1_724_499
9246516d9dd5 Uploaded
devteam
parents:
diff changeset
54 ATAT
9246516d9dd5 Uploaded
devteam
parents:
diff changeset
55
9246516d9dd5 Uploaded
devteam
parents:
diff changeset
56 Example Output file::
9246516d9dd5 Uploaded
devteam
parents:
diff changeset
57
9246516d9dd5 Uploaded
devteam
parents:
diff changeset
58 >1-1
9246516d9dd5 Uploaded
devteam
parents:
diff changeset
59 TGCG
9246516d9dd5 Uploaded
devteam
parents:
diff changeset
60 >2-4
9246516d9dd5 Uploaded
devteam
parents:
diff changeset
61 ATAT
9246516d9dd5 Uploaded
devteam
parents:
diff changeset
62 >3-1
9246516d9dd5 Uploaded
devteam
parents:
diff changeset
63 TGGC
9246516d9dd5 Uploaded
devteam
parents:
diff changeset
64 >4-1
9246516d9dd5 Uploaded
devteam
parents:
diff changeset
65 TGAG
9246516d9dd5 Uploaded
devteam
parents:
diff changeset
66 >5-1
9246516d9dd5 Uploaded
devteam
parents:
diff changeset
67 TTCA
9246516d9dd5 Uploaded
devteam
parents:
diff changeset
68
9246516d9dd5 Uploaded
devteam
parents:
diff changeset
69 .. class:: infomark
9246516d9dd5 Uploaded
devteam
parents:
diff changeset
70
9246516d9dd5 Uploaded
devteam
parents:
diff changeset
71 Original Sequence Names / Lane descriptions (e.g. "CSHL_2_FC0042AGLLOO_1_1_742_502") are discarded.
9246516d9dd5 Uploaded
devteam
parents:
diff changeset
72
9246516d9dd5 Uploaded
devteam
parents:
diff changeset
73 The output sequence name is composed of two numbers: the first is the sequence's number, the second is the multiplicity value.
9246516d9dd5 Uploaded
devteam
parents:
diff changeset
74
9246516d9dd5 Uploaded
devteam
parents:
diff changeset
75 The following output::
9246516d9dd5 Uploaded
devteam
parents:
diff changeset
76
9246516d9dd5 Uploaded
devteam
parents:
diff changeset
77 >2-4
9246516d9dd5 Uploaded
devteam
parents:
diff changeset
78 ATAT
9246516d9dd5 Uploaded
devteam
parents:
diff changeset
79
9246516d9dd5 Uploaded
devteam
parents:
diff changeset
80 means that the sequence "ATAT" is the second sequence in the file, and it appeared 4 times in the input FASTA file.
9246516d9dd5 Uploaded
devteam
parents:
diff changeset
81
9246516d9dd5 Uploaded
devteam
parents:
diff changeset
82
9246516d9dd5 Uploaded
devteam
parents:
diff changeset
83 ------
9246516d9dd5 Uploaded
devteam
parents:
diff changeset
84
9246516d9dd5 Uploaded
devteam
parents:
diff changeset
85 This tool is based on `FASTX-toolkit`__ by Assaf Gordon.
9246516d9dd5 Uploaded
devteam
parents:
diff changeset
86
9246516d9dd5 Uploaded
devteam
parents:
diff changeset
87 .. __: http://hannonlab.cshl.edu/fastx_toolkit/
9246516d9dd5 Uploaded
devteam
parents:
diff changeset
88
9246516d9dd5 Uploaded
devteam
parents:
diff changeset
89 </help>
9246516d9dd5 Uploaded
devteam
parents:
diff changeset
90 </tool>