annotate segemehl.xml @ 2:dc63d59e7bf8 draft

Uploaded
author bgruening
date Sat, 18 Jan 2014 05:43:29 -0500
parents 468b59eae694
children e1d38fef6dd5
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
1
468b59eae694 Uploaded
bgruening
parents: 0
diff changeset
1 <tool id="segemehl" name="segemehl" version="0.1.6.0">
468b59eae694 Uploaded
bgruening
parents: 0
diff changeset
2 <description>based short read aligner</description>
0
94926c35b6f3 intial uploaded
rnateam
parents:
diff changeset
3 <requirements>
94926c35b6f3 intial uploaded
rnateam
parents:
diff changeset
4 <requirement type="package" version="0.1.6">segemehl</requirement>
94926c35b6f3 intial uploaded
rnateam
parents:
diff changeset
5 </requirements>
94926c35b6f3 intial uploaded
rnateam
parents:
diff changeset
6 <command>
94926c35b6f3 intial uploaded
rnateam
parents:
diff changeset
7 ## prepare segemehl index if no reference genome is supplied
94926c35b6f3 intial uploaded
rnateam
parents:
diff changeset
8 temp_index = `mktemp`;
94926c35b6f3 intial uploaded
rnateam
parents:
diff changeset
9 #if $refGenomeSource.genomeSource == "history":
94926c35b6f3 intial uploaded
rnateam
parents:
diff changeset
10 segemehl.x -x $temp_index -d $refGenomeSource.own_reference_genome;
94926c35b6f3 intial uploaded
rnateam
parents:
diff changeset
11 #else:
94926c35b6f3 intial uploaded
rnateam
parents:
diff changeset
12 $temp_index = ${refGenomeSource.index.fields.index_path}
94926c35b6f3 intial uploaded
rnateam
parents:
diff changeset
13 #end if
94926c35b6f3 intial uploaded
rnateam
parents:
diff changeset
14
94926c35b6f3 intial uploaded
rnateam
parents:
diff changeset
15
94926c35b6f3 intial uploaded
rnateam
parents:
diff changeset
16 ## execute segemehl
94926c35b6f3 intial uploaded
rnateam
parents:
diff changeset
17 segemehl.x
94926c35b6f3 intial uploaded
rnateam
parents:
diff changeset
18
94926c35b6f3 intial uploaded
rnateam
parents:
diff changeset
19 ## number of threads
1
468b59eae694 Uploaded
bgruening
parents: 0
diff changeset
20 -t "\${GALAXY_SLOTS:-12}"
0
94926c35b6f3 intial uploaded
rnateam
parents:
diff changeset
21
94926c35b6f3 intial uploaded
rnateam
parents:
diff changeset
22 ## db file path
94926c35b6f3 intial uploaded
rnateam
parents:
diff changeset
23 -d ${refGenomeSource.index.fields.db_path}
94926c35b6f3 intial uploaded
rnateam
parents:
diff changeset
24
94926c35b6f3 intial uploaded
rnateam
parents:
diff changeset
25 -i $temp_index
94926c35b6f3 intial uploaded
rnateam
parents:
diff changeset
26
94926c35b6f3 intial uploaded
rnateam
parents:
diff changeset
27 ## check for single/pair-end
94926c35b6f3 intial uploaded
rnateam
parents:
diff changeset
28 #if str( $library.type ) == "single":
94926c35b6f3 intial uploaded
rnateam
parents:
diff changeset
29 #set $query_list = list()
94926c35b6f3 intial uploaded
rnateam
parents:
diff changeset
30 ## prepare inputs
94926c35b6f3 intial uploaded
rnateam
parents:
diff changeset
31 #for $fastq in $library.reads:
94926c35b6f3 intial uploaded
rnateam
parents:
diff changeset
32 $query_list.append('%s' %($fastq.input_query))
94926c35b6f3 intial uploaded
rnateam
parents:
diff changeset
33 #end for
94926c35b6f3 intial uploaded
rnateam
parents:
diff changeset
34 -q "#echo ' '.join( $query_list )#"
94926c35b6f3 intial uploaded
rnateam
parents:
diff changeset
35 #else
94926c35b6f3 intial uploaded
rnateam
parents:
diff changeset
36 ## prepare inputs
94926c35b6f3 intial uploaded
rnateam
parents:
diff changeset
37
94926c35b6f3 intial uploaded
rnateam
parents:
diff changeset
38 #set $mate1 = list()
94926c35b6f3 intial uploaded
rnateam
parents:
diff changeset
39 #set $mate2 = list()
94926c35b6f3 intial uploaded
rnateam
parents:
diff changeset
40 #for $mate_pair in $library.mate_list:
94926c35b6f3 intial uploaded
rnateam
parents:
diff changeset
41 $mate1.append( str($mate_pair.first_strand_query) )
94926c35b6f3 intial uploaded
rnateam
parents:
diff changeset
42 $mate2.append( str($mate_pair.second_strand_query) )
94926c35b6f3 intial uploaded
rnateam
parents:
diff changeset
43 #end for
94926c35b6f3 intial uploaded
rnateam
parents:
diff changeset
44
94926c35b6f3 intial uploaded
rnateam
parents:
diff changeset
45 -q #echo ','.join($mate1)
94926c35b6f3 intial uploaded
rnateam
parents:
diff changeset
46 -p #echo ','.join($mate2)
94926c35b6f3 intial uploaded
rnateam
parents:
diff changeset
47
94926c35b6f3 intial uploaded
rnateam
parents:
diff changeset
48 -I $library.maxinsertsize
94926c35b6f3 intial uploaded
rnateam
parents:
diff changeset
49 #end if
94926c35b6f3 intial uploaded
rnateam
parents:
diff changeset
50 -m $minsize
94926c35b6f3 intial uploaded
rnateam
parents:
diff changeset
51 -A $accuracy
94926c35b6f3 intial uploaded
rnateam
parents:
diff changeset
52 -H $hitstrategy
94926c35b6f3 intial uploaded
rnateam
parents:
diff changeset
53 #if str( $prime5 ).strip():
94926c35b6f3 intial uploaded
rnateam
parents:
diff changeset
54 -P $prime5
94926c35b6f3 intial uploaded
rnateam
parents:
diff changeset
55 #end if
94926c35b6f3 intial uploaded
rnateam
parents:
diff changeset
56 #if str( $prime3 ).strip():
94926c35b6f3 intial uploaded
rnateam
parents:
diff changeset
57 -Q $prime3
94926c35b6f3 intial uploaded
rnateam
parents:
diff changeset
58 #end if
94926c35b6f3 intial uploaded
rnateam
parents:
diff changeset
59 $polyA
94926c35b6f3 intial uploaded
rnateam
parents:
diff changeset
60 $autoclip
94926c35b6f3 intial uploaded
rnateam
parents:
diff changeset
61 $hardclip
94926c35b6f3 intial uploaded
rnateam
parents:
diff changeset
62 $order
94926c35b6f3 intial uploaded
rnateam
parents:
diff changeset
63 -s
94926c35b6f3 intial uploaded
rnateam
parents:
diff changeset
64 -o $segemehl_out
94926c35b6f3 intial uploaded
rnateam
parents:
diff changeset
65 </command>
94926c35b6f3 intial uploaded
rnateam
parents:
diff changeset
66 <stdio>
94926c35b6f3 intial uploaded
rnateam
parents:
diff changeset
67 <regex match="Exit forced"
94926c35b6f3 intial uploaded
rnateam
parents:
diff changeset
68 source="both"
94926c35b6f3 intial uploaded
rnateam
parents:
diff changeset
69 level="fatal"
94926c35b6f3 intial uploaded
rnateam
parents:
diff changeset
70 description="Execution halted." />
94926c35b6f3 intial uploaded
rnateam
parents:
diff changeset
71 </stdio>
94926c35b6f3 intial uploaded
rnateam
parents:
diff changeset
72 <inputs>
94926c35b6f3 intial uploaded
rnateam
parents:
diff changeset
73
94926c35b6f3 intial uploaded
rnateam
parents:
diff changeset
74 <conditional name="refGenomeSource">
94926c35b6f3 intial uploaded
rnateam
parents:
diff changeset
75 <param name="genomeSource" type="select" label="Will you select a reference genome from your history or use a built-in index?" help="Built-ins were indexed using default options">
94926c35b6f3 intial uploaded
rnateam
parents:
diff changeset
76 <option value="indexed">Use a built-in index</option>
94926c35b6f3 intial uploaded
rnateam
parents:
diff changeset
77 <option value="history">Use one from the history</option>
94926c35b6f3 intial uploaded
rnateam
parents:
diff changeset
78 </param>
94926c35b6f3 intial uploaded
rnateam
parents:
diff changeset
79 <when value="indexed">
94926c35b6f3 intial uploaded
rnateam
parents:
diff changeset
80 <param name="index" type="select" label="Select a reference genome" help="If your genome of interest is not listed, contact your Galaxy admin">
94926c35b6f3 intial uploaded
rnateam
parents:
diff changeset
81 <options from_data_table="segemehl_indexes">
94926c35b6f3 intial uploaded
rnateam
parents:
diff changeset
82 <column name="value" index="0"/>
94926c35b6f3 intial uploaded
rnateam
parents:
diff changeset
83 <column name="dbkey" index="1"/>
94926c35b6f3 intial uploaded
rnateam
parents:
diff changeset
84 <column name="name" index="2"/>
94926c35b6f3 intial uploaded
rnateam
parents:
diff changeset
85 <column name="db_path" index="3"/>
94926c35b6f3 intial uploaded
rnateam
parents:
diff changeset
86 <column name="index_path" index="4"/>
94926c35b6f3 intial uploaded
rnateam
parents:
diff changeset
87 <filter type="sort_by" column="2"/>
94926c35b6f3 intial uploaded
rnateam
parents:
diff changeset
88 <validator type="no_options" message="No indexes are available for the selected input dataset"/>
94926c35b6f3 intial uploaded
rnateam
parents:
diff changeset
89 </options>
94926c35b6f3 intial uploaded
rnateam
parents:
diff changeset
90 </param>
94926c35b6f3 intial uploaded
rnateam
parents:
diff changeset
91 </when> <!-- build-in -->
94926c35b6f3 intial uploaded
rnateam
parents:
diff changeset
92 <when value="history">
94926c35b6f3 intial uploaded
rnateam
parents:
diff changeset
93 <param name="own_reference_genome" type="data" format="fasta" metadata_name="dbkey" label="Select the reference genome" />
94926c35b6f3 intial uploaded
rnateam
parents:
diff changeset
94 </when> <!-- history -->
94926c35b6f3 intial uploaded
rnateam
parents:
diff changeset
95 </conditional> <!-- refGenomeSource -->
94926c35b6f3 intial uploaded
rnateam
parents:
diff changeset
96
94926c35b6f3 intial uploaded
rnateam
parents:
diff changeset
97
94926c35b6f3 intial uploaded
rnateam
parents:
diff changeset
98 <conditional name="library">
94926c35b6f3 intial uploaded
rnateam
parents:
diff changeset
99 <param name="type" type="select" label="Is this library paired-end?">
94926c35b6f3 intial uploaded
rnateam
parents:
diff changeset
100 <option value="single">Single-end</option>
94926c35b6f3 intial uploaded
rnateam
parents:
diff changeset
101 <option value="paired">Paired-end</option>
94926c35b6f3 intial uploaded
rnateam
parents:
diff changeset
102 </param>
94926c35b6f3 intial uploaded
rnateam
parents:
diff changeset
103 <when value="single">
94926c35b6f3 intial uploaded
rnateam
parents:
diff changeset
104 <repeat name="reads" title="FASTQ/FASTA files">
94926c35b6f3 intial uploaded
rnateam
parents:
diff changeset
105 <param name="input_query" type="data" format="fastqsanger,fastqillumina,fastq,fasta" label="Reads fasta/fastq file" />
94926c35b6f3 intial uploaded
rnateam
parents:
diff changeset
106 </repeat>
94926c35b6f3 intial uploaded
rnateam
parents:
diff changeset
107 </when>
94926c35b6f3 intial uploaded
rnateam
parents:
diff changeset
108 <when value="paired">
94926c35b6f3 intial uploaded
rnateam
parents:
diff changeset
109 <repeat name="mate_list" title="Paired End Pairs" min="1">
94926c35b6f3 intial uploaded
rnateam
parents:
diff changeset
110 <param name="first_strand_query" type="data" format="fastqsanger,fastqillumina,fastq,fasta" label="Reads from first strand" />
94926c35b6f3 intial uploaded
rnateam
parents:
diff changeset
111 <param name="second_strand_query" type="data" format="fastqsanger,fastqillumina,fastq,fasta" label="Reads from second strand" />
94926c35b6f3 intial uploaded
rnateam
parents:
diff changeset
112 </repeat>
2
dc63d59e7bf8 Uploaded
bgruening
parents: 1
diff changeset
113 <param name="maxinsertsize" type="integer" value="5000" label="Maximum size of the inserts (paired end)" help="default: 5000 (-I)" />
0
94926c35b6f3 intial uploaded
rnateam
parents:
diff changeset
114 </when>
94926c35b6f3 intial uploaded
rnateam
parents:
diff changeset
115 </conditional>
94926c35b6f3 intial uploaded
rnateam
parents:
diff changeset
116
94926c35b6f3 intial uploaded
rnateam
parents:
diff changeset
117
2
dc63d59e7bf8 Uploaded
bgruening
parents: 1
diff changeset
118 <param name="minsize" type="integer" value="12" size="5" label="Minimum size of queries" help="default: 12 (-m)">
0
94926c35b6f3 intial uploaded
rnateam
parents:
diff changeset
119 <validator type="in_range" min="1"/>
94926c35b6f3 intial uploaded
rnateam
parents:
diff changeset
120 </param>
2
dc63d59e7bf8 Uploaded
bgruening
parents: 1
diff changeset
121 <param name="accuracy" type="integer" value="85" size="5" label="Min percentage of matches per read in semi-global alignment" help="default: 85 (-A)" >
0
94926c35b6f3 intial uploaded
rnateam
parents:
diff changeset
122 <validator type="in_range" min="1" max="100"/>
94926c35b6f3 intial uploaded
rnateam
parents:
diff changeset
123 </param>
2
dc63d59e7bf8 Uploaded
bgruening
parents: 1
diff changeset
124 <param name="hitstrategy" type="select" label="Hits to report?" help="(-H)">
dc63d59e7bf8 Uploaded
bgruening
parents: 1
diff changeset
125 <option value="1">report only best scoring hits</option>
dc63d59e7bf8 Uploaded
bgruening
parents: 1
diff changeset
126 <option value="0">report all scoring hits</option>
dc63d59e7bf8 Uploaded
bgruening
parents: 1
diff changeset
127 </param>
dc63d59e7bf8 Uploaded
bgruening
parents: 1
diff changeset
128 <param name="prime5" type="text" size="80" label="add 5' adapter" help="default: none (-Q)" />
dc63d59e7bf8 Uploaded
bgruening
parents: 1
diff changeset
129 <param name="prime3" type="text" size="80" label="add 3' adapter" help="default: none (-P)"/>
dc63d59e7bf8 Uploaded
bgruening
parents: 1
diff changeset
130 <param name="polyA" type="boolean" truevalue="--polyA" falsevalue="" checked="false" label="Clip polyA tail" help="(-T)"/>
dc63d59e7bf8 Uploaded
bgruening
parents: 1
diff changeset
131 <param name="autoclip" type="boolean" truevalue="--autoclip" falsevalue="" checked="false" label="Autoclip unknown 3prime adapter" help="(-Y)"/>
dc63d59e7bf8 Uploaded
bgruening
parents: 1
diff changeset
132 <param name="hardclip" type="boolean" truevalue="--hardclip" falsevalue="" checked="false" label="Enable hard clipping" help="-C"/>
dc63d59e7bf8 Uploaded
bgruening
parents: 1
diff changeset
133 <param name="order" type="boolean" truevalue="--order" falsevalue="" checked="false" label="Sorts the output by chromsome and position" help="(-O)"/>
0
94926c35b6f3 intial uploaded
rnateam
parents:
diff changeset
134 </inputs>
94926c35b6f3 intial uploaded
rnateam
parents:
diff changeset
135
94926c35b6f3 intial uploaded
rnateam
parents:
diff changeset
136 <outputs>
94926c35b6f3 intial uploaded
rnateam
parents:
diff changeset
137 <data format="sam" name="segemehl_out" label="Read alignments on ${on_string}"/>
94926c35b6f3 intial uploaded
rnateam
parents:
diff changeset
138 </outputs>
94926c35b6f3 intial uploaded
rnateam
parents:
diff changeset
139 <help>
94926c35b6f3 intial uploaded
rnateam
parents:
diff changeset
140
94926c35b6f3 intial uploaded
rnateam
parents:
diff changeset
141 .. class:: infomark
94926c35b6f3 intial uploaded
rnateam
parents:
diff changeset
142
94926c35b6f3 intial uploaded
rnateam
parents:
diff changeset
143 **What it does**
94926c35b6f3 intial uploaded
rnateam
parents:
diff changeset
144
1
468b59eae694 Uploaded
bgruening
parents: 0
diff changeset
145 Segemehl_ is a short read mapper with gaps.
0
94926c35b6f3 intial uploaded
rnateam
parents:
diff changeset
146
1
468b59eae694 Uploaded
bgruening
parents: 0
diff changeset
147 Segemehl_ is a software to map short sequencer reads to reference genomes.
0
94926c35b6f3 intial uploaded
rnateam
parents:
diff changeset
148 Unlike other methods, segemehl is able to detect not only mismatches but also insertions and deletions.
94926c35b6f3 intial uploaded
rnateam
parents:
diff changeset
149 Furthermore, segemehl is not limited to a specific read length and is able to mapprimer- or polyadenylation contaminated reads correctly.
1
468b59eae694 Uploaded
bgruening
parents: 0
diff changeset
150 segemehl implements a matching strategy based on enhanced suffix arrays (ESA). Segemehl_ allows bisulfite sequencing mapping and split read mapping.
468b59eae694 Uploaded
bgruening
parents: 0
diff changeset
151
468b59eae694 Uploaded
bgruening
parents: 0
diff changeset
152 .. _Segemehl: http://www.bioinf.uni-leipzig.de/Software/segemehl/
468b59eae694 Uploaded
bgruening
parents: 0
diff changeset
153
468b59eae694 Uploaded
bgruening
parents: 0
diff changeset
154 **References**
468b59eae694 Uploaded
bgruening
parents: 0
diff changeset
155
468b59eae694 Uploaded
bgruening
parents: 0
diff changeset
156 Hoffmann S, Otto C, Kurtz S, Sharma CM, Khaitovich P, Vogel J, Stadler PF, Hackermueller J: "Fast mapping of short sequences with mismatches, insertions and deletions using index structures", PLoS Comput Biol (2009) vol. 5 (9) pp. e1000502
468b59eae694 Uploaded
bgruening
parents: 0
diff changeset
157 download latest version: 0.1.6 manual: download here new stuff: faster multiple split read mapping bug fixes: bugfixes: increased sensitivity for strand switches changes: - default accuracy now 90% older segemehl indices are still usable. issues: untraceable errors with gcc compiler gcc-4.5. zlib linker problems with some ubuntu versions complaint department: steve bioinf uni leipzig deshapeimage_1_link_0shapeimage_1_link_1
0
94926c35b6f3 intial uploaded
rnateam
parents:
diff changeset
158
94926c35b6f3 intial uploaded
rnateam
parents:
diff changeset
159 </help>
94926c35b6f3 intial uploaded
rnateam
parents:
diff changeset
160 </tool>