comparison tools/mira4/mira4_mapping.xml @ 6:626d5cfd01aa draft

Uploaded v0.0.1 preview 6, support for fragment length (using mira4_validator.py)
author peterjc
date Mon, 21 Oct 2013 12:01:47 -0400
parents ffefb87bd414
children 902f01c1084b
comparison
equal deleted inserted replaced
5:ffefb87bd414 6:626d5cfd01aa
1 <tool id="mira_4_0_mapping" name="MIRA v4.0 mapping" version="0.0.1"> 1 <tool id="mira_4_0_mapping" name="MIRA v4.0 mapping" version="0.0.1">
2 <description>Maps Sanger, Roche 454, Solexa/Illumina, Ion Torrent and PacBio reads</description> 2 <description>Maps Sanger, Roche 454, Solexa/Illumina, Ion Torrent and PacBio reads</description>
3 <requirements> 3 <requirements>
4 <requirement type="python-module">Bio</requirement>
5 <requirement type="binary">mira</requirement> 4 <requirement type="binary">mira</requirement>
6 <requirement type="package" version="4.0">MIRA</requirement> 5 <requirement type="package" version="4.0">MIRA</requirement>
7 </requirements> 6 </requirements>
8 <version_command interpreter="python">mira4.py --version</version_command> 7 <version_command interpreter="python">mira4.py --version</version_command>
9 <command interpreter="python"> 8 <command interpreter="python">
36 <option value="iontor">Ion Torrent</option> 35 <option value="iontor">Ion Torrent</option>
37 <option value="pcbiolq">PacBio low quality (raw)</option> 36 <option value="pcbiolq">PacBio low quality (raw)</option>
38 <option value="pcbiohq">PacBio high quality (corrected)</option> 37 <option value="pcbiohq">PacBio high quality (corrected)</option>
39 <option value="text">Synthetic reads (database entries, consensus sequences, artifical reads, etc)</option> 38 <option value="text">Synthetic reads (database entries, consensus sequences, artifical reads, etc)</option>
40 </param> 39 </param>
41 <param name="segment_placement" type="select" label="Pairing type (segment placing)"> 40 <conditional name="segments">
42 <option value="">None (e.g. single end sequencing)</option> 41 <param name="type" type="select" label="Are these paired reads?">
43 <option value="FR">---&gt; &lt;--- (e.g. Sanger capillary or Solexa/Illumina paired-end library)</option> 42 <option value="paired">Paired reads</option>
44 <option value="RF">&lt;--- ---&gt; (e.g. Solexa/Illumina mate-pair library)</option> 43 <option value="none">Single reads or not relevant (e.g. primer walking with Sanger capillary sequencing)</option>
45 <option value="SB">2---&gt; 1---&gt; (e.g. Roche 454 paired-end libraries or IonTorrent long-mate; see note)</option> 44 </param>
46 <option value="?">Unknown or not relevant (e.g. primer walking with Sanger capillary sequencing)</option> 45 <when value="paired">
47 </param> 46 <param name="placement" type="select" label="Pairing type (segment placing)">
47 <option value="FR">---&gt; &lt;--- (e.g. Sanger capillary or Solexa/Illumina paired-end library)</option>
48 <option value="RF">&lt;--- ---&gt; (e.g. Solexa/Illumina mate-pair library)</option>
49 <option value="SB">2---&gt; 1---&gt; (e.g. Roche 454 paired-end libraries or IonTorrent long-mate; see note)</option>
50 </param>
51 <param name="naming" type="select" label="Pair naming convention">
52 <option value="solexa">Solexa/Illumina (using '/1' and '/2' suffixes)</option>
53 <option value="FR">Forward/Reverse scheme (using '.f*' and '.r*' suffixes)</option>
54 <option value="tigr">TIGR scheme (using 'TF*' and 'TR*' suffixes)</option>
55 <option value="sanger">Sanger scheme (see notes)</option>
56 <option value="stlouis">St. Louis scheme (see notes)</option>
57 </param>
58 </when>
59 <when value="none" /><!-- no further questions -->
60 </conditional>
48 <param name="filenames" type="data" format="fastq,mira" multiple="true" required="true" label="Read file(s)" 61 <param name="filenames" type="data" format="fastq,mira" multiple="true" required="true" label="Read file(s)"
49 help="Multiple files allowed, for example paired reads can be given as two files (MIRA looks at read names to identify pairs)." /> 62 help="Multiple files allowed, for example paired reads can be given as two files (MIRA looks at read names to identify pairs)." />
50 </repeat> 63 </repeat>
51 </inputs> 64 </inputs>
52 <outputs> 65 <outputs>
102 technology = ${rg.technology} 115 technology = ${rg.technology}
103 #if str($strain_setup)=="same" 116 #if str($strain_setup)=="same"
104 ##This is perhaps redundant as MIRA defaults to StrainX for the reads: 117 ##This is perhaps redundant as MIRA defaults to StrainX for the reads:
105 strain = StrainX 118 strain = StrainX
106 #end if 119 #end if
107 #if str($rg.segment_placement) != ""
108 ##Record the segment placement (if any) 120 ##Record the segment placement (if any)
109 segmentplacement = ${rg.segment_placement} 121 #if str($rg.segments.type) == "paired"
122 segmentplacement = ${rg.segments.placement}
123 segmentnaming = ${rg.segments.naming}
124 #end if
125 #if str($rg.segments.type) == "none"
126 segmentplacement = ?
110 #end if 127 #end if
111 ##MIRA will accept multiple filenames on one data line, or multiple data lines 128 ##MIRA will accept multiple filenames on one data line, or multiple data lines
112 #for $f in $rg.filenames 129 #for $f in $rg.filenames
113 ##Must now map Galaxy datatypes to MIRA file types... 130 ##Must now map Galaxy datatypes to MIRA file types...
114 #if $f.ext.startswith("fastq") 131 #if $f.ext.startswith("fastq")
158 a range of platforms (Sanger capillary, Solexa/Illumina, Roche 454, Ion Torrent 175 a range of platforms (Sanger capillary, Solexa/Illumina, Roche 454, Ion Torrent
159 and also PacBio). 176 and also PacBio).
160 177
161 It is particularly suited to small genomes such as bacteria. 178 It is particularly suited to small genomes such as bacteria.
162 179
163 **Notes** 180
181 **Notes on paired reads**
164 182
165 .. class:: warningmark 183 .. class:: warningmark
166 184
167 Note that the raw data for Roche 454 and Ion Torrent paired-end libraries 185 MIRA uses read naming conventions to identify paired read partners
168 sequences a circularised fragment such that the raw data starts with the 186 (and does not care about their order in the input files). In most cases,
169 end of the fragment, a linker, then the start of the fragment. This means 187 the Solexa/Illumina setting is fine. For Sanger capillary sequencing,
170 both the start and end are sequenced from the same strand, and thus should 188 you may need to rename your reads to match one of the standard conventions
171 be given to MIRA as orientation "2---&gt; 1---&gt;". However, in order to 189 supported by MIRA. For Roche 454 or Ion Torrent the appropriate settings
172 use this data with traditional tools expecting Sanger capillary style 190 depend on how the FASTQ file was produced:
173 libraries which expect "---&gt; &lt;---" your FASTQ files may have been 191
174 pre-processed to mimic this by reverse complementing one of the pair. 192 * If using Roche's ``sffinfo`` or older versions of ``sff_extract``
193 to convert SFF files to FASTQ, your reads will probably have the
194 ``---&gt; &lt;---`` orientation and use the ``.f`` and ``.r``
195 suffixes (FR naming).
196
197 * If using a recent version of ``sff_extract``, then the ``/1`` and ``/2``
198 suffixes are used (Solexa/Illumina style naming) and the original
199 ``2---&gt; 1---&gt;`` orientation is preserved.
200
201 The reason for this is the raw data for Roche 454 and Ion Torrent paired-end
202 libraries sequences a circularised fragment such that the raw data begins
203 with the end of the fragment, a linker, then the start of the fragment.
204 This means both the start and end are sequenced from the same strand, and
205 have the orientation ``2---&gt; 1---&gt;``. However, in order to use the data
206 with traditional tools expecting Sanger capillary style ``---&gt; &lt;---``
207 orientation it was common to reverse complement one of the pair to mimic this.
208
175 209
176 **Citation** 210 **Citation**
177 211
178 If you use this Galaxy tool in work leading to a scientific publication please 212 If you use this Galaxy tool in work leading to a scientific publication please
179 cite the following papers: 213 cite the following papers: