6
|
1 <tool id="spotyping" name="Spoligotype Prediction" version="0.1.5">
|
|
2 <description>fast and accurate in silico Mycobacterium spoligotyping from sequence reads</description>
|
0
|
3 <requirements>
|
|
4 <requirement type="package" version="2.1">spotyping</requirement>
|
|
5 </requirements>
|
3
|
6 <command detect_errors="aggressive"><![CDATA[
|
6
|
7 SpoTyping.py
|
|
8 $advanced.seq
|
|
9 $advanced.swift
|
|
10 --min=$advanced.min
|
|
11 --rmin=$advanced.min_relax
|
0
|
12 #if str( $data_input.data_selector ) == "paired"
|
|
13 $data_input.input1.forward $data_input.input1.reverse
|
|
14 #end if
|
|
15 #if str( $data_input.data_selector ) == "single"
|
6
|
16 $data_input.input2
|
0
|
17 #end if
|
5
|
18 && cp SITVIT_ONLINE.*.xls spotyping.xls
|
3
|
19 ]]>
|
1
|
20 </command>
|
0
|
21 <inputs>
|
|
22 <conditional name="data_input">
|
6
|
23 <param name="data_selector" type="select" label="Single or Paired-end Data" help="Select between paired and single end data to add name to dataset">
|
|
24 <option value="paired">Paired</option>
|
|
25 <option value="single">Single</option>
|
|
26 </param>
|
|
27 <when value="paired">
|
|
28 <param name="input1" format="data" type="data_collection" collection_type="paired" label="Select a paired collection" help="a paired data"/>
|
|
29 </when>
|
|
30 <when value="single">
|
|
31 <param name="input2" format="data" type="data" label="input" help="Specify dataset with single reads"/>
|
|
32 </when>
|
0
|
33 </conditional>
|
6
|
34 <section name="advanced" title="Advanced options" expanded="false">
|
|
35 <param type="boolean" argument="--seq" label="Input is assembled sequence" help="Input is either a complete genomic sequence or assembled contigs from an isolate" truevalue="--seq" falsevalue="" checked="false" />
|
|
36 <param type="boolean" argument="--swift" label="Swift mode" checked="true" truevalue="--swift=on" falsevalue="--swift=off" />
|
|
37 <param name="min" type="integer" value="5" label="MIN" help="minimum number of error-free hits to support presence of a spacer" />
|
|
38 <param name="min_relax" type="integer" value="6" label="MIN RELAX" help="minimum number of 1-error-tolerant hits to support presence of a spacer " />
|
|
39 </section>
|
0
|
40 </inputs>
|
|
41 <outputs>
|
4
|
42 <data name="output1" label="spoligotyping results" format="txt" from_work_dir="SpoTyping"/>
|
|
43 <data name="output2" label="spoligotyping log" format="txt" from_work_dir="SpoTyping.log"/>
|
5
|
44 <data name="output3" label="query" format="excel.xls" from_work_dir="spotyping.xls"/>
|
0
|
45 </outputs>
|
|
46 <help><![CDATA[
|
6
|
47 This is a modified version of IUC's wrapper of spotyping without the concatenation and renaming or input files. The wrapper also runs properly when supplied with paired-end reads
|
|
48
|
|
49 SpoTyping_ is a software for predicting spoligotype_ from sequencing reads, complete genomic sequences and assembled contigs.
|
|
50
|
|
51 **Input:**
|
|
52
|
|
53 - Fastq file - if paired end data is used, you may choose to concatenate paired reads into a single input (e.g. using the cat tool)
|
|
54 - Fasta file of a complete genomic sequence or assembled contigs of an isolate (with --seq option)
|
|
55
|
|
56 *Note on input size*: In swift mode the sampling threshold is reached in approximately 30x coverage when using
|
|
57 paired end sequencing of a *M. tuberculosis* genome.
|
|
58
|
|
59 **Output:**
|
|
60
|
|
61 Count of hits from BLAST result for each spacer sequence and predicted spoligotype in the format of binary code and octal code.
|
|
62
|
|
63 **Options:**
|
|
64
|
|
65 \--noQuery
|
|
66 Avoid querying the SITVIT_ online service to describe the prevalance of the reported spoligotype.
|
|
67
|
|
68 \--seq
|
|
69 Set this if input is a fasta file that contains only complete genomic sequence or assembled contigs from an isolate. [Default is off]
|
|
70
|
|
71 \-s SWIFT, --swift=SWIFT
|
|
72 Swift mode, either "on" or "off" [Default: on] - swift mode samples 250 million bases to use for spoligotyping
|
|
73
|
|
74 \--sorted
|
|
75 Set if input reads are sorted relative to positions on a reference genome. If reads are sorted and swift mode is used, swift mode's sampling is adjusted
|
|
76 to sample reads across positions in the genome evenly.
|
|
77
|
|
78 \--filter
|
|
79 Filter reads such that:
|
|
80
|
|
81 1. Leading and trailing 'N's would be removed.
|
|
82 2. Any read with more than 3 'N's in the middle would be removed.
|
|
83 3. Any read with more than 7 consecutive bases identical would be trimmed/filtered out given
|
|
84 the length of the flanking regions.
|
|
85
|
|
86 **Got weird spoligotype prediction?**
|
|
87
|
|
88 Sequencing throughput is very low (<40Mbp, for example): SpoTyping may not be able to give accurate prediction due to the relatively low read depth.
|
|
89
|
|
90 **Interpreting the spoligotype**
|
|
91
|
|
92 The binary or octal spoligotype can be used to look up lineage information using a service
|
|
93 like `TB Lineage`_.
|
|
94
|
|
95 **SITVIT reports**
|
|
96
|
|
97 Optionally a report on the detected spoligotype can be retrieved from the SITVIT_ database. If such a report is requested it can also be
|
|
98 illustrated as a (PDF format) plot.
|
|
99
|
|
100 .. _SpoTyping: https://github.com/xiaeryu/SpoTyping
|
|
101 .. _spoligotype: https://www.ncbi.nlm.nih.gov/pubmed/19521871
|
|
102 .. _TB Lineage: http://tbinsight.cs.rpi.edu/run_tb_lineage.html
|
|
103 .. _SITVIT: http://www.pasteur-guadeloupe.fr:8081/SITVIT_ONLINE/
|
0
|
104 ]]></help>
|
|
105 <citations>
|
|
106 <citation type="bibtex">
|
|
107 @misc{githubSpoTyping,
|
|
108 author = {Xia, Eryu},
|
|
109 year = {2016},
|
|
110 title = {SpoTyping},
|
|
111 publisher = {GitHub},
|
|
112 journal = {GitHub repository},
|
|
113 url = {https://github.com/xiaeryu/SpoTyping},
|
|
114 }</citation>
|
6
|
115 <citation type="doi">10.1186/s13073-016-0270-7</citation>
|
0
|
116 </citations>
|
|
117 </tool>
|