comparison gene_family_aligner.xml @ 16:4a0837f2b995 draft

Uploaded
author greg
date Fri, 28 Apr 2017 09:20:56 -0400
parents 5a5f80ea6306
children 23e20d346539
comparison
equal deleted inserted replaced
15:5a5f80ea6306 16:4a0837f2b995
1 <tool id="plant_tribes_gene_family_aligner" name="GeneFamilyAligner" version="@WRAPPER_VERSION@.0"> 1 <tool id="plant_tribes_gene_family_aligner" name="GeneFamilyAligner" version="@WRAPPER_VERSION@.0">
2 <description>aligns gene family sequences</description> 2 <description>aligns integrated orthologous gene family clusters</description>
3 <macros> 3 <macros>
4 <import>macros.xml</import> 4 <import>macros.xml</import>
5 </macros> 5 </macros>
6 <expand macro="requirements_gene_family_aligner" /> 6 <expand macro="requirements_gene_family_aligner" />
7 <expand macro="stdio" /> 7 <expand macro="stdio" />
80 #end if 80 #end if
81 ]]> 81 ]]>
82 </command> 82 </command>
83 <inputs> 83 <inputs>
84 <conditional name="input_format_cond"> 84 <conditional name="input_format_cond">
85 <param name="input_format" type="select" label="Select type of data to sub sample"> 85 <param name="input_format" type="select" label="Classified orthogroup fasta files">
86 <option value="ptortho">Gene family clusters</option> 86 <option value="ptortho">Proteins orthogroup fasta files</option>
87 <option value="ptorthocs">Gene family clusters with corresponding coding sequences</option> 87 <option value="ptorthocs">Protein and coding sequences orthogroup fasta files</option>
88 </param> 88 </param>
89 <when value="ptortho"> 89 <when value="ptortho">
90 <param name="input_ptortho" format="ptortho" type="data" label="Gene family clusters"> 90 <param name="input_ptortho" format="ptortho" type="data" label="Proteins orthogroup fasta files">
91 <!-- <validator type="empty_files_path" /> --> 91 <!-- <validator type="empty_files_path" /> -->
92 </param> 92 </param>
93 <expand macro="cond_alignment_method" /> 93 <expand macro="cond_alignment_method" />
94 </when> 94 </when>
95 <when value="ptorthocs"> 95 <when value="ptorthocs">
96 <param name="input_ptorthocs" format="ptorthocs" type="data" label="Gene family clusters with corresponding coding sequences"> 96 <param name="input_ptorthocs" format="ptorthocs" type="data" label="Protein and coding sequences orthogroup fasta files">
97 <!-- <validator type="empty_files_path" /> --> 97 <!-- <validator type="empty_files_path" /> -->
98 </param> 98 </param>
99 <expand macro="cond_alignment_method" /> 99 <expand macro="cond_alignment_method" />
100 <expand macro="param_codon_alignments" /> 100 <expand macro="param_codon_alignments" />
101 </when> 101 </when>
128 </test> 128 </test>
129 --> 129 -->
130 </tests> 130 </tests>
131 <help> 131 <help>
132 This tool is one of the PlantTribes collection of automated modular analysis pipelines for comparative and evolutionary 132 This tool is one of the PlantTribes collection of automated modular analysis pipelines for comparative and evolutionary
133 analyses of genome-scale gene families and transcriptomes. This tool aligns gene family sequences. 133 analyses of genome-scale gene families and transcriptomes. This tool estimates protein and codon multiple sequence alignments
134 of integrated orthologous gene family fasta files produced by the GeneFamilyIntegrator tool.
134 135
135 ----- 136 -----
136 137
137 **Required options** 138 **Required options**
138 139
139 * **Select type of data to sub sample** 140 * **Classified orthogroup fasta files** - orthogroup fasta files produced by the GeneFamilyClassifier tool selected from your history. Depending on how the GeneFamilyClassifier tool was exectured, these could either be proteins or proteins and their corresponding coding sequences.
140 141
141 - **Gene family clusters** - sequences classified into gene family clusters. 142 - **Proteins orthogroup fasta files** - proteins fasta files.
142 - **Gene family clusters with corresponding coding sequences** - sequences classified into gene family clusters including corresponding coding sequences. 143 - **Protein and coding sequences orthogroup fasta files** - proteins and their corresponding coding sequences fasta files.
143 144
144 - **Construct orthogroup multiple codon alignments** - construct orthogroup multiple codon alignments. 145 - **Construct orthogroup multiple codon alignments** - construct orthogroup multiple codon alignments.
145 146
146 * **Select method for multiple sequence alignments** 147 * **Multiple sequence alignment method** - method for estimating orthogroup multiple sequence alignments. PlantTribes estimates alignments using either MAFFT's L-INS-i algorithm or the divide and conquer approach implemented in the PASTA pipeline for large alignments.
147 148
148 - **MAFFT algorithm** - mafft algorithm. 149 - **MAFFT** - MAFFT algorithm.
149 - **Pasta algorithm** - pasta algorithm. 150 - **PASTA** - PASTA algorithm.
150 151
151 - **Maximum number of iterations that the PASTA algorithm will execute** - maximum number of iterations that the PASTA algorithm will execute. 152 - **PASTA iteration limit** - number of PASTA iterations. By default, PASTA performs 3 iterations.
152 153
153 **Other options** 154 **Other options**
154 155
155 * **Remove gappy sequences in alignments** 156 * **Alignment post-processing configuration** - select 'Yes' to enable multiple sequence alignment post-processing configuration options.
156 157
157 - **Select process used for gap trimming** - either nucleotide based or using trimAl's ML heuristic trimming approach 158 - **Trimming method** - multiple sequence alignment trimming method. PlantTribes trims alignments using two automated approaches implemented in trimAl. Gap score based trimming removes alignments sites that do not achieve a user specified gap score. For example, a setting of 0.1 removes sites that have gaps in 90% or more of the sequences in the multiple sequence alignment. The automated heuristic trimming approach determines the best automated trimAl method to trim a given alignment as described in the trimAl tutorial `trimAl`_.
159
160 .. _trimAl: http://trimal.cgenomics.org
158 161
159 - **Nucleotide based** 162 - **Nucleotide based**
160 163
161 - **Remove sites in alignments with gaps of** 164 - **Gap score** - 1.0 - (the fraction of sequences with gap allowed in an alignment site). The score is restricted to the range 0.0 - 1.0. Zero value has no effect.
162 - **Maximum number of iterations** - maximum number of iterations for iterative orthogroups realignment, trimming and fitering 165
166 - **Remove sequences** - select 'Yes' to remove sequences in multiple sequence alignments that do not achieve a user specified alignment coverage score. For example, a setting of 0.7 removes sequences with more than 30% gaps in the alignment. This option requires one of the trimming methods to be set.
167
168 - **Coverage score** - minimum fraction of sites without gaps for a sequence in a multiple sequence alignment. The score is restricted to the range 0.0 - 1.0. Zero value has no effect.
169
170 - **Realignment iteration limit** - number of iterations to perform trimming, removal of sequences, and realignment of orthogroup sequences. Zero value has no effect.
163 171
164 </help> 172 </help>
165 <citations> 173 <citations>
166 <expand macro="citation1" /> 174 <expand macro="citation1" />
167 <expand macro="citations2to4" /> 175 <citation type="bibtex">
176 @article{Wall2008,
177 journal = {Nucleic Acids Research},
178 author = {2. Wall PK, Leebens-Mack J, Muller KF, Field D, Altman NS},
179 title = {PlantTribes: a gene and gene family resource for comparative genomics in plants},
180 year = {2008},
181 volume = {36},
182 number = {suppl 1},
183 pages = {D970-D976},}
184 </citation>
185 <citation type="bibtex">
186 @article{Katoh2013,
187 journal = {Molecular biology and evolution},
188 author = {3. Katoh K, Standley DM},
189 title = {MAFFT multiple sequence alignment software version 7: improvements in performance and usability},
190 year = {2013},
191 volume = {30},
192 number = {4},
193 pages = {772-780},}
194 </citation>
195 <citation type="bibtex">
196 @article{Mirarab2014,
197 journal = {Research in Computational Molecular Biology (RECOMB)},
198 author = {4. Mirarab S, Nguyen N, Warnow T},
199 title = {PASTA: Ultra-Large Multiple Sequence Alignment. In R. Sharan (Ed.)},
200 year = {2014},
201 pages = {177–191},
202 url = {https://github.com/smirarab/pasta},}
203 </citation>
204 <citation type="bibtex">
205 @article{Capella-Gutierrez2009,
206 journal = {Bioinformatics,},
207 author = {5. Capella-Gutierrez S, Silla-Martínez JM, Gabaldón T},
208 title = {trimAl: a tool for automated alignment trimming in large-scale phylogenetic analyses},
209 year = {2009},
210 volume = {25},
211 number = {15},
212 pages = {1972-1973},}
213 </citation>
168 </citations> 214 </citations>
169 </tool> 215 </tool>