comparison kaks_analysis.xml @ 34:8ff9aac5419f draft

Uploaded
author greg
date Fri, 05 May 2017 09:56:35 -0400
parents 9821735dccc4
children 5c246135e07d
comparison
equal deleted inserted replaced
33:c7a46686427a 34:8ff9aac5419f
200 from an external source. Optionally, the resulting set of estimated Ks values can be clustered into components using a mixture 200 from an external source. Optionally, the resulting set of estimated Ks values can be clustered into components using a mixture
201 of multivariate normal distributions to identify significant duplication event(s) in a species or a pair of species. 201 of multivariate normal distributions to identify significant duplication event(s) in a species or a pair of species.
202 202
203 ----- 203 -----
204 204
205 **Options** 205 * **Required options**
206
207 * **Required**
208 206
209 - **Coding sequences for the first species** - coding sequence fasta file for the first species either produced by the AssemblyPostProcessor tool or from an external source selected from your history. 207 - **Coding sequences for the first species** - coding sequence fasta file for the first species either produced by the AssemblyPostProcessor tool or from an external source selected from your history.
210 - **Protein sequences for the first species** - corresponding protein sequence fasta files for the first species either produced by the AssemblyPostProcessor tool or from an external source selected from your history. 208 - **Protein sequences for the first species** - corresponding protein sequence fasta files for the first species either produced by the AssemblyPostProcessor tool or from an external source selected from your history.
211 - **Type of sequence comparison** - pairwise sequence comparison to determine homolgous pairs. This can be either paralogous for self-species comparison or orthologous for cross-species comparison. Cross-species comparision requires input for the second species. 209 - **Type of sequence comparison** - pairwise sequence comparison to determine homologous pairs. This can be either paralogous for self-species comparison or orthologous for cross-species comparison. Cross-species comparison requires input for the second species.
212 210
213 * **Optional** 211 * **Other options**
214 212
215 - **Coding sequences for the second species** - coding sequence fasta file for the second species either produced by the AssemblyPostProcessor tool or from an external source selected from your history. This option is required only for orthologous comparison. 213 - **Coding sequences for the second species** - coding sequence fasta file for the second species either produced by the AssemblyPostProcessor tool or from an external source selected from your history. This option is required only for orthologous comparison.
216 - **Protein sequences for the second species** - corresponding protein sequence fasta files for the second species either produced by the AssemblyPostProcessor tool or from an external source selected from your history. This option is required only for orthologous comparison. 214 - **Protein sequences for the second species** - corresponding protein sequence fasta files for the second species either produced by the AssemblyPostProcessor tool or from an external source selected from your history. This option is required only for orthologous comparison.
217 - **Alignment coverage configuration** - select 'Yes' to set the minimum allowable alignment coverage length between homologous pairs. PlantTribes uses global codon alignment match score to determine the pairwise alignment coverage. By default, the match score is set to 0.5 if 'No' is selected. 215 - **Alignment coverage configuration** - select 'Yes' to set the minimum allowable alignment coverage length between homologous pairs. PlantTribes uses global codon alignment match score to determine the pairwise alignment coverage. By default, the match score is set to 0.5 if 'No' is selected.
218 216
225 - **PAML codeml configuration** - select 'Yes' to enable selection of a PAML codeml control file to carry out maximum likelihood analysis of protein-coding DNA sequences using codon substitution models. Template file "codeml.ctl.args" can be found in the scaffold data installed into Galaxy via the PlantTribes Scaffolds Download Data Manager tool, and are also available at the PlantTribes GitHub `repository`_. Default settings shown in the template are used if 'No' is selected. 223 - **PAML codeml configuration** - select 'Yes' to enable selection of a PAML codeml control file to carry out maximum likelihood analysis of protein-coding DNA sequences using codon substitution models. Template file "codeml.ctl.args" can be found in the scaffold data installed into Galaxy via the PlantTribes Scaffolds Download Data Manager tool, and are also available at the PlantTribes GitHub `repository`_. Default settings shown in the template are used if 'No' is selected.
226 - **Rates clustering configuration** - select 'Yes' to estimate clusters of synonymous substitution rates using a mixture of multivariate normal distributions which represent putative duplication event(s). 224 - **Rates clustering configuration** - select 'Yes' to estimate clusters of synonymous substitution rates using a mixture of multivariate normal distributions which represent putative duplication event(s).
227 225
228 - **Number of components** - number of components to include in the normal mixture model. 226 - **Number of components** - number of components to include in the normal mixture model.
229 227
230 - **Lower limit synonymous subsitution rates configuration** - select 'Yes' to set the minimum allowable synonymous substitution rate to use in the normal mixtures cluster analysis to exclude young paralogs that arise from normal gene births and deaths in a genome. 228 - **Lower limit synonymous substitution rates configuration** - select 'Yes' to set the minimum allowable synonymous substitution rate to use in the normal mixtures cluster analysis to exclude young paralogs that arise from normal gene births and deaths in a genome.
231 229
232 - **Minimum rate** - minimum allowable synonymous substitution rate. 230 - **Minimum rate** - minimum allowable synonymous substitution rate.
233 231
234 - **Upper limit synonymous subsitution rates configuration** - select 'Yes' to set the maximum allowable synonymous substitution rate to use in the normal mixtures cluster analysis to exclude likely ancient paralogs in a genome. 232 - **Upper limit synonymous substitution rates configuration** - select 'Yes' to set the maximum allowable synonymous substitution rate to use in the normal mixtures cluster analysis to exclude likely ancient paralogs in a genome.
235 233
236 - **Maximum rate** - maximum allowable synonymous substitution rate. 234 - **Maximum rate** - maximum allowable synonymous substitution rate.
237 235
238 .. _repository: https://github.com/dePamphilis/PlantTribes/blob/master/config/codeml.ctl.args 236 .. _repository: https://github.com/dePamphilis/PlantTribes/blob/master/config/codeml.ctl.args
239 237