comparison ordinate.xml @ 0:918c4abfacf4 draft

planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/ampvis2 commit 9ed0c3078be166bd22136771f517ae91a5198ecf
author iuc
date Fri, 16 Aug 2024 08:52:52 +0000
parents
children
comparison
equal deleted inserted replaced
-1:000000000000 0:918c4abfacf4
1 <tool id="ampvis2_ordinate" name="ampvis2 ordination plot" version="@TOOL_VERSION@+galaxy@VERSION_SUFFIX@" profile="@PROFILE@" license="MIT">
2 <description></description>
3 <macros>
4 <import>macros.xml</import>
5 <xml name="distmeasure_macro">
6 <param argument="distmeasure" type="select" label="Distance measure">
7 <option value="wunifrac">Weighted UniFrac. Requires a rooted phylogenetic tree.</option>
8 <option value="unifrac">Unweighted UniFrac. Requires a phylogenetic tree.</option>
9 <option value="jsd">Jensen-Shannon Divergence</option>
10 <option value="manhattan">manhattan</option>
11 <option value="euclidean">euclidean</option>
12 <option value="canberra">canberra</option>
13 <option value="bray" selected="true">bray</option>
14 <option value="kulczynski">kulczynski</option>
15 <option value="jaccard">jaccard</option>
16 <option value="gower">gower</option>
17 <option value="altGower">altGower</option>
18 <option value="morisita">morisita</option>
19 <option value="horn">horn</option>
20 <option value="mountford">mountford</option>
21 <option value="raup">raup</option>
22 <option value="binomial">binomial</option>
23 <option value="chao">chao</option>
24 <option value="cao">cao</option>
25 <option value="mahalanobis">mahalanobis</option>
26 <option value="clark">Clark</option>
27 <option value="chisq">Chi-square</option>
28 <option value="chord"> Chord</option>
29 <option value="hellinger">Hellinger</option>
30 <option value="aitchison">Aitchison</option>
31 <option value="robust.aitchison">Robust Aitchison </option>
32 </param>
33 </xml>
34 <!-- default is most often hellinger, but for N/MMDS none is suggested -->
35 <xml name="transform_macro">
36 <param argument="transform" type="select" label="Transforms the abundances before ordination" help="See details in decostand. Using the hellinger transformation is a good choice when performing PCA/RDA as it will produce a more ecologically meaningful result (read about the double-zero problem in Numerical Ecology). When the Hellinger transformation is used with CA/CCA it will help reducing the impact of low abundant species. When performing nMDS or PCoA (aka mMDS) it is not recommended to also use data transformation as this will obscure the chosen distance measure.">
37 <option value="none">No transformation</option>
38 <option value="total">divide by margin total (total)</option>
39 <option value="max">divide by margin maximum (max)</option>
40 <option value="freq">divide by margin total and multiply by the number of non-zero items, so that the average of non-zero entries is one (freq)</option>
41 <option value="normalize">make margin sum of squares equal to one (normalize)</option>
42 <option value="range">standardize values into range 0:1. If all values are constant, they will be transformed to 0. (range)</option>
43 <option value="standardize">scale x to zero mean and unit variance (standardize)</option>
44 <option value="pa">to presence/absence scale 0/1: absence/presence (pa)</option>
45 <option value="chi.square">divide by row sums and square root of column sums, and adjust for square root of matrix total (chi.square)</option>
46 <option value="hellinger">square root of method = "total" (hellinger)</option>
47 <option value="log">logarithmic transformation (log)</option>
48 <option value="sqrt">square root (sqrt)</option>
49 </param>
50 </xml>
51 <xml name="transform_macro_hellinger">
52 <expand macro="transform_macro">
53 <option value="hellinger" selected="true">square root of method = "total" (hellinger)</option>
54 </expand>
55 </xml>
56 <xml name="constrain_macro">
57 <expand macro="metadata_select_discrete" argument="constrain" optional="false" multiple="true" label="Constrain analysis by" help="Variable(s) in the metadata for constrained analyses. Multiple variables can be provided, but keep in mind that the more variables selected the more the result will be similar to unconstrained analysis."/>
58 </xml>
59 </macros>
60 <expand macro="header"/>
61 <command detect_errors="exit_code"><![CDATA[
62 Rscript '$rscript'
63 ]]></command>
64 <configfiles>
65 <configfile name="rscript"><![CDATA[
66 #if $type_cond.type in ['RDA', 'CCA'] and $type_cond.constrain
67 #set constrain_list='c("' + '", "'.join(str($type_cond.constrain).split(",")) + '")'
68 #else
69 #set constrain_list='NULL'
70 #end if
71 library(ampvis2, quietly = TRUE)
72 data <- readRDS("$data")
73 details <- amp_ordinate(
74 data,
75 filter_species = $filter_species,
76 type = "$type_cond.type",
77 #if $type_cond.type in ['MMDS', 'NMDS']
78 distmeasure = "$type_cond.distmeasure",
79 #end if
80 transform = "$type_cond.transform",
81 #if $type_cond.type in ['RDA', 'CCA']
82 constrain = $constrain_list,
83 #end if
84 ## x_axis = 1,
85 ## y_axis = 2,
86 print_caption = $print_caption,
87 #if $sample_color_by
88 sample_color_by = "$sample_color_by",
89 #end if
90 ## sample_color_order = NULL,
91 #if $sample_shape_by
92 sample_shape_by = "$sample_shape_by",
93 #end if
94 #if $sample_colorframe
95 sample_colorframe = "$sample_colorframe",
96 #end if
97 #if $sample_colorframe_label
98 sample_colorframe_label = "$sample_colorframe_label",
99 #end if
100 ## sample_colorframe_label_size = 3,
101 #if $sample_label_by
102 sample_label_by = "$sample_label_by",
103 #end if
104 ## sample_label_size = 4,
105 ## sample_label_segment_color = "black",
106 ## sample_point_size = 2,
107 #if $sample_trajectory
108 sample_trajectory = "$sample_trajectory",
109 #if $sample_trajectory_group
110 sample_trajectory_group = "$sample_trajectory_group",
111 #end if
112 #end if
113 ## sample_plotly = NULL,
114 #if $species_plot_cond.species_plot == "TRUE"
115 species_plot = $species_plot_cond.species_plot,
116 species_nlabels = $species_plot_cond.species_nlabels,
117 species_label_taxonomy = "$species_plot_cond.species_label_taxonomy",
118 ## species_label_size = 3,
119 ## species_label_color = "grey10",
120 ## species_rescale = FALSE,
121 ## species_point_size = 2,
122 ## species_shape = 20,
123 ## species_plotly = FALSE,
124 #end if
125 #if $envfit_factor
126 envfit_factor = "$envfit_factor",
127 #end if
128 #if $envfit_numeric
129 envfit_numeric = "$envfit_numeric",
130 #end if
131 envfit_signif_level = $envfit_signif_level,
132 ## envfit_textsize = 3,
133 ## envfit_textcolor = "darkred",
134 ## envfit_numeric_arrows_scale = 1,
135 ## envfit_arrowcolor = "darkred",
136 ## envfit_show = TRUE,
137 repel_labels = $repel_labels,
138 opacity = $opacity,
139 tax_empty = "$tax_empty",
140 detailed_output = TRUE,
141 ## num_threads = 1L,
142 )
143 plot <- details\$plot
144 @OUTPUT_TOKEN@
145 ggsave("$screeplot", print(details\$screeplot), device="$output_options.out_format")
146 ## TODO output more from model
147 ## $model
148 ## $dsites
149 ## $dspecies
150 ## $evf_factor_model
151 ## $evf_numeric_modei
152 ## #else
153 ## write.table(raw, file = "$plot_raw", sep = "\t")
154 ## #end if
155 ]]></configfile>
156 </configfiles>
157 <inputs>
158 <expand macro="rds_metadata_input_macro"/>
159 <param argument="filter_species" type="float" value="0.1" min="0" optional="true" label="Abundance threshold" help="Remove low abundant OTU's across all samples below this threshold in percent. Setting this to 0 may drastically increase computation time."/>
160 <conditional name="type_cond">
161 <param argument="type" type="select" label="Ordination method" help="Note that PCoA is not performed by the vegan package, but the pcoa function from the APE package.">
162 <option value="PCA">(PCA) Principal Components Analysis</option>
163 <option value="RDA">(RDA) Redundancy Analysis (considered the constrained version of PCA)</option>
164 <option value="CA">(CC) Correspondence Analysis</option>
165 <option value="CCA">(CCA) Canonical Correspondence Analysis (considered the constrained version of CA)</option>
166 <option value="DCA">(DCA) Detrended Correspondence Analysis</option>
167 <option value="NMDS">(NMDS) non-metric Multidimensional Scaling</option>
168 <option value="MMDS">(MMDS) metric Multidimensional Scaling a.k.a Principal Coordinates Analysis (not to be confused with PCA)</option>
169 </param>
170 <when value="PCA">
171 <expand macro="transform_macro_hellinger"/>
172 </when>
173 <when value="RDA">
174 <expand macro="transform_macro_hellinger"/>
175 <expand macro="constrain_macro"/>
176 </when>
177 <when value="CA">
178 <expand macro="transform_macro_hellinger"/>
179 </when>
180 <when value="CCA">
181 <expand macro="transform_macro_hellinger"/>
182 <expand macro="constrain_macro"/>
183 </when>
184 <when value="DCA">
185 <expand macro="transform_macro_hellinger"/>
186 </when>
187 <when value="NMDS">
188 <expand macro="distmeasure_macro"/>
189 <expand macro="transform_macro">
190 <option value="none" selected="true">No transformation</option>
191 </expand>
192 </when>
193 <when value="MMDS">
194 <expand macro="distmeasure_macro"/>
195 <expand macro="transform_macro">
196 <option value="none" selected="true">No transformation</option>
197 </expand>
198 </when>
199 </conditional>
200 <param argument="print_caption" type="boolean" truevalue="TRUE" falsevalue="FALSE" label="Auto-generate a figure caption" help="Based on the used arguments. The caption includes a description of how the result has been generated as well as references for the methods used."/>
201 <expand macro="metadata_select" argument="sample_color_by" label="Color sample points by"/>
202 <expand macro="metadata_select_discrete" argument="sample_shape_by" label="Shape sample points by"/>
203 <expand macro="metadata_select_discrete" argument="sample_colorframe" label="Frame the sample points with a polygon by" help="Split by the variable defined by sample_color_by"/>
204 <expand macro="metadata_select" argument="sample_colorframe_label" label="Label Frame by"/>
205 <expand macro="metadata_select" argument="sample_label_by" label="Label sample points by"/>
206 <expand macro="metadata_select" argument="sample_trajectory" label="Make a trajectory between sample points by"/>
207 <expand macro="metadata_select" argument="sample_trajectory_group" label="Make a trajectory between sample points by the sample_trajectory argument, but within individual groups."/>
208
209 <conditional name="species_plot_cond">
210 <param argument="species_plot" type="select" label="Plot species points">
211 <option value="TRUE">Yes</option>
212 <option value="FALSE" selected="true">No</option>
213 </param>
214 <when value="TRUE">
215 <param argument="species_nlabels" type="integer" value="10" min="1" label="Number of the most extreme species labels to plot" help="Ordered by the sum of the numerical values of the x,y coordinates. Only makes sense with PCA/RDA)."/>
216 <expand macro="taxlevel_macro" argument="species_label_taxonomy" label="Taxonomic level by which to label the species points">
217 <option value="Genus" selected="true">Genus</option>
218 </expand>
219 <param argument="species_label_size" type="integer" value="3" min="1" label="Size of the species text labels"/>
220 </when>
221 <when value="FALSE"/>
222 </conditional>
223
224 <expand macro="metadata_select_discrete" argument="envfit_factor" label="Categorical variables to fit onto the ordination plot"/>
225 <expand macro="metadata_select_numeric" argument="envfit_numeric" label="Numerical variables to fit arrows onto the ordination plot." help="The lengths of the arrows are scaled by significance. "/>
226 <param argument="envfit_signif_level" type="float" value="0.005" min="0" max="1" label="Significance threshold" help="For displaying the fitting results"/>
227
228 <param argument="repel_labels" type="boolean" truevalue="TRUE" falsevalue="FALSE" label="Repel all labels" help="To prevent cluttering of the plot"/>
229 <param argument="opacity" type="float" value="0.8" min="0" max="1" label="Opacity of plotted points and colorframe" help="0: invisible, 1: opaque."/>
230 <expand macro="tax_empty_macro"/>
231 <expand macro="out_format_macro"/>
232 <param name="output_screeplot" type="boolean" checked="false" label="Output screeplot" help="plot of variances against the number of the principal component"/>
233 </inputs>
234 <outputs>
235 <expand macro="out_macro"/>
236 <expand macro="out_macro" name="screeplot" label=": screeplot">
237 <filter>output_screeplot</filter>
238 </expand>
239 </outputs>
240 <tests>
241 <!-- defaults (PCA) -->
242 <test expect_num_outputs="2">
243 <param name="data" value="AalborgWWTPs.rds" ftype="ampvis2"/>
244 <param name="output_screeplot" value="true"/>
245 <output name="plot" value="AalborgWWTPs-ordinate.pdf" ftype="pdf" compare="sim_size"/>
246 <output name="screeplot" value="AalborgWWTPs-ordinate-screeplot.pdf" ftype="pdf" compare="sim_size"/>
247 </test>
248 <!-- RDA + caption -->
249 <test expect_num_outputs="1">
250 <param name="data" value="AalborgWWTPs.rds" ftype="ampvis2"/>
251 <param name="metadata_list" value="AalborgWWTPs-metadata.list"/>
252 <conditional name="type_cond">
253 <param name="type" value="RDA"/>
254 <param name="distmeasure" value="manhattan"/>
255 <param name="transform" value="chi.square"/>
256 <param name="constrain" value="Plant"/>
257 </conditional>
258 <output name="plot" value="AalborgWWTPs-ordinate-rda.pdf" ftype="pdf" compare="sim_size"/>
259 </test>
260 <!-- MMDS + unfirac (which requires a tree) -->
261 <test expect_num_outputs="1">
262 <param name="data" value="AalborgWWTPs-complete.rds" ftype="ampvis2"/>
263 <param name="metadata_list" value="AalborgWWTPs-metadata.list"/>
264 <conditional name="type_cond">
265 <param name="type" value="MMDS"/>
266 <param name="distmeasure" value="unifrac"/>
267 </conditional>
268 <output name="plot" value="AalborgWWTPs-ordinate-nmds.pdf" ftype="pdf" compare="sim_size"/>
269 </test>
270 <!-- color, shape and colorframe -->
271 <test expect_num_outputs="1">
272 <param name="data" value="AalborgWWTPs.rds" ftype="ampvis2"/>
273 <param name="metadata_list" value="AalborgWWTPs-metadata.list"/>
274 <param name="sample_color_by" value="Year"/>
275 <param name="sample_shape_by" value="Plant"/>
276 <param name="sample_colorframe" value="Year"/>
277 <param name="sample_colorframe_label" value="Year"/>
278 <output name="plot" value="AalborgWWTPs-ordinate-color-shape-frame.pdf" ftype="pdf" compare="sim_size"/>
279 </test>
280 <!-- label, trajectory -->
281 <test expect_num_outputs="1">
282 <param name="data" value="AalborgWWTPs.rds" ftype="ampvis2"/>
283 <param name="metadata_list" value="AalborgWWTPs-metadata.list"/>
284 <param name="sample_label_by" value="Plant"/>
285 <param name="sample_trajectory" value="Year"/>
286 <output name="plot" value="AalborgWWTPs-ordinate-label-traj.pdf" ftype="pdf" compare="sim_size"/>
287 </test>
288 <!-- trajectory group -->
289 <test expect_num_outputs="1">
290 <param name="data" value="AalborgWWTPs.rds" ftype="ampvis2"/>
291 <param name="metadata_list" value="AalborgWWTPs-metadata.list"/>
292 <param name="sample_trajectory_group" value="Plant"/>
293 <output name="plot" value="AalborgWWTPs-ordinate-label-traj-group.pdf" ftype="pdf" compare="sim_size"/>
294 </test>
295 <!-- species plot -->
296 <test expect_num_outputs="1">
297 <param name="data" value="AalborgWWTPs.rds" ftype="ampvis2"/>
298 <param name="metadata_list" value="AalborgWWTPs-metadata.list"/>
299 <conditional name="species_plot_cond">
300 <param name="species_plot" value="TRUE"/>
301 <param name="species_nlabels" value="2"/>
302 <param name="species_label_taxonomy" value="Family"/>
303 <param name="species_label_size" value="4"/>
304 </conditional>
305 <output name="plot" value="AalborgWWTPs-ordinate-species.pdf" ftype="pdf" compare="sim_size"/>
306 </test>
307 <!-- envfit factor-->
308 <test expect_num_outputs="1">
309 <param name="data" value="AalborgWWTPs.rds" ftype="ampvis2"/>
310 <param name="metadata_list" value="AalborgWWTPs-metadata.list"/>
311 <param name="envfit_factor" value="Plant"/>
312 <output name="plot" value="AalborgWWTPs-ordinate-envfit-factor.pdf" ftype="pdf" compare="sim_size"/>
313 </test>
314 <!-- envfit numeric-->
315 <test expect_num_outputs="1">
316 <param name="data" value="AalborgWWTPs.rds" ftype="ampvis2"/>
317 <param name="metadata_list" value="AalborgWWTPs-metadata.list"/>
318 <param name="envfit_numeric" value="Year"/>
319 <output name="plot" value="AalborgWWTPs-ordinate-envfit-num.pdf" ftype="pdf" compare="sim_size"/>
320 </test>
321 </tests>
322 <help><![CDATA[
323 What it does
324 ============
325
326 Generate ordination plots suited for analysis and comparison of microbial communities.
327
328 The Galaxy tool calls the `amp_ordinate
329 <https://kasperskytte.github.io/ampvis2/reference/amp_ordinate.html>`_ function
330 of the ampvis2 package, which is a wrapper around the vegan package to generate
331 ggplot2 ordination plots.
332
333 Details
334 =======
335
336 The ``amp_ordinate`` function is primarily based on two packages:
337
338 1. ``vegan`` package, which performs the actual ordination, and
339 2. the ggplot2-package to generate the plot.
340
341 The function generates an ordination plot by the following process:
342
343 - Various input argument checks and error messages
344 - OTU-table filtering, where low abundant OTU's across all samples are removed (if not **Abundance threshold** (``filter_species``) is set to 0)
345 - Data transformation (if not **Transforms the abundances before ordination** (``transform``) is set to "no transformation").
346 See details in `decostand <https://rdrr.io/cran/vegan/man/decostand.html>`_
347 - Calculate distance matrix based on the chosen distmeasure if the chosen ordination method is PCoA/nMDS/DCA.
348 See details in `vegdist <https://rdrr.io/cran/vegan/man/vegdist.html>`_
349 - Perform the actual ordination and calculate the axis scores for both samples and species/OTU's
350 - Visualise the result with ggplot2 or plotly in various ways defined by the user
351
352 When the chosen ordination method is an eigenanalysis-based method then the
353 relative contribution (eigenvalue) of each axis to the total inertia in the data
354 (sum of all eigenvalues, including those of the constrained space) is indicated
355 in percent at the axis titles. When one of the constrained ordination methods
356 (RDA and CCA) is used then a second value is furthermore shown which then
357 indicates the relative contribution of the particular axis to the total
358 constrained space only.
359
360 Input
361 =====
362
363 @HELP_RDS_INPUT@
364
365 @HELP_METADATA_LIST_INPUT@
366
367 Output
368 ======
369
370 An ordination plot in the chosen output format.
371 ]]></help>
372 <expand macro="citations"/>
373 </tool>