annotate readme.rst @ 20:bb725f6d6d38 draft default tip

planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/rglasso commit 344140b8df53b8b7024618bb04594607a045c03a
author iuc
date Mon, 04 May 2015 22:47:29 -0400
parents e0e11c2cae3f
children
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
3
e0e11c2cae3f Uploaded
fubar
parents:
diff changeset
1 glmnet wrappers
e0e11c2cae3f Uploaded
fubar
parents:
diff changeset
2 ===============
e0e11c2cae3f Uploaded
fubar
parents:
diff changeset
3
e0e11c2cae3f Uploaded
fubar
parents:
diff changeset
4 This is a self installing Galaxy tool exposing the glmnet_ R package which has excellent documentation at
e0e11c2cae3f Uploaded
fubar
parents:
diff changeset
5 glmnet_ Minimal details are provided in this wrapper - please RTM to get the best out of it.
e0e11c2cae3f Uploaded
fubar
parents:
diff changeset
6
e0e11c2cae3f Uploaded
fubar
parents:
diff changeset
7 The tool exposes the entire range of penalised maximum likelihood
e0e11c2cae3f Uploaded
fubar
parents:
diff changeset
8 GLM models ranging from pure lasso (set alpha to 1) to pure ridge-regression (set alpha to 0).
e0e11c2cae3f Uploaded
fubar
parents:
diff changeset
9
e0e11c2cae3f Uploaded
fubar
parents:
diff changeset
10 These models can be k-fold internally cross validated to help select an "optimal" predictive or classification
e0e11c2cae3f Uploaded
fubar
parents:
diff changeset
11 algorithm. Predictive coefficients for each included independent variable are output for each model.
e0e11c2cae3f Uploaded
fubar
parents:
diff changeset
12
e0e11c2cae3f Uploaded
fubar
parents:
diff changeset
13 Predictors can be forced into models to adjust for known confounders or explanatory factors.
e0e11c2cae3f Uploaded
fubar
parents:
diff changeset
14
e0e11c2cae3f Uploaded
fubar
parents:
diff changeset
15 The glmnet_ implementation of the coordinate descent algorithm is fast and efficient even on relatively large problems
e0e11c2cae3f Uploaded
fubar
parents:
diff changeset
16 with tens of thousands of predictors and thousands of samples - such as normalised microarray intensities and anthropometry
e0e11c2cae3f Uploaded
fubar
parents:
diff changeset
17 on a very large sample of obese patients.
e0e11c2cae3f Uploaded
fubar
parents:
diff changeset
18
e0e11c2cae3f Uploaded
fubar
parents:
diff changeset
19 The user supplies a tabular file with rows as samples and columns containing observations, then chooses
e0e11c2cae3f Uploaded
fubar
parents:
diff changeset
20 as many predictors as required. A separate model will be output for each of potentially multiple dependent
e0e11c2cae3f Uploaded
fubar
parents:
diff changeset
21 variables. Models are reported as the coefficients for terms in an 'optimal' model.
e0e11c2cae3f Uploaded
fubar
parents:
diff changeset
22 These optimal predictors are selected by repeatedly setting
e0e11c2cae3f Uploaded
fubar
parents:
diff changeset
23 aside a random subsample, building a model in the remainder and estimating AUC or deviance
e0e11c2cae3f Uploaded
fubar
parents:
diff changeset
24 using k (default 10) fold internal cross validation. For each of these steps, a random 1/k
e0e11c2cae3f Uploaded
fubar
parents:
diff changeset
25 of the samples are set aside and used to estiamte performance of an optimal model estimated
e0e11c2cae3f Uploaded
fubar
parents:
diff changeset
26 from the remaining samples. Plots are provided showing the range of these (eg 10) internal validation
e0e11c2cae3f Uploaded
fubar
parents:
diff changeset
27 estimates and mean model AUC (binomial) or residual deviance plots at each penalty increment step.
e0e11c2cae3f Uploaded
fubar
parents:
diff changeset
28
e0e11c2cae3f Uploaded
fubar
parents:
diff changeset
29 A full range of link functions are available including Gaussian, Poisson, Binomial and
e0e11c2cae3f Uploaded
fubar
parents:
diff changeset
30 Cox proportional hazard time to failure for censored data in this wrapper.
e0e11c2cae3f Uploaded
fubar
parents:
diff changeset
31
e0e11c2cae3f Uploaded
fubar
parents:
diff changeset
32 Note that multinomial and multiresponse gaussian models are NOT yet implemented since I have not yet
e0e11c2cae3f Uploaded
fubar
parents:
diff changeset
33 had use for them - send code!
e0e11c2cae3f Uploaded
fubar
parents:
diff changeset
34
e0e11c2cae3f Uploaded
fubar
parents:
diff changeset
35 .. _glmnet: http://web.stanford.edu/~hastie/glmnet/glmnet_alpha.html
e0e11c2cae3f Uploaded
fubar
parents:
diff changeset
36
e0e11c2cae3f Uploaded
fubar
parents:
diff changeset
37 Wrapper author: Ross Lazarus
e0e11c2cae3f Uploaded
fubar
parents:
diff changeset
38 19 october 2014
e0e11c2cae3f Uploaded
fubar
parents:
diff changeset
39