Mercurial > repos > greg > multigps

--- a/multigps.xml	Wed Dec 14 10:46:50 2016 -0500
+++ b/multigps.xml	Wed Dec 14 11:30:21 2016 -0500
@@ -444,59 +444,54 @@

 **Options**

-* **Input files, attributes and options**
-
- - **Is this experiment signal or control?** - Designate the associated input file as a “signal” or “control” experiment.
- - **Condition name** - Condition name.
- - **Replicate name** - This is optional for control experiments, and if defined, the control will only be used for the corresponding named signal replicate.
- - **Read distribution file** - Optional binding event read distribution file (appropriate for the specified replicate) for initializing models.  If not specified, the default distribution is used.  The true distribution of reads around binding events is estimated during MultiGPS training.
- - **Use fixed per-base read count limit for this replicate?** - Optional fixed per-base read count limit for the specified replicate.  Selecting "Yes" sets a read count limit that varies along the genome according to how neighboring bases are distributed, while selecting "No" sets a global per-base limit that is estimated from a Poisson distribution.
-
-* **Perform motif-finding or use a motif-prior?** -  Integrate motif-finding or use a motif-prior via MEME.
+* **Loading data:**
+ - **Optional file containing file containing reads from a control experiment** - must be same format as input experiment
+ - **Fixed per-base limit** - Fixed per-base limit (default: estimated from background model).
+ - **Poisson threshold for filtering per base** - Look at neighboring positions to decide what the per-base limit should be.
+ - **Use non-unique reads** - Use non-unique reads.
+ - **Fraction of the genome that is mappable for these experiments** - Fraction of the genome that is mappable for these experiments
+ - **Turn off caching of the entire set of experiments?** - Flag to turn off caching of the entire set of experiments (i.e. run slower with less memory).

- - **Choose the source for the reference genome** - Reference data can be locally cached or selected from the Galaxy history.
- - **Perform inter-experiment positional prior?** - Perform inter-experiment positional prior.
- - **Probability that events are shared across conditions** - Probability that events are shared across conditions.
- - **Perform both motif-finding and motif priors?** - Select "No" to turn off motif-finding and motif priors.
- - **Perform motif-finding only?** - Select "Yes" to turn off motif priors, performing motif-finding only.
- - **Number of motifs MEME should find for each condition** - Number of motifs MEME should find for each condition.
- - **Minimum motif width for MEME** - Minimum motif width argument for MEME.
- - **Maximum motif width for MEME** - Maximum motif width argument for MEME.
+* **Scaling control vs signal counts:**
+ - **Use signal vs control scaling?** - Flag to turn off auto estimation of signal vs control scaling factor
+ - **Use the median signal/control ratio as the scaling factor?** - Flag to use scaling by median ratio (default = scaling by NCIS).
+ - **Use scaling by regression on binned tag counts?** - Flag to use scaling by regression (default = scaling by NCIS).
+ - **Estimate scaling factor by SES?** - Specify whether to estimate scaling factor by SES.
+ - **Multiply control counts by total tag count ratio and then by this factor** - Multiply control counts by total tag count ratio and then by this factor (default: NCIS).
+ - **Window size for estimating scaling ratios** - Window size in base pairs for estimating scaling ratios
+ - **Plot diagnostic information for the chosen scaling method?** - Flag to plot diagnostic information for the chosen scaling method.

-* **General Advanced Options**
-
- - **Maximum number of training rounds for updating binding event read distributions** - Maximum number of training rounds for updating binding event read distributions
- - **Optional file containing a set of regions to ignore during MultiGPS training** - It’s a good idea to exclude the mitochondrial genome and other ‘blacklisted’ regions that contain artifactual accumulations of reads in both ChIP-seq and control experiments. MultiGPS will waste time trying to model binding events in these regions, even though they will not typically appear significantly enriched over the control (and thus will not be reported to the user).
+* **Running MultiGPS:**
+ - **binding event read distribution file** - Binding event read distribution file for initializing models. The true distribution of reads around binding events is estimated during MultiGPS training. A default initial distribution appropriate for ChIP-seq data is used if this option is not specified.
+ - **Maximum number of training rounds for updating binding event read distributions** - Maximum number of training rounds for updating binding event read distributions.
  - **Perform binding model updates?** - Perform binding model updates?
  - **Minimum number of events to support an update of the read distribution** - Minimum number of events to support an update of the read distribution
  - **Perform binding model smoothing?** - Smooth with a cubic spline using a specified smoothing factor.
+ - **Spline smoothing parameter** - Smoothing parameter for smoothing cubic spline.
  - **Perform Gaussian model smoothing?** - Select "Yes" to use Gaussian model smoothing using a specified smoothing factor if binding model smoothing is not performed.
  - **Allow joint events in model updates?** - Specify whether to allow joint events in model updates.
- - **Share component configs in the ML step?** - Specify whether to share component configs in the ML step.  This mainly affects the quantification of binding levels for binding events that are not shared but are located at nearby locations across experiments.
-
-* **Set limits on how many reads can have their 5′ end at the same position in each replicate?**
-
- - **Fixed per-base limit** - Fixed per-base limit.
- - **Poisson threshold for filtering per base** - Look at neighboring positions to decide what the per-base limit should be.
- - **Use non-unique reads** - Use non-unique reads.
+ - **Keep binding model range fixed to inital size?** - Flag to keep binding model range fixed to inital size (default: vary automatically)
+ - **Poisson log threshold for potential region scanning** - Poisson log threshold for potential region scanning.
+ - **Alpha scaling factor** - Alpha scaling factor. Increasing this parameter results in stricter binding event calls.
+ - **Impose this alpha** - The alpha parameter is a sparse prior on binding events in the MultiGPS model. It can be interpreted as a minimum number of reads that each binding event must be responsible for in the model. Default: estimate alpha automatically.
+ - **Share component configs in the ML step?** - Flag to not share component configs in the ML step
+ - **Optional file containing a set of regions to ignore during MultiGPS training** - File containing a set of regions to ignore during MultiGPS training. It’s a good idea to exclude the mitochondrial genome and other ‘blacklisted’ regions that contain artifactual accumulations of reads in both ChIP-seq and control experiments. MultiGPS will waste time trying to model binding events in these regions, even though they will not typically appear significantly enriched over the control (and thus will not be reported to the user).

-* **Set data scaling parameters?**
+* **MultiGPS priors:**
+ - **Perform inter-experiment positional prior?** - Flag to turn off inter-experiment positional prior (default=on).
+ - **Probability that events are shared across conditions** - Probability that events are shared across conditions.
+ - **Perform both motif-finding and motif priors?** - Flag to turn off motif-finding and motif priors.
+ - **Perform motif-finding only?** - Flag to turn off motif priors only.
+ - **Number of motifs MEME should find for each condition** - Number of motifs MEME should find for each condition.
+ - **Minimum motif width for MEME** - minw arg for MEME.
+ - **Maximum motif width for MEME** - maxw arg for MEME.

- - **Use signal vs control scaling?** - Specify whether to use signal vs control scaling.
- - **Use the median signal/control ratio as the scaling factor?** - Specify whether to use the median signal/control ratio as the scaling factor.
- - **Estimate scaling factor by SES?** - Specify whether to estimate scaling factor by SES.
- - **Window size for estimating scaling ratios** - Window size in base pairs for estimating scaling ratios
-
-* **Report binding events?**
-
+* **Reporting binding events:**
  - **Minimum Q-value (corrected p-value) of reported binding events** - Minimum Q-value (corrected p-value) of reported binding events.
  - **Minimum event fold-change vs scaled control** - Minimum event fold-change vs scaled control.
  - **Run differential enrichment tests?** - Choose whether to run differential enrichment tests.
  - **EdgeR over-dispersion parameter value** - EdgeR over-dispersion parameter value.
  - **Minimum p-value for reporting differential enrichment** - Minimum p-value for reporting differential enrichment.
-
-* **Output MultiGPS process log?** - Select "Yes" to produce a second output dataset that contains the MultiGPS process log.
-
     </help>
     <expand macro="citations" />
 </tool>