genetrack: genetrack.xml comparison

comparison genetrack.xml @ 17:5a6ea187933b draft

Uploaded

author	greg
date	Wed, 16 Dec 2015 19:53:24 -0500
parents	b40ad4bee6cb
children	e1d437bd7d36

comparison

equal deleted inserted replaced

-:b40ad4bee6cb
+:5a6ea187933b
 * **Sigma to use when smoothing reads** - Smooths clusters of tags via a Gaussian distribution.
 * **Peak exclusion zone** - Exclusion zone around each peak, eliminating all other peaks on the same strand that are within a ± bp distance of the peak.
 * **Exclusion zone of upstream called peaks** - Defines the exclusion zone centered over peaks upstream of a peak.
 * **Exclusion zone of downstream called peaks** - Defines the exclusion zone centered over peaks downstream of a peak.
 * **Filter** - Absolute read filter, restricts output to only peaks with larger peak height.
+-----
+**Output gff Columns**
+1. Chromosome
+2. Script
+3. Placeholder (no meaning)
+4. Start of peak exclusion zone (-e 20)
+5. End of peak exclusion zone
+6. Tag sum (not peak height or area under curve, which LionDB provides)
+7. Strand
+8. Placeholder (no meaning)
+9. Attributes (standard deviation of reads located within exclusion zone) = fuzziness of peak
+-----
+**Considerations**
+In principle, the width of the exclusion zone may be as large as the DNA region occupied by the native protein
+plus a steric exclusion zone between the protein and the exonuclease.  On the other hand the site might be considerably
+smaller if the protein is in a denatured state during exonuclease digestion (since it is pre-treated with SDS).
+In general, higher resolution data or smaller binding site size data should use smaller sigma values.  Large binding site
+size data such as 147 bp nucleosomal DNA use a larger sigma value like 20 (-s 20).  For transcription factors mapped by
+ChIP-exo, sigma may initially be set at 5, and the exclusion zone set at 20 (-s 5 –e 20).  Sigma is typically varied
+between ~3 and ~20.  Too high of a sigma value may merge two independent nearby binding events.  This may be desirable if
+closely bound factors are not distinguishable.  Too low of a sigma value will cause some tags that contribute to a binding
+event to be excluded, because they may not be located sufficiently close to the main peak.  If alternative (mutually
+exclusive) binding is expected for two overlapping sites, and these sites are to be independently recorded, then an
+empirically determined smaller exclusion zone width is set.  Thus, the value of sigma is set empirically for each mapped
+factor depending upon the resolution and binding site size of the binding event.
+It might make sense to exclude peaks that have only a single tag, where -F 1 is used, or have their tags located on only
+a single coordinate (called Singletons, where stddev=0 in the output file).  However, low coverage datasets might be
+improved by including them, if additional analysis (e.g., motif discovery) validates them.  In addition, idealized action
+of the exonuclease in ChIP-exo might place all tags for a peak on a single coordinate.
 </help>
 <expand macro="citations" />
 </tool>

Mercurial > repos > greg > genetrack

comparison genetrack.xml @ 17:5a6ea187933b draft