comparison genetrack.xml @ 20:2f0dede41f69 draft

Uploaded
author greg
date Wed, 16 Dec 2015 20:09:08 -0500
parents f45571c6e3dd
children c868ac2145c4
comparison
equal deleted inserted replaced
19:f45571c6e3dd 20:2f0dede41f69
136 * **Peak exclusion zone** - Exclusion zone around each peak, eliminating all other peaks on the same strand that are within a ± bp distance of the peak. 136 * **Peak exclusion zone** - Exclusion zone around each peak, eliminating all other peaks on the same strand that are within a ± bp distance of the peak.
137 * **Exclusion zone of upstream called peaks** - Defines the exclusion zone centered over peaks upstream of a peak. 137 * **Exclusion zone of upstream called peaks** - Defines the exclusion zone centered over peaks upstream of a peak.
138 * **Exclusion zone of downstream called peaks** - Defines the exclusion zone centered over peaks downstream of a peak. 138 * **Exclusion zone of downstream called peaks** - Defines the exclusion zone centered over peaks downstream of a peak.
139 * **Filter** - Absolute read filter, restricts output to only peaks with larger peak height. 139 * **Filter** - Absolute read filter, restricts output to only peaks with larger peak height.
140 140
141 ----- 141 -----
142 142
143 **Output gff Columns** 143 **Output gff Columns**
144 144
145 * Chromosome 145 1. Chromosome
146 * Script 146 2. Script
147 * Placeholder (no meaning) 147 3. Placeholder (no meaning)
148 * Start of peak exclusion zone (-e 20) 148 4. Start of peak exclusion zone (-e 20)
149 * End of peak exclusion zone 149 5. End of peak exclusion zone
150 * Tag sum (not peak height or area under curve, which LionDB provides) 150 6. Tag sum (not peak height or area under curve, which LionDB provides)
151 * Strand 151 7. Strand
152 * Placeholder (no meaning) 152 8. Placeholder (no meaning)
153 * Attributes (standard deviation of reads located within exclusion zone) = fuzziness of peak 153 9. Attributes (standard deviation of reads located within exclusion zone) = fuzziness of peak
154 154
155 ----- 155 -----
156 156
157 **Considerations** 157 **Considerations**
158 158
159 In principle, the width of the exclusion zone may be as large as the DNA region occupied by the native protein 159 In principle, the width of the exclusion zone may be as large as the DNA region occupied by the native protein
160 plus a steric exclusion zone between the protein and the exonuclease. On the other hand the site might be considerably 160 plus a steric exclusion zone between the protein and the exonuclease. On the other hand the site might be considerably
161 smaller if the protein is in a denatured state during exonuclease digestion (since it is pre-treated with SDS). 161 smaller if the protein is in a denatured state during exonuclease digestion (since it is pre-treated with SDS).
162 162
163 In general, higher resolution data or smaller binding site size data should use smaller sigma values. Large binding site 163 In general, higher resolution data or smaller binding site size data should use smaller sigma values. Large binding site
164 size data such as 147 bp nucleosomal DNA use a larger sigma value like 20 (-s 20). For transcription factors mapped by 164 size data such as 147 bp nucleosomal DNA use a larger sigma value like 20 (-s 20). For transcription factors mapped by
165 ChIP-exo, sigma may initially be set at 5, and the exclusion zone set at 20 (-s 5 –e 20). Sigma is typically varied 165 ChIP-exo, sigma may initially be set at 5, and the exclusion zone set at 20 (-s 5 –e 20). Sigma is typically varied
166 between ~3 and ~20. Too high of a sigma value may merge two independent nearby binding events. This may be desirable if 166 between ~3 and ~20. Too high of a sigma value may merge two independent nearby binding events. This may be desirable if
167 closely bound factors are not distinguishable. Too low of a sigma value will cause some tags that contribute to a binding 167 closely bound factors are not distinguishable. Too low of a sigma value will cause some tags that contribute to a binding
168 event to be excluded, because they may not be located sufficiently close to the main peak. If alternative (mutually 168 event to be excluded, because they may not be located sufficiently close to the main peak. If alternative (mutually
169 exclusive) binding is expected for two overlapping sites, and these sites are to be independently recorded, then an 169 exclusive) binding is expected for two overlapping sites, and these sites are to be independently recorded, then an
170 empirically determined smaller exclusion zone width is set. Thus, the value of sigma is set empirically for each mapped 170 empirically determined smaller exclusion zone width is set. Thus, the value of sigma is set empirically for each mappedfactor depending upon the resolution and binding site size of the binding event.
171 factor depending upon the resolution and binding site size of the binding event.
172 171
173 It might make sense to exclude peaks that have only a single tag, where -F 1 is used, or have their tags located on only 172 It might make sense to exclude peaks that have only a single tag, where -F 1 is used, or have their tags located on only
174 a single coordinate (called Singletons, where stddev=0 in the output file). However, low coverage datasets might be 173 a single coordinate (called Singletons, where stddev=0 in the output file). However, low coverage datasets might be
175 improved by including them, if additional analysis (e.g., motif discovery) validates them. In addition, idealized action 174 improved by including them, if additional analysis (e.g., motif discovery) validates them. In addition, idealized action
176 of the exonuclease in ChIP-exo might place all tags for a peak on a single coordinate. 175 of the exonuclease in ChIP-exo might place all tags for a peak on a single coordinate.