Mercurial > repos > bgruening > hicexplorer_hicdetectloops
comparison hicDetectLoops.xml @ 1:2d1988b74bf6 draft
"planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/hicexplorer commit 4b602d427e0fc0da5353a4510798349de98e4ae4"
| author | iuc |
|---|---|
| date | Wed, 11 Mar 2020 09:34:24 +0000 |
| parents | f8a8ca1ae303 |
| children | cabc992c6207 |
comparison
equal
deleted
inserted
replaced
| 0:f8a8ca1ae303 | 1:2d1988b74bf6 |
|---|---|
| 45 #set $chromosome = ' '.join([ '\'%s\'' % $chrom for $chrom in str($chromosomes).split(' ') ]) | 45 #set $chromosome = ' '.join([ '\'%s\'' % $chrom for $chrom in str($chromosomes).split(' ') ]) |
| 46 --chromosomes $chromosome | 46 --chromosomes $chromosome |
| 47 #end if | 47 #end if |
| 48 | 48 |
| 49 --statisticalTest $statisticalTest_selector | 49 --statisticalTest $statisticalTest_selector |
| 50 | 50 |
| 51 --outFileName output_loop.bedgraph | 51 --outFileName output_loop.bedgraph |
| 52 | 52 |
| 53 --threads @THREADS@ | 53 --threads @THREADS@ -tpc @THREADS@ |
| 54 ]]> | 54 ]]> |
| 55 </command> | 55 </command> |
| 56 <inputs> | 56 <inputs> |
| 57 <expand macro="matrix_h5_cooler_macro" /> | 57 <expand macro="matrix_h5_cooler_macro" /> |
| 58 <param argument="--peakWidth" type="integer" optional='true' label="Peak width" help= "The width of the peak region in bins. The square around the peak will include (2 * peakWidth)^2 bins." /> | 58 <param argument="--peakWidth" type="integer" optional='true' label="Peak width" help= "The width of the peak region in bins. The square around the peak will include (2 * peakWidth)^2 bins." /> |
| 59 <param argument="--windowSize" type="integer" optional='true' label="Window size" help= "The window size for the neighborhood region the peak is located in. All values from this region (exclude the values from the peak | 59 <param argument="--windowSize" type="integer" optional='true' label="Window size" help= "The window size for the neighborhood region the peak is located in. All values from this region (exclude the values from the peak |
| 60 region) are tested against the peak region for significant difference. The square will have the size of (2 * windowSize)^2 bins" /> | 60 region) are tested against the peak region for significant difference. The square will have the size of (2 * windowSize)^2 bins" /> |
| 61 <param argument="--pValuePreselection" type="float" label="P-value preselection" help= "Only candidates with p-values less the given threshold will be considered as candidates. | 61 <param argument="--pValuePreselection" type="float" label="P-value preselection" help= "Only candidates with p-values less the given threshold will be considered as candidates. |
| 62 For each genomic distance a negative binomial distribution is fitted and for each pixel a p-value given by the cumulative density function is given. | 62 For each genomic distance a negative binomial distribution is fitted and for each pixel a p-value given by the cumulative density function is given. |
| 63 This does NOT influence the p-value for the neighborhood testing." value='0.05'/> | 63 This does NOT influence the p-value for the neighborhood testing." value='0.05'/> |
| 64 <param argument="--peakInteractionsThreshold" type="integer" label="Minimum interaction number" help= "The minimum number of interactions a detected peaks needs to have to be considered." value='5' /> | 64 <param argument="--peakInteractionsThreshold" type="integer" label="Minimum interaction number" help= "The minimum number of interactions a detected peaks needs to have to be considered." value='5' /> |
| 65 <param argument="--maximumInteractionPercentageThreshold" type="float" value='0.1' label="Maximum interaction share" help= "For each genomic distance the maximum value is considered and all candidates need to have at least \'max_value * maximumInteractionPercentageThreshold\' interactions." /> | 65 <param argument="--maximumInteractionPercentageThreshold" type="float" value='0.1' label="Maximum interaction share" help= "For each genomic distance the maximum value is considered and all candidates need to have at least \'max_value * maximumInteractionPercentageThreshold\' interactions." /> |
| 66 <param argument="--pValue" type="float" label="P-value" help= "Rejection level for the statistical test for H0. H0 is peak region and background have the same distribution." value='0.05'/> | 66 <param argument="--pValue" type="float" label="P-value" help= "Rejection level for the statistical test for H0. H0 is peak region and background have the same distribution." value='0.05'/> |
| 67 <param argument="--maxLoopDistance" optional='true' type="integer" label="Maximal loop distance" help= "Maximum genomic distance of a loop, usually loops are within a distance of ~2MB." value='2000000'/> | 67 <param argument="--maxLoopDistance" optional='true' type="integer" label="Maximal loop distance" help= "Maximum genomic distance of a loop, usually loops are within a distance of ~2MB." value='2000000'/> |
| 71 <param name="statisticalTest_selector" type="select" label="Stistical test"> | 71 <param name="statisticalTest_selector" type="select" label="Stistical test"> |
| 72 <option value="wilcoxon-rank-sum" selected="True">Wilcoxon rank-sum'</option> | 72 <option value="wilcoxon-rank-sum" selected="True">Wilcoxon rank-sum'</option> |
| 73 <option value="anderson-darling">Anderson-Darling</option> | 73 <option value="anderson-darling">Anderson-Darling</option> |
| 74 </param> | 74 </param> |
| 75 </inputs> | 75 </inputs> |
| 76 <outputs> | 76 <outputs> |
| 77 <data name='output_loops' from_work_dir='output_loop.bedgraph' format='bedgraph' label='Computed loops'/> | 77 <data name='output_loops' from_work_dir='output_loop.bedgraph' format='bedgraph' label='Computed loops'/> |
| 78 </outputs> | 78 </outputs> |
| 79 <tests> | 79 <tests> |
| 80 <test> | 80 <test> |
| 81 <param name="matrix_h5_cooler" value="small_test_matrix.cool"/> | 81 <param name="matrix_h5_cooler" value="small_test_matrix.cool"/> |
| 93 Loop detection | 93 Loop detection |
| 94 ============== | 94 ============== |
| 95 | 95 |
| 96 Computes enriched regions (peaks) or long range contacts on the given contact matrix. | 96 Computes enriched regions (peaks) or long range contacts on the given contact matrix. |
| 97 | 97 |
| 98 hicDetectLoops can detect enriched interaction regions (peaks / loops) based on a strict candidate selection, negative binomial distributions and Anderson-Darling / Wilcoxon rank-sum tests. | |
| 99 | |
| 100 The algorithm was mainly develop on GM12878 cells from Rao 2014 on 10kb and 5kb fixed bin size resolution. | |
| 101 | |
| 102 _________________ | |
| 103 | |
| 104 Usage | |
| 105 ----- | |
| 106 | |
| 107 A command line example is available below (easily matchable in Galaxy using each field information): | |
| 108 | |
| 109 ̀`$ hicDetectLoops -m matrix.cool -o loops.bedgraph --maxLoopDistance 2000000 --windowSize 10 --peakWidth 6 --pValuePreselection 0.05 --pValue 0.05 --peakInteractionsThreshold 20 --maximumInteractionPercentageThreshold 0.1 --statisticTest anderson-darling` | |
| 110 | |
| 111 The candidate selection is based on the restriction of the maximum genomic distance, here 2MB. This distance is given by Rao 2014. For each genomic distance a negative binomial distribution is computed and only interaction pairs with a threshold less than ``--pValuePreselection`` are accepted. Detected candidates need to have at least an interaction count of ``--maximumInteractionPercentageThreshold`` times the maximum value for their genomic distance. Please note that ``--maximumInteractionPercentageThreshold`` was introduced with HiCExplorer release 3.2. Earlier versions did not have this parameter yet and therefore their outputs may differ. In a second step, each candidate is considered compared to its neighborhood. This neighborhood is defined by the ``--windowSize`` parameter in the x and y dimension. Per neighborhood only one candidate is considered, therefore only the candidate with the highest peak values is accepted. As a last step, the neighborhood is split into a peak and background region (parameter ``--peakWidth``). The peakWidth can never be larger than the windowSize. However, we recommend for 10kb matrices a windowSize of 10 and a peakWidth of 6. | |
| 112 | |
| 113 The output file (´´-o loops.bedgraph``) contains the x and y position of each loop and its corresponding p-value of the Anderson-Darling test. | |
| 114 | |
| 115 `1 120000000 122500000 1 145000000 147500000 0.001` | |
| 116 | |
| 117 The results can visualized via hicPlotMatrix: | |
| 118 | |
| 119 `$ hicPlotMatrix -m matrix.cool -o plot.png --log1p --region 1:18000000-22000000 --loops loops.bedgraph` | |
| 120 | |
| 121 .. image:: $PATH_TO_IMAGES/hicDetectLoops.png | |
| 122 :width: 50% | |
| 123 | |
| 124 | |
| 98 For more information about HiCExplorer please consider our documentation on readthedocs.io_. | 125 For more information about HiCExplorer please consider our documentation on readthedocs.io_. |
| 99 | 126 |
| 100 .. _readthedocs.io: http://hicexplorer.readthedocs.io/en/latest/index.html | 127 .. _readthedocs.io: http://hicexplorer.readthedocs.io/en/latest/index.html |
| 101 | 128 |
| 102 ]]></help> | 129 ]]></help> |
