annotate compute_motifs_frequency.xml @ 1:918324d122bd draft default tip

planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
author devteam
date Tue, 13 Oct 2015 12:15:23 -0400
parents d66f925bfbeb
children
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
0
d66f925bfbeb Uploaded tool tarball.
devteam
parents:
diff changeset
1 <tool id="compute_motifs_frequency" name="Compute Motif Frequencies" version="1.0.0">
d66f925bfbeb Uploaded tool tarball.
devteam
parents:
diff changeset
2 <description>in indel flanking regions</description>
d66f925bfbeb Uploaded tool tarball.
devteam
parents:
diff changeset
3
d66f925bfbeb Uploaded tool tarball.
devteam
parents:
diff changeset
4
d66f925bfbeb Uploaded tool tarball.
devteam
parents:
diff changeset
5 <command interpreter="perl">
d66f925bfbeb Uploaded tool tarball.
devteam
parents:
diff changeset
6 compute_motifs_frequency.pl $inputFile1 $inputFile2 $inputNumber3 $outputFile1 $outputFile2
d66f925bfbeb Uploaded tool tarball.
devteam
parents:
diff changeset
7 </command>
d66f925bfbeb Uploaded tool tarball.
devteam
parents:
diff changeset
8
d66f925bfbeb Uploaded tool tarball.
devteam
parents:
diff changeset
9
d66f925bfbeb Uploaded tool tarball.
devteam
parents:
diff changeset
10 <inputs>
d66f925bfbeb Uploaded tool tarball.
devteam
parents:
diff changeset
11
d66f925bfbeb Uploaded tool tarball.
devteam
parents:
diff changeset
12 <param format="tabular" name="inputFile1" type="data" label="Select motifs file"/>
d66f925bfbeb Uploaded tool tarball.
devteam
parents:
diff changeset
13
d66f925bfbeb Uploaded tool tarball.
devteam
parents:
diff changeset
14 <param format="tabular" name="inputFile2" type="data" label="Select indel flanking regions file from your history"/>
d66f925bfbeb Uploaded tool tarball.
devteam
parents:
diff changeset
15
1
918324d122bd planemo upload commit 33927a87ba2eee9bf0ecdd376a66241b17b3d734
devteam
parents: 0
diff changeset
16 <param type="integer" name="inputNumber3" value="0" label="What is the size of each window?" help="'0' = all the upstream flanking sequence will be one window only, and the same for the downstream flanking sequence."/>
0
d66f925bfbeb Uploaded tool tarball.
devteam
parents:
diff changeset
17
d66f925bfbeb Uploaded tool tarball.
devteam
parents:
diff changeset
18 </inputs>
d66f925bfbeb Uploaded tool tarball.
devteam
parents:
diff changeset
19
d66f925bfbeb Uploaded tool tarball.
devteam
parents:
diff changeset
20
d66f925bfbeb Uploaded tool tarball.
devteam
parents:
diff changeset
21 <outputs>
d66f925bfbeb Uploaded tool tarball.
devteam
parents:
diff changeset
22 <data format="tabular" name="outputFile1"/>
d66f925bfbeb Uploaded tool tarball.
devteam
parents:
diff changeset
23 <data format="tabular" name="outputFile2"/>
d66f925bfbeb Uploaded tool tarball.
devteam
parents:
diff changeset
24 </outputs>
d66f925bfbeb Uploaded tool tarball.
devteam
parents:
diff changeset
25
d66f925bfbeb Uploaded tool tarball.
devteam
parents:
diff changeset
26 <tests>
d66f925bfbeb Uploaded tool tarball.
devteam
parents:
diff changeset
27 <test>
d66f925bfbeb Uploaded tool tarball.
devteam
parents:
diff changeset
28 <param name="inputFile1" value="motifs1.tabular" />
d66f925bfbeb Uploaded tool tarball.
devteam
parents:
diff changeset
29 <param name="inputFile2" value="indelsFlankingSequences1.tabular" />
d66f925bfbeb Uploaded tool tarball.
devteam
parents:
diff changeset
30 <param name="inputNumber3" value="0" />
d66f925bfbeb Uploaded tool tarball.
devteam
parents:
diff changeset
31 <output name="outputFile1" file="flankingSequencesWindows0.tabular" />
d66f925bfbeb Uploaded tool tarball.
devteam
parents:
diff changeset
32 <output name="outputFile2" file="motifFrequencies0.tabular" />
d66f925bfbeb Uploaded tool tarball.
devteam
parents:
diff changeset
33 </test>
d66f925bfbeb Uploaded tool tarball.
devteam
parents:
diff changeset
34
d66f925bfbeb Uploaded tool tarball.
devteam
parents:
diff changeset
35 <test>
d66f925bfbeb Uploaded tool tarball.
devteam
parents:
diff changeset
36 <param name="inputFile1" value="motifs1.tabular" />
d66f925bfbeb Uploaded tool tarball.
devteam
parents:
diff changeset
37 <param name="inputFile2" value="indelsFlankingSequences1.tabular" />
d66f925bfbeb Uploaded tool tarball.
devteam
parents:
diff changeset
38 <param name="inputNumber3" value="10" />
d66f925bfbeb Uploaded tool tarball.
devteam
parents:
diff changeset
39 <output name="outputFile1" file="flankingSequencesWindows10.tabular" />
d66f925bfbeb Uploaded tool tarball.
devteam
parents:
diff changeset
40 <output name="outputFile2" file="motifFrequencies10.tabular" />
d66f925bfbeb Uploaded tool tarball.
devteam
parents:
diff changeset
41 </test>
d66f925bfbeb Uploaded tool tarball.
devteam
parents:
diff changeset
42 </tests>
d66f925bfbeb Uploaded tool tarball.
devteam
parents:
diff changeset
43
d66f925bfbeb Uploaded tool tarball.
devteam
parents:
diff changeset
44
d66f925bfbeb Uploaded tool tarball.
devteam
parents:
diff changeset
45 <help>
d66f925bfbeb Uploaded tool tarball.
devteam
parents:
diff changeset
46
d66f925bfbeb Uploaded tool tarball.
devteam
parents:
diff changeset
47 .. class:: infomark
d66f925bfbeb Uploaded tool tarball.
devteam
parents:
diff changeset
48
d66f925bfbeb Uploaded tool tarball.
devteam
parents:
diff changeset
49 **What it does**
d66f925bfbeb Uploaded tool tarball.
devteam
parents:
diff changeset
50
d66f925bfbeb Uploaded tool tarball.
devteam
parents:
diff changeset
51 This program computes the frequency of motifs in the flanking regions of indels found in a chromosome or a genome.
d66f925bfbeb Uploaded tool tarball.
devteam
parents:
diff changeset
52 Each indel has an upstream flanking sequence and a downstream flanking one. Each of the upstream and downstream flanking
d66f925bfbeb Uploaded tool tarball.
devteam
parents:
diff changeset
53 sequences will be divided into a certain number of windows according to the window size input by the user.
d66f925bfbeb Uploaded tool tarball.
devteam
parents:
diff changeset
54 The frequency of a motif in a certain window in one of the two flanking sequences is the total sum of occurrences of
d66f925bfbeb Uploaded tool tarball.
devteam
parents:
diff changeset
55 that motif in that window of that flanking sequence over all indels. The indel flanking regions file will be taken
d66f925bfbeb Uploaded tool tarball.
devteam
parents:
diff changeset
56 from your history or it will be uploaded, whereas the motifs file should be uploaded.
d66f925bfbeb Uploaded tool tarball.
devteam
parents:
diff changeset
57
d66f925bfbeb Uploaded tool tarball.
devteam
parents:
diff changeset
58 - The first input file is the motifs file and it is a tabular file consisting of two columns:
d66f925bfbeb Uploaded tool tarball.
devteam
parents:
diff changeset
59
d66f925bfbeb Uploaded tool tarball.
devteam
parents:
diff changeset
60 - the first column represents the motif name
d66f925bfbeb Uploaded tool tarball.
devteam
parents:
diff changeset
61 - the second column represents the motif sequence, as follows::
d66f925bfbeb Uploaded tool tarball.
devteam
parents:
diff changeset
62
d66f925bfbeb Uploaded tool tarball.
devteam
parents:
diff changeset
63 dnaPolPauseFrameshift1 GAG
d66f925bfbeb Uploaded tool tarball.
devteam
parents:
diff changeset
64 dnaPolPauseFrameshift2 ACG
d66f925bfbeb Uploaded tool tarball.
devteam
parents:
diff changeset
65 xSites1 CCG
d66f925bfbeb Uploaded tool tarball.
devteam
parents:
diff changeset
66
d66f925bfbeb Uploaded tool tarball.
devteam
parents:
diff changeset
67 - The second input file is the indels flanking regions file and it is a tabular file consisting of five columns:
d66f925bfbeb Uploaded tool tarball.
devteam
parents:
diff changeset
68
d66f925bfbeb Uploaded tool tarball.
devteam
parents:
diff changeset
69 - the first column represents the indel start coordinate
d66f925bfbeb Uploaded tool tarball.
devteam
parents:
diff changeset
70 - the second column represents the indel end coordinate
d66f925bfbeb Uploaded tool tarball.
devteam
parents:
diff changeset
71 - the third column represents the indel length
d66f925bfbeb Uploaded tool tarball.
devteam
parents:
diff changeset
72 - the fourth column represents the upstream flanking sequence
d66f925bfbeb Uploaded tool tarball.
devteam
parents:
diff changeset
73 - the fifth column represents the upstream flanking sequence, as follows::
d66f925bfbeb Uploaded tool tarball.
devteam
parents:
diff changeset
74
d66f925bfbeb Uploaded tool tarball.
devteam
parents:
diff changeset
75 16694766 16694768 3 GTGGGTCCTGCCCAGCCTCTGCCTCAGAGGGAAGAGTAGAGAACTGGG AGAGCAGGTCCTTAGGGAGCCCGAGGAAGTCCCTGACGCCAGCTGTTCTCGCGGACGAA
d66f925bfbeb Uploaded tool tarball.
devteam
parents:
diff changeset
76 25169542 25169545 4 caagcccacaagccttcagaccatagcaCGGGCTCCAGAGGTGTGAGG CAGGTCAGGTGCTTTAGAAGTCAAAAACTCTCAGTAAGGCAAATCACCCCCTATCTCCT
d66f925bfbeb Uploaded tool tarball.
devteam
parents:
diff changeset
77 41929580 41929585 6 ggctgtcgtatggaatctggggctcaggactctgtcccatttctctaa accattctgcTTCAACCCAGACACTGACTGTTTTCCAAATTTACTTGTTTGTTTGTTTT
d66f925bfbeb Uploaded tool tarball.
devteam
parents:
diff changeset
78
d66f925bfbeb Uploaded tool tarball.
devteam
parents:
diff changeset
79
d66f925bfbeb Uploaded tool tarball.
devteam
parents:
diff changeset
80 -----
d66f925bfbeb Uploaded tool tarball.
devteam
parents:
diff changeset
81
d66f925bfbeb Uploaded tool tarball.
devteam
parents:
diff changeset
82 .. class:: warningmark
d66f925bfbeb Uploaded tool tarball.
devteam
parents:
diff changeset
83
d66f925bfbeb Uploaded tool tarball.
devteam
parents:
diff changeset
84 **Notes**
d66f925bfbeb Uploaded tool tarball.
devteam
parents:
diff changeset
85
d66f925bfbeb Uploaded tool tarball.
devteam
parents:
diff changeset
86 - The lengths of the upstream flanking sequences must be equal for all indels.
d66f925bfbeb Uploaded tool tarball.
devteam
parents:
diff changeset
87 - The lengths of the downstream flanking sequences must be equal for all indels.
d66f925bfbeb Uploaded tool tarball.
devteam
parents:
diff changeset
88 - If the length of the upstream flanking sequence L is not an integer multiple of the window size S, in other words if L/S = m + r where m is the result of division and r is the remainder, then the upstream flanking sequence will be divided into m windows only starting from the indel, and the rest of the sequence will not be considered. The same rule applies to the downstream flanking sequence.
d66f925bfbeb Uploaded tool tarball.
devteam
parents:
diff changeset
89
d66f925bfbeb Uploaded tool tarball.
devteam
parents:
diff changeset
90 -----
d66f925bfbeb Uploaded tool tarball.
devteam
parents:
diff changeset
91
d66f925bfbeb Uploaded tool tarball.
devteam
parents:
diff changeset
92 The **output** of this program is two files:
d66f925bfbeb Uploaded tool tarball.
devteam
parents:
diff changeset
93
d66f925bfbeb Uploaded tool tarball.
devteam
parents:
diff changeset
94 - The first output file is a tabular file and represents the windows of both upstream and downstream flanking sequences. It consists of multiple left columns representing the windows of the upstream flanking sequence, followed by one column representing the indels, then followed by multiple right columns representing the windows of the downstream flanking sequence, as follows::
d66f925bfbeb Uploaded tool tarball.
devteam
parents:
diff changeset
95
d66f925bfbeb Uploaded tool tarball.
devteam
parents:
diff changeset
96 cgaggtcagg agatcgagac catcctggct aacatggtga aatcccgtct ctactaaaaa indel aaatttatat ttataaacaa ttttaataca cctatgttta ttatacattt
d66f925bfbeb Uploaded tool tarball.
devteam
parents:
diff changeset
97 GCCAGTTTAT GGTCTAACAA GGAGAGAAAC AGGGGGCTGA AGGGGTTTCT TAACCTCCAG indel TTCCGGGCTC TGTCCCTAAC CCCCAGCTAG GTAAGTGGCA AAGCACTTCT
d66f925bfbeb Uploaded tool tarball.
devteam
parents:
diff changeset
98 CAGTGGGACC AAGCACTGAA CCACTTTGGG GAGAATCTCA CACTGGGGCC CTCTGACACC indel tatatatttt tttttttttt tttttttttt tttttttttg agatggtgtc
d66f925bfbeb Uploaded tool tarball.
devteam
parents:
diff changeset
99 AGAGCAGCAG CACCCACTTT TGCAGTGTGT GACGTTGGTG GAGCCATCGA AGTCTGTGCT indel GAGCCCTCCC CAGTGCTCCG AGGAGCTGCT GTTCCCCCTG GAGCTCAGAA
d66f925bfbeb Uploaded tool tarball.
devteam
parents:
diff changeset
100
d66f925bfbeb Uploaded tool tarball.
devteam
parents:
diff changeset
101 - The second output file is a tabular file and represents the motif frequencies in every window of every flanking sequence. The first column on the left represents the names of motifs. The other columns represent the frequencies of motifs in the windows that correspond to the ones in the first output file, as follows::
d66f925bfbeb Uploaded tool tarball.
devteam
parents:
diff changeset
102
d66f925bfbeb Uploaded tool tarball.
devteam
parents:
diff changeset
103 dnaPolPauseFrameshift1 2 3 1 0 1 2 indel 0 2 2 1 3
d66f925bfbeb Uploaded tool tarball.
devteam
parents:
diff changeset
104 dnaPolPauseFrameshift2 2 3 1 0 1 2 indel 0 2 2 1 3
d66f925bfbeb Uploaded tool tarball.
devteam
parents:
diff changeset
105 xSites1 3 2 0 1 1 2 indel 1 1 3 2 3
d66f925bfbeb Uploaded tool tarball.
devteam
parents:
diff changeset
106
d66f925bfbeb Uploaded tool tarball.
devteam
parents:
diff changeset
107 </help>
d66f925bfbeb Uploaded tool tarball.
devteam
parents:
diff changeset
108
d66f925bfbeb Uploaded tool tarball.
devteam
parents:
diff changeset
109 </tool>