Mercurial > repos > iuc > glimmer_long_orfs
comparison glimmer_long_orfs.xml @ 0:943d09a12602 draft default tip
planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/glimmer commit 37388949e348d221170659bbee547bf4ac67ef1a
| author | iuc |
|---|---|
| date | Tue, 28 Nov 2017 09:56:41 -0500 |
| parents | |
| children |
comparison
equal
deleted
inserted
replaced
| -1:000000000000 | 0:943d09a12602 |
|---|---|
| 1 <tool id="glimmer_long_orfs" name="Glimmer long ORFs" version="@WRAPPER_VERSION@"> | |
| 2 <description>identify long, non-overlapping ORFs</description> | |
| 3 <macros> | |
| 4 <import>macros.xml</import> | |
| 5 </macros> | |
| 6 <expand macro="requirements"/> | |
| 7 <command><![CDATA[ | |
| 8 long-orfs | |
| 9 -n -t | |
| 10 $cutoff | |
| 11 '$inputfile' | |
| 12 '$output' | |
| 13 2>&1 | |
| 14 ]]></command> | |
| 15 <inputs> | |
| 16 <param name="inputfile" type="data" format="fasta" label="Genome Sequence" help="Dataset missing? See TIP below"/> | |
| 17 <param name='cutoff' type='float' label='cutoff' value='1.5'/> | |
| 18 </inputs> | |
| 19 <outputs> | |
| 20 <data format="tabular" name="output" /> | |
| 21 </outputs> | |
| 22 <tests> | |
| 23 <test> | |
| 24 <param name="inputfile" value='streptomyces_Tu6071_genomic.fasta'/> | |
| 25 <param name='cutoff' value='1.5'/> | |
| 26 <output name="output" file='longORFSTestOutput.dat'/> | |
| 27 </test> | |
| 28 </tests> | |
| 29 <help><![CDATA[ | |
| 30 | |
| 31 **What it does** | |
| 32 | |
| 33 This program identifies long, non-overlapping open reading frames (orfs) in a DNA sequence file. | |
| 34 These orfs are very likely to contain genes, and can be used as a set of training sequences | |
| 35 More specifically, among all orfs longer than a minimum length , those that do not overlap any others are output. The start codon used for | |
| 36 each orf is the first possible one. The program, by default, automatically determines the | |
| 37 value that maximizes the number of orfs that are output. With the -t option, the initial | |
| 38 set of candidate orfs also can be filtered using entropy distance, which generally produces | |
| 39 a larger, more accurate training set, particularly for high-GC-content genomes. | |
| 40 | |
| 41 | |
| 42 | |
| 43 ----- | |
| 44 | |
| 45 **Glimmer Overview** | |
| 46 | |
| 47 :: | |
| 48 | |
| 49 ************** ************** ************** ************** | |
| 50 * * * * * * * * | |
| 51 * long-orfs * ===> * Extract * ===> * build-icm * ===> * glimmer3 * | |
| 52 * * * * * * * * | |
| 53 ************** ************** ************** ************** | |
| 54 | |
| 55 ----- | |
| 56 | |
| 57 **Example** | |
| 58 | |
| 59 | |
| 60 * input:: | |
| 61 | |
| 62 -Genome Sequence | |
| 63 | |
| 64 CELF22B7 C.aenorhabditis elegans (Bristol N2) cosmid F22B7 | |
| 65 GATCCTTGTAGATTTTGAATTTGAAGTTTTTTCTCATTCCAAAACTCTGT | |
| 66 GATCTGAAATAAAATGTCTCAAAAAAATAGAAGAAAACATTGCTTTATAT | |
| 67 TTATCAGTTATGGTTTTCAAAATTTTCTGACATACCGTTTTGCTTCTTTT | |
| 68 TTTCTCATCTTCTTCAAATATCAATTGTGATAATCTGACTCCTAACAATC | |
| 69 GAATTTCTTTTCCTTTTTCTTTTTCCAACAACTCCAGTGAGAACTTTTGA | |
| 70 ATATCTTCAAGTGACTTCACCACATCAGAAGGTGTCAACGATCTTGTGAG | |
| 71 AACATCGAATGAAGATAATTTTAATTTTAGAGTTACAGTTTTTCCTCCGA | |
| 72 CAATTCCTGATTTACGAACATCTTCTTCAAGCATTCTACAGATTTCTTGA | |
| 73 TGCTCTTCTAGGAGGATGTTGAAATCCGAAGTTGGAGAAAAAGTTCTCTC | |
| 74 AACTGAAATGCTTTTTCTTCGTGGATCCGATTCAGATGGACGACCTGGCA | |
| 75 GTCCGAGAGCCGTTCGAAGGAAAGATTCTTGTGAGAGAGGCGTGAAACAC | |
| 76 AAAGGGTATAGGTTCTTCTTCAGATTCATATCACCAACAGTTTGAATATC | |
| 77 CATTGCTTTCAGTTGAGCTTCGCATACACGACCAATTCCTCCAACCTAAA | |
| 78 AAATTATCTAGGTAAAACTAGAAGGTTATGCTTTAATAGTCTCACCTTAC | |
| 79 GAATCGGTAAATCCTTCAAAAACTCCATAATCGCGTTTTTATCATTTTCT | |
| 80 ..... | |
| 81 | |
| 82 - Cutoff 1.5 | |
| 83 | |
| 84 * output:: | |
| 85 | |
| 86 Sequence file = /home/mohammed/galaxy-central/database/files/000/dataset_34.dat | |
| 87 Excluded regions file = none | |
| 88 Circular genome = true | |
| 89 Initial minimum gene length = 90 bp | |
| 90 Determine optimal min gene length to maximize number of genes | |
| 91 Maximum overlap bases = 30 | |
| 92 Start codons = atg,gtg,ttg | |
| 93 Stop codons = taa,tag,tga | |
| 94 Sequence length = 40222 | |
| 95 Final minimum gene length = 97 | |
| 96 | |
| 97 Putative Genes: | |
| 98 00001 40137 52 +2 0.892 | |
| 99 00002 1319 1095 -3 0.654 | |
| 100 00003 1555 1391 -2 0.793 | |
| 101 00004 1953 2066 +3 1.078 | |
| 102 00005 2045 2146 +2 0.919 | |
| 103 00006 4463 4759 +2 0.985 | |
| 104 00007 6785 6582 -3 1.033 | |
| 105 00008 6862 7020 +1 0.915 | |
| 106 00009 7300 7488 +1 0.900 | |
| 107 00010 7463 7570 +2 0.912 | |
| 108 00011 8399 8527 +2 1.044 | |
| 109 00012 10652 10545 -3 0.895 | |
| 110 00013 12170 12066 -3 1.108 | |
| 111 00014 13891 13748 -2 0.998 | |
| 112 00015 14157 14044 -1 1.026 | |
| 113 00016 15285 15410 +3 0.928 | |
| 114 00017 15829 15704 -2 0.949 | |
| 115 | |
| 116 .... | |
| 117 ]]></help> | |
| 118 <expand macro="citation" /> | |
| 119 </tool> |
