Mercurial > repos > peterjc > tmhmm_and_signalp
comparison tools/protein_analysis/tmhmm2.xml @ 0:a2eeeaa6f75e
Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
| author | peterjc |
|---|---|
| date | Tue, 07 Jun 2011 17:37:26 -0400 |
| parents | |
| children | 9a8a7f680dd6 |
comparison
equal
deleted
inserted
replaced
| -1:000000000000 | 0:a2eeeaa6f75e |
|---|---|
| 1 <tool id="tmhmm2" name="TMHMM 2.0" version="0.0.1"> | |
| 2 <description>Find transmembrane domains in protein sequences</description> | |
| 3 <command interpreter="python"> | |
| 4 tmhmm2.py 8 $fasta_file $tabular_file | |
| 5 ##I want the number of threads to be a Galaxy config option... | |
| 6 </command> | |
| 7 <inputs> | |
| 8 <param name="fasta_file" type="data" format="fasta" label="FASTA file of protein sequences"/> | |
| 9 <!-- | |
| 10 <param name="version" type="select" display="radio" label="Model version"> | |
| 11 <option value="">Version 1 (old)</option> | |
| 12 <option value="" selected="True">Version 2 (default)</option> | |
| 13 </param> | |
| 14 --> | |
| 15 </inputs> | |
| 16 <outputs> | |
| 17 <data name="tabular_file" format="tabular" label="TMHMM results" /> | |
| 18 </outputs> | |
| 19 <requirements> | |
| 20 <requirement type="binary">tmhmm</requirement> | |
| 21 </requirements> | |
| 22 <tests> | |
| 23 <test> | |
| 24 <param name="fasta_file" value="four_human_proteins.fasta" ftype="fasta"/> | |
| 25 <output name="tabular_file" file="four_human_proteins.tmhmm2.tsv" ftype="tabular"/> | |
| 26 </test> | |
| 27 </tests> | |
| 28 <help> | |
| 29 | |
| 30 **What it does** | |
| 31 | |
| 32 This calls the TMHMM v2.0 tool for prediction of transmembrane (TM) helices in proteins using a hidden Markov model (HMM). | |
| 33 | |
| 34 The input is a FASTA file of protein sequences, and the output is tabular with six columns (one row per protein): | |
| 35 | |
| 36 1. Sequence identifier | |
| 37 2. Sequence length | |
| 38 3. Expected number of amino acids in TM helices (ExpAA). If this number is larger than 18 it is very likely to be a transmembrane protein (OR have a signal peptide). | |
| 39 4. Expected number of amino acids in TM helices in the first 60 amino acids of the protein (Exp60). If this number more than a few, be aware that a predicted transmembrane helix in the N-term could be a signal peptide. | |
| 40 5. Number of transmembrane helices predicted by N-best. | |
| 41 6. Topology predicted by N-best (encoded as a strip using o for output and i for inside) | |
| 42 | |
| 43 Predicted TM segments in the n-terminal region sometime turn out to be signal peptides. | |
| 44 | |
| 45 One of the most common mistakes by the program is to reverse the direction of proteins with one TM segment. | |
| 46 | |
| 47 Do not use the program to predict whether a non-membrane protein is cytoplasmic or not. | |
| 48 | |
| 49 **Notes** | |
| 50 | |
| 51 The raw output from TMHMM v2.0 looks like this (six columns tab separated): | |
| 52 | |
| 53 =================================== ======= =========== ============= ========= ============================= | |
| 54 gi|2781234|pdb|1JLY|B len=304 ExpAA=0.01 First60=0.00 PredHel=0 Topology=o | |
| 55 gi|4959044|gb|AAD34209.1|AF069992_1 len=600 ExpAA=0.00 First60=0.00 PredHel=0 Topology=o | |
| 56 gi|671626|emb|CAA85685.1| len=473 ExpAA=0.19 First60=0.00 PredHel=0 Topology=o | |
| 57 gi|3298468|dbj|BAA31520.1| len=107 ExpAA=59.37 First60=31.17 PredHel=3 Topology=o23-45i52-74o89-106i | |
| 58 =================================== ======= =========== ============= ========= ============================= | |
| 59 | |
| 60 In order to make it easier to use in Galaxy, the wrapper script simplifies this to remove the redundant tags, and instead adds a comment line at the top with the column names: | |
| 61 | |
| 62 =================================== === ===== ======= ======= ==================== | |
| 63 #ID len ExpAA First60 PredHel Topology | |
| 64 gi|2781234|pdb|1JLY|B 304 0.01 0.00 0 o | |
| 65 gi|4959044|gb|AAD34209.1|AF069992_1 600 0.00 0.00 0 o | |
| 66 gi|671626|emb|CAA85685.1| 473 0.19 0.00 0 o | |
| 67 gi|3298468|dbj|BAA31520.1| 107 59.37 31.17 3 o23-45i52-74o89-106i | |
| 68 =================================== === ===== ======= ======= ==================== | |
| 69 | |
| 70 **References** | |
| 71 | |
| 72 Krogh, Larsson, von Heijne, and Sonnhammer. | |
| 73 Predicting Transmembrane Protein Topology with a Hidden Markov Model: Application to Complete Genomes. | |
| 74 J. Mol. Biol. 305:567-580, 2001. | |
| 75 | |
| 76 Sonnhammer, von Heijne, and Krogh. | |
| 77 A hidden Markov model for predicting transmembrane helices in protein sequences. | |
| 78 In J. Glasgow et al., eds.: Proc. Sixth Int. Conf. on Intelligent Systems for Molecular Biology, pages 175-182. AAAI Press, 1998. | |
| 79 | |
| 80 </help> | |
| 81 </tool> |
