comparison tools/protein_analysis/predictnls.xml @ 0:beaa52cd2954 draft

Uploaded v0.0.4, first public release
author peterjc
date Mon, 23 Sep 2013 10:03:02 -0400
parents
children
comparison
equal deleted inserted replaced
-1:000000000000 0:beaa52cd2954
1 <tool id="predictnls" name="PredictNLS" version="0.0.4">
2 <description>Find nuclear localization signals (NLSs) in protein sequences</description>
3 <command interpreter="python">
4 predictnls.py $fasta_file $tabular_file
5 </command>
6 <inputs>
7 <param name="fasta_file" type="data" format="fasta" label="FASTA file of protein sequences"/>
8 </inputs>
9 <outputs>
10 <data name="tabular_file" format="tabular" label="predictNLS results" />
11 </outputs>
12 <tests>
13 <test>
14 <param name="fasta_file" value="four_human_proteins.fasta"/>
15 <output name="tabular_file" file="four_human_proteins.predictnls.tabular"/>
16 </test>
17 </tests>
18 <requirements>
19 <requirement type="binary">predictnls</requirement>
20 </requirements>
21 <help>
22
23 **What it does**
24
25 This calls a Python re-implementation of the PredictNLS tool for prediction of
26 nuclear localization signals (NLSs), which works by looking for matches to
27 a known set of patterns (described using regular expressions).
28
29 The input is a FASTA file of protein sequences, and the output is tabular with
30 these columns (multiple rows per protein):
31
32 ====== ==========================================================================
33 Column Description
34 ------ --------------------------------------------------------------------------
35 1 Sequence identifier
36 2 Start of NLS
37 3 NLS sequence
38 4 NLS pattern (regular expression)
39 5 Number of reference proteins with this NLS
40 6 Percentage of reference proteins with this NLS which are nuclear localized
41 7 Comma separated list of reference proteins
42 8 Comma separated list of reference proteins' localizations
43 ====== ==========================================================================
44
45 If a sequence has no predicted NLS, then there is no line in the output file
46 for it. This is a simplification of the text rich output from the command line
47 tool, to give a tabular file suitable for use within Galaxy.
48
49 Information about potential DNA binding (shown in the original predictnls
50 tool) is not given.
51
52 **Localizations**
53
54 The following abbreviations are used (derived from SWISS-PROT):
55
56 ==== =======================
57 Abbr Localization
58 ---- -----------------------
59 cyt Cytoplasm
60 pla Chloroplast
61 ret Eendoplasmic reticululm
62 ext Extracellular
63 gol Golgi
64 lys Lysosomal
65 mit Mitochondria
66 nuc Nuclear
67 oxi Peroxisom
68 vac Vacuolar
69 rip Periplasmic
70 ==== =======================
71
72 **References**
73
74 Murat Cokol, Rajesh Nair, and Burkhard Rost.
75 Finding nuclear localization signals.
76 EMBO reports 1(5), 411–415, 2000
77 http://dx.doi.org/10.1093/embo-reports/kvd092
78
79 http://rostlab.org
80
81 </help>
82 </tool>