diff tools/protein_analysis/predictnls.xml @ 0:beaa52cd2954 draft

Uploaded v0.0.4, first public release
author peterjc
date Mon, 23 Sep 2013 10:03:02 -0400
parents
children
line wrap: on
line diff
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/tools/protein_analysis/predictnls.xml	Mon Sep 23 10:03:02 2013 -0400
@@ -0,0 +1,82 @@
+<tool id="predictnls" name="PredictNLS" version="0.0.4">
+    <description>Find nuclear localization signals (NLSs) in protein sequences</description>
+    <command interpreter="python">
+      predictnls.py $fasta_file $tabular_file
+    </command>
+    <inputs>
+        <param name="fasta_file" type="data" format="fasta" label="FASTA file of protein sequences"/> 
+    </inputs>
+    <outputs>
+        <data name="tabular_file" format="tabular" label="predictNLS results" />
+    </outputs>
+    <tests>
+        <test>
+             <param name="fasta_file" value="four_human_proteins.fasta"/>
+             <output name="tabular_file" file="four_human_proteins.predictnls.tabular"/>
+        </test>
+    </tests>
+    <requirements>
+        <requirement type="binary">predictnls</requirement>
+    </requirements>
+    <help>
+    
+**What it does**
+
+This calls a Python re-implementation of the PredictNLS tool for prediction of
+nuclear localization signals (NLSs), which works by looking for matches to
+a known set of patterns (described using regular expressions).
+
+The input is a FASTA file of protein sequences, and the output is tabular with
+these columns (multiple rows per protein):
+
+====== ==========================================================================
+Column Description
+------ --------------------------------------------------------------------------
+     1 Sequence identifier
+     2 Start of NLS
+     3 NLS sequence
+     4 NLS pattern (regular expression)
+     5 Number of reference proteins with this NLS
+     6 Percentage of reference proteins with this NLS which are nuclear localized
+     7 Comma separated list of reference proteins
+     8 Comma separated list of reference proteins' localizations
+====== ==========================================================================
+
+If a sequence has no predicted NLS, then there is no line in the output file
+for it. This is a simplification of the text rich output from the command line
+tool, to give a tabular file suitable for use within Galaxy.
+
+Information about potential DNA binding (shown in the original predictnls
+tool) is not given.
+
+**Localizations**
+
+The following abbreviations are used (derived from SWISS-PROT):
+
+==== =======================
+Abbr Localization         
+---- -----------------------
+cyt  Cytoplasm
+pla  Chloroplast
+ret  Eendoplasmic reticululm
+ext  Extracellular
+gol  Golgi
+lys  Lysosomal
+mit  Mitochondria
+nuc  Nuclear
+oxi  Peroxisom
+vac  Vacuolar
+rip  Periplasmic
+==== =======================
+
+**References**
+
+Murat Cokol, Rajesh Nair, and Burkhard Rost.
+Finding nuclear localization signals.
+EMBO reports 1(5), 411–415, 2000
+http://dx.doi.org/10.1093/embo-reports/kvd092
+
+http://rostlab.org
+
+    </help>
+</tool>