Mercurial > repos > peterjc > predictnls
diff tools/protein_analysis/predictnls.xml @ 0:beaa52cd2954 draft
Uploaded v0.0.4, first public release
author | peterjc |
---|---|
date | Mon, 23 Sep 2013 10:03:02 -0400 |
parents | |
children |
line wrap: on
line diff
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/tools/protein_analysis/predictnls.xml Mon Sep 23 10:03:02 2013 -0400 @@ -0,0 +1,82 @@ +<tool id="predictnls" name="PredictNLS" version="0.0.4"> + <description>Find nuclear localization signals (NLSs) in protein sequences</description> + <command interpreter="python"> + predictnls.py $fasta_file $tabular_file + </command> + <inputs> + <param name="fasta_file" type="data" format="fasta" label="FASTA file of protein sequences"/> + </inputs> + <outputs> + <data name="tabular_file" format="tabular" label="predictNLS results" /> + </outputs> + <tests> + <test> + <param name="fasta_file" value="four_human_proteins.fasta"/> + <output name="tabular_file" file="four_human_proteins.predictnls.tabular"/> + </test> + </tests> + <requirements> + <requirement type="binary">predictnls</requirement> + </requirements> + <help> + +**What it does** + +This calls a Python re-implementation of the PredictNLS tool for prediction of +nuclear localization signals (NLSs), which works by looking for matches to +a known set of patterns (described using regular expressions). + +The input is a FASTA file of protein sequences, and the output is tabular with +these columns (multiple rows per protein): + +====== ========================================================================== +Column Description +------ -------------------------------------------------------------------------- + 1 Sequence identifier + 2 Start of NLS + 3 NLS sequence + 4 NLS pattern (regular expression) + 5 Number of reference proteins with this NLS + 6 Percentage of reference proteins with this NLS which are nuclear localized + 7 Comma separated list of reference proteins + 8 Comma separated list of reference proteins' localizations +====== ========================================================================== + +If a sequence has no predicted NLS, then there is no line in the output file +for it. This is a simplification of the text rich output from the command line +tool, to give a tabular file suitable for use within Galaxy. + +Information about potential DNA binding (shown in the original predictnls +tool) is not given. + +**Localizations** + +The following abbreviations are used (derived from SWISS-PROT): + +==== ======================= +Abbr Localization +---- ----------------------- +cyt Cytoplasm +pla Chloroplast +ret Eendoplasmic reticululm +ext Extracellular +gol Golgi +lys Lysosomal +mit Mitochondria +nuc Nuclear +oxi Peroxisom +vac Vacuolar +rip Periplasmic +==== ======================= + +**References** + +Murat Cokol, Rajesh Nair, and Burkhard Rost. +Finding nuclear localization signals. +EMBO reports 1(5), 411–415, 2000 +http://dx.doi.org/10.1093/embo-reports/kvd092 + +http://rostlab.org + + </help> +</tool>