comparison tools/protein_analysis/README.txt @ 20:a538e182fab3 draft

Uploaded v0.2.5 preview 4, adding Cock et al. 2003 citation information.
author peterjc
date Tue, 10 Sep 2013 08:55:19 -0400
parents
children 4cee8236c77b
comparison
equal deleted inserted replaced
19:4cd848c5590b 20:a538e182fab3
1 This package contains Galaxy wrappers for a selection of standalone command
2 line protein analysis tools:
3
4 * SignalP 3.0, THMHMM 2.0, Promoter 2.0 from the Center for Biological
5 Sequence Analysis at the Technical University of Denmark,
6 http://www.cbs.dtu.dk/cbs/
7
8 * WoLF PSORT v0.2 from http://wolfpsort.org/
9
10 * PSORTb v3 from http://www.psort.org/downloads/index.html
11
12 Also, the RXLR motif tool uses SignalP 3.0 and HMMER 2.3.2 internally.
13
14 To use these Galaxy wrappers you must first install the command line tools.
15 At the time of writing they are all free for academic use, or open source.
16
17 These wrappers are copyright 2010-2013 by Peter Cock, James Hutton Institute
18 (formerly SCRI, Scottish Crop Research Institute), UK. All rights reserved.
19 Contributions/revisions copyright 2011 Konrad Paszkiewicz. All rights reserved.
20 See the included LICENCE file for details (an MIT style open source licence).
21
22 The wrappers are available from the Galaxy Tool Shed
23 http://toolshed.g2.bx.psu.edu/view/peterjc/tmhmm_and_signalp
24
25 Citation
26 ========
27
28 If you use any of these Galaxy tools in work leading to a scientific
29 publication, in addition to citing the invididual underlying tools, please cite:
30
31 Peter Cock, Bjoern Gruening, Konrad Paszkiewicz and Leighton Pritchard (2013).
32 Galaxy tools and workflows for sequence analysis with applications
33 in molecular plant pathology. PeerJ 1:e167
34 http://dx.doi.org/10.7717/peerj.167
35
36 Full reference information is included in the help text for each tool.
37
38
39 Requirements
40 ============
41
42 First install those command line tools you wish to use the wrappers for:
43
44 1. Install the command line version of SignalP 3.0 and ensure "signalp" is
45 on the PATH, see: http://www.cbs.dtu.dk/services/SignalP/
46
47 2. Install the command line version of TMHMM 2.0 and ensure "tmhmm" is on
48 the PATH, see: http://www.cbs.dtu.dk/services/TMHMM/
49
50 3. Install the command line version of Promoter 2.0 and ensure "promoter" is
51 on the PATH, see: http://www.cbs.dtu.dk/services/Promoter
52
53 4. Install the WoLF PSORT v0.2 package, and ensure "runWolfPsortSummary"
54 is on the PATH (we use an extra wrapper script to change to the WoLF PSORT
55 directory, run runWolfPsortSummary, and then change back to the original
56 directory), see: http://wolfpsort.org/WoLFPSORT_package/version0.2/
57
58 5. Install hmmsearch from HMMER 2.3.2 (the last stable release of HMMER 2)
59 but put it on the path under the name hmmsearch2 (allowing it to co-exist
60 with HMMER 3), or edit rlxr_motif.py accordingly.
61
62 Verify each of the tools is installed and working from the command line
63 (when logged in as the Galaxy user if appropriate).
64
65
66 Manual Installation
67 ===================
68
69 1. Create a folder tools/protein_analysis under your Galaxy installation.
70 This folder name is not critical, and can be changed if desired - you
71 must update the paths used in tool_conf.xml to match.
72
73 2. Copy/move the following files (from this archive) there:
74
75 tmhmm2.xml (Galaxy tool definition)
76 tmhmm2.py (Python wrapper script)
77
78 signalp3.xml (Galaxy tool definition)
79 signalp3.py (Python wrapper script)
80
81 promoter2.xml (Galaxy tool definition)
82 promoter2.py (Python wrapper script)
83
84 psortb.xml (Galaxy tool definition)
85 psortb.py (Python wrapper script)
86
87 wolf_psort.xml (Galaxy tool definition)
88 wolf_psort.py (Python wrapper script)
89
90 rxlr_motifs.xml (Galaxy tool definition)
91 rxlr_motifs.py (Python script)
92
93 seq_analysis_utils.py (shared Python code)
94 LICENCE
95 README (this file)
96
97 3. Edit your Galaxy conjuration file tool_conf.xml (to use the tools) AND
98 also tool_conf.xml.sample (to run the tests) to include the new tools
99 by adding:
100
101 <section name="Protein sequence analysis" id="protein_analysis">
102 <tool file="protein_analysis/tmhmm2.xml" />
103 <tool file="protein_analysis/signalp3.xml" />
104 <tool file="protein_analysis/psortb.xml" />
105 <tool file="protein_analysis/wolf_psort.xml" />
106 <tool file="protein_analysis/rxlr_motifs.xml" />
107 </section>
108 <section name="Nucleotide sequence analysis" id="nucleotide_analysis">
109 <tool file="protein_analysis/promoter2.xml" />
110 </section>
111
112 Leave out the lines for any tools you do not wish to use in Galaxy.
113
114 4. Copy/move the test-data files (from this archive) to Galaxy's
115 subfolder test-data.
116
117 5. Run the Galaxy functional tests for these new wrappers with:
118
119 ./run_functional_tests.sh -id tmhmm2
120 ./run_functional_tests.sh -id signalp3
121 ./run_functional_tests.sh -id Psortb
122 ./run_functional_tests.sh -id rxlr_motifs
123
124 Alternatively, this should work (assuming you left the name and id as shown in
125 the XML file tool_conf.xml.sample):
126
127 ./run_functional_tests.sh -sid Protein_sequence_analysis-protein_analysis
128
129 To check the section ID expected, use ./run_functional_tests.sh -list
130
131 6. Restart Galaxy and check the new tools are shown and work.
132
133
134 History
135 =======
136
137 v0.0.1 - Initial release
138 v0.0.2 - Corrected some typos in the help text
139 - Renamed test output file to use Galaxy convention of *.tabular
140 v0.0.3 - Check for tmhmm2 silent failures (no output)
141 - Additional unit tests
142 v0.0.4 - Ignore comment lines in tmhmm2 output.
143 v0.0.5 - Explicitly request tmhmm short output (may not be the default)
144 v0.0.6 - Improvement to how sub-jobs are run (should be faster)
145 v0.0.7 - Change SignalP default truncation from 60 to 70 to match the
146 SignalP webservice.
147 v0.0.8 - Added WoLF PSORT wrapper to the suite.
148 v0.0.9 - Added our RXLR motifs tool to the suite.
149 v0.1.0 - Added Promoter 2.0 wrapper (similar to SignalP & TMHMM wrappers)
150 - Support Galaxy's <parallelism> tag for SignalP, TMHMM & Promoter
151 v0.1.1 - Fixed an error in the header of the tabular output from Promoter
152 v0.1.2 - Use the new <stdio> settings in the XML wrappers to catch errors
153 - Use SGE style $NSLOTS for thread count (otherwise default to 4)
154 v0.1.3 - Added missing file whisson_et_al_rxlr_eer_cropped.hmm to Tool Shed
155 v0.2.0 - Added PSORTb wrapper to the suite, based on earlier work
156 contributed by Konrad Paszkiewicz.
157 v0.2.1 - Use a script to create the Tool Shed tar-ball (removed some stray
158 files accidentally included previously via a wildcard).
159 v0.2.2 - Include missing test files.
160 v0.2.3 - Added unit tests for WoLF PSORT.
161 v0.2.4 - Added unit tests for Promoter 2
162 v0.2.5 - Link to Tool Shed added to help text and this documentation.
163 - More unit tests.
164 - Fixed bug with RXLR tool and empty FASTA files.
165 - Fixed typo in the RXLR tool help text.
166 - Updated citation information (Cock et al. 2013).
167
168
169 Developers
170 ==========
171
172 This script and other tools are being developed on the following hg branch:
173 http://bitbucket.org/peterjc/galaxy-central/src/tools
174
175 This incorporates the previously used hg branch:
176 http://bitbucket.org/peterjc/galaxy-central/src/seq_analysis
177
178 For making the "Galaxy Tool Shed" http://community.g2.bx.psu.edu/ tarball use
179 the following command from the Galaxy root folder:
180
181 $ ./tools/protein_analysis/make_tmhmm_and_signalp.sh
182
183 This simplifies ensuring a consistent set of files is bundled each time,
184 including all the relevant test files.