annotate README.txt @ 2:5b930e77b1f3

Better readability of Dockerfile, fix editing of userid for Dockerfile in DockerToolFactory.py.
author mvdbeek
date Wed, 03 Dec 2014 00:26:43 +0100
parents 7e0392d4531c
children
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
0
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
1 # WARNING before you start
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
2 # Install this tool for test purposes only
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
3 # Please NEVER on a public or production instance
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
4 # updated august 8 2014 to fix bugs reported by Marius van den Beek
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
5
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
6 Please report bugs concerning Docker to m.vandenbeek at gmail . com
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
7 or at https://bitbucket.org/mvdbeek/dockertoolfactory.
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
8
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
9
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
10 *Installation instructions*
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
11
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
12 This is a fork of toolfactory that makes use of Docker to sandbox the generated script.
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
13 As such you need to have the system user under which galaxy tools are executed be able to run Docker. On Ubuntu you can do this by
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
14 adding your galaxy user to the docker group (http://askubuntu.com/questions/477551/how-can-i-use-docker-without-sudo).
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
15 Here is the short form for installing Docker from the official docker Ubuntu Trusty repository:
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
16
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
17 sudo apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv-keys 36A1D7869245C8950F966E92D8576A8BA88D21E9
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
18 sudo sh -c "echo deb https://get.docker.io/ubuntu docker main > /etc/apt/sources.list.d/docker.list"
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
19 sudo apt-get update
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
20 sudo apt-get install lxc-docker
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
21 sudo gpasswd -a galaxy docker
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
22 sudo service docker restart
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
23
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
24 Eventually the galaxy process might need ot be restarted.
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
25
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
26 Note that this could bring severe security problems in case untrusted users can become this user.
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
27 If you want to use this tool, read and understand the following article:
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
28 https://docs.docker.com/articles/security/#docker-daemon-attack-surface
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
29
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
30 Work is ongoing, some important features are missing, like being able to manage containers and
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
31 to limit resource useage.
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
32
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
33 This is an alpha-stage, potentially dangerous tool.
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
34
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
35
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
36 Please cite:
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
37 http://bioinformatics.oxfordjournals.org/cgi/reprint/bts573?ijkey=lczQh1sWrMwdYWJ&keytype=ref
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
38 if you use this tool in your published work.
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
39
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
40 *Short Story*
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
41
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
42 This is an unusual Galaxy tool that exposes unrestricted and therefore extremely dangerous
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
43 scripting to designated administrative users of a Galaxy server, allowing them to run scripts
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
44 in R, python, sh and perl over a single input data set, writing a single new data set as output.
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
45
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
46 In addition, this tool optionally generates very simple new Galaxy tools, that effectively
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
47 freeze the supplied script into a new, ordinary Galaxy tool that runs it over one or more input files,
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
48 working just like any other Galaxy tool for your users.
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
49
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
50 To use the ToolFactory, you should have prepared a script to paste into a text box,
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
51 and a small test input example ready to select from your history to test your new script.
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
52 There is an example in each scripting language on the Tool Factory form. You can just
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
53 cut and paste these to try it out - remember to select the right interpreter please. You'll
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
54 also need to create a small test data set using the Galaxy history add new data tool.
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
55
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
56 If the script fails somehow, use the "redo" button on the tool output in your history to
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
57 recreate the form complete with broken script. Fix the bug and execute again. Rinse, wash, repeat.
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
58
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
59 Once the script runs sucessfully, a new Galaxy tool that runs your script can be generated.
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
60 Select the "generate" option and supply some help text and names. The new tool will be
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
61 generated in the form of a new Galaxy datatype - toolshed.gz - as the name suggests,
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
62 it's an archive ready to upload to a Galaxy ToolShed as a new tool repository.
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
63
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
64 Once it's in a ToolShed, it can be installed into any local Galaxy server from
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
65 the server administrative interface.
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
66
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
67 Once the new tool is installed, local users can run it - each time, the script that was supplied
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
68 when it was built will be executed with the input chosen from the user's history. In other words,
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
69 the tools you generate with the ToolFactory run just like any other Galaxy tool,
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
70 but run your script every time.
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
71
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
72 Tool factory tools are perfect for workflow components. One input, one output, no variables.
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
73
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
74 *Reasons to read further*
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
75
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
76 If you use Galaxy to support your research;
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
77
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
78 You and fellow users are sometimes forced to take data out of Galaxy, process it with ugly
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
79 little perl/awk/sed/R... scripts and put it back;
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
80
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
81 You do this when you can't do some transformation in Galaxy (the 90/10 rule);
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
82
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
83 You don't have enough developer resources for wrapping dozens of even relatively simple tools;
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
84
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
85 Your research and your institution would be far better off if those feral scripts were all tucked
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
86 safely in your local toolshed and Galaxy histories.
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
87
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
88 *The good news* If it can be trivially scripted, it can be running safely in your
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
89 local Galaxy via your own local toolshed in a few minutes - with functional tests.
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
90
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
91
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
92 *Value proposition* The ToolFactory allows Galaxy to efficiently take over most of your lab's
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
93 dark script matter, making it reproducible in Galaxy and shareable through the ToolShed.
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
94
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
95 That's what this tool does. You paste a simple script and the tool returns
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
96 a new, real Galaxy tool, ready to be installed from the local toolshed to local servers.
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
97 Scripts can be wrapped and online literally within minutes.
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
98
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
99 *To fully and safely exploit the awesome power* of this tool, Galaxy and the ToolShed,
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
100 you should be a developer installing this tool on a private/personal/scratch local instance where you
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
101 are an admin_user. Then, if you break it, you get to keep all the pieces
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
102 see https://bitbucket.org/fubar/galaxytoolfactory/wiki/Home
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
103
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
104 ** Installation **
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
105 This is a Galaxy tool. You can install it most conveniently using the administrative "Search and browse tool sheds" link.
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
106 Find the Galaxy Test toolshed (not main) and search for the toolfactory repository.
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
107 Open it and review the code and select the option to install it.
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
108
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
109 If you can't get the tool that way, the xml and py files here need to be copied into a new tools
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
110 subdirectory such as tools/toolfactory Your tool_conf.xml needs a new entry pointing to the xml
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
111 file - something like::
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
112
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
113 <section name="Tool building tools" id="toolbuilders">
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
114 <tool file="toolfactory/rgToolFactory.xml"/>
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
115 </section>
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
116
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
117 If not already there (I just added it to datatypes_conf.xml.sample), please add:
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
118 <datatype extension="toolshed.gz" type="galaxy.datatypes.binary:Binary" mimetype="multipart/x-gzip" subclass="True" />
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
119 to your local data_types_conf.xml.
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
120
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
121 Ensure that html sanitization is set to False and uncommented in universe_wsgi.ini
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
122
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
123 You'll have to restart the server for the new tool to be available.
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
124
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
125 Of course, R, python, perl etc are needed on your path if you want to test scripts using those interpreters.
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
126 Adding new ones to this tool code should be easy enough. Please make suggestions as bitbucket issues and code.
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
127 The HTML file code automatically shrinks R's bloated pdfs, and depends on ghostscript. The thumbnails require imagemagick .
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
128
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
129 * Restricted execution *
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
130 The new tool factory tool will then be usable ONLY by admin users - people with IDs in admin_users in universe_wsgi.ini
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
131 **Yes, that's right. ONLY admin_users can run this tool** Think about it for a moment. If allowed to run any
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
132 arbitrary script on your Galaxy server, the only thing that would impede a miscreant bent on destroying all your
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
133 Galaxy data would probably be lack of appropriate technical skills.
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
134
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
135 *What it does* This is a tool factory for simple scripts in python, R and perl currently.
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
136 Functional tests are automatically generated. How cool is that.
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
137
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
138 LIMITED to simple scripts that read one input from the history.
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
139 Optionally can write one new history dataset,
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
140 and optionally collect any number of outputs into links on an autogenerated HTML
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
141 index page for the user to navigate - useful if the script writes images and output files - pdf outputs
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
142 are shown as thumbnails and R's bloated pdf's are shrunk with ghostscript so that and imagemagik need to
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
143 be avaailable.
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
144
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
145 Generated tools can be edited and enhanced like any Galaxy tool, so start small and build up since
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
146 a generated script gets you a serious leg up to a more complex one.
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
147
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
148 *What you do* You paste and run your script
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
149 you fix the syntax errors and eventually it runs
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
150 You can use the redo button and edit the script before
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
151 trying to rerun it as you debug - it works pretty well.
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
152
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
153 Once the script works on some test data, you can
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
154 generate a toolshed compatible gzip file
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
155 containing your script ready to run as an ordinary Galaxy tool in a
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
156 repository on your local toolshed. That means safe and largely automated installation in any
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
157 production Galaxy configured to use your toolshed.
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
158
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
159 *Generated tool Security* Once you install a generated tool, it's just
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
160 another tool - assuming the script is safe. They just run normally and their user cannot do anything unusually insecure
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
161 but please, practice safe toolshed.
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
162 Read the fucking code before you install any tool.
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
163 Especially this one - it is really scary.
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
164
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
165 If you opt for an HTML output, you get all the script outputs arranged
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
166 as a single Html history item - all output files are linked, thumbnails for all the pdfs.
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
167 Ugly but really inexpensive.
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
168
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
169 Patches and suggestions welcome as bitbucket issues please?
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
170
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
171 long route to June 2012 product
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
172 derived from an integrated script model
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
173 called rgBaseScriptWrapper.py
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
174 Note to the unwary:
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
175 This tool allows arbitrary scripting on your Galaxy as the Galaxy user
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
176 There is nothing stopping a malicious user doing whatever they choose
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
177 Extremely dangerous!!
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
178 Totally insecure. So, trusted users only
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
179
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
180
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
181
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
182
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
183 copyright ross lazarus (ross stop lazarus at gmail stop com) May 2012
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
184
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
185 all rights reserved
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
186 Licensed under the LGPL if you want to improve it, feel free https://bitbucket.org/fubar/galaxytoolfactory/wiki/Home
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
187
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
188 Material for our more enthusiastic and voracious readers continues below - we salute you.
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
189
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
190 **Motivation** Simple transformation, filtering or reporting scripts get written, run and lost every day in most busy labs
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
191 - even ours where Galaxy is in use. This 'dark script matter' is pervasive and generally not reproducible.
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
192
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
193 **Benefits** For our group, this allows Galaxy to fill that important dark script gap - all those "small" bioinformatics
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
194 tasks. Once a user has a working R (or python or perl) script that does something Galaxy cannot currently do (eg transpose a
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
195 tabular file) and takes parameters the way Galaxy supplies them (see example below), they:
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
196
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
197 1. Install the tool factory on a personal private instance
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
198
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
199 2. Upload a small test data set
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
200
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
201 3. Paste the script into the 'script' text box and iteratively run the insecure tool on test data until it works right -
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
202 there is absolutely no reason to do this anywhere other than on a personal private instance.
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
203
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
204 4. Once it works right, set the 'Generate toolshed gzip' option and run it again.
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
205
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
206 5. A toolshed style gzip appears ready to upload and install like any other Toolshed entry.
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
207
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
208 6. Upload the new tool to the toolshed
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
209
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
210 7. Ask the local admin to check the new tool to confirm it's not evil and install it in the local production galaxy
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
211
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
212 **Simple examples on the tool form**
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
213
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
214 A simple Rscript "filter" showing how the command line parameters can be handled, takes an input file,
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
215 does something (transpose in this case) and writes the results to a new tabular file::
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
216
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
217 # transpose a tabular input file and write as a tabular output file
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
218 ourargs = commandArgs(TRUE)
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
219 inf = ourargs[1]
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
220 outf = ourargs[2]
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
221 inp = read.table(inf,head=F,row.names=NULL,sep='\t')
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
222 outp = t(inp)
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
223 write.table(outp,outf, quote=FALSE, sep="\t",row.names=F,col.names=F)
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
224
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
225 Calculate a multiple test adjusted p value from a column of p values - for this script to be useful,
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
226 it needs the right column for the input to be specified in the code for the
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
227 given input file type(s) specified when the tool is generated ::
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
228
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
229 # use p.adjust - assumes a HEADER row and column 1 - please fix for any real use
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
230 column = 1 # adjust if necessary for some other kind of input
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
231 fdrmeth = 'BH'
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
232 ourargs = commandArgs(TRUE)
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
233 inf = ourargs[1]
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
234 outf = ourargs[2]
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
235 inp = read.table(inf,head=T,row.names=NULL,sep='\t')
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
236 p = inp[,column]
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
237 q = p.adjust(p,method=fdrmeth)
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
238 newval = paste(fdrmeth,'p-value',sep='_')
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
239 q = data.frame(q)
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
240 names(q) = newval
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
241 outp = cbind(inp,newval=q)
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
242 write.table(outp,outf, quote=FALSE, sep="\t",row.names=F,col.names=T)
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
243
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
244
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
245
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
246 Another Rscript example without any input file - generates a random heatmap pdf - you must make sure the option to create an HTML output file is
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
247 turned on for this to work. The heatmap will be presented as a thumbnail linked to the pdf in the resulting HTML page::
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
248
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
249 # note this script takes NO input or output because it generates random data
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
250 foo = data.frame(a=runif(100),b=runif(100),c=runif(100),d=runif(100),e=runif(100),f=runif(100))
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
251 bar = as.matrix(foo)
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
252 pdf( "heattest.pdf" )
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
253 heatmap(bar,main='Random Heatmap')
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
254 dev.off()
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
255
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
256 A Python example that reverses each row of a tabular file. You'll need to remove the leading spaces for this to work if cut
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
257 and pasted into the script box. Note that you can already do this in Galaxy by setting up the cut columns tool with the
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
258 correct number of columns in reverse order,but this script will work for any number of columns so is completely generic::
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
259
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
260 # reverse order of columns in a tabular file
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
261 import sys
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
262 inp = sys.argv[1]
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
263 outp = sys.argv[2]
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
264 i = open(inp,'r')
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
265 o = open(outp,'w')
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
266 for row in i:
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
267 rs = row.rstrip().split('\t')
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
268 rs.reverse()
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
269 o.write('\t'.join(rs))
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
270 o.write('\n')
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
271 i.close()
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
272 o.close()
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
273
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
274
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
275 Galaxy as an IDE for developing API scripts
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
276 If you need to develop Galaxy API scripts and you like to live dangerously, please read on.
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
277
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
278 Galaxy as an IDE?
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
279 Amazingly enough, blend-lib API scripts run perfectly well *inside* Galaxy when pasted into a Tool Factory form. No need to generate a new tool. Galaxy+Tool_Factory = IDE I think we need a new t-shirt. Seriously, it is actually quite useable.
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
280
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
281 Why bother - what's wrong with Eclipse
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
282 Nothing. But, compared with developing API scripts in the usual way outside Galaxy, you get persistence and other framework benefits plus at absolutely no extra charge, a ginormous security problem if you share the history or any outputs because they contain the api script with key so development servers only please!
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
283
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
284 Workflow
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
285 Fire up the Tool Factory in Galaxy.
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
286
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
287 Leave the input box empty, set the interpreter to python, paste and run an api script - eg working example (substitute the url and key) below.
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
288
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
289 It took me a few iterations to develop the example below because I know almost nothing about the API. I started with very simple code from one of the samples and after each run, the (edited..) api script is conveniently recreated using the redo button on the history output item. So each successive version of the developing api script you run is persisted - ready to be edited and rerun easily. It is ''very'' handy to be able to add a line of code to the script and run it, then view the output to (eg) inspect dicts returned by API calls to help move progressively deeper iteratively.
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
290
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
291 Give the below a whirl on a private clone (install the tool factory from the main toolshed) and try adding complexity with few rerun/edit/rerun cycles.
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
292
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
293 Eg tool factory api script
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
294 import sys
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
295 from blend.galaxy import GalaxyInstance
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
296 ourGal = 'http://x.x.x.x:xxxx'
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
297 ourKey = 'xxx'
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
298 gi = GalaxyInstance(ourGal, key=ourKey)
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
299 libs = gi.libraries.get_libraries()
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
300 res = []
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
301 # libs looks like
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
302 # u'url': u'/galaxy/api/libraries/441d8112651dc2f3', u'id': u'441d8112651dc2f3', u'name':.... u'Demonstration sample RNA data',
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
303 for lib in libs:
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
304 res.append('%s:\n' % lib['name'])
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
305 res.append(str(gi.libraries.show_library(lib['id'],contents=True)))
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
306 outf=open(sys.argv[2],'w')
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
307 outf.write('\n'.join(res))
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
308 outf.close()
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
309
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
310 **Attribution**
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
311 Creating re-usable tools from scripts: The Galaxy Tool Factory
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
312 Ross Lazarus; Antony Kaspi; Mark Ziemann; The Galaxy Team
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
313 Bioinformatics 2012; doi: 10.1093/bioinformatics/bts573
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
314
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
315 http://bioinformatics.oxfordjournals.org/cgi/reprint/bts573?ijkey=lczQh1sWrMwdYWJ&keytype=ref
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
316
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
317 **Licensing**
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
318 Copyright Ross Lazarus 2010
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
319 ross lazarus at g mail period com
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
320
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
321 All rights reserved.
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
322
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
323 Licensed under the LGPL
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
324
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
325 **Obligatory screenshot**
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
326
7e0392d4531c Initial Commit.
m.vandenbeek@gmail.com
parents:
diff changeset
327 http://bitbucket.org/fubar/galaxytoolmaker/src/fda8032fe989/images/dynamicScriptTool.png