# HG changeset patch # User peterjc # Date 1414770203 14400 # Node ID d0de6862cda19966557ef8f49c4e9fa2d50d41f5 # Parent 148eceb80cbb7597620dd53dc353ad54eb8ed5f9 Uploaded v0.1.01, embed citation info, GI and SeqID filters. diff -r 148eceb80cbb -r d0de6862cda1 tools/ncbi_blast_plus/README.rst --- a/tools/ncbi_blast_plus/README.rst Wed Mar 19 10:51:45 2014 -0400 +++ b/tools/ncbi_blast_plus/README.rst Fri Oct 31 11:43:23 2014 -0400 @@ -1,10 +1,13 @@ Galaxy wrappers for NCBI BLAST+ suite ===================================== -These wrappers are copyright 2010-2013 by Peter Cock (The James Hutton Institute, -UK) and additional contributors. All rights reserved. See the licence text below. +These wrappers are copyright 2010-2014 by Peter Cock (The James Hutton Institute, +UK) and additional contributors including Edward Kirton, John Chilton, +Nicola Soranzo, Jim Johnson, and Bjoern Gruening. -Currently tested with NCBI BLAST 2.2.28+ (i.e. version 2.2.28 of BLAST+), +See the licence text below. + +Currently tested with NCBI BLAST 2.2.29+ (i.e. version 2.2.29 of BLAST+), and does not work with the NCBI 'legacy' BLAST suite (e.g. ``blastall``). Note that these wrappers (and the associated datatypes) were originally @@ -52,8 +55,7 @@ You will also need to install ``blast_datatypes`` from the Tool Shed. This -defines the BLAST XML file format (``blastxml``) and protein and nucleotide -BLAST databases composite file formats (``blastdbp`` and ``blastdbn``): +defines the BLAST XML file format (``blastxml``), BLAST databases, etc: * http://toolshed.g2.bx.psu.edu/view/devteam/blast_datatypes @@ -62,7 +64,7 @@ files. You must install the NCBI BLAST+ standalone tools somewhere on the system -path. Currently the unit tests are written using "BLAST 2.2.29+". +path. Currently the unit tests are written using BLAST 2.2.29+. Run the functional tests (adjusting the section identifier to match your ``tool_conf.xml.sample`` file):: @@ -73,10 +75,10 @@ ============= You must tell Galaxy about any system level BLAST databases using configuration -files blastdb.loc (nucleotide databases like NT) and blastdb_p.loc (protein -databases like NR), and blastdb_d.loc (protein domain databases like CDD or -SMART) which are located in the tool-data/ folder. Sample files are included -which explain the tab-based format to use. +files ``blastdb.loc`` (nucleotide databases like NT) and ``blastdb_p.loc`` +(protein databases like NR), and ``blastdb_d.loc`` (protein domain databases +like CDD or SMART) which are located in the ``tool-data/`` folder. Sample +files are included which explain the tab-based format to use. You can download the NCBI provided databases as tar-balls from here: @@ -130,7 +132,7 @@ get one of the missing columns like query or subject length) v0.0.18 - Defensive quoting of filenames in case of spaces (where possible, BLAST+ handling of some multi-file arguments is problematic). -v0.0.19 - Added wrappers for rpsblast and rpstblastn, and new blastdb_d.loc +v0.0.19 - Added wrappers for rpsblast and rpstblastn, and new ``blastdb_d.loc`` for the domain databases they use (e.g. CDD, PFAM or SMART). - Correct case of exception regular expression (for error handling fall-back in case the return code is not set properly). @@ -177,6 +179,8 @@ (based on contribution from Bjoern Gruening). - The RPS-BLAST and RPS-TBLASTN wrappers support using a protein domain database from the user's history. + - Tool definitions now embed citation information (by John Chilton). + - BLAST tools support GI and SeqID filters (added by Bjoern Gruening). ======= ====================================================================== diff -r 148eceb80cbb -r d0de6862cda1 tools/ncbi_blast_plus/blastxml_to_tabular.py --- a/tools/ncbi_blast_plus/blastxml_to_tabular.py Wed Mar 19 10:51:45 2014 -0400 +++ b/tools/ncbi_blast_plus/blastxml_to_tabular.py Fri Oct 31 11:43:23 2014 -0400 @@ -66,7 +66,7 @@ from optparse import OptionParser if "-v" in sys.argv or "--version" in sys.argv: - print "v0.1.00" + print "v0.1.01" sys.exit(0) if sys.version_info[:2] >= ( 2, 5 ): @@ -85,7 +85,20 @@ if len(sys.argv) == 4 and sys.argv[3] in ["std", "x22", "ext"]: #False positive if user really has a BLAST XML file called 'std' or 'ext'... - stop_err("ERROR: The script API has changed, sorry.") + stop_err("""ERROR: The script API has changed, sorry. + +Instead of the old style: + +$ python blastxml_to_tabular.py input.xml output.tabular std + +Please use: + +$ python blastxml_to_tabular.py -o output.tabular -c std input.xml + +For more information, use: + +$ python blastxml_to_tabular.py -h +""") usage = """usage: %prog [options] blastxml[,...] diff -r 148eceb80cbb -r d0de6862cda1 tools/ncbi_blast_plus/blastxml_to_tabular.xml --- a/tools/ncbi_blast_plus/blastxml_to_tabular.xml Wed Mar 19 10:51:45 2014 -0400 +++ b/tools/ncbi_blast_plus/blastxml_to_tabular.xml Fri Oct 31 11:43:23 2014 -0400 @@ -1,4 +1,4 @@ - + Convert BLAST XML output to tabular blastxml_to_tabular.py --version @@ -209,4 +209,7 @@ This wrapper is available to install into other Galaxy Instances via the Galaxy Tool Shed at http://toolshed.g2.bx.psu.edu/view/devteam/ncbi_blast_plus + + 10.7717/peerj.167 + diff -r 148eceb80cbb -r d0de6862cda1 tools/ncbi_blast_plus/ncbi_blastdbcmd_info.xml --- a/tools/ncbi_blast_plus/ncbi_blastdbcmd_info.xml Wed Mar 19 10:51:45 2014 -0400 +++ b/tools/ncbi_blast_plus/ncbi_blastdbcmd_info.xml Fri Oct 31 11:43:23 2014 -0400 @@ -1,4 +1,4 @@ - + Show BLAST database information from blastdbcmd blastdbcmd @@ -32,4 +32,5 @@ @REFERENCES@ + diff -r 148eceb80cbb -r d0de6862cda1 tools/ncbi_blast_plus/ncbi_blastdbcmd_wrapper.xml --- a/tools/ncbi_blast_plus/ncbi_blastdbcmd_wrapper.xml Wed Mar 19 10:51:45 2014 -0400 +++ b/tools/ncbi_blast_plus/ncbi_blastdbcmd_wrapper.xml Fri Oct 31 11:43:23 2014 -0400 @@ -1,4 +1,4 @@ - + Extract sequence(s) from BLAST database blastdbcmd @@ -104,4 +104,5 @@ @REFERENCES@ + diff -r 148eceb80cbb -r d0de6862cda1 tools/ncbi_blast_plus/ncbi_blastn_wrapper.xml --- a/tools/ncbi_blast_plus/ncbi_blastn_wrapper.xml Wed Mar 19 10:51:45 2014 -0400 +++ b/tools/ncbi_blast_plus/ncbi_blastn_wrapper.xml Fri Oct 31 11:43:23 2014 -0400 @@ -1,4 +1,4 @@ - + Search nucleotide database with nucleotide query sequence(s) @@ -24,6 +24,7 @@ -perc_identity $adv_opts.identity_cutoff #end if $adv_opts.ungapped +@ADV_ID_LIST_FILTER@ ## End of advanced options: #end if @@ -62,6 +63,7 @@ + @@ -128,4 +130,5 @@ @REFERENCES@ + diff -r 148eceb80cbb -r d0de6862cda1 tools/ncbi_blast_plus/ncbi_blastp_wrapper.xml --- a/tools/ncbi_blast_plus/ncbi_blastp_wrapper.xml Wed Mar 19 10:51:45 2014 -0400 +++ b/tools/ncbi_blast_plus/ncbi_blastp_wrapper.xml Fri Oct 31 11:43:23 2014 -0400 @@ -1,4 +1,4 @@ - + Search protein database with protein query sequence(s) @@ -22,6 +22,7 @@ @ADVANCED_OPTIONS@ ##Ungapped disabled for now - see comments below ##$adv_opts.ungapped +@ADV_ID_LIST_FILTER@ ## End of advanced options: #end if @@ -52,6 +53,7 @@ --> + @@ -144,4 +146,5 @@ @REFERENCES@ + diff -r 148eceb80cbb -r d0de6862cda1 tools/ncbi_blast_plus/ncbi_blastx_wrapper.xml --- a/tools/ncbi_blast_plus/ncbi_blastx_wrapper.xml Wed Mar 19 10:51:45 2014 -0400 +++ b/tools/ncbi_blast_plus/ncbi_blastx_wrapper.xml Fri Oct 31 11:43:23 2014 -0400 @@ -1,4 +1,4 @@ - + Search protein database with translated nucleotide query sequence(s) @@ -22,6 +22,7 @@ -matrix $adv_opts.matrix @ADVANCED_OPTIONS@ $adv_opts.ungapped +@ADV_ID_LIST_FILTER@ ## End of advanced options: #end if @@ -45,6 +46,7 @@ + @@ -123,4 +125,5 @@ @REFERENCES@ + diff -r 148eceb80cbb -r d0de6862cda1 tools/ncbi_blast_plus/ncbi_convert2blastmask_wrapper.xml --- a/tools/ncbi_blast_plus/ncbi_convert2blastmask_wrapper.xml Wed Mar 19 10:51:45 2014 -0400 +++ b/tools/ncbi_blast_plus/ncbi_convert2blastmask_wrapper.xml Fri Oct 31 11:43:23 2014 -0400 @@ -1,4 +1,4 @@ - + Convert masking information in lower-case masked FASTA input to file formats suitable for makeblastdb convert2blastmask @@ -84,4 +84,5 @@ @REFERENCES@ + diff -r 148eceb80cbb -r d0de6862cda1 tools/ncbi_blast_plus/ncbi_dustmasker_wrapper.xml --- a/tools/ncbi_blast_plus/ncbi_dustmasker_wrapper.xml Wed Mar 19 10:51:45 2014 -0400 +++ b/tools/ncbi_blast_plus/ncbi_dustmasker_wrapper.xml Fri Oct 31 11:43:23 2014 -0400 @@ -1,4 +1,4 @@ - + masks low complexity regions @@ -96,4 +96,5 @@ @REFERENCES@ + diff -r 148eceb80cbb -r d0de6862cda1 tools/ncbi_blast_plus/ncbi_macros.xml --- a/tools/ncbi_blast_plus/ncbi_macros.xml Wed Mar 19 10:51:45 2014 -0400 +++ b/tools/ncbi_blast_plus/ncbi_macros.xml Fri Oct 31 11:43:23 2014 -0400 @@ -113,8 +113,9 @@ + - + @@ -321,8 +322,42 @@ - + + + + + + + + + + + + + + + + + + + + + + +#if $adv_opts.adv_optional_id_files_opts.adv_optional_id_files_opts_selector == 'negative_gilist': + -negative_gilist $adv_opts.adv_optional_id_files_opts.negative_gilist +#elif $adv_opts.adv_optional_id_files_opts.adv_optional_id_files_opts_selector == 'gilist': + -gilist $adv_opts.adv_optional_id_files_opts.gilist +#elif $adv_opts.adv_optional_id_files_opts.adv_optional_id_files_opts_selector == 'seqidlist': + -seqidlist $adv_opts.adv_optional_id_files_opts.seqidlist +#end if + -num_threads "\${GALAXY_SLOTS:-8}" #if $db_opts.db_opts_selector == "db": @@ -379,6 +414,13 @@ This wrapper is available to install into other Galaxy Instances via the Galaxy Tool Shed at http://toolshed.g2.bx.psu.edu/view/devteam/ncbi_blast_plus + + + 10.1186/1471-2105-10-421 + 10.7717/peerj.167 + + + **Output format** Because Galaxy focuses on processing tabular data, the default output of this diff -r 148eceb80cbb -r d0de6862cda1 tools/ncbi_blast_plus/ncbi_makeblastdb.xml --- a/tools/ncbi_blast_plus/ncbi_makeblastdb.xml Wed Mar 19 10:51:45 2014 -0400 +++ b/tools/ncbi_blast_plus/ncbi_makeblastdb.xml Fri Oct 31 11:43:23 2014 -0400 @@ -1,4 +1,4 @@ - + Make BLAST database makeblastdb @@ -180,4 +180,5 @@ @REFERENCES@ + diff -r 148eceb80cbb -r d0de6862cda1 tools/ncbi_blast_plus/ncbi_makeprofiledb.xml --- a/tools/ncbi_blast_plus/ncbi_makeprofiledb.xml Wed Mar 19 10:51:45 2014 -0400 +++ b/tools/ncbi_blast_plus/ncbi_makeprofiledb.xml Fri Oct 31 11:43:23 2014 -0400 @@ -124,4 +124,5 @@ @REFERENCES@ + diff -r 148eceb80cbb -r d0de6862cda1 tools/ncbi_blast_plus/ncbi_rpsblast_wrapper.xml --- a/tools/ncbi_blast_plus/ncbi_rpsblast_wrapper.xml Wed Mar 19 10:51:45 2014 -0400 +++ b/tools/ncbi_blast_plus/ncbi_rpsblast_wrapper.xml Fri Oct 31 11:43:23 2014 -0400 @@ -104,4 +104,5 @@ @REFERENCES@ + diff -r 148eceb80cbb -r d0de6862cda1 tools/ncbi_blast_plus/ncbi_rpstblastn_wrapper.xml --- a/tools/ncbi_blast_plus/ncbi_rpstblastn_wrapper.xml Wed Mar 19 10:51:45 2014 -0400 +++ b/tools/ncbi_blast_plus/ncbi_rpstblastn_wrapper.xml Fri Oct 31 11:43:23 2014 -0400 @@ -102,4 +102,5 @@ @REFERENCES@ + diff -r 148eceb80cbb -r d0de6862cda1 tools/ncbi_blast_plus/ncbi_segmasker_wrapper.xml --- a/tools/ncbi_blast_plus/ncbi_segmasker_wrapper.xml Wed Mar 19 10:51:45 2014 -0400 +++ b/tools/ncbi_blast_plus/ncbi_segmasker_wrapper.xml Fri Oct 31 11:43:23 2014 -0400 @@ -1,4 +1,4 @@ - + low-complexity regions in protein sequences segmasker @@ -98,4 +98,5 @@ @REFERENCES@ + diff -r 148eceb80cbb -r d0de6862cda1 tools/ncbi_blast_plus/ncbi_tblastn_wrapper.xml --- a/tools/ncbi_blast_plus/ncbi_tblastn_wrapper.xml Wed Mar 19 10:51:45 2014 -0400 +++ b/tools/ncbi_blast_plus/ncbi_tblastn_wrapper.xml Fri Oct 31 11:43:23 2014 -0400 @@ -1,4 +1,4 @@ - + Search translated nucleotide database with protein query sequence(s) @@ -22,6 +22,7 @@ @ADVANCED_OPTIONS@ ##Ungapped disabled for now - see comments below ##$adv_opts.ungapped +@ADV_ID_LIST_FILTER@ ## End of advanced options: #end if @@ -49,6 +50,7 @@ --> + @@ -158,4 +160,5 @@ @REFERENCES@ + diff -r 148eceb80cbb -r d0de6862cda1 tools/ncbi_blast_plus/ncbi_tblastx_wrapper.xml --- a/tools/ncbi_blast_plus/ncbi_tblastx_wrapper.xml Wed Mar 19 10:51:45 2014 -0400 +++ b/tools/ncbi_blast_plus/ncbi_tblastx_wrapper.xml Fri Oct 31 11:43:23 2014 -0400 @@ -1,4 +1,4 @@ - + Search translated nucleotide database with translated nucleotide query sequence(s) @@ -24,6 +24,7 @@ ## Need int(str(...)) because $adv_opts.max_hits is an InputValueWrapper object not a string ## Note -max_target_seqs overrides -num_descriptions and -num_alignments @ADVANCED_OPTIONS@ +@ADV_ID_LIST_FILTER@ ## End of advanced options: #end if @@ -49,6 +50,7 @@ + @@ -74,7 +76,7 @@ **What it does** -Search a *translated nucleotide database* using a *protein query*, +Search a *translated nucleotide database* using a *translated nucleotide query*, using the NCBI BLAST+ tblastx command line tool. @FASTA_WARNING@ @@ -92,4 +94,5 @@ @REFERENCES@ + diff -r 148eceb80cbb -r d0de6862cda1 tools/ncbi_blast_plus/repository_dependencies.xml --- a/tools/ncbi_blast_plus/repository_dependencies.xml Wed Mar 19 10:51:45 2014 -0400 +++ b/tools/ncbi_blast_plus/repository_dependencies.xml Fri Oct 31 11:43:23 2014 -0400 @@ -1,4 +1,4 @@ - + diff -r 148eceb80cbb -r d0de6862cda1 tools/ncbi_blast_plus/tool_dependencies.xml --- a/tools/ncbi_blast_plus/tool_dependencies.xml Wed Mar 19 10:51:45 2014 -0400 +++ b/tools/ncbi_blast_plus/tool_dependencies.xml Fri Oct 31 11:43:23 2014 -0400 @@ -1,6 +1,6 @@ - +