Mercurial > repos > dfornika > blast_xml_to_tabular
comparison blast_xml_to_tabular.xml @ 0:efe0c7b8fb78 draft
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
author | dfornika |
---|---|
date | Mon, 09 Sep 2019 17:17:09 -0400 |
parents | |
children |
comparison
equal
deleted
inserted
replaced
-1:000000000000 | 0:efe0c7b8fb78 |
---|---|
1 <tool id="blast_xml_to_tabular" name="BLAST XML to tabular" version="1.1.0"> | |
2 <description>Convert BLAST XML output to tabular</description> | |
3 <command detect_errors="exit_code"> | |
4 <![CDATA[ | |
5 '$__tool_directory__/blast_xml_to_tabular.py' | |
6 '${blastxml_file}' | |
7 '${tabular_file}' | |
8 '${out_format}' | |
9 ]]> | |
10 </command> | |
11 <inputs> | |
12 <param name="blastxml_file" type="data" format="blastxml" label="BLAST results as XML"/> | |
13 <param name="out_format" type="select" label="Output format"> | |
14 <option value="std" selected="True">Tabular (standard 12 columns)</option> | |
15 <option value="ext">Tabular (extended 24 columns)</option> | |
16 <option value="ext+">Tabular (extended 26 columns)</option> | |
17 </param> | |
18 </inputs> | |
19 <outputs> | |
20 <data name="tabular_file" format="tabular" label="BLAST results: data $blastxml_file.hid as tabular" /> | |
21 </outputs> | |
22 <help> | |
23 | |
24 .. class:: infomark | |
25 | |
26 **What it does** | |
27 | |
28 NCBI BLAST+ (and the older NCBI 'legacy' BLAST) can output in a range of | |
29 formats including tabular and a more detailed XML format. A complex workflow | |
30 may need both the XML and the tabular output - but running BLAST twice is | |
31 slow and wasteful. | |
32 | |
33 This tool takes the BLAST XML output and by default converts it into the | |
34 standard 12 column tabular equivalent: | |
35 | |
36 ====== ========= ============================================ | |
37 Column NCBI name Description | |
38 ------ --------- -------------------------------------------- | |
39 1 qseqid Query Seq-id (ID of your sequence) | |
40 2 sseqid Subject Seq-id (ID of the database hit) | |
41 3 pident Percentage of identical matches | |
42 4 length Alignment length | |
43 5 mismatch Number of mismatches | |
44 6 gapopen Number of gap openings | |
45 7 qstart Start of alignment in query | |
46 8 qend End of alignment in query | |
47 9 sstart Start of alignment in subject (database hit) | |
48 10 send End of alignment in subject (database hit) | |
49 11 evalue Expectation value (E-value) | |
50 12 bitscore Bit score | |
51 ====== ========= ============================================ | |
52 | |
53 The BLAST+ tools can optionally output additional columns of information, | |
54 but this takes longer to calculate. Most (but not all) of these columns are | |
55 included by selecting the extended tabular output. The extra columns are | |
56 included *after* the standard 12 columns. This is so that you can write | |
57 workflow filtering steps that accept either the 12 or 24 column tabular | |
58 BLAST output. | |
59 | |
60 ====== ============= =========================================== | |
61 Column NCBI name Description | |
62 ------ ------------- ------------------------------------------- | |
63 13 sallseqid All subject Seq-id(s), separated by a ';' | |
64 14 score Raw score | |
65 15 nident Number of identical matches | |
66 16 positive Number of positive-scoring matches | |
67 17 gaps Total number of gaps | |
68 18 ppos Percentage of positive-scoring matches | |
69 19 qframe Query frame | |
70 20 sframe Subject frame | |
71 21 qseq Aligned part of query sequence | |
72 22 sseq Aligned part of subject sequence | |
73 23 qlen Query sequence length | |
74 24 slen Subject sequence length | |
75 ====== ============= =========================================== | |
76 | |
77 Very slight modifications were made to the "BLAST XML to tabular" tool that | |
78 ships with Galaxy to output two more column columns: | |
79 | |
80 ====== ============= =========================================== | |
81 Column NCBI name Description | |
82 ------ ------------- ------------------------------------------- | |
83 25 pcov Percentage coverage | |
84 26 sallseqdescr All subject Seq-descr(s), separated by a ',' | |
85 ====== ============= =========================================== | |
86 | |
87 ---- | |
88 | |
89 .. class:: infomark | |
90 | |
91 This is a slightly modified version of a tool that ships with Galaxy. | |
92 If the 12 or 24 columns formats are desired, use the original tool. | |
93 | |
94 .. class:: warningmark | |
95 | |
96 Beware that the XML file (and thus the conversion) and the tabular output | |
97 direct from BLAST+ may differ in the presence of XXXX masking on regions | |
98 low complexity (columns 21 and 22), and thus also calculated figures like | |
99 the percentage identity (column 3). | |
100 | |
101 </help> | |
102 </tool> |