Mercurial > repos > dfornika > blast_xml_to_tabular
annotate blast_xml_to_tabular.py @ 0:efe0c7b8fb78 draft
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
author | dfornika |
---|---|
date | Mon, 09 Sep 2019 17:17:09 -0400 |
parents | |
children |
rev | line source |
---|---|
0
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
1 #!/usr/bin/env python |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
2 |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
3 """Convert a BLAST XML file to 12 column tabular output |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
4 |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
5 Takes three command line options, input BLAST XML filename, output tabular |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
6 BLAST filename, output format (std for standard 12 columns, or ext for the |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
7 extended 24 columns offered in the BLAST+ wrappers). |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
8 |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
9 The 12 columns output are 'qseqid sseqid pident length mismatch gapopen qstart |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
10 qend sstart send evalue bitscore' or 'std' at the BLAST+ command line, which |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
11 mean: |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
12 |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
13 ====== ========= ============================================ |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
14 Column NCBI name Description |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
15 ------ --------- -------------------------------------------- |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
16 1 qseqid Query Seq-id (ID of your sequence) |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
17 2 sseqid Subject Seq-id (ID of the database hit) |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
18 3 pident Percentage of identical matches |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
19 4 length Alignment length |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
20 5 mismatch Number of mismatches |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
21 6 gapopen Number of gap openings |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
22 7 qstart Start of alignment in query |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
23 8 qend End of alignment in query |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
24 9 sstart Start of alignment in subject (database hit) |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
25 10 send End of alignment in subject (database hit) |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
26 11 evalue Expectation value (E-value) |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
27 12 bitscore Bit score |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
28 ====== ========= ============================================ |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
29 |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
30 The additional columns offered in the Galaxy BLAST+ wrappers are: |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
31 |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
32 ====== ============= =========================================== |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
33 Column NCBI name Description |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
34 ------ ------------- ------------------------------------------- |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
35 13 sallseqid All subject Seq-id(s), separated by a ';' |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
36 14 score Raw score |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
37 15 nident Number of identical matches |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
38 16 positive Number of positive-scoring matches |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
39 17 gaps Total number of gaps |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
40 18 ppos Percentage of positive-scoring matches |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
41 19 qframe Query frame |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
42 20 sframe Subject frame |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
43 21 qseq Aligned part of query sequence |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
44 22 sseq Aligned part of subject sequence |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
45 23 qlen Query sequence length |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
46 24 slen Subject sequence length |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
47 ====== ============= =========================================== |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
48 |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
49 Most of these fields are given explicitly in the XML file, others some like |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
50 the percentage identity and the number of gap openings must be calculated. |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
51 |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
52 Be aware that the sequence in the extended tabular output or XML direct from |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
53 BLAST+ may or may not use XXXX masking on regions of low complexity. This |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
54 can throw the off the calculation of percentage identity and gap openings. |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
55 [In fact, both BLAST 2.2.24+ and 2.2.25+ have a subtle bug in this regard, |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
56 with these numbers changing depending on whether or not the low complexity |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
57 filter is used.] |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
58 |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
59 This script attempts to produce identical output to what BLAST+ would have done. |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
60 However, check this with "diff -b ..." since BLAST+ sometimes includes an extra |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
61 space character (probably a bug). |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
62 |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
63 python blast_xml_parser.py in_file out_file out_format |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
64 """ |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
65 |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
66 import sys |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
67 import re |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
68 |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
69 import xml.etree.cElementTree as ElementTree |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
70 |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
71 def stop_err( msg ): |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
72 sys.stderr.write("%s\n" % msg) |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
73 sys.exit(1) |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
74 |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
75 #Parse Command Line |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
76 try: |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
77 in_file, out_file, out_fmt = sys.argv[1:] |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
78 except: |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
79 stop_err("Expect 3 arguments: input BLAST XML file, output tabular file, out format (std or ext)") |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
80 |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
81 if out_fmt == "std": |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
82 extended = False |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
83 elif out_fmt == "x22": |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
84 stop_err("Format argument x22 has been replaced with ext (extended 24 columns)") |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
85 elif out_fmt == "ext" or out_fmt == "ext+": |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
86 extended = True |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
87 else: |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
88 stop_err("Format argument should be std (12 column) or ext (extended 24 columns) or ext+ (extended 26 columns)") |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
89 |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
90 extended_plus = False |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
91 if '+' in out_fmt: |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
92 extended_plus = True |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
93 |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
94 # get an iterable |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
95 try: |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
96 context = ElementTree.iterparse(in_file, events=("start", "end")) |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
97 except: |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
98 stop_err("Invalid data format.") |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
99 # turn it into an iterator |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
100 context = iter(context) |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
101 # get the root element |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
102 try: |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
103 event, root = context.next() |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
104 except: |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
105 stop_err( "Invalid data format." ) |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
106 |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
107 re_default_query_id = re.compile("^Query_\d+$") |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
108 assert re_default_query_id.match("Query_101") |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
109 assert not re_default_query_id.match("Query_101a") |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
110 assert not re_default_query_id.match("MyQuery_101") |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
111 re_default_subject_id = re.compile("^Subject_\d+$") |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
112 assert re_default_subject_id.match("Subject_1") |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
113 assert not re_default_subject_id.match("Subject_") |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
114 assert not re_default_subject_id.match("Subject_12a") |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
115 assert not re_default_subject_id.match("TheSubject_1") |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
116 |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
117 outfile = open(out_file, 'w') |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
118 blast_program = None |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
119 for event, elem in context: |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
120 if event == "end" and elem.tag == "BlastOutput_program": |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
121 blast_program = elem.text |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
122 # for every <Iteration> tag |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
123 if event == "end" and elem.tag == "Iteration": |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
124 #Expecting either this, from BLAST 2.2.25+ using FASTA vs FASTA |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
125 # <Iteration_query-ID>sp|Q9BS26|ERP44_HUMAN</Iteration_query-ID> |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
126 # <Iteration_query-def>Endoplasmic reticulum resident protein 44 OS=Homo sapiens GN=ERP44 PE=1 SV=1</Iteration_query-def> |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
127 # <Iteration_query-len>406</Iteration_query-len> |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
128 # <Iteration_hits></Iteration_hits> |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
129 # |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
130 #Or, from BLAST 2.2.24+ run online |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
131 # <Iteration_query-ID>Query_1</Iteration_query-ID> |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
132 # <Iteration_query-def>Sample</Iteration_query-def> |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
133 # <Iteration_query-len>516</Iteration_query-len> |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
134 # <Iteration_hits>... |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
135 qseqid = elem.findtext("Iteration_query-ID") |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
136 if re_default_query_id.match(qseqid): |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
137 #Place holder ID, take the first word of the query definition |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
138 qseqid = elem.findtext("Iteration_query-def").split(None,1)[0] |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
139 qlen = int(elem.findtext("Iteration_query-len")) |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
140 |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
141 # for every <Hit> within <Iteration> |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
142 for hit in elem.findall("Iteration_hits/Hit"): |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
143 #Expecting either this, |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
144 # <Hit_id>gi|3024260|sp|P56514.1|OPSD_BUFBU</Hit_id> |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
145 # <Hit_def>RecName: Full=Rhodopsin</Hit_def> |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
146 # <Hit_accession>P56514</Hit_accession> |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
147 #or, |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
148 # <Hit_id>Subject_1</Hit_id> |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
149 # <Hit_def>gi|57163783|ref|NP_001009242.1| rhodopsin [Felis catus]</Hit_def> |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
150 # <Hit_accession>Subject_1</Hit_accession> |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
151 # |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
152 #apparently depending on the parse_deflines switch |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
153 sseqid = hit.findtext("Hit_id").split(None,1)[0] |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
154 hit_def = sseqid + " " + hit.findtext("Hit_def") |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
155 if re_default_subject_id.match(sseqid) \ |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
156 and sseqid == hit.findtext("Hit_accession"): |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
157 #Place holder ID, take the first word of the subject definition |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
158 hit_def = hit.findtext("Hit_def") |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
159 sseqid = hit_def.split(None,1)[0] |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
160 # for every <Hsp> within <Hit> |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
161 for hsp in hit.findall("Hit_hsps/Hsp"): |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
162 nident = hsp.findtext("Hsp_identity") |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
163 length = hsp.findtext("Hsp_align-len") |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
164 pident = "%0.2f" % (100*float(nident)/float(length)) |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
165 |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
166 q_seq = hsp.findtext("Hsp_qseq") |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
167 h_seq = hsp.findtext("Hsp_hseq") |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
168 m_seq = hsp.findtext("Hsp_midline") |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
169 assert len(q_seq) == len(h_seq) == len(m_seq) == int(length) |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
170 gapopen = str(len(q_seq.replace('-', ' ').split())-1 + \ |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
171 len(h_seq.replace('-', ' ').split())-1) |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
172 |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
173 mismatch = m_seq.count(' ') + m_seq.count('+') \ |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
174 - q_seq.count('-') - h_seq.count('-') |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
175 #TODO - Remove this alternative mismatch calculation and test |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
176 #once satisifed there are no problems |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
177 expected_mismatch = len(q_seq) \ |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
178 - sum(1 for q,h in zip(q_seq, h_seq) \ |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
179 if q == h or q == "-" or h == "-") |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
180 xx = sum(1 for q,h in zip(q_seq, h_seq) if q=="X" and h=="X") |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
181 if not (expected_mismatch - q_seq.count("X") <= int(mismatch) <= expected_mismatch + xx): |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
182 stop_err("%s vs %s mismatches, expected %i <= %i <= %i" \ |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
183 % (qseqid, sseqid, expected_mismatch - q_seq.count("X"), |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
184 int(mismatch), expected_mismatch)) |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
185 |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
186 #TODO - Remove this alternative identity calculation and test |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
187 #once satisifed there are no problems |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
188 expected_identity = sum(1 for q,h in zip(q_seq, h_seq) if q == h) |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
189 if not (expected_identity - xx <= int(nident) <= expected_identity + q_seq.count("X")): |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
190 stop_err("%s vs %s identities, expected %i <= %i <= %i" \ |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
191 % (qseqid, sseqid, expected_identity, int(nident), |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
192 expected_identity + q_seq.count("X"))) |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
193 |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
194 evalue = hsp.findtext("Hsp_evalue") |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
195 if evalue == "0": |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
196 evalue = "0.0" |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
197 else: |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
198 evalue = "%0.0e" % float(evalue) |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
199 |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
200 bitscore = float(hsp.findtext("Hsp_bit-score")) |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
201 if bitscore < 100: |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
202 #Seems to show one decimal place for lower scores |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
203 bitscore = "%0.1f" % bitscore |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
204 else: |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
205 #Note BLAST does not round to nearest int, it truncates |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
206 bitscore = "%i" % bitscore |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
207 |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
208 qstart = hsp.findtext("Hsp_query-from") |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
209 qend = hsp.findtext("Hsp_query-to") |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
210 values = [qseqid, |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
211 sseqid, |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
212 pident, |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
213 length, #hsp.findtext("Hsp_align-len") |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
214 str(mismatch), |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
215 gapopen, |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
216 #hsp.findtext("Hsp_query-from"), #qstart, |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
217 #hsp.findtext("Hsp_query-to"), #qend, |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
218 qstart, |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
219 qend, |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
220 hsp.findtext("Hsp_hit-from"), #sstart, |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
221 hsp.findtext("Hsp_hit-to"), #send, |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
222 evalue, #hsp.findtext("Hsp_evalue") in scientific notation |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
223 bitscore, #hsp.findtext("Hsp_bit-score") rounded |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
224 ] |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
225 |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
226 if extended: |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
227 sallseqid = ";".join(name.split(None,1)[0] for name in hit_def.split(">")) |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
228 #print hit_def, "-->", sallseqid |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
229 positive = hsp.findtext("Hsp_positive") |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
230 ppos = "%0.2f" % (100*float(positive)/float(length)) |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
231 qframe = hsp.findtext("Hsp_query-frame") |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
232 sframe = hsp.findtext("Hsp_hit-frame") |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
233 if blast_program == "blastp": |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
234 #Probably a bug in BLASTP that they use 0 or 1 depending on format |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
235 if qframe == "0": qframe = "1" |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
236 if sframe == "0": sframe = "1" |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
237 slen = int(hit.findtext("Hit_len")) |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
238 values.extend([sallseqid, |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
239 hsp.findtext("Hsp_score"), #score, |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
240 nident, |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
241 positive, |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
242 hsp.findtext("Hsp_gaps"), #gaps, |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
243 ppos, |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
244 qframe, |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
245 sframe, |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
246 #NOTE - for blastp, XML shows original seq, tabular uses XXX masking |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
247 q_seq, |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
248 h_seq, |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
249 str(qlen), |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
250 str(slen) |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
251 ]) |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
252 |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
253 if extended_plus: |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
254 pcov = "%0.2f" % (float(int(qend) - int(qstart) + 1)/qlen * 100) |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
255 sallseqdescr = ";".join(name.split(None,1)[1] for name in hit_def.split(">")) |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
256 values.extend([pcov, sallseqdescr]) |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
257 |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
258 #print "\t".join(values) |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
259 outfile.write("\t".join(values) + "\n") |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
260 # prevents ElementTree from growing large datastructure |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
261 root.clear() |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
262 elem.clear() |
efe0c7b8fb78
planemo upload for repository https://github.com/dfornika/galaxy/tree/master/tools/blast_xml_to_tabular commit 006cbba6513492f5a06b573c676400a2d464520b-dirty
dfornika
parents:
diff
changeset
|
263 outfile.close() |