annotate blastxml_to_gapped_gff3.py @ 16:318a0aa5075a draft

Uploaded manually
author iuc
date Tue, 29 Dec 2015 15:31:54 -0500
parents 67fb31daef0e
children 6bfd32bd1000
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
13
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
1 #!/usr/bin/perl
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
2 import re
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
3 import sys
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
4 import copy
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
5 import argparse
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
6 from BCBio import GFF
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
7 import logging
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
8 logging.basicConfig(level=logging.INFO)
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
9 log = logging.getLogger(name='blastxml2gff3')
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
10
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
11 __author__ = "Eric Rasche"
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
12 __version__ = "0.4.0"
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
13 __maintainer__ = "Eric Rasche"
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
14 __email__ = "esr@tamu.edu"
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
15
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
16 __doc__ = """
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
17 BlastXML files, when transformed to GFF3, do not normally show gaps in the
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
18 blast hits. This tool aims to fill that "gap".
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
19 """
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
20
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
21
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
22 def blastxml2gff3(blastxml, min_gap=3, trim=False, trim_end=False):
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
23 from Bio.Blast import NCBIXML
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
24 from Bio.Seq import Seq
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
25 from Bio.SeqRecord import SeqRecord
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
26 from Bio.SeqFeature import SeqFeature, FeatureLocation
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
27
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
28 blast_records = NCBIXML.parse(blastxml)
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
29 records = []
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
30 for record in blast_records:
16
318a0aa5075a Uploaded manually
iuc
parents: 13
diff changeset
31 # http://www.sequenceontology.org/browser/release_2.4/term/SO:0000343
318a0aa5075a Uploaded manually
iuc
parents: 13
diff changeset
32 match_type = { # Currently we can only handle BLASTN, BLASTP
318a0aa5075a Uploaded manually
iuc
parents: 13
diff changeset
33 'BLASTN': 'nucleotide_match',
318a0aa5075a Uploaded manually
iuc
parents: 13
diff changeset
34 'BLASTP': 'protein_match',
318a0aa5075a Uploaded manually
iuc
parents: 13
diff changeset
35 }.get(record.application, 'match')
318a0aa5075a Uploaded manually
iuc
parents: 13
diff changeset
36
13
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
37 rec = SeqRecord(Seq("ACTG"), id=record.query)
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
38 for hit in record.alignments:
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
39 for hsp in hit.hsps:
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
40 qualifiers = {
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
41 "source": "blast",
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
42 "score": hsp.expect,
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
43 "accession": hit.accession,
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
44 "hit_id": hit.hit_id,
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
45 "length": hit.length,
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
46 "hit_titles": hit.title.split(' >')
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
47 }
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
48 desc = hit.title.split(' >')[0]
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
49 qualifiers['description'] = desc[desc.index(' '):]
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
50
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
51 # This required a fair bit of sketching out/match to figure out
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
52 # the first time.
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
53 #
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
54 # the match_start location must account for queries and
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
55 # subjecst that start at locations other than 1
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
56 parent_match_start = hsp.query_start - hsp.sbjct_start
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
57 # The end is the start + hit.length because the match itself
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
58 # may be longer than the parent feature, so we use the supplied
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
59 # subject/hit length to calculate the real ending of the target
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
60 # protein.
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
61 parent_match_end = hsp.query_start + hit.length + hsp.query.count('-')
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
62
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
63 # However, if the user requests that we trim the feature, then
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
64 # we need to cut the ``match`` start to 0 to match the parent feature.
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
65 # We'll also need to cut the end to match the query's end. It (maybe)
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
66 # should be the feature end? But we don't have access to that data, so
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
67 # We settle for this.
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
68 if trim:
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
69 if parent_match_start < 1:
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
70 parent_match_start = 0
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
71
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
72 if trim or trim_end:
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
73 if parent_match_end > hsp.query_end:
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
74 parent_match_end = hsp.query_end + 1
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
75
16
318a0aa5075a Uploaded manually
iuc
parents: 13
diff changeset
76 # The ``match`` feature will hold one or more ``match_part``s
13
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
77 top_feature = SeqFeature(
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
78 FeatureLocation(parent_match_start, parent_match_end),
16
318a0aa5075a Uploaded manually
iuc
parents: 13
diff changeset
79 type=match_type, strand=0,
13
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
80 qualifiers=qualifiers
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
81 )
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
82
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
83 # Unlike the parent feature, ``match_part``s have sources.
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
84 part_qualifiers = {
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
85 "source": "blast",
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
86 }
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
87 top_feature.sub_features = []
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
88 for start, end, cigar in generate_parts(hsp.query, hsp.match,
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
89 hsp.sbjct,
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
90 ignore_under=min_gap):
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
91 part_qualifiers['Gap'] = cigar
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
92 part_qualifiers['ID'] = hit.hit_id
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
93
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
94 if trim:
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
95 # If trimming, then we start relative to the
16
318a0aa5075a Uploaded manually
iuc
parents: 13
diff changeset
96 # match's start
13
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
97 match_part_start = parent_match_start + start
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
98 else:
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
99 # Otherwise, we have to account for the subject start's location
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
100 match_part_start = parent_match_start + hsp.sbjct_start + start - 1
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
101
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
102 # We used to use hsp.align_length here, but that includes
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
103 # gaps in the parent sequence
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
104 #
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
105 # Furthermore align_length will give calculation errors in weird places
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
106 # So we just use (end-start) for simplicity
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
107 match_part_end = match_part_start + (end - start)
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
108
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
109 top_feature.sub_features.append(
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
110 SeqFeature(
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
111 FeatureLocation(match_part_start, match_part_end),
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
112 type="match_part", strand=0,
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
113 qualifiers=copy.deepcopy(part_qualifiers))
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
114 )
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
115
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
116 rec.features.append(top_feature)
16
318a0aa5075a Uploaded manually
iuc
parents: 13
diff changeset
117 rec.annotations = {}
13
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
118 records.append(rec)
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
119 return records
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
120
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
121
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
122 def __remove_query_gaps(query, match, subject):
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
123 """remove positions in all three based on gaps in query
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
124
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
125 In order to simplify math and calculations...we remove all of the gaps
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
126 based on gap locations in the query sequence::
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
127
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
128 Q:ACTG-ACTGACTG
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
129 S:ACTGAAC---CTG
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
130
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
131 will become::
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
132
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
133 Q:ACTGACTGACTG
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
134 S:ACTGAC---CTG
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
135
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
136 which greatly simplifies the process of identifying the correct location
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
137 for a match_part
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
138 """
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
139 prev = 0
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
140 fq = ''
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
141 fm = ''
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
142 fs = ''
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
143 for position in re.finditer('-', query):
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
144 fq += query[prev:position.start()]
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
145 fm += match[prev:position.start()]
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
146 fs += subject[prev:position.start()]
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
147 prev = position.start() + 1
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
148 fq += query[prev:]
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
149 fm += match[prev:]
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
150 fs += subject[prev:]
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
151
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
152 return (fq, fm, fs)
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
153
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
154
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
155 def generate_parts(query, match, subject, ignore_under=3):
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
156 region_q = []
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
157 region_m = []
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
158 region_s = []
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
159
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
160 (query, match, subject) = __remove_query_gaps(query, match, subject)
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
161
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
162 region_start = -1
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
163 region_end = -1
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
164 mismatch_count = 0
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
165 for i, (q, m, s) in enumerate(zip(query, match, subject)):
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
166
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
167 # If we have a match
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
168 if m != ' ' or m == '+':
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
169 if region_start == -1:
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
170 region_start = i
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
171 # It's a new region, we need to reset or it's pre-seeded with
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
172 # spaces
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
173 region_q = []
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
174 region_m = []
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
175 region_s = []
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
176 region_end = i
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
177 mismatch_count = 0
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
178 else:
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
179 mismatch_count += 1
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
180
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
181 region_q.append(q)
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
182 region_m.append(m)
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
183 region_s.append(s)
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
184
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
185 if mismatch_count >= ignore_under and region_start != -1 and region_end != -1:
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
186 region_q = region_q[0:-ignore_under]
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
187 region_m = region_m[0:-ignore_under]
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
188 region_s = region_s[0:-ignore_under]
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
189 yield region_start, region_end + 1, \
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
190 cigar_from_string(region_q, region_m, region_s, strict_m=True)
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
191 region_q = []
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
192 region_m = []
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
193 region_s = []
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
194
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
195 region_start = -1
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
196 region_end = -1
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
197 mismatch_count = 0
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
198
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
199 yield region_start, region_end + 1, \
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
200 cigar_from_string(region_q, region_m, region_s, strict_m=True)
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
201
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
202
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
203 def _qms_to_matches(query, match, subject, strict_m=True):
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
204 matchline = []
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
205
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
206 for (q, m, s) in zip(query, match, subject):
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
207 ret = ''
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
208
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
209 if m != ' ' or m == '+':
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
210 ret = '='
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
211 elif m == ' ':
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
212 if q == '-':
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
213 ret = 'D'
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
214 elif s == '-':
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
215 ret = 'I'
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
216 else:
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
217 ret = 'X'
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
218 else:
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
219 log.warn("Bad data: \n\t%s\n\t%s\n\t%s\n" % (query, match, subject))
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
220
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
221
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
222 if strict_m:
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
223 if ret == '=' or ret == 'X':
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
224 ret = 'M'
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
225
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
226 matchline.append(ret)
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
227 return matchline
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
228
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
229
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
230 def _matchline_to_cigar(matchline):
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
231 cigar_line = []
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
232 last_char = matchline[0]
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
233 count = 0
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
234 for char in matchline:
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
235 if char == last_char:
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
236 count += 1
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
237 else:
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
238 cigar_line.append("%s%s" % (last_char, count))
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
239 count = 1
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
240 last_char = char
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
241 cigar_line.append("%s%s" % (last_char, count))
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
242 return ' '.join(cigar_line)
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
243
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
244
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
245 def cigar_from_string(query, match, subject, strict_m=True):
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
246 matchline = _qms_to_matches(query, match, subject, strict_m=strict_m)
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
247 if len(matchline) > 0:
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
248 return _matchline_to_cigar(matchline)
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
249 else:
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
250 return ""
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
251
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
252
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
253 if __name__ == '__main__':
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
254 parser = argparse.ArgumentParser(description='Convert Blast XML to gapped GFF3', epilog='')
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
255 parser.add_argument('blastxml', type=file, help='Blast XML Output')
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
256 parser.add_argument('--min_gap', type=int, help='Maximum gap size before generating a new match_part', default=3)
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
257 parser.add_argument('--trim', action='store_true', help='Trim blast hits to be only as long as the parent feature')
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
258 parser.add_argument('--trim_end', action='store_true', help='Cut blast results off at end of gene')
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
259 args = parser.parse_args()
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
260
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
261 result = blastxml2gff3(**vars(args))
67fb31daef0e planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/jbrowse commit 399061eca3a2956704522974446601755503c96d-dirty
iuc
parents:
diff changeset
262 GFF.write(result, sys.stdout)