Mercurial > repos > matthias > longorf
annotate getLongestORF.py @ 0:e09750baa9ac draft default tip
planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559-dirty
| author | matthias | 
|---|---|
| date | Wed, 20 Jun 2018 10:55:21 -0400 | 
| parents | |
| children | 
| rev | line source | 
|---|---|
| 0 
e09750baa9ac
planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559-dirty
 matthias parents: diff
changeset | 1 #!/usr/bin/env python | 
| 
e09750baa9ac
planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559-dirty
 matthias parents: diff
changeset | 2 | 
| 
e09750baa9ac
planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559-dirty
 matthias parents: diff
changeset | 3 """ | 
| 
e09750baa9ac
planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559-dirty
 matthias parents: diff
changeset | 4 usage: getLongestORF.py input output.fas output.tab | 
| 
e09750baa9ac
planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559-dirty
 matthias parents: diff
changeset | 5 | 
| 
e09750baa9ac
planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559-dirty
 matthias parents: diff
changeset | 6 | 
| 
e09750baa9ac
planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559-dirty
 matthias parents: diff
changeset | 7 input.fas: a amino acid fasta file of all open reading frames (ORF) listed by transcript (output of GalaxyTool "getorf") | 
| 
e09750baa9ac
planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559-dirty
 matthias parents: diff
changeset | 8 output.fas: fasta file with all longest ORFs per transcript | 
| 
e09750baa9ac
planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559-dirty
 matthias parents: diff
changeset | 9 output.tab: table with information about seqID, start, end, length, orientation, longest for all ORFs | 
| 
e09750baa9ac
planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559-dirty
 matthias parents: diff
changeset | 10 | 
| 
e09750baa9ac
planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559-dirty
 matthias parents: diff
changeset | 11 example: | 
| 
e09750baa9ac
planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559-dirty
 matthias parents: diff
changeset | 12 | 
| 
e09750baa9ac
planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559-dirty
 matthias parents: diff
changeset | 13 >253936-254394(+)_1 [28 - 63] | 
| 
e09750baa9ac
planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559-dirty
 matthias parents: diff
changeset | 14 LTNYCQMVHNIL | 
| 
e09750baa9ac
planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559-dirty
 matthias parents: diff
changeset | 15 >253936-254394(+)_2 [18 - 77] | 
| 
e09750baa9ac
planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559-dirty
 matthias parents: diff
changeset | 16 HKLIDKLLPNGAQYFVKSTQ | 
| 
e09750baa9ac
planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559-dirty
 matthias parents: diff
changeset | 17 >253936-254394(+)_3 [32 - 148] | 
| 
e09750baa9ac
planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559-dirty
 matthias parents: diff
changeset | 18 QTTAKWCTIFCKKYPVAPFHTMYLNYAVTWHHRSLLVAV | 
| 
e09750baa9ac
planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559-dirty
 matthias parents: diff
changeset | 19 >253936-254394(+)_4 [117 - 152] | 
| 
e09750baa9ac
planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559-dirty
 matthias parents: diff
changeset | 20 LGIIVPSLLLCN | 
| 
e09750baa9ac
planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559-dirty
 matthias parents: diff
changeset | 21 >248351-252461(+)_1 [14 - 85] | 
| 
e09750baa9ac
planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559-dirty
 matthias parents: diff
changeset | 22 VLARKYPRCLSPSKKSPCQLRQRS | 
| 
e09750baa9ac
planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559-dirty
 matthias parents: diff
changeset | 23 >248351-252461(+)_2 [21 - 161] | 
| 
e09750baa9ac
planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559-dirty
 matthias parents: diff
changeset | 24 PGNTHDASAHRKSLRVNSDKEVKCLFTKNAASEHPDHKRRRVSEHVP | 
| 
e09750baa9ac
planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559-dirty
 matthias parents: diff
changeset | 25 >248351-252461(+)_3 [89 - 202] | 
| 
e09750baa9ac
planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559-dirty
 matthias parents: diff
changeset | 26 VPLHQECCIGAPRPQTTACVRACAMTNTPRSSMTSKTG | 
| 
e09750baa9ac
planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559-dirty
 matthias parents: diff
changeset | 27 >248351-252461(+)_4 [206 - 259] | 
| 
e09750baa9ac
planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559-dirty
 matthias parents: diff
changeset | 28 SRTTSGRQSVLSEKLWRR | 
| 
e09750baa9ac
planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559-dirty
 matthias parents: diff
changeset | 29 >248351-252461(+)_5 [263 - 313] | 
| 
e09750baa9ac
planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559-dirty
 matthias parents: diff
changeset | 30 CLSPLWVPCCSRHSCHG | 
| 
e09750baa9ac
planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559-dirty
 matthias parents: diff
changeset | 31 """ | 
| 
e09750baa9ac
planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559-dirty
 matthias parents: diff
changeset | 32 | 
| 
e09750baa9ac
planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559-dirty
 matthias parents: diff
changeset | 33 import sys,re | 
| 
e09750baa9ac
planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559-dirty
 matthias parents: diff
changeset | 34 | 
| 
e09750baa9ac
planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559-dirty
 matthias parents: diff
changeset | 35 def findlongestOrf(transcriptDict,old_seqID): | 
| 
e09750baa9ac
planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559-dirty
 matthias parents: diff
changeset | 36 #write for previous seqID | 
| 
e09750baa9ac
planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559-dirty
 matthias parents: diff
changeset | 37 prevTranscript = transcriptDict[old_seqID] | 
| 
e09750baa9ac
planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559-dirty
 matthias parents: diff
changeset | 38 i_max = 0 | 
| 
e09750baa9ac
planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559-dirty
 matthias parents: diff
changeset | 39 #find longest orf in transcript | 
| 
e09750baa9ac
planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559-dirty
 matthias parents: diff
changeset | 40 for i in range(0,len(prevTranscript)): | 
| 
e09750baa9ac
planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559-dirty
 matthias parents: diff
changeset | 41 if(prevTranscript[i][2] >= prevTranscript[i_max][2]): | 
| 
e09750baa9ac
planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559-dirty
 matthias parents: diff
changeset | 42 i_max = i | 
| 
e09750baa9ac
planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559-dirty
 matthias parents: diff
changeset | 43 for i in range(0,len(prevTranscript)): | 
| 
e09750baa9ac
planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559-dirty
 matthias parents: diff
changeset | 44 prevStart = prevTranscript[i][0] | 
| 
e09750baa9ac
planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559-dirty
 matthias parents: diff
changeset | 45 prevEnd = prevTranscript[i][1] | 
| 
e09750baa9ac
planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559-dirty
 matthias parents: diff
changeset | 46 prevLength = prevTranscript[i][2] | 
| 
e09750baa9ac
planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559-dirty
 matthias parents: diff
changeset | 47 output = str(old_seqID) + "\t" + str(prevStart) + "\t" + str(prevEnd) + "\t" + str(prevLength) | 
| 
e09750baa9ac
planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559-dirty
 matthias parents: diff
changeset | 48 if (end - start > 0): | 
| 
e09750baa9ac
planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559-dirty
 matthias parents: diff
changeset | 49 output+="\tForward" | 
| 
e09750baa9ac
planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559-dirty
 matthias parents: diff
changeset | 50 else: | 
| 
e09750baa9ac
planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559-dirty
 matthias parents: diff
changeset | 51 output+="\tReverse" | 
| 
e09750baa9ac
planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559-dirty
 matthias parents: diff
changeset | 52 if(i == i_max): | 
| 
e09750baa9ac
planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559-dirty
 matthias parents: diff
changeset | 53 output += "\ty\n" | 
| 
e09750baa9ac
planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559-dirty
 matthias parents: diff
changeset | 54 else: | 
| 
e09750baa9ac
planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559-dirty
 matthias parents: diff
changeset | 55 output += "\tn\n" | 
| 
e09750baa9ac
planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559-dirty
 matthias parents: diff
changeset | 56 OUTPUT_ORF_SUMMARY.write(output) | 
| 
e09750baa9ac
planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559-dirty
 matthias parents: diff
changeset | 57 transcriptDict.pop(old_seqID, None) | 
| 
e09750baa9ac
planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559-dirty
 matthias parents: diff
changeset | 58 return None | 
| 
e09750baa9ac
planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559-dirty
 matthias parents: diff
changeset | 59 | 
| 
e09750baa9ac
planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559-dirty
 matthias parents: diff
changeset | 60 INPUT = open(sys.argv[1],"r") | 
| 
e09750baa9ac
planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559-dirty
 matthias parents: diff
changeset | 61 OUTPUT_FASTA = open(sys.argv[2],"w") | 
| 
e09750baa9ac
planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559-dirty
 matthias parents: diff
changeset | 62 OUTPUT_ORF_SUMMARY = open(sys.argv[3],"w") | 
| 
e09750baa9ac
planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559-dirty
 matthias parents: diff
changeset | 63 | 
| 
e09750baa9ac
planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559-dirty
 matthias parents: diff
changeset | 64 seqID = "" | 
| 
e09750baa9ac
planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559-dirty
 matthias parents: diff
changeset | 65 old_seqID = "" | 
| 
e09750baa9ac
planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559-dirty
 matthias parents: diff
changeset | 66 lengthDict = {} | 
| 
e09750baa9ac
planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559-dirty
 matthias parents: diff
changeset | 67 seqDict = {} | 
| 
e09750baa9ac
planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559-dirty
 matthias parents: diff
changeset | 68 headerDict = {} | 
| 
e09750baa9ac
planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559-dirty
 matthias parents: diff
changeset | 69 transcriptDict = {} | 
| 
e09750baa9ac
planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559-dirty
 matthias parents: diff
changeset | 70 skip = False | 
| 
e09750baa9ac
planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559-dirty
 matthias parents: diff
changeset | 71 | 
| 
e09750baa9ac
planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559-dirty
 matthias parents: diff
changeset | 72 OUTPUT_ORF_SUMMARY.write("seqID\tstart\tend\tlength\torientation\tlongest\n") | 
| 
e09750baa9ac
planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559-dirty
 matthias parents: diff
changeset | 73 | 
| 
e09750baa9ac
planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559-dirty
 matthias parents: diff
changeset | 74 for line in INPUT: | 
| 
e09750baa9ac
planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559-dirty
 matthias parents: diff
changeset | 75 line = line.strip() | 
| 
e09750baa9ac
planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559-dirty
 matthias parents: diff
changeset | 76 # print line | 
| 
e09750baa9ac
planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559-dirty
 matthias parents: diff
changeset | 77 if(re.match(">",line)): #header | 
| 
e09750baa9ac
planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559-dirty
 matthias parents: diff
changeset | 78 seqID = "_".join(line.split(">")[1].split("_")[:-1]) | 
| 
e09750baa9ac
planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559-dirty
 matthias parents: diff
changeset | 79 #seqID = line.split(">")[1].split("_")[0] | 
| 
e09750baa9ac
planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559-dirty
 matthias parents: diff
changeset | 80 start = int (re.search('\ \[(\d+)\ -', line).group(1)) | 
| 
e09750baa9ac
planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559-dirty
 matthias parents: diff
changeset | 81 end = int (re.search('-\ (\d+)\]',line).group(1)) | 
| 
e09750baa9ac
planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559-dirty
 matthias parents: diff
changeset | 82 length = abs(end - start) | 
| 
e09750baa9ac
planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559-dirty
 matthias parents: diff
changeset | 83 if(seqID not in transcriptDict and old_seqID != ""): #new transcript | 
| 
e09750baa9ac
planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559-dirty
 matthias parents: diff
changeset | 84 findlongestOrf(transcriptDict,old_seqID) | 
| 
e09750baa9ac
planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559-dirty
 matthias parents: diff
changeset | 85 if seqID not in transcriptDict: | 
| 
e09750baa9ac
planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559-dirty
 matthias parents: diff
changeset | 86 transcriptDict[seqID] = [] | 
| 
e09750baa9ac
planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559-dirty
 matthias parents: diff
changeset | 87 transcriptDict[seqID].append([start,end,length]) | 
| 
e09750baa9ac
planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559-dirty
 matthias parents: diff
changeset | 88 if(seqID not in lengthDict and old_seqID != ""): #new transcript | 
| 
e09750baa9ac
planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559-dirty
 matthias parents: diff
changeset | 89 #write FASTA | 
| 
e09750baa9ac
planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559-dirty
 matthias parents: diff
changeset | 90 OUTPUT_FASTA.write(headerDict[old_seqID]+"\n"+seqDict[old_seqID]+"\n") | 
| 
e09750baa9ac
planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559-dirty
 matthias parents: diff
changeset | 91 #delete old dict entry | 
| 
e09750baa9ac
planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559-dirty
 matthias parents: diff
changeset | 92 headerDict.pop(old_seqID, None) | 
| 
e09750baa9ac
planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559-dirty
 matthias parents: diff
changeset | 93 seqDict.pop(old_seqID, None) | 
| 
e09750baa9ac
planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559-dirty
 matthias parents: diff
changeset | 94 lengthDict.pop(old_seqID, None) | 
| 
e09750baa9ac
planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559-dirty
 matthias parents: diff
changeset | 95 #if several longest sequences exist with the same length, the dictionary saves the last occuring. | 
| 
e09750baa9ac
planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559-dirty
 matthias parents: diff
changeset | 96 if(seqID not in lengthDict or length >= lengthDict[seqID]): | 
| 
e09750baa9ac
planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559-dirty
 matthias parents: diff
changeset | 97 headerDict[seqID] = line | 
| 
e09750baa9ac
planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559-dirty
 matthias parents: diff
changeset | 98 lengthDict[seqID] = length | 
| 
e09750baa9ac
planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559-dirty
 matthias parents: diff
changeset | 99 seqDict[seqID] = "" | 
| 
e09750baa9ac
planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559-dirty
 matthias parents: diff
changeset | 100 skip = False | 
| 
e09750baa9ac
planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559-dirty
 matthias parents: diff
changeset | 101 else: | 
| 
e09750baa9ac
planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559-dirty
 matthias parents: diff
changeset | 102 skip = True | 
| 
e09750baa9ac
planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559-dirty
 matthias parents: diff
changeset | 103 next | 
| 
e09750baa9ac
planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559-dirty
 matthias parents: diff
changeset | 104 old_seqID = seqID | 
| 
e09750baa9ac
planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559-dirty
 matthias parents: diff
changeset | 105 elif(skip): | 
| 
e09750baa9ac
planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559-dirty
 matthias parents: diff
changeset | 106 next | 
| 
e09750baa9ac
planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559-dirty
 matthias parents: diff
changeset | 107 else: | 
| 
e09750baa9ac
planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559-dirty
 matthias parents: diff
changeset | 108 seqDict[seqID] += line | 
| 
e09750baa9ac
planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559-dirty
 matthias parents: diff
changeset | 109 | 
| 
e09750baa9ac
planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559-dirty
 matthias parents: diff
changeset | 110 OUTPUT_FASTA.write(headerDict[old_seqID]+"\n"+seqDict[old_seqID]) | 
| 
e09750baa9ac
planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559-dirty
 matthias parents: diff
changeset | 111 findlongestOrf(transcriptDict,old_seqID) | 
| 
e09750baa9ac
planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559-dirty
 matthias parents: diff
changeset | 112 INPUT.close() | 
| 
e09750baa9ac
planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559-dirty
 matthias parents: diff
changeset | 113 OUTPUT_FASTA.close() | 
| 
e09750baa9ac
planemo upload for repository https://github.com/bernt-matthias/mb-galaxy-tools/blob/master/tools/longorf/ commit 8e118a4d24047e2c62912b962e854f789d6ff559-dirty
 matthias parents: diff
changeset | 114 OUTPUT_ORF_SUMMARY.close() | 
