annotate venn_list.py @ 15:367a0403b7d2 draft

planemo upload for repository https://github.com/peterjc/pico_galaxy/tools/venn_list commit 6c4ac223d511bbcd0ec9cbada730613a5fe9f1af-dirty
author peterjc
date Thu, 07 May 2015 05:43:50 -0400
parents
children
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
15
367a0403b7d2 planemo upload for repository https://github.com/peterjc/pico_galaxy/tools/venn_list commit 6c4ac223d511bbcd0ec9cbada730613a5fe9f1af-dirty
peterjc
parents:
diff changeset
1 #!/usr/bin/env python
367a0403b7d2 planemo upload for repository https://github.com/peterjc/pico_galaxy/tools/venn_list commit 6c4ac223d511bbcd0ec9cbada730613a5fe9f1af-dirty
peterjc
parents:
diff changeset
2 """Plot up to 3-way Venn Diagram using R limma vennDiagram (via rpy)
367a0403b7d2 planemo upload for repository https://github.com/peterjc/pico_galaxy/tools/venn_list commit 6c4ac223d511bbcd0ec9cbada730613a5fe9f1af-dirty
peterjc
parents:
diff changeset
3
367a0403b7d2 planemo upload for repository https://github.com/peterjc/pico_galaxy/tools/venn_list commit 6c4ac223d511bbcd0ec9cbada730613a5fe9f1af-dirty
peterjc
parents:
diff changeset
4 This script is copyright 2010 by Peter Cock, The James Hutton Institute
367a0403b7d2 planemo upload for repository https://github.com/peterjc/pico_galaxy/tools/venn_list commit 6c4ac223d511bbcd0ec9cbada730613a5fe9f1af-dirty
peterjc
parents:
diff changeset
5 (formerly SCRI), UK. All rights reserved.
367a0403b7d2 planemo upload for repository https://github.com/peterjc/pico_galaxy/tools/venn_list commit 6c4ac223d511bbcd0ec9cbada730613a5fe9f1af-dirty
peterjc
parents:
diff changeset
6 See accompanying text file for licence details (MIT/BSD style).
367a0403b7d2 planemo upload for repository https://github.com/peterjc/pico_galaxy/tools/venn_list commit 6c4ac223d511bbcd0ec9cbada730613a5fe9f1af-dirty
peterjc
parents:
diff changeset
7
367a0403b7d2 planemo upload for repository https://github.com/peterjc/pico_galaxy/tools/venn_list commit 6c4ac223d511bbcd0ec9cbada730613a5fe9f1af-dirty
peterjc
parents:
diff changeset
8 This is version 0.0.8 of the script.
367a0403b7d2 planemo upload for repository https://github.com/peterjc/pico_galaxy/tools/venn_list commit 6c4ac223d511bbcd0ec9cbada730613a5fe9f1af-dirty
peterjc
parents:
diff changeset
9 """
367a0403b7d2 planemo upload for repository https://github.com/peterjc/pico_galaxy/tools/venn_list commit 6c4ac223d511bbcd0ec9cbada730613a5fe9f1af-dirty
peterjc
parents:
diff changeset
10
367a0403b7d2 planemo upload for repository https://github.com/peterjc/pico_galaxy/tools/venn_list commit 6c4ac223d511bbcd0ec9cbada730613a5fe9f1af-dirty
peterjc
parents:
diff changeset
11
367a0403b7d2 planemo upload for repository https://github.com/peterjc/pico_galaxy/tools/venn_list commit 6c4ac223d511bbcd0ec9cbada730613a5fe9f1af-dirty
peterjc
parents:
diff changeset
12 import sys
367a0403b7d2 planemo upload for repository https://github.com/peterjc/pico_galaxy/tools/venn_list commit 6c4ac223d511bbcd0ec9cbada730613a5fe9f1af-dirty
peterjc
parents:
diff changeset
13
367a0403b7d2 planemo upload for repository https://github.com/peterjc/pico_galaxy/tools/venn_list commit 6c4ac223d511bbcd0ec9cbada730613a5fe9f1af-dirty
peterjc
parents:
diff changeset
14 def sys_exit(msg, error_level=1):
367a0403b7d2 planemo upload for repository https://github.com/peterjc/pico_galaxy/tools/venn_list commit 6c4ac223d511bbcd0ec9cbada730613a5fe9f1af-dirty
peterjc
parents:
diff changeset
15 """Print error message to stdout and quit with given error level."""
367a0403b7d2 planemo upload for repository https://github.com/peterjc/pico_galaxy/tools/venn_list commit 6c4ac223d511bbcd0ec9cbada730613a5fe9f1af-dirty
peterjc
parents:
diff changeset
16 sys.stderr.write("%s\n" % msg)
367a0403b7d2 planemo upload for repository https://github.com/peterjc/pico_galaxy/tools/venn_list commit 6c4ac223d511bbcd0ec9cbada730613a5fe9f1af-dirty
peterjc
parents:
diff changeset
17 sys.exit(error_level)
367a0403b7d2 planemo upload for repository https://github.com/peterjc/pico_galaxy/tools/venn_list commit 6c4ac223d511bbcd0ec9cbada730613a5fe9f1af-dirty
peterjc
parents:
diff changeset
18
367a0403b7d2 planemo upload for repository https://github.com/peterjc/pico_galaxy/tools/venn_list commit 6c4ac223d511bbcd0ec9cbada730613a5fe9f1af-dirty
peterjc
parents:
diff changeset
19 try:
367a0403b7d2 planemo upload for repository https://github.com/peterjc/pico_galaxy/tools/venn_list commit 6c4ac223d511bbcd0ec9cbada730613a5fe9f1af-dirty
peterjc
parents:
diff changeset
20 import rpy
367a0403b7d2 planemo upload for repository https://github.com/peterjc/pico_galaxy/tools/venn_list commit 6c4ac223d511bbcd0ec9cbada730613a5fe9f1af-dirty
peterjc
parents:
diff changeset
21 except ImportError:
367a0403b7d2 planemo upload for repository https://github.com/peterjc/pico_galaxy/tools/venn_list commit 6c4ac223d511bbcd0ec9cbada730613a5fe9f1af-dirty
peterjc
parents:
diff changeset
22 sys_exit("Requires the Python library rpy (to call R)")
367a0403b7d2 planemo upload for repository https://github.com/peterjc/pico_galaxy/tools/venn_list commit 6c4ac223d511bbcd0ec9cbada730613a5fe9f1af-dirty
peterjc
parents:
diff changeset
23 except RuntimeError, e:
367a0403b7d2 planemo upload for repository https://github.com/peterjc/pico_galaxy/tools/venn_list commit 6c4ac223d511bbcd0ec9cbada730613a5fe9f1af-dirty
peterjc
parents:
diff changeset
24 sys_exit("The Python library rpy is not availble for the current R version\n\n%s" % e)
367a0403b7d2 planemo upload for repository https://github.com/peterjc/pico_galaxy/tools/venn_list commit 6c4ac223d511bbcd0ec9cbada730613a5fe9f1af-dirty
peterjc
parents:
diff changeset
25
367a0403b7d2 planemo upload for repository https://github.com/peterjc/pico_galaxy/tools/venn_list commit 6c4ac223d511bbcd0ec9cbada730613a5fe9f1af-dirty
peterjc
parents:
diff changeset
26 try:
367a0403b7d2 planemo upload for repository https://github.com/peterjc/pico_galaxy/tools/venn_list commit 6c4ac223d511bbcd0ec9cbada730613a5fe9f1af-dirty
peterjc
parents:
diff changeset
27 rpy.r.library("limma")
367a0403b7d2 planemo upload for repository https://github.com/peterjc/pico_galaxy/tools/venn_list commit 6c4ac223d511bbcd0ec9cbada730613a5fe9f1af-dirty
peterjc
parents:
diff changeset
28 except:
367a0403b7d2 planemo upload for repository https://github.com/peterjc/pico_galaxy/tools/venn_list commit 6c4ac223d511bbcd0ec9cbada730613a5fe9f1af-dirty
peterjc
parents:
diff changeset
29 sys_exit("Requires the R library limma (for vennDiagram function)")
367a0403b7d2 planemo upload for repository https://github.com/peterjc/pico_galaxy/tools/venn_list commit 6c4ac223d511bbcd0ec9cbada730613a5fe9f1af-dirty
peterjc
parents:
diff changeset
30
367a0403b7d2 planemo upload for repository https://github.com/peterjc/pico_galaxy/tools/venn_list commit 6c4ac223d511bbcd0ec9cbada730613a5fe9f1af-dirty
peterjc
parents:
diff changeset
31
367a0403b7d2 planemo upload for repository https://github.com/peterjc/pico_galaxy/tools/venn_list commit 6c4ac223d511bbcd0ec9cbada730613a5fe9f1af-dirty
peterjc
parents:
diff changeset
32 if len(sys.argv)-1 not in [7, 10, 13]:
367a0403b7d2 planemo upload for repository https://github.com/peterjc/pico_galaxy/tools/venn_list commit 6c4ac223d511bbcd0ec9cbada730613a5fe9f1af-dirty
peterjc
parents:
diff changeset
33 sys_exit("Expected 7, 10 or 13 arguments (for 1, 2 or 3 sets), not %i" % (len(sys.argv)-1))
367a0403b7d2 planemo upload for repository https://github.com/peterjc/pico_galaxy/tools/venn_list commit 6c4ac223d511bbcd0ec9cbada730613a5fe9f1af-dirty
peterjc
parents:
diff changeset
34
367a0403b7d2 planemo upload for repository https://github.com/peterjc/pico_galaxy/tools/venn_list commit 6c4ac223d511bbcd0ec9cbada730613a5fe9f1af-dirty
peterjc
parents:
diff changeset
35 all_file, all_type, all_label = sys.argv[1:4]
367a0403b7d2 planemo upload for repository https://github.com/peterjc/pico_galaxy/tools/venn_list commit 6c4ac223d511bbcd0ec9cbada730613a5fe9f1af-dirty
peterjc
parents:
diff changeset
36 set_data = []
367a0403b7d2 planemo upload for repository https://github.com/peterjc/pico_galaxy/tools/venn_list commit 6c4ac223d511bbcd0ec9cbada730613a5fe9f1af-dirty
peterjc
parents:
diff changeset
37 if len(sys.argv)-1 >= 7:
367a0403b7d2 planemo upload for repository https://github.com/peterjc/pico_galaxy/tools/venn_list commit 6c4ac223d511bbcd0ec9cbada730613a5fe9f1af-dirty
peterjc
parents:
diff changeset
38 set_data.append(tuple(sys.argv[4:7]))
367a0403b7d2 planemo upload for repository https://github.com/peterjc/pico_galaxy/tools/venn_list commit 6c4ac223d511bbcd0ec9cbada730613a5fe9f1af-dirty
peterjc
parents:
diff changeset
39 if len(sys.argv)-1 >= 10:
367a0403b7d2 planemo upload for repository https://github.com/peterjc/pico_galaxy/tools/venn_list commit 6c4ac223d511bbcd0ec9cbada730613a5fe9f1af-dirty
peterjc
parents:
diff changeset
40 set_data.append(tuple(sys.argv[7:10]))
367a0403b7d2 planemo upload for repository https://github.com/peterjc/pico_galaxy/tools/venn_list commit 6c4ac223d511bbcd0ec9cbada730613a5fe9f1af-dirty
peterjc
parents:
diff changeset
41 if len(sys.argv)-1 >= 13:
367a0403b7d2 planemo upload for repository https://github.com/peterjc/pico_galaxy/tools/venn_list commit 6c4ac223d511bbcd0ec9cbada730613a5fe9f1af-dirty
peterjc
parents:
diff changeset
42 set_data.append(tuple(sys.argv[10:13]))
367a0403b7d2 planemo upload for repository https://github.com/peterjc/pico_galaxy/tools/venn_list commit 6c4ac223d511bbcd0ec9cbada730613a5fe9f1af-dirty
peterjc
parents:
diff changeset
43 pdf_file = sys.argv[-1]
367a0403b7d2 planemo upload for repository https://github.com/peterjc/pico_galaxy/tools/venn_list commit 6c4ac223d511bbcd0ec9cbada730613a5fe9f1af-dirty
peterjc
parents:
diff changeset
44 n = len(set_data)
367a0403b7d2 planemo upload for repository https://github.com/peterjc/pico_galaxy/tools/venn_list commit 6c4ac223d511bbcd0ec9cbada730613a5fe9f1af-dirty
peterjc
parents:
diff changeset
45 print "Doing %i-way Venn Diagram" % n
367a0403b7d2 planemo upload for repository https://github.com/peterjc/pico_galaxy/tools/venn_list commit 6c4ac223d511bbcd0ec9cbada730613a5fe9f1af-dirty
peterjc
parents:
diff changeset
46
367a0403b7d2 planemo upload for repository https://github.com/peterjc/pico_galaxy/tools/venn_list commit 6c4ac223d511bbcd0ec9cbada730613a5fe9f1af-dirty
peterjc
parents:
diff changeset
47 def load_ids(filename, filetype):
367a0403b7d2 planemo upload for repository https://github.com/peterjc/pico_galaxy/tools/venn_list commit 6c4ac223d511bbcd0ec9cbada730613a5fe9f1af-dirty
peterjc
parents:
diff changeset
48 if filetype=="tabular":
367a0403b7d2 planemo upload for repository https://github.com/peterjc/pico_galaxy/tools/venn_list commit 6c4ac223d511bbcd0ec9cbada730613a5fe9f1af-dirty
peterjc
parents:
diff changeset
49 for line in open(filename):
367a0403b7d2 planemo upload for repository https://github.com/peterjc/pico_galaxy/tools/venn_list commit 6c4ac223d511bbcd0ec9cbada730613a5fe9f1af-dirty
peterjc
parents:
diff changeset
50 line = line.rstrip("\n")
367a0403b7d2 planemo upload for repository https://github.com/peterjc/pico_galaxy/tools/venn_list commit 6c4ac223d511bbcd0ec9cbada730613a5fe9f1af-dirty
peterjc
parents:
diff changeset
51 if line and not line.startswith("#"):
367a0403b7d2 planemo upload for repository https://github.com/peterjc/pico_galaxy/tools/venn_list commit 6c4ac223d511bbcd0ec9cbada730613a5fe9f1af-dirty
peterjc
parents:
diff changeset
52 yield line.split("\t",1)[0]
367a0403b7d2 planemo upload for repository https://github.com/peterjc/pico_galaxy/tools/venn_list commit 6c4ac223d511bbcd0ec9cbada730613a5fe9f1af-dirty
peterjc
parents:
diff changeset
53 elif filetype=="fasta":
367a0403b7d2 planemo upload for repository https://github.com/peterjc/pico_galaxy/tools/venn_list commit 6c4ac223d511bbcd0ec9cbada730613a5fe9f1af-dirty
peterjc
parents:
diff changeset
54 for line in open(filename):
367a0403b7d2 planemo upload for repository https://github.com/peterjc/pico_galaxy/tools/venn_list commit 6c4ac223d511bbcd0ec9cbada730613a5fe9f1af-dirty
peterjc
parents:
diff changeset
55 if line.startswith(">"):
367a0403b7d2 planemo upload for repository https://github.com/peterjc/pico_galaxy/tools/venn_list commit 6c4ac223d511bbcd0ec9cbada730613a5fe9f1af-dirty
peterjc
parents:
diff changeset
56 yield line[1:].rstrip("\n").split(None,1)[0]
367a0403b7d2 planemo upload for repository https://github.com/peterjc/pico_galaxy/tools/venn_list commit 6c4ac223d511bbcd0ec9cbada730613a5fe9f1af-dirty
peterjc
parents:
diff changeset
57 elif filetype.startswith("fastq"):
367a0403b7d2 planemo upload for repository https://github.com/peterjc/pico_galaxy/tools/venn_list commit 6c4ac223d511bbcd0ec9cbada730613a5fe9f1af-dirty
peterjc
parents:
diff changeset
58 #Use the Galaxy library not Biopython to cope with CS
367a0403b7d2 planemo upload for repository https://github.com/peterjc/pico_galaxy/tools/venn_list commit 6c4ac223d511bbcd0ec9cbada730613a5fe9f1af-dirty
peterjc
parents:
diff changeset
59 from galaxy_utils.sequence.fastq import fastqReader
367a0403b7d2 planemo upload for repository https://github.com/peterjc/pico_galaxy/tools/venn_list commit 6c4ac223d511bbcd0ec9cbada730613a5fe9f1af-dirty
peterjc
parents:
diff changeset
60 handle = open(filename, "rU")
367a0403b7d2 planemo upload for repository https://github.com/peterjc/pico_galaxy/tools/venn_list commit 6c4ac223d511bbcd0ec9cbada730613a5fe9f1af-dirty
peterjc
parents:
diff changeset
61 for record in fastqReader(handle):
367a0403b7d2 planemo upload for repository https://github.com/peterjc/pico_galaxy/tools/venn_list commit 6c4ac223d511bbcd0ec9cbada730613a5fe9f1af-dirty
peterjc
parents:
diff changeset
62 #The [1:] is because the fastaReader leaves the @ on the identifer.
367a0403b7d2 planemo upload for repository https://github.com/peterjc/pico_galaxy/tools/venn_list commit 6c4ac223d511bbcd0ec9cbada730613a5fe9f1af-dirty
peterjc
parents:
diff changeset
63 yield record.identifier.split()[0][1:]
367a0403b7d2 planemo upload for repository https://github.com/peterjc/pico_galaxy/tools/venn_list commit 6c4ac223d511bbcd0ec9cbada730613a5fe9f1af-dirty
peterjc
parents:
diff changeset
64 handle.close()
367a0403b7d2 planemo upload for repository https://github.com/peterjc/pico_galaxy/tools/venn_list commit 6c4ac223d511bbcd0ec9cbada730613a5fe9f1af-dirty
peterjc
parents:
diff changeset
65 elif filetype=="sff":
367a0403b7d2 planemo upload for repository https://github.com/peterjc/pico_galaxy/tools/venn_list commit 6c4ac223d511bbcd0ec9cbada730613a5fe9f1af-dirty
peterjc
parents:
diff changeset
66 try:
367a0403b7d2 planemo upload for repository https://github.com/peterjc/pico_galaxy/tools/venn_list commit 6c4ac223d511bbcd0ec9cbada730613a5fe9f1af-dirty
peterjc
parents:
diff changeset
67 from Bio.SeqIO import index
367a0403b7d2 planemo upload for repository https://github.com/peterjc/pico_galaxy/tools/venn_list commit 6c4ac223d511bbcd0ec9cbada730613a5fe9f1af-dirty
peterjc
parents:
diff changeset
68 except ImportError:
367a0403b7d2 planemo upload for repository https://github.com/peterjc/pico_galaxy/tools/venn_list commit 6c4ac223d511bbcd0ec9cbada730613a5fe9f1af-dirty
peterjc
parents:
diff changeset
69 sys_exit("Require Biopython 1.54 or later (to read SFF files)")
367a0403b7d2 planemo upload for repository https://github.com/peterjc/pico_galaxy/tools/venn_list commit 6c4ac223d511bbcd0ec9cbada730613a5fe9f1af-dirty
peterjc
parents:
diff changeset
70 #This will read the SFF index block if present (very fast)
367a0403b7d2 planemo upload for repository https://github.com/peterjc/pico_galaxy/tools/venn_list commit 6c4ac223d511bbcd0ec9cbada730613a5fe9f1af-dirty
peterjc
parents:
diff changeset
71 for name in index(filename, "sff"):
367a0403b7d2 planemo upload for repository https://github.com/peterjc/pico_galaxy/tools/venn_list commit 6c4ac223d511bbcd0ec9cbada730613a5fe9f1af-dirty
peterjc
parents:
diff changeset
72 yield name
367a0403b7d2 planemo upload for repository https://github.com/peterjc/pico_galaxy/tools/venn_list commit 6c4ac223d511bbcd0ec9cbada730613a5fe9f1af-dirty
peterjc
parents:
diff changeset
73 else:
367a0403b7d2 planemo upload for repository https://github.com/peterjc/pico_galaxy/tools/venn_list commit 6c4ac223d511bbcd0ec9cbada730613a5fe9f1af-dirty
peterjc
parents:
diff changeset
74 sys_exit("Unexpected file type %s" % filetype)
367a0403b7d2 planemo upload for repository https://github.com/peterjc/pico_galaxy/tools/venn_list commit 6c4ac223d511bbcd0ec9cbada730613a5fe9f1af-dirty
peterjc
parents:
diff changeset
75
367a0403b7d2 planemo upload for repository https://github.com/peterjc/pico_galaxy/tools/venn_list commit 6c4ac223d511bbcd0ec9cbada730613a5fe9f1af-dirty
peterjc
parents:
diff changeset
76 def load_ids_whitelist(filename, filetype, whitelist):
367a0403b7d2 planemo upload for repository https://github.com/peterjc/pico_galaxy/tools/venn_list commit 6c4ac223d511bbcd0ec9cbada730613a5fe9f1af-dirty
peterjc
parents:
diff changeset
77 for name in load_ids(filename, filetype):
367a0403b7d2 planemo upload for repository https://github.com/peterjc/pico_galaxy/tools/venn_list commit 6c4ac223d511bbcd0ec9cbada730613a5fe9f1af-dirty
peterjc
parents:
diff changeset
78 if name in whitelist:
367a0403b7d2 planemo upload for repository https://github.com/peterjc/pico_galaxy/tools/venn_list commit 6c4ac223d511bbcd0ec9cbada730613a5fe9f1af-dirty
peterjc
parents:
diff changeset
79 yield name
367a0403b7d2 planemo upload for repository https://github.com/peterjc/pico_galaxy/tools/venn_list commit 6c4ac223d511bbcd0ec9cbada730613a5fe9f1af-dirty
peterjc
parents:
diff changeset
80 else:
367a0403b7d2 planemo upload for repository https://github.com/peterjc/pico_galaxy/tools/venn_list commit 6c4ac223d511bbcd0ec9cbada730613a5fe9f1af-dirty
peterjc
parents:
diff changeset
81 sys_exit("Unexpected ID %s in %s file %s" % (name, filetype, filename))
367a0403b7d2 planemo upload for repository https://github.com/peterjc/pico_galaxy/tools/venn_list commit 6c4ac223d511bbcd0ec9cbada730613a5fe9f1af-dirty
peterjc
parents:
diff changeset
82
367a0403b7d2 planemo upload for repository https://github.com/peterjc/pico_galaxy/tools/venn_list commit 6c4ac223d511bbcd0ec9cbada730613a5fe9f1af-dirty
peterjc
parents:
diff changeset
83 if all_file in ["", "-", '""', '"-"']:
367a0403b7d2 planemo upload for repository https://github.com/peterjc/pico_galaxy/tools/venn_list commit 6c4ac223d511bbcd0ec9cbada730613a5fe9f1af-dirty
peterjc
parents:
diff changeset
84 #Load without white list
367a0403b7d2 planemo upload for repository https://github.com/peterjc/pico_galaxy/tools/venn_list commit 6c4ac223d511bbcd0ec9cbada730613a5fe9f1af-dirty
peterjc
parents:
diff changeset
85 sets = [set(load_ids(f,t)) for (f,t,c) in set_data]
367a0403b7d2 planemo upload for repository https://github.com/peterjc/pico_galaxy/tools/venn_list commit 6c4ac223d511bbcd0ec9cbada730613a5fe9f1af-dirty
peterjc
parents:
diff changeset
86 #Take union
367a0403b7d2 planemo upload for repository https://github.com/peterjc/pico_galaxy/tools/venn_list commit 6c4ac223d511bbcd0ec9cbada730613a5fe9f1af-dirty
peterjc
parents:
diff changeset
87 all = set()
367a0403b7d2 planemo upload for repository https://github.com/peterjc/pico_galaxy/tools/venn_list commit 6c4ac223d511bbcd0ec9cbada730613a5fe9f1af-dirty
peterjc
parents:
diff changeset
88 for s in sets:
367a0403b7d2 planemo upload for repository https://github.com/peterjc/pico_galaxy/tools/venn_list commit 6c4ac223d511bbcd0ec9cbada730613a5fe9f1af-dirty
peterjc
parents:
diff changeset
89 all.update(s)
367a0403b7d2 planemo upload for repository https://github.com/peterjc/pico_galaxy/tools/venn_list commit 6c4ac223d511bbcd0ec9cbada730613a5fe9f1af-dirty
peterjc
parents:
diff changeset
90 print "Inferred total of %i IDs" % len(all)
367a0403b7d2 planemo upload for repository https://github.com/peterjc/pico_galaxy/tools/venn_list commit 6c4ac223d511bbcd0ec9cbada730613a5fe9f1af-dirty
peterjc
parents:
diff changeset
91 else:
367a0403b7d2 planemo upload for repository https://github.com/peterjc/pico_galaxy/tools/venn_list commit 6c4ac223d511bbcd0ec9cbada730613a5fe9f1af-dirty
peterjc
parents:
diff changeset
92 all = set(load_ids(all_file, all_type))
367a0403b7d2 planemo upload for repository https://github.com/peterjc/pico_galaxy/tools/venn_list commit 6c4ac223d511bbcd0ec9cbada730613a5fe9f1af-dirty
peterjc
parents:
diff changeset
93 print "Total of %i IDs" % len(all)
367a0403b7d2 planemo upload for repository https://github.com/peterjc/pico_galaxy/tools/venn_list commit 6c4ac223d511bbcd0ec9cbada730613a5fe9f1af-dirty
peterjc
parents:
diff changeset
94 sets = [set(load_ids_whitelist(f,t,all)) for (f,t,c) in set_data]
367a0403b7d2 planemo upload for repository https://github.com/peterjc/pico_galaxy/tools/venn_list commit 6c4ac223d511bbcd0ec9cbada730613a5fe9f1af-dirty
peterjc
parents:
diff changeset
95
367a0403b7d2 planemo upload for repository https://github.com/peterjc/pico_galaxy/tools/venn_list commit 6c4ac223d511bbcd0ec9cbada730613a5fe9f1af-dirty
peterjc
parents:
diff changeset
96 for s, (f,t,c) in zip(sets, set_data):
367a0403b7d2 planemo upload for repository https://github.com/peterjc/pico_galaxy/tools/venn_list commit 6c4ac223d511bbcd0ec9cbada730613a5fe9f1af-dirty
peterjc
parents:
diff changeset
97 print "%i in %s" % (len(s), c)
367a0403b7d2 planemo upload for repository https://github.com/peterjc/pico_galaxy/tools/venn_list commit 6c4ac223d511bbcd0ec9cbada730613a5fe9f1af-dirty
peterjc
parents:
diff changeset
98
367a0403b7d2 planemo upload for repository https://github.com/peterjc/pico_galaxy/tools/venn_list commit 6c4ac223d511bbcd0ec9cbada730613a5fe9f1af-dirty
peterjc
parents:
diff changeset
99 #Now call R library to draw simple Venn diagram
367a0403b7d2 planemo upload for repository https://github.com/peterjc/pico_galaxy/tools/venn_list commit 6c4ac223d511bbcd0ec9cbada730613a5fe9f1af-dirty
peterjc
parents:
diff changeset
100 try:
367a0403b7d2 planemo upload for repository https://github.com/peterjc/pico_galaxy/tools/venn_list commit 6c4ac223d511bbcd0ec9cbada730613a5fe9f1af-dirty
peterjc
parents:
diff changeset
101 #Create dummy Venn diagram counts object for three groups
367a0403b7d2 planemo upload for repository https://github.com/peterjc/pico_galaxy/tools/venn_list commit 6c4ac223d511bbcd0ec9cbada730613a5fe9f1af-dirty
peterjc
parents:
diff changeset
102 cols = 'c("%s")' % '","'.join("Set%i" % (i+1) for i in range(n))
367a0403b7d2 planemo upload for repository https://github.com/peterjc/pico_galaxy/tools/venn_list commit 6c4ac223d511bbcd0ec9cbada730613a5fe9f1af-dirty
peterjc
parents:
diff changeset
103 rpy.r('groups <- cbind(%s)' % ','.join(['1']*n))
367a0403b7d2 planemo upload for repository https://github.com/peterjc/pico_galaxy/tools/venn_list commit 6c4ac223d511bbcd0ec9cbada730613a5fe9f1af-dirty
peterjc
parents:
diff changeset
104 rpy.r('colnames(groups) <- %s' % cols)
367a0403b7d2 planemo upload for repository https://github.com/peterjc/pico_galaxy/tools/venn_list commit 6c4ac223d511bbcd0ec9cbada730613a5fe9f1af-dirty
peterjc
parents:
diff changeset
105 rpy.r('vc <- vennCounts(groups)')
367a0403b7d2 planemo upload for repository https://github.com/peterjc/pico_galaxy/tools/venn_list commit 6c4ac223d511bbcd0ec9cbada730613a5fe9f1af-dirty
peterjc
parents:
diff changeset
106 #Populate the 2^n classes with real counts
367a0403b7d2 planemo upload for repository https://github.com/peterjc/pico_galaxy/tools/venn_list commit 6c4ac223d511bbcd0ec9cbada730613a5fe9f1af-dirty
peterjc
parents:
diff changeset
107 #Don't make any assumptions about the class order
367a0403b7d2 planemo upload for repository https://github.com/peterjc/pico_galaxy/tools/venn_list commit 6c4ac223d511bbcd0ec9cbada730613a5fe9f1af-dirty
peterjc
parents:
diff changeset
108 #print rpy.r('vc')
367a0403b7d2 planemo upload for repository https://github.com/peterjc/pico_galaxy/tools/venn_list commit 6c4ac223d511bbcd0ec9cbada730613a5fe9f1af-dirty
peterjc
parents:
diff changeset
109 for index, row in enumerate(rpy.r('vc[,%s]' % cols)):
367a0403b7d2 planemo upload for repository https://github.com/peterjc/pico_galaxy/tools/venn_list commit 6c4ac223d511bbcd0ec9cbada730613a5fe9f1af-dirty
peterjc
parents:
diff changeset
110 if isinstance(row, int) or isinstance(row, float):
367a0403b7d2 planemo upload for repository https://github.com/peterjc/pico_galaxy/tools/venn_list commit 6c4ac223d511bbcd0ec9cbada730613a5fe9f1af-dirty
peterjc
parents:
diff changeset
111 #Hack for rpy being too clever for single element row
367a0403b7d2 planemo upload for repository https://github.com/peterjc/pico_galaxy/tools/venn_list commit 6c4ac223d511bbcd0ec9cbada730613a5fe9f1af-dirty
peterjc
parents:
diff changeset
112 row = [row]
367a0403b7d2 planemo upload for repository https://github.com/peterjc/pico_galaxy/tools/venn_list commit 6c4ac223d511bbcd0ec9cbada730613a5fe9f1af-dirty
peterjc
parents:
diff changeset
113 names = all
367a0403b7d2 planemo upload for repository https://github.com/peterjc/pico_galaxy/tools/venn_list commit 6c4ac223d511bbcd0ec9cbada730613a5fe9f1af-dirty
peterjc
parents:
diff changeset
114 for wanted, s in zip(row, sets):
367a0403b7d2 planemo upload for repository https://github.com/peterjc/pico_galaxy/tools/venn_list commit 6c4ac223d511bbcd0ec9cbada730613a5fe9f1af-dirty
peterjc
parents:
diff changeset
115 if wanted:
367a0403b7d2 planemo upload for repository https://github.com/peterjc/pico_galaxy/tools/venn_list commit 6c4ac223d511bbcd0ec9cbada730613a5fe9f1af-dirty
peterjc
parents:
diff changeset
116 names = names.intersection(s)
367a0403b7d2 planemo upload for repository https://github.com/peterjc/pico_galaxy/tools/venn_list commit 6c4ac223d511bbcd0ec9cbada730613a5fe9f1af-dirty
peterjc
parents:
diff changeset
117 else:
367a0403b7d2 planemo upload for repository https://github.com/peterjc/pico_galaxy/tools/venn_list commit 6c4ac223d511bbcd0ec9cbada730613a5fe9f1af-dirty
peterjc
parents:
diff changeset
118 names = names.difference(s)
367a0403b7d2 planemo upload for repository https://github.com/peterjc/pico_galaxy/tools/venn_list commit 6c4ac223d511bbcd0ec9cbada730613a5fe9f1af-dirty
peterjc
parents:
diff changeset
119 rpy.r('vc[%i,"Counts"] <- %i' % (index+1, len(names)))
367a0403b7d2 planemo upload for repository https://github.com/peterjc/pico_galaxy/tools/venn_list commit 6c4ac223d511bbcd0ec9cbada730613a5fe9f1af-dirty
peterjc
parents:
diff changeset
120 #print rpy.r('vc')
367a0403b7d2 planemo upload for repository https://github.com/peterjc/pico_galaxy/tools/venn_list commit 6c4ac223d511bbcd0ec9cbada730613a5fe9f1af-dirty
peterjc
parents:
diff changeset
121 if n == 1:
367a0403b7d2 planemo upload for repository https://github.com/peterjc/pico_galaxy/tools/venn_list commit 6c4ac223d511bbcd0ec9cbada730613a5fe9f1af-dirty
peterjc
parents:
diff changeset
122 #Single circle, don't need to add (Total XXX) line
367a0403b7d2 planemo upload for repository https://github.com/peterjc/pico_galaxy/tools/venn_list commit 6c4ac223d511bbcd0ec9cbada730613a5fe9f1af-dirty
peterjc
parents:
diff changeset
123 names = [c for (t,f,c) in set_data]
367a0403b7d2 planemo upload for repository https://github.com/peterjc/pico_galaxy/tools/venn_list commit 6c4ac223d511bbcd0ec9cbada730613a5fe9f1af-dirty
peterjc
parents:
diff changeset
124 else:
367a0403b7d2 planemo upload for repository https://github.com/peterjc/pico_galaxy/tools/venn_list commit 6c4ac223d511bbcd0ec9cbada730613a5fe9f1af-dirty
peterjc
parents:
diff changeset
125 names = ["%s\n(Total %i)" % (c, len(s)) for s, (f,t,c) in zip(sets, set_data)]
367a0403b7d2 planemo upload for repository https://github.com/peterjc/pico_galaxy/tools/venn_list commit 6c4ac223d511bbcd0ec9cbada730613a5fe9f1af-dirty
peterjc
parents:
diff changeset
126 rpy.r.assign("names", names)
367a0403b7d2 planemo upload for repository https://github.com/peterjc/pico_galaxy/tools/venn_list commit 6c4ac223d511bbcd0ec9cbada730613a5fe9f1af-dirty
peterjc
parents:
diff changeset
127 rpy.r.assign("colors", ["red","green","blue"][:n])
367a0403b7d2 planemo upload for repository https://github.com/peterjc/pico_galaxy/tools/venn_list commit 6c4ac223d511bbcd0ec9cbada730613a5fe9f1af-dirty
peterjc
parents:
diff changeset
128 rpy.r.pdf(pdf_file, 8, 8)
367a0403b7d2 planemo upload for repository https://github.com/peterjc/pico_galaxy/tools/venn_list commit 6c4ac223d511bbcd0ec9cbada730613a5fe9f1af-dirty
peterjc
parents:
diff changeset
129 rpy.r("""vennDiagram(vc, include="both", names=names,
367a0403b7d2 planemo upload for repository https://github.com/peterjc/pico_galaxy/tools/venn_list commit 6c4ac223d511bbcd0ec9cbada730613a5fe9f1af-dirty
peterjc
parents:
diff changeset
130 main="%s", sub="(Total %i)",
367a0403b7d2 planemo upload for repository https://github.com/peterjc/pico_galaxy/tools/venn_list commit 6c4ac223d511bbcd0ec9cbada730613a5fe9f1af-dirty
peterjc
parents:
diff changeset
131 circle.col=colors)
367a0403b7d2 planemo upload for repository https://github.com/peterjc/pico_galaxy/tools/venn_list commit 6c4ac223d511bbcd0ec9cbada730613a5fe9f1af-dirty
peterjc
parents:
diff changeset
132 """ % (all_label, len(all)))
367a0403b7d2 planemo upload for repository https://github.com/peterjc/pico_galaxy/tools/venn_list commit 6c4ac223d511bbcd0ec9cbada730613a5fe9f1af-dirty
peterjc
parents:
diff changeset
133 rpy.r.dev_off()
367a0403b7d2 planemo upload for repository https://github.com/peterjc/pico_galaxy/tools/venn_list commit 6c4ac223d511bbcd0ec9cbada730613a5fe9f1af-dirty
peterjc
parents:
diff changeset
134 except Exception, exc:
367a0403b7d2 planemo upload for repository https://github.com/peterjc/pico_galaxy/tools/venn_list commit 6c4ac223d511bbcd0ec9cbada730613a5fe9f1af-dirty
peterjc
parents:
diff changeset
135 sys_exit( "%s" %str( exc ) )
367a0403b7d2 planemo upload for repository https://github.com/peterjc/pico_galaxy/tools/venn_list commit 6c4ac223d511bbcd0ec9cbada730613a5fe9f1af-dirty
peterjc
parents:
diff changeset
136 rpy.r.quit( save="no" )
367a0403b7d2 planemo upload for repository https://github.com/peterjc/pico_galaxy/tools/venn_list commit 6c4ac223d511bbcd0ec9cbada730613a5fe9f1af-dirty
peterjc
parents:
diff changeset
137 print "Done"