Mercurial > repos > jjohnson > defuse
annotate create_reference_dataset.xml @ 46:e500b50b72fd draft default tip
Uploaded
author | jjohnson |
---|---|
date | Thu, 19 Oct 2017 10:05:54 -0400 |
parents | aedaa66483f1 |
children |
rev | line source |
---|---|
44 | 1 <tool id="create_defuse_reference" name="Create DeFuse Reference" version="@DEFUSE_VERSION@.1"> |
12 | 2 <description>create a defuse reference from Ensembl and UCSC sources</description> |
44 | 3 <macros> |
4 <import>macros.xml</import> | |
5 </macros> | |
6 <requirements> | |
7 <expand macro="defuse_requirement" /> | |
8 </requirements> | |
19
1af6f32ff592
Add datamanager, move to defuse_reference.loc
Jim Johnson <jj@umn.edu>
parents:
18
diff
changeset
|
9 <command interpreter="command"> /bin/bash $defuse_script </command> |
12 | 10 <inputs> |
18
547d8db4673e
Update create_reference_dataset for non human genome builds
Jim Johnson <jj@umn.edu>
parents:
14
diff
changeset
|
11 <conditional name="genome"> |
547d8db4673e
Update create_reference_dataset for non human genome builds
Jim Johnson <jj@umn.edu>
parents:
14
diff
changeset
|
12 <param name="choice" type="select" label="Select a Genome Build"> |
44 | 13 <option value="GRCh38">Homo_sapiens GRCh38 hg38</option> |
18
547d8db4673e
Update create_reference_dataset for non human genome builds
Jim Johnson <jj@umn.edu>
parents:
14
diff
changeset
|
14 <option value="GRCh37">Homo_sapiens GRCh37 hg19</option> |
547d8db4673e
Update create_reference_dataset for non human genome builds
Jim Johnson <jj@umn.edu>
parents:
14
diff
changeset
|
15 <option value="NCBI36">Homo_sapiens NCBI36 hg18</option> |
547d8db4673e
Update create_reference_dataset for non human genome builds
Jim Johnson <jj@umn.edu>
parents:
14
diff
changeset
|
16 <option value="GRCm38">Mus_musculus GRCm38 mm10</option> |
547d8db4673e
Update create_reference_dataset for non human genome builds
Jim Johnson <jj@umn.edu>
parents:
14
diff
changeset
|
17 <option value="NCBIM37">Mus_musculus NCBIM37 mm9</option> |
547d8db4673e
Update create_reference_dataset for non human genome builds
Jim Johnson <jj@umn.edu>
parents:
14
diff
changeset
|
18 <option value="Rnor_5.0">Rattus_norvegicus Rnor_5.0 rn5</option> |
547d8db4673e
Update create_reference_dataset for non human genome builds
Jim Johnson <jj@umn.edu>
parents:
14
diff
changeset
|
19 <option value="user_specified">User specified</option> |
547d8db4673e
Update create_reference_dataset for non human genome builds
Jim Johnson <jj@umn.edu>
parents:
14
diff
changeset
|
20 </param> |
44 | 21 <when value="GRCh38"> |
22 <param name="ensembl_organism" type="hidden" value="homo_sapiens"/> | |
23 <param name="ensembl_prefix" type="hidden" value="Homo_sapiens"/> | |
24 <param name="ensembl_genome_version" type="hidden" value="GRCh38"/> | |
25 <param name="ensembl_version" type="hidden" value="80"/> | |
26 <param name="ncbi_organism" type="hidden" value="Homo_sapiens"/> | |
27 <param name="ncbi_prefix" type="hidden" value="Hs"/> | |
28 <param name="ucsc_genome_version" type="hidden" value="hg38"/> | |
29 <param name="chromosomes" type="hidden" value="1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,X,Y,MT"/> | |
30 <param name="mt_chromosome" type="hidden" value="MT"/> | |
31 <param name="gene_sources" type="hidden" value="IG_C_gene,IG_D_gene,IG_J_gene,IG_V_gene,processed_transcript,protein_coding"/> | |
32 <param name="ig_gene_sources" type="hidden" value="IG_C_gene,IG_D_gene,IG_J_gene,IG_V_gene,IG_pseudogene"/> | |
33 <param name="rrna_gene_sources" type="hidden" value="Mt_rRNA,rRNA,rRNA_pseudogene"/> | |
34 </when> | |
18
547d8db4673e
Update create_reference_dataset for non human genome builds
Jim Johnson <jj@umn.edu>
parents:
14
diff
changeset
|
35 <when value="GRCh37"> |
547d8db4673e
Update create_reference_dataset for non human genome builds
Jim Johnson <jj@umn.edu>
parents:
14
diff
changeset
|
36 <param name="ensembl_organism" type="hidden" value="homo_sapiens"/> |
547d8db4673e
Update create_reference_dataset for non human genome builds
Jim Johnson <jj@umn.edu>
parents:
14
diff
changeset
|
37 <param name="ensembl_prefix" type="hidden" value="Homo_sapiens"/> |
547d8db4673e
Update create_reference_dataset for non human genome builds
Jim Johnson <jj@umn.edu>
parents:
14
diff
changeset
|
38 <param name="ensembl_genome_version" type="hidden" value="GRCh37"/> |
547d8db4673e
Update create_reference_dataset for non human genome builds
Jim Johnson <jj@umn.edu>
parents:
14
diff
changeset
|
39 <param name="ensembl_version" type="hidden" value="71"/> |
547d8db4673e
Update create_reference_dataset for non human genome builds
Jim Johnson <jj@umn.edu>
parents:
14
diff
changeset
|
40 <param name="ncbi_organism" type="hidden" value="Homo_sapiens"/> |
547d8db4673e
Update create_reference_dataset for non human genome builds
Jim Johnson <jj@umn.edu>
parents:
14
diff
changeset
|
41 <param name="ncbi_prefix" type="hidden" value="Hs"/> |
547d8db4673e
Update create_reference_dataset for non human genome builds
Jim Johnson <jj@umn.edu>
parents:
14
diff
changeset
|
42 <param name="ucsc_genome_version" type="hidden" value="hg19"/> |
547d8db4673e
Update create_reference_dataset for non human genome builds
Jim Johnson <jj@umn.edu>
parents:
14
diff
changeset
|
43 <param name="chromosomes" type="hidden" value="1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,X,Y,MT"/> |
547d8db4673e
Update create_reference_dataset for non human genome builds
Jim Johnson <jj@umn.edu>
parents:
14
diff
changeset
|
44 <param name="mt_chromosome" type="hidden" value="MT"/> |
547d8db4673e
Update create_reference_dataset for non human genome builds
Jim Johnson <jj@umn.edu>
parents:
14
diff
changeset
|
45 <param name="gene_sources" type="hidden" value="IG_C_gene,IG_D_gene,IG_J_gene,IG_V_gene,processed_transcript,protein_coding"/> |
547d8db4673e
Update create_reference_dataset for non human genome builds
Jim Johnson <jj@umn.edu>
parents:
14
diff
changeset
|
46 <param name="ig_gene_sources" type="hidden" value="IG_C_gene,IG_D_gene,IG_J_gene,IG_V_gene,IG_pseudogene"/> |
547d8db4673e
Update create_reference_dataset for non human genome builds
Jim Johnson <jj@umn.edu>
parents:
14
diff
changeset
|
47 <param name="rrna_gene_sources" type="hidden" value="Mt_rRNA,rRNA,rRNA_pseudogene"/> |
547d8db4673e
Update create_reference_dataset for non human genome builds
Jim Johnson <jj@umn.edu>
parents:
14
diff
changeset
|
48 </when> |
547d8db4673e
Update create_reference_dataset for non human genome builds
Jim Johnson <jj@umn.edu>
parents:
14
diff
changeset
|
49 <when value="NCBI36"> |
547d8db4673e
Update create_reference_dataset for non human genome builds
Jim Johnson <jj@umn.edu>
parents:
14
diff
changeset
|
50 <param name="ensembl_organism" type="hidden" value="homo_sapiens"/> |
547d8db4673e
Update create_reference_dataset for non human genome builds
Jim Johnson <jj@umn.edu>
parents:
14
diff
changeset
|
51 <param name="ensembl_prefix" type="hidden" value="Homo_sapiens"/> |
547d8db4673e
Update create_reference_dataset for non human genome builds
Jim Johnson <jj@umn.edu>
parents:
14
diff
changeset
|
52 <param name="ensembl_genome_version" type="hidden" value="NCBI36"/> |
547d8db4673e
Update create_reference_dataset for non human genome builds
Jim Johnson <jj@umn.edu>
parents:
14
diff
changeset
|
53 <param name="ensembl_version" type="hidden" value="54"/> |
547d8db4673e
Update create_reference_dataset for non human genome builds
Jim Johnson <jj@umn.edu>
parents:
14
diff
changeset
|
54 <param name="ncbi_organism" type="hidden" value="Homo_sapiens"/> |
547d8db4673e
Update create_reference_dataset for non human genome builds
Jim Johnson <jj@umn.edu>
parents:
14
diff
changeset
|
55 <param name="ncbi_prefix" type="hidden" value="Hs"/> |
547d8db4673e
Update create_reference_dataset for non human genome builds
Jim Johnson <jj@umn.edu>
parents:
14
diff
changeset
|
56 <param name="ucsc_genome_version" type="hidden" value="hg18"/> |
547d8db4673e
Update create_reference_dataset for non human genome builds
Jim Johnson <jj@umn.edu>
parents:
14
diff
changeset
|
57 <param name="chromosomes" type="hidden" value="1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,X,Y,MT"/> |
547d8db4673e
Update create_reference_dataset for non human genome builds
Jim Johnson <jj@umn.edu>
parents:
14
diff
changeset
|
58 <param name="mt_chromosome" type="hidden" value="MT"/> |
547d8db4673e
Update create_reference_dataset for non human genome builds
Jim Johnson <jj@umn.edu>
parents:
14
diff
changeset
|
59 <param name="gene_sources" type="hidden" value="IG_C_gene,IG_D_gene,IG_J_gene,IG_V_gene,processed_transcript,protein_coding"/> |
547d8db4673e
Update create_reference_dataset for non human genome builds
Jim Johnson <jj@umn.edu>
parents:
14
diff
changeset
|
60 <param name="ig_gene_sources" type="hidden" value="IG_C_gene,IG_D_gene,IG_J_gene,IG_V_gene,IG_pseudogene"/> |
547d8db4673e
Update create_reference_dataset for non human genome builds
Jim Johnson <jj@umn.edu>
parents:
14
diff
changeset
|
61 <param name="rrna_gene_sources" type="hidden" value="Mt_rRNA,rRNA,rRNA_pseudogene"/> |
547d8db4673e
Update create_reference_dataset for non human genome builds
Jim Johnson <jj@umn.edu>
parents:
14
diff
changeset
|
62 </when> |
547d8db4673e
Update create_reference_dataset for non human genome builds
Jim Johnson <jj@umn.edu>
parents:
14
diff
changeset
|
63 <when value="GRCm38"> |
547d8db4673e
Update create_reference_dataset for non human genome builds
Jim Johnson <jj@umn.edu>
parents:
14
diff
changeset
|
64 <param name="ensembl_organism" type="hidden" value="mus_musculus"/> |
547d8db4673e
Update create_reference_dataset for non human genome builds
Jim Johnson <jj@umn.edu>
parents:
14
diff
changeset
|
65 <param name="ensembl_prefix" type="hidden" value="Mus_musculus"/> |
547d8db4673e
Update create_reference_dataset for non human genome builds
Jim Johnson <jj@umn.edu>
parents:
14
diff
changeset
|
66 <param name="ensembl_genome_version" type="hidden" value="GRCm38"/> |
547d8db4673e
Update create_reference_dataset for non human genome builds
Jim Johnson <jj@umn.edu>
parents:
14
diff
changeset
|
67 <param name="ensembl_version" type="hidden" value="71"/> |
547d8db4673e
Update create_reference_dataset for non human genome builds
Jim Johnson <jj@umn.edu>
parents:
14
diff
changeset
|
68 <param name="ncbi_organism" type="hidden" value="Mus_musculus"/> |
547d8db4673e
Update create_reference_dataset for non human genome builds
Jim Johnson <jj@umn.edu>
parents:
14
diff
changeset
|
69 <param name="ncbi_prefix" type="hidden" value="Mm"/> |
547d8db4673e
Update create_reference_dataset for non human genome builds
Jim Johnson <jj@umn.edu>
parents:
14
diff
changeset
|
70 <param name="ucsc_genome_version" type="hidden" value="mm10"/> |
547d8db4673e
Update create_reference_dataset for non human genome builds
Jim Johnson <jj@umn.edu>
parents:
14
diff
changeset
|
71 <param name="chromosomes" type="hidden" value="1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,X,Y,MT"/> |
547d8db4673e
Update create_reference_dataset for non human genome builds
Jim Johnson <jj@umn.edu>
parents:
14
diff
changeset
|
72 <param name="mt_chromosome" type="hidden" value="MT"/> |
547d8db4673e
Update create_reference_dataset for non human genome builds
Jim Johnson <jj@umn.edu>
parents:
14
diff
changeset
|
73 <param name="gene_sources" type="hidden" value="IG_C_gene,IG_D_gene,IG_J_gene,IG_V_gene,processed_transcript,protein_coding"/> |
547d8db4673e
Update create_reference_dataset for non human genome builds
Jim Johnson <jj@umn.edu>
parents:
14
diff
changeset
|
74 <param name="ig_gene_sources" type="hidden" value="IG_C_gene,IG_D_gene,IG_J_gene,IG_V_gene,IG_pseudogene"/> |
547d8db4673e
Update create_reference_dataset for non human genome builds
Jim Johnson <jj@umn.edu>
parents:
14
diff
changeset
|
75 <param name="rrna_gene_sources" type="hidden" value="Mt_rRNA,rRNA,rRNA_pseudogene"/> |
547d8db4673e
Update create_reference_dataset for non human genome builds
Jim Johnson <jj@umn.edu>
parents:
14
diff
changeset
|
76 </when> |
547d8db4673e
Update create_reference_dataset for non human genome builds
Jim Johnson <jj@umn.edu>
parents:
14
diff
changeset
|
77 <when value="NCBIM37"> |
547d8db4673e
Update create_reference_dataset for non human genome builds
Jim Johnson <jj@umn.edu>
parents:
14
diff
changeset
|
78 <param name="ensembl_organism" type="hidden" value="mus_musculus"/> |
547d8db4673e
Update create_reference_dataset for non human genome builds
Jim Johnson <jj@umn.edu>
parents:
14
diff
changeset
|
79 <param name="ensembl_prefix" type="hidden" value="Mus_musculus"/> |
547d8db4673e
Update create_reference_dataset for non human genome builds
Jim Johnson <jj@umn.edu>
parents:
14
diff
changeset
|
80 <param name="ensembl_genome_version" type="hidden" value="NCBIM37"/> |
547d8db4673e
Update create_reference_dataset for non human genome builds
Jim Johnson <jj@umn.edu>
parents:
14
diff
changeset
|
81 <param name="ensembl_version" type="hidden" value="67"/> |
547d8db4673e
Update create_reference_dataset for non human genome builds
Jim Johnson <jj@umn.edu>
parents:
14
diff
changeset
|
82 <param name="ncbi_organism" type="hidden" value="Mus_musculus"/> |
547d8db4673e
Update create_reference_dataset for non human genome builds
Jim Johnson <jj@umn.edu>
parents:
14
diff
changeset
|
83 <param name="ncbi_prefix" type="hidden" value="Mm"/> |
547d8db4673e
Update create_reference_dataset for non human genome builds
Jim Johnson <jj@umn.edu>
parents:
14
diff
changeset
|
84 <param name="ucsc_genome_version" type="hidden" value="mm9"/> |
547d8db4673e
Update create_reference_dataset for non human genome builds
Jim Johnson <jj@umn.edu>
parents:
14
diff
changeset
|
85 <param name="chromosomes" type="hidden" value="1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,X,Y,MT"/> |
547d8db4673e
Update create_reference_dataset for non human genome builds
Jim Johnson <jj@umn.edu>
parents:
14
diff
changeset
|
86 <param name="mt_chromosome" type="hidden" value="MT"/> |
547d8db4673e
Update create_reference_dataset for non human genome builds
Jim Johnson <jj@umn.edu>
parents:
14
diff
changeset
|
87 <param name="gene_sources" type="hidden" value="IG_C_gene,IG_D_gene,IG_J_gene,IG_V_gene,processed_transcript,protein_coding"/> |
547d8db4673e
Update create_reference_dataset for non human genome builds
Jim Johnson <jj@umn.edu>
parents:
14
diff
changeset
|
88 <param name="ig_gene_sources" type="hidden" value="IG_C_gene,IG_D_gene,IG_J_gene,IG_V_gene,IG_pseudogene"/> |
547d8db4673e
Update create_reference_dataset for non human genome builds
Jim Johnson <jj@umn.edu>
parents:
14
diff
changeset
|
89 <param name="rrna_gene_sources" type="hidden" value="Mt_rRNA,rRNA,rRNA_pseudogene"/> |
547d8db4673e
Update create_reference_dataset for non human genome builds
Jim Johnson <jj@umn.edu>
parents:
14
diff
changeset
|
90 </when> |
547d8db4673e
Update create_reference_dataset for non human genome builds
Jim Johnson <jj@umn.edu>
parents:
14
diff
changeset
|
91 <when value="Rnor_5.0"> |
547d8db4673e
Update create_reference_dataset for non human genome builds
Jim Johnson <jj@umn.edu>
parents:
14
diff
changeset
|
92 <param name="ensembl_organism" type="hidden" value="rattus_norvegicus"/> |
547d8db4673e
Update create_reference_dataset for non human genome builds
Jim Johnson <jj@umn.edu>
parents:
14
diff
changeset
|
93 <param name="ensembl_prefix" type="hidden" value="Rattus_norvegicus"/> |
547d8db4673e
Update create_reference_dataset for non human genome builds
Jim Johnson <jj@umn.edu>
parents:
14
diff
changeset
|
94 <param name="ensembl_genome_version" type="hidden" value="Rnor_5.0"/> |
547d8db4673e
Update create_reference_dataset for non human genome builds
Jim Johnson <jj@umn.edu>
parents:
14
diff
changeset
|
95 <param name="ensembl_version" type="hidden" value="71"/> |
547d8db4673e
Update create_reference_dataset for non human genome builds
Jim Johnson <jj@umn.edu>
parents:
14
diff
changeset
|
96 <param name="ncbi_organism" type="hidden" value="Rattus_norvegicus"/> |
547d8db4673e
Update create_reference_dataset for non human genome builds
Jim Johnson <jj@umn.edu>
parents:
14
diff
changeset
|
97 <param name="ncbi_prefix" type="hidden" value="Rn"/> |
547d8db4673e
Update create_reference_dataset for non human genome builds
Jim Johnson <jj@umn.edu>
parents:
14
diff
changeset
|
98 <param name="ucsc_genome_version" type="hidden" value="rn5"/> |
547d8db4673e
Update create_reference_dataset for non human genome builds
Jim Johnson <jj@umn.edu>
parents:
14
diff
changeset
|
99 <param name="chromosomes" type="hidden" value="1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,X,MT"/> |
547d8db4673e
Update create_reference_dataset for non human genome builds
Jim Johnson <jj@umn.edu>
parents:
14
diff
changeset
|
100 <param name="mt_chromosome" type="hidden" value="MT"/> |
547d8db4673e
Update create_reference_dataset for non human genome builds
Jim Johnson <jj@umn.edu>
parents:
14
diff
changeset
|
101 <param name="gene_sources" type="hidden" value="IG_C_gene,IG_D_gene,IG_J_gene,IG_V_gene,processed_transcript,protein_coding"/> |
547d8db4673e
Update create_reference_dataset for non human genome builds
Jim Johnson <jj@umn.edu>
parents:
14
diff
changeset
|
102 <param name="ig_gene_sources" type="hidden" value="IG_C_gene,IG_D_gene,IG_J_gene,IG_V_gene,IG_pseudogene"/> |
547d8db4673e
Update create_reference_dataset for non human genome builds
Jim Johnson <jj@umn.edu>
parents:
14
diff
changeset
|
103 <param name="rrna_gene_sources" type="hidden" value="Mt_rRNA,rRNA,rRNA_pseudogene"/> |
547d8db4673e
Update create_reference_dataset for non human genome builds
Jim Johnson <jj@umn.edu>
parents:
14
diff
changeset
|
104 </when> |
547d8db4673e
Update create_reference_dataset for non human genome builds
Jim Johnson <jj@umn.edu>
parents:
14
diff
changeset
|
105 <when value="user_specified"> |
34 | 106 <param name="ensembl_organism" type="text" value="" label="Ensembl Organism Name"> |
107 <help> | |
108 Examples: homo_sapiens, mus_musculus, rattus_norvegicus | |
109 ftp://ftp.ensembl.org/pub/release-$ensembl_version/fasta/$ensembl_organism/dna/$ensembl_prefix.$ensembl_genome_version.$ensembl_version.dna.chromosome.$chromosome.fa.gz | |
110 </help> | |
111 </param> | |
18
547d8db4673e
Update create_reference_dataset for non human genome builds
Jim Johnson <jj@umn.edu>
parents:
14
diff
changeset
|
112 <param name="ensembl_prefix" type="text" value="" label="Ensembl Organism prefix" help="Examples: Homo_sapiens, Mus_musculus, Rattus_norvegicus"/> |
547d8db4673e
Update create_reference_dataset for non human genome builds
Jim Johnson <jj@umn.edu>
parents:
14
diff
changeset
|
113 <param name="ensembl_genome_version" type="text" value="" label="Ensembl Genome Version" help="Examples: GRCh37, GRCm38, Rnor_5.0"/> |
547d8db4673e
Update create_reference_dataset for non human genome builds
Jim Johnson <jj@umn.edu>
parents:
14
diff
changeset
|
114 <param name="ensembl_version" type="integer" value="" label="Ensembl Release Version" help="Example: 71"/> |
547d8db4673e
Update create_reference_dataset for non human genome builds
Jim Johnson <jj@umn.edu>
parents:
14
diff
changeset
|
115 <param name="ncbi_organism" type="text" value="" label="NCBI Organism Name" help="Examples: Homo_sapiens, Mus_musculus, Rattus_norvegicus"/> |
547d8db4673e
Update create_reference_dataset for non human genome builds
Jim Johnson <jj@umn.edu>
parents:
14
diff
changeset
|
116 <param name="ncbi_prefix" type="text" value="" label="NCBI Organism Unigene prefix" help="Examples: Hs, Mm, Rn"/> |
547d8db4673e
Update create_reference_dataset for non human genome builds
Jim Johnson <jj@umn.edu>
parents:
14
diff
changeset
|
117 <param name="ucsc_genome_version" type="text" value="" label="UCSC Genome Version" help="Examples: hg19, mm10, rn5"/> |
547d8db4673e
Update create_reference_dataset for non human genome builds
Jim Johnson <jj@umn.edu>
parents:
14
diff
changeset
|
118 <param name="chromosomes" type="text" value="" label="Chromosomes for Ensembl genome build" > |
547d8db4673e
Update create_reference_dataset for non human genome builds
Jim Johnson <jj@umn.edu>
parents:
14
diff
changeset
|
119 <help> Examples: |
547d8db4673e
Update create_reference_dataset for non human genome builds
Jim Johnson <jj@umn.edu>
parents:
14
diff
changeset
|
120 Homo_sapiens: 1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,X,Y,MT |
547d8db4673e
Update create_reference_dataset for non human genome builds
Jim Johnson <jj@umn.edu>
parents:
14
diff
changeset
|
121 Mus_musculus: 1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,X,Y,MT |
547d8db4673e
Update create_reference_dataset for non human genome builds
Jim Johnson <jj@umn.edu>
parents:
14
diff
changeset
|
122 Rattus_norvegicus: 1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,X,MT |
547d8db4673e
Update create_reference_dataset for non human genome builds
Jim Johnson <jj@umn.edu>
parents:
14
diff
changeset
|
123 ( ftp://ftp.ensembl.org/pub/release-71/fasta/homo_sapiens/dna/ ) |
547d8db4673e
Update create_reference_dataset for non human genome builds
Jim Johnson <jj@umn.edu>
parents:
14
diff
changeset
|
124 </help> |
547d8db4673e
Update create_reference_dataset for non human genome builds
Jim Johnson <jj@umn.edu>
parents:
14
diff
changeset
|
125 </param> |
547d8db4673e
Update create_reference_dataset for non human genome builds
Jim Johnson <jj@umn.edu>
parents:
14
diff
changeset
|
126 <param name="mt_chromosome" type="text" value="MT" label="Ensembl Mitochonrial Chromosome name" /> |
547d8db4673e
Update create_reference_dataset for non human genome builds
Jim Johnson <jj@umn.edu>
parents:
14
diff
changeset
|
127 <param name="gene_sources" type="text" value="IG_C_gene,IG_D_gene,IG_J_gene,IG_V_gene,processed_transcript,protein_coding" label="Gene sources" /> |
547d8db4673e
Update create_reference_dataset for non human genome builds
Jim Johnson <jj@umn.edu>
parents:
14
diff
changeset
|
128 <param name="ig_gene_sources" type="text" value="IG_C_gene,IG_D_gene,IG_J_gene,IG_V_gene,IG_pseudogene" label="IG Gene sources" /> |
547d8db4673e
Update create_reference_dataset for non human genome builds
Jim Johnson <jj@umn.edu>
parents:
14
diff
changeset
|
129 <param name="rrna_gene_sources" type="text" value="Mt_rRNA,rRNA,rRNA_pseudogene" label="Ribosomal Gene sources" /> |
547d8db4673e
Update create_reference_dataset for non human genome builds
Jim Johnson <jj@umn.edu>
parents:
14
diff
changeset
|
130 </when> |
547d8db4673e
Update create_reference_dataset for non human genome builds
Jim Johnson <jj@umn.edu>
parents:
14
diff
changeset
|
131 </conditional> |
12 | 132 </inputs> |
133 <outputs> | |
19
1af6f32ff592
Add datamanager, move to defuse_reference.loc
Jim Johnson <jj@umn.edu>
parents:
18
diff
changeset
|
134 <data format="defuse.conf" name="config_txt" label="${tool.name} on ${genome.ensembl_genome_version} : config.txt"/> |
12 | 135 </outputs> |
14
d975e466d443
Add stdio tag to create_reference_dataset.xml
Jim Johnson <jj@umn.edu>
parents:
13
diff
changeset
|
136 <stdio> |
d975e466d443
Add stdio tag to create_reference_dataset.xml
Jim Johnson <jj@umn.edu>
parents:
13
diff
changeset
|
137 <exit_code range="1:" level="fatal" description="Error running Create DeFuse Reference" /> |
d975e466d443
Add stdio tag to create_reference_dataset.xml
Jim Johnson <jj@umn.edu>
parents:
13
diff
changeset
|
138 <regex match="Error:" |
d975e466d443
Add stdio tag to create_reference_dataset.xml
Jim Johnson <jj@umn.edu>
parents:
13
diff
changeset
|
139 source="both" |
d975e466d443
Add stdio tag to create_reference_dataset.xml
Jim Johnson <jj@umn.edu>
parents:
13
diff
changeset
|
140 level="fatal" |
d975e466d443
Add stdio tag to create_reference_dataset.xml
Jim Johnson <jj@umn.edu>
parents:
13
diff
changeset
|
141 description="Error running Create DeFuse Reference" /> |
d975e466d443
Add stdio tag to create_reference_dataset.xml
Jim Johnson <jj@umn.edu>
parents:
13
diff
changeset
|
142 |
d975e466d443
Add stdio tag to create_reference_dataset.xml
Jim Johnson <jj@umn.edu>
parents:
13
diff
changeset
|
143 </stdio> |
12 | 144 <configfiles> |
145 <configfile name="defuse_config"> | |
146 # | |
147 # Configuration file for defuse | |
148 # | |
19
1af6f32ff592
Add datamanager, move to defuse_reference.loc
Jim Johnson <jj@umn.edu>
parents:
18
diff
changeset
|
149 # Variables that desiganate the PATH to an application, e.g. __SAMTOOLS_BIN__ |
1af6f32ff592
Add datamanager, move to defuse_reference.loc
Jim Johnson <jj@umn.edu>
parents:
18
diff
changeset
|
150 # will be set by the runtime script using the ENV PATH |
12 | 151 # |
152 | |
153 # Directory where the defuse code was unpacked | |
154 source_directory = __DEFUSE_PATH__ | |
155 | |
19
1af6f32ff592
Add datamanager, move to defuse_reference.loc
Jim Johnson <jj@umn.edu>
parents:
18
diff
changeset
|
156 # Organism IDs |
18
547d8db4673e
Update create_reference_dataset for non human genome builds
Jim Johnson <jj@umn.edu>
parents:
14
diff
changeset
|
157 ensembl_organism = $genome.ensembl_organism |
547d8db4673e
Update create_reference_dataset for non human genome builds
Jim Johnson <jj@umn.edu>
parents:
14
diff
changeset
|
158 ensembl_prefix = $genome.ensembl_prefix |
547d8db4673e
Update create_reference_dataset for non human genome builds
Jim Johnson <jj@umn.edu>
parents:
14
diff
changeset
|
159 ensembl_version = $genome.ensembl_version |
547d8db4673e
Update create_reference_dataset for non human genome builds
Jim Johnson <jj@umn.edu>
parents:
14
diff
changeset
|
160 ensembl_genome_version = $genome.ensembl_genome_version |
547d8db4673e
Update create_reference_dataset for non human genome builds
Jim Johnson <jj@umn.edu>
parents:
14
diff
changeset
|
161 ucsc_genome_version = $genome.ucsc_genome_version |
547d8db4673e
Update create_reference_dataset for non human genome builds
Jim Johnson <jj@umn.edu>
parents:
14
diff
changeset
|
162 ncbi_organism = $genome.ncbi_organism |
547d8db4673e
Update create_reference_dataset for non human genome builds
Jim Johnson <jj@umn.edu>
parents:
14
diff
changeset
|
163 ncbi_prefix = $genome.ncbi_prefix |
12 | 164 |
165 # Directory where you want your dataset | |
35 | 166 dataset_directory = $config_txt.dataset.extra_files_path |
12 | 167 |
168 #raw | |
169 # Input genome and gene models | |
18
547d8db4673e
Update create_reference_dataset for non human genome builds
Jim Johnson <jj@umn.edu>
parents:
14
diff
changeset
|
170 gene_models = $(dataset_directory)/$(ensembl_prefix).$(ensembl_genome_version).$(ensembl_version).gtf |
547d8db4673e
Update create_reference_dataset for non human genome builds
Jim Johnson <jj@umn.edu>
parents:
14
diff
changeset
|
171 genome_fasta = $(dataset_directory)/$(ensembl_prefix).$(ensembl_genome_version).$(ensembl_version).dna.chromosomes.fa |
12 | 172 |
173 # Repeat table from ucsc genome browser | |
174 repeats_filename = $(dataset_directory)/repeats.txt | |
175 | |
176 # EST info downloaded from ucsc genome browser | |
177 est_fasta = $(dataset_directory)/est.fa | |
178 est_alignments = $(dataset_directory)/intronEst.txt | |
179 | |
180 # Unigene clusters downloaded from ncbi | |
18
547d8db4673e
Update create_reference_dataset for non human genome builds
Jim Johnson <jj@umn.edu>
parents:
14
diff
changeset
|
181 unigene_fasta = $(dataset_directory)/$(ncbi_prefix).seq.uniq |
12 | 182 #end raw |
183 | |
184 # Paths to external tools | |
185 samtools_bin = __SAMTOOLS_BIN__ | |
186 bowtie_bin = __BOWTIE_BIN__ | |
187 bowtie_build_bin = __BOWTIE_BUILD_BIN__ | |
188 blat_bin = __BLAT_BIN__ | |
189 fatotwobit_bin = __FATOTWOBIT_BIN__ | |
190 gmap_bin = __GMAP_BIN__ | |
191 gmap_setup_bin = __GMAP_SETUP_BIN__ | |
192 r_bin = __R_BIN__ | |
193 rscript_bin = __RSCRIPT_BIN__ | |
194 | |
195 #raw | |
196 # Directory where you want your dataset | |
197 gmap_index_directory = $(dataset_directory)/gmap | |
198 #end raw | |
199 | |
200 #raw | |
201 # Dataset files | |
202 dataset_prefix = $(dataset_directory)/defuse | |
203 chromosome_prefix = $(dataset_prefix).dna.chromosomes | |
204 exons_fasta = $(dataset_prefix).exons.fa | |
205 cds_fasta = $(dataset_prefix).cds.fa | |
206 cdna_regions = $(dataset_prefix).cdna.regions | |
207 cdna_fasta = $(dataset_prefix).cdna.fa | |
208 reference_fasta = $(dataset_prefix).reference.fa | |
209 rrna_fasta = $(dataset_prefix).rrna.fa | |
210 ig_gene_list = $(dataset_prefix).ig.gene.list | |
211 repeats_regions = $(dataset_directory)/repeats.regions | |
212 est_split_fasta1 = $(dataset_directory)/est.1.fa | |
213 est_split_fasta2 = $(dataset_directory)/est.2.fa | |
214 est_split_fasta3 = $(dataset_directory)/est.3.fa | |
215 est_split_fasta4 = $(dataset_directory)/est.4.fa | |
216 est_split_fasta5 = $(dataset_directory)/est.5.fa | |
217 est_split_fasta6 = $(dataset_directory)/est.6.fa | |
218 est_split_fasta7 = $(dataset_directory)/est.7.fa | |
219 est_split_fasta8 = $(dataset_directory)/est.8.fa | |
220 est_split_fasta9 = $(dataset_directory)/est.9.fa | |
221 | |
222 # Fasta files with bowtie indices for prefiltering reads for concordantly mapping pairs | |
223 prefilter1 = $(unigene_fasta) | |
224 | |
225 # deFuse scripts and tools | |
226 scripts_directory = $(source_directory)/scripts | |
227 tools_directory = $(source_directory)/tools | |
228 data_directory = $(source_directory)/data | |
229 #end raw | |
230 | |
231 # Parameters for building the dataset | |
18
547d8db4673e
Update create_reference_dataset for non human genome builds
Jim Johnson <jj@umn.edu>
parents:
14
diff
changeset
|
232 chromosomes = $genome.chromosomes |
547d8db4673e
Update create_reference_dataset for non human genome builds
Jim Johnson <jj@umn.edu>
parents:
14
diff
changeset
|
233 mt_chromosome = $genome.mt_chromosome |
547d8db4673e
Update create_reference_dataset for non human genome builds
Jim Johnson <jj@umn.edu>
parents:
14
diff
changeset
|
234 gene_sources = $genome.gene_sources |
547d8db4673e
Update create_reference_dataset for non human genome builds
Jim Johnson <jj@umn.edu>
parents:
14
diff
changeset
|
235 ig_gene_sources = $genome.ig_gene_sources |
547d8db4673e
Update create_reference_dataset for non human genome builds
Jim Johnson <jj@umn.edu>
parents:
14
diff
changeset
|
236 rrna_gene_sources = $genome.rrna_gene_sources |
44 | 237 gene_biotypes = $genome.gene_sources |
238 ig_gene_biotypes = $genome.ig_gene_sources | |
239 rrna_gene_biotypes = $genome.rrna_gene_sources | |
12 | 240 |
241 #raw | |
242 # Remove temp files | |
243 remove_job_files = yes | |
244 remove_job_temp_files = yes | |
245 #end raw | |
246 </configfile> | |
19
1af6f32ff592
Add datamanager, move to defuse_reference.loc
Jim Johnson <jj@umn.edu>
parents:
18
diff
changeset
|
247 <configfile name="defuse_script"> |
12 | 248 #!/bin/bash |
249 ## define some things for cheetah proccessing | |
250 #set $amp = chr(38) | |
251 #set $gt = chr(62) | |
252 ## substitute pathnames into config file | |
253 if `grep __DEFUSE_PATH__ $defuse_config ${gt} /dev/null`;then sed -i'.tmp' "s#__DEFUSE_PATH__#\${DEFUSE_PATH}#" $defuse_config; fi | |
254 if `grep __SAMTOOLS_BIN__ $defuse_config ${gt} /dev/null` ${amp}${amp} SAMTOOLS_BIN=`which samtools`;then sed -i'.tmp' "s#__SAMTOOLS_BIN__#\${SAMTOOLS_BIN}#" $defuse_config; fi | |
255 if `grep __BOWTIE_BIN__ $defuse_config ${gt} /dev/null` ${amp}${amp} BOWTIE_BIN=`which bowtie`;then sed -i'.tmp' "s#__BOWTIE_BIN__#\${BOWTIE_BIN}#" $defuse_config; fi | |
256 if `grep __BOWTIE_BUILD_BIN__ $defuse_config ${gt} /dev/null` ${amp}${amp} BOWTIE_BUILD_BIN=`which bowtie-build`;then sed -i'.tmp' "s#__BOWTIE_BUILD_BIN__#\${BOWTIE_BUILD_BIN}#" $defuse_config; fi | |
257 if `grep __BLAT_BIN__ $defuse_config ${gt} /dev/null` ${amp}${amp} BLAT_BIN=`which blat`;then sed -i'.tmp' "s#__BLAT_BIN__#\${BLAT_BIN}#" $defuse_config; fi | |
258 if `grep __FATOTWOBIT_BIN__ $defuse_config ${gt} /dev/null` ${amp}${amp} FATOTWOBIT_BIN=`which faToTwoBit`;then sed -i'.tmp' "s#__FATOTWOBIT_BIN__#\${FATOTWOBIT_BIN}#" $defuse_config; fi | |
259 if `grep __GMAP_BIN__ $defuse_config ${gt} /dev/null` ${amp}${amp} GMAP_BIN=`which gmap`;then sed -i'.tmp' "s#__GMAP_BIN__#\${GMAP_BIN}#" $defuse_config; fi | |
260 if `grep __GMAP_SETUP_BIN__ $defuse_config ${gt} /dev/null` ${amp}${amp} GMAP_SETUP_BIN=`which gmap_setup`;then sed -i'.tmp' "s#__GMAP_SETUP_BIN__#\${GMAP_SETUP_BIN}#" $defuse_config; fi | |
261 if `grep __GMAP_INDEX_DIR__ $defuse_config ${gt} /dev/null` ${amp}${amp} GMAP_INDEX_DIR=`pwd`/gmap;then sed -i'.tmp' "s#__GMAP_INDEX_DIR__#\${GMAP_INDEX_DIR}#" $defuse_config; fi | |
262 if `grep __R_BIN__ $defuse_config ${gt} /dev/null` ${amp}${amp} R_BIN=`which R`;then sed -i'.tmp' "s#__R_BIN__#\${R_BIN}#" $defuse_config; fi | |
263 if `grep __RSCRIPT_BIN__ $defuse_config ${gt} /dev/null` ${amp}${amp} RSCRIPT_BIN=`which Rscript`;then sed -i'.tmp' "s#__RSCRIPT_BIN__#\${RSCRIPT_BIN}#" $defuse_config; fi | |
264 ## copy config to output | |
265 cp $defuse_config $config_txt | |
266 ## make a data_dir and ln -s the input fastq | |
35 | 267 mkdir -p $config_txt.dataset.extra_files_path |
18
547d8db4673e
Update create_reference_dataset for non human genome builds
Jim Johnson <jj@umn.edu>
parents:
14
diff
changeset
|
268 ## create_reference_dataset.pl |
12 | 269 perl \${DEFUSE_PATH}/scripts/create_reference_dataset.pl -c $defuse_config |
270 </configfile> | |
271 </configfiles> | |
272 | |
273 <tests> | |
274 </tests> | |
275 <help> | |
276 **DeFuse** | |
277 | |
19
1af6f32ff592
Add datamanager, move to defuse_reference.loc
Jim Johnson <jj@umn.edu>
parents:
18
diff
changeset
|
278 DeFuse_ is a software package for gene fusion discovery using RNA-Seq data. The software uses clusters of discordant paired end alignments to inform a split read alignment analysis for finding fusion boundaries. The software also employs a number of heuristic filters in an attempt to reduce the number of false positives and produces a fully annotated output for each predicted fusion. See the DeFuse_Version_0.6_ manual for details. |
18
547d8db4673e
Update create_reference_dataset for non human genome builds
Jim Johnson <jj@umn.edu>
parents:
14
diff
changeset
|
279 |
547d8db4673e
Update create_reference_dataset for non human genome builds
Jim Johnson <jj@umn.edu>
parents:
14
diff
changeset
|
280 DeFuse uses a Reference Dataset to search for gene fusions. The Reference Dataset is generated from the following sources in DeFuse_Version_0.6_: |
547d8db4673e
Update create_reference_dataset for non human genome builds
Jim Johnson <jj@umn.edu>
parents:
14
diff
changeset
|
281 - genome_fasta from Ensembl |
547d8db4673e
Update create_reference_dataset for non human genome builds
Jim Johnson <jj@umn.edu>
parents:
14
diff
changeset
|
282 - gene_models from Ensembl |
547d8db4673e
Update create_reference_dataset for non human genome builds
Jim Johnson <jj@umn.edu>
parents:
14
diff
changeset
|
283 - repeats_filename from UCSC RepeatMasker rmsk.txt |
547d8db4673e
Update create_reference_dataset for non human genome builds
Jim Johnson <jj@umn.edu>
parents:
14
diff
changeset
|
284 - est_fasta from UCSC |
547d8db4673e
Update create_reference_dataset for non human genome builds
Jim Johnson <jj@umn.edu>
parents:
14
diff
changeset
|
285 - est_alignments from UCSC intronEst.txt |
547d8db4673e
Update create_reference_dataset for non human genome builds
Jim Johnson <jj@umn.edu>
parents:
14
diff
changeset
|
286 - unigene_fasta from NCBI |
547d8db4673e
Update create_reference_dataset for non human genome builds
Jim Johnson <jj@umn.edu>
parents:
14
diff
changeset
|
287 |
547d8db4673e
Update create_reference_dataset for non human genome builds
Jim Johnson <jj@umn.edu>
parents:
14
diff
changeset
|
288 The create_defuse_reference Galaxy tool downloads the reference genome and other source files, and builds any derivative files including bowtie indices, gmap indices, and 2bit files. Expect this step to take at least 12 hours. |
547d8db4673e
Update create_reference_dataset for non human genome builds
Jim Johnson <jj@umn.edu>
parents:
14
diff
changeset
|
289 |
547d8db4673e
Update create_reference_dataset for non human genome builds
Jim Johnson <jj@umn.edu>
parents:
14
diff
changeset
|
290 |
547d8db4673e
Update create_reference_dataset for non human genome builds
Jim Johnson <jj@umn.edu>
parents:
14
diff
changeset
|
291 It will generate a config.txt file that can be input into the deFuse Galaxy tool. |
12 | 292 |
293 Journal reference: http://www.ploscompbiol.org/article/info%3Adoi%2F10.1371%2Fjournal.pcbi.1001138 | |
294 | |
295 .. _DeFuse: http://sourceforge.net/apps/mediawiki/defuse/index.php?title=Main_Page | |
296 | |
18
547d8db4673e
Update create_reference_dataset for non human genome builds
Jim Johnson <jj@umn.edu>
parents:
14
diff
changeset
|
297 .. _DeFuse_Version_0.6: http://sourceforge.net/apps/mediawiki/defuse/index.php?title=DeFuse_Version_0.6.1 |
12 | 298 |
299 ------ | |
300 | |
301 **Outputs** | |
302 | |
18
547d8db4673e
Update create_reference_dataset for non human genome builds
Jim Johnson <jj@umn.edu>
parents:
14
diff
changeset
|
303 The galaxy history will contain: the config.txt file that provides DeFuse with the reference data paths. |
12 | 304 |
305 </help> | |
44 | 306 <expand macro="citations"/> |
12 | 307 </tool> |