annotate protxml_to_gff.xml @ 1:20639ed90568 draft default tip

Uploaded
author iracooke
date Tue, 18 Mar 2014 22:41:32 -0400
parents 1bea12973e53
children
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
1
20639ed90568 Uploaded
iracooke
parents: 0
diff changeset
1 <tool id="protxml_to_gff" name="ProtXML to GFF" version="1.0.1">
0
1bea12973e53 Uploaded
iracooke
parents:
diff changeset
2 <requirements>
1bea12973e53 Uploaded
iracooke
parents:
diff changeset
3 <requirement type="package" version="1.2.6">protk</requirement>
1bea12973e53 Uploaded
iracooke
parents:
diff changeset
4 <requirement type="package" version="2.2.29">blast+</requirement>
1bea12973e53 Uploaded
iracooke
parents:
diff changeset
5 </requirements>
1bea12973e53 Uploaded
iracooke
parents:
diff changeset
6
1bea12973e53 Uploaded
iracooke
parents:
diff changeset
7 <description>Map peptides from a protXML file to genomic coordinates</description>
1bea12973e53 Uploaded
iracooke
parents:
diff changeset
8
1bea12973e53 Uploaded
iracooke
parents:
diff changeset
9 <command>
1bea12973e53 Uploaded
iracooke
parents:
diff changeset
10 protxml_to_gff.rb -p $protxml_file
1bea12973e53 Uploaded
iracooke
parents:
diff changeset
11
1bea12973e53 Uploaded
iracooke
parents:
diff changeset
12 -g $genome_fasta_file
1bea12973e53 Uploaded
iracooke
parents:
diff changeset
13
1bea12973e53 Uploaded
iracooke
parents:
diff changeset
14 -d $protein_fasta_file
1bea12973e53 Uploaded
iracooke
parents:
diff changeset
15
1bea12973e53 Uploaded
iracooke
parents:
diff changeset
16 -o $output
1bea12973e53 Uploaded
iracooke
parents:
diff changeset
17
1bea12973e53 Uploaded
iracooke
parents:
diff changeset
18 --threshold $peptide_threshold
1bea12973e53 Uploaded
iracooke
parents:
diff changeset
19
1bea12973e53 Uploaded
iracooke
parents:
diff changeset
20 --prot-threshold $protein_threshold
1bea12973e53 Uploaded
iracooke
parents:
diff changeset
21
1bea12973e53 Uploaded
iracooke
parents:
diff changeset
22 $stack_charges
1bea12973e53 Uploaded
iracooke
parents:
diff changeset
23
1bea12973e53 Uploaded
iracooke
parents:
diff changeset
24 </command>
1bea12973e53 Uploaded
iracooke
parents:
diff changeset
25
1bea12973e53 Uploaded
iracooke
parents:
diff changeset
26
1bea12973e53 Uploaded
iracooke
parents:
diff changeset
27
1bea12973e53 Uploaded
iracooke
parents:
diff changeset
28
1bea12973e53 Uploaded
iracooke
parents:
diff changeset
29 <stdio>
1bea12973e53 Uploaded
iracooke
parents:
diff changeset
30 <exit_code range="1:" level="fatal" description="Failure" />
1bea12973e53 Uploaded
iracooke
parents:
diff changeset
31 </stdio>
1bea12973e53 Uploaded
iracooke
parents:
diff changeset
32
1bea12973e53 Uploaded
iracooke
parents:
diff changeset
33 <inputs>
1bea12973e53 Uploaded
iracooke
parents:
diff changeset
34 <param name="protxml_file" type="data" format="protxml" help="ProtXML containing combined results from all searches" label="ProtXML File" />
1bea12973e53 Uploaded
iracooke
parents:
diff changeset
35 <param name="genome_fasta_file" type="data" format="fasta" help="The genome against which peptides will be mapped" label="Genome fasta file" />
1bea12973e53 Uploaded
iracooke
parents:
diff changeset
36 <param name="protein_fasta_file" type="data" format="fasta" help="The database used for ms/ms searches (must have genomic coords encoded in the fasta header)" label="Protein fasta file" />
1bea12973e53 Uploaded
iracooke
parents:
diff changeset
37
1bea12973e53 Uploaded
iracooke
parents:
diff changeset
38 <param name="peptide_threshold" help="Peptide Probability Threshold" type="float" value="0.95" min="0" max="1" label="Peptide Probability Threshold" />
1bea12973e53 Uploaded
iracooke
parents:
diff changeset
39 <param name="protein_threshold" help="Protein Probability Threshold" type="float" value="0.99" min="0" max="1" label="Protein Probability Threshold" />
1bea12973e53 Uploaded
iracooke
parents:
diff changeset
40
1bea12973e53 Uploaded
iracooke
parents:
diff changeset
41 <param name="stack_charges" type="boolean" label="Stack Charges" help="Different peptide charge states get separate gff entries" truevalue="--stack-charge-states" falsevalue=""/>
1bea12973e53 Uploaded
iracooke
parents:
diff changeset
42
1bea12973e53 Uploaded
iracooke
parents:
diff changeset
43 <param name="collapse_redundant_proteins" type="boolean" label="Collapse Redundant Proteins" help="Proteins that cover genomic regions already covered will be skipped" truevalue="--collapse-redundant-proteins" falsevalue=""/>
1bea12973e53 Uploaded
iracooke
parents:
diff changeset
44
1bea12973e53 Uploaded
iracooke
parents:
diff changeset
45 </inputs>
1bea12973e53 Uploaded
iracooke
parents:
diff changeset
46
1bea12973e53 Uploaded
iracooke
parents:
diff changeset
47 <outputs>
1
20639ed90568 Uploaded
iracooke
parents: 0
diff changeset
48 <data format="gff3" name="output" />
0
1bea12973e53 Uploaded
iracooke
parents:
diff changeset
49 </outputs>
1bea12973e53 Uploaded
iracooke
parents:
diff changeset
50
1bea12973e53 Uploaded
iracooke
parents:
diff changeset
51
1bea12973e53 Uploaded
iracooke
parents:
diff changeset
52 <help>
1bea12973e53 Uploaded
iracooke
parents:
diff changeset
53
1bea12973e53 Uploaded
iracooke
parents:
diff changeset
54 **What it does**
1bea12973e53 Uploaded
iracooke
parents:
diff changeset
55
1bea12973e53 Uploaded
iracooke
parents:
diff changeset
56 Generates a gff file containing genomic coordinates for peptides present in a protXML file.
1bea12973e53 Uploaded
iracooke
parents:
diff changeset
57
1bea12973e53 Uploaded
iracooke
parents:
diff changeset
58 In order for this tool to work the inputs must satisfy certain requirements.
1bea12973e53 Uploaded
iracooke
parents:
diff changeset
59
1bea12973e53 Uploaded
iracooke
parents:
diff changeset
60 1. The genome fasta should encode the scaffold numbers as in the following example
1bea12973e53 Uploaded
iracooke
parents:
diff changeset
61
1bea12973e53 Uploaded
iracooke
parents:
diff changeset
62 >scaffoldXXX
1bea12973e53 Uploaded
iracooke
parents:
diff changeset
63
1bea12973e53 Uploaded
iracooke
parents:
diff changeset
64 or
1bea12973e53 Uploaded
iracooke
parents:
diff changeset
65
1bea12973e53 Uploaded
iracooke
parents:
diff changeset
66 >scaffold_XXX
1bea12973e53 Uploaded
iracooke
parents:
diff changeset
67
1bea12973e53 Uploaded
iracooke
parents:
diff changeset
68 where XXX represent digits encoding the scaffold number. Any number of digits are allowed
1bea12973e53 Uploaded
iracooke
parents:
diff changeset
69
1bea12973e53 Uploaded
iracooke
parents:
diff changeset
70 2. The protXML should have been generated by searching a database generated using the protk Generate 6 frame translation tool and the extract proteins from gff3 tool. Both those tools should be run with the genomics coordinates included in the output file.
1bea12973e53 Uploaded
iracooke
parents:
diff changeset
71
1bea12973e53 Uploaded
iracooke
parents:
diff changeset
72
1bea12973e53 Uploaded
iracooke
parents:
diff changeset
73
1bea12973e53 Uploaded
iracooke
parents:
diff changeset
74 ----
1bea12973e53 Uploaded
iracooke
parents:
diff changeset
75
1bea12973e53 Uploaded
iracooke
parents:
diff changeset
76 **References**
1bea12973e53 Uploaded
iracooke
parents:
diff changeset
77
1bea12973e53 Uploaded
iracooke
parents:
diff changeset
78
1bea12973e53 Uploaded
iracooke
parents:
diff changeset
79 </help>
1bea12973e53 Uploaded
iracooke
parents:
diff changeset
80
1bea12973e53 Uploaded
iracooke
parents:
diff changeset
81 </tool>