Mercurial > repos > jjohnson > bed_to_protein_map
comparison bed_to_protein_map.xml @ 0:4702e7f629bb draft default tip
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/bed_to_protein_map commit 38e8d0e983c3aa314e13bdc9ea98f4a728b7772c-dirty
| author | jjohnson |
|---|---|
| date | Mon, 20 Nov 2017 14:58:18 -0500 |
| parents | |
| children |
comparison
equal
deleted
inserted
replaced
| -1:000000000000 | 0:4702e7f629bb |
|---|---|
| 1 <tool id="bed_to_protein_map" name="bed to protein map" version="0.1.0"> | |
| 2 <description>genomic location of proteins for MVP</description> | |
| 3 <requirements> | |
| 4 </requirements> | |
| 5 <stdio> | |
| 6 <exit_code range="1:" /> | |
| 7 </stdio> | |
| 8 <command><![CDATA[ | |
| 9 python '$__tool_directory__/bed_to_protein_map.py' -i '$input' -o '$output' | |
| 10 ]]></command> | |
| 11 <inputs> | |
| 12 <param name="input" type="data" format="bed" label="A BED file with 12 columns, thickStart and thickEnd define protein coding region"/> | |
| 13 </inputs> | |
| 14 <outputs> | |
| 15 <data name="output" format="tabular"> | |
| 16 <actions> | |
| 17 <action name="column_names" type="metadata" default="name,chrom,start,end,strand,cds_start,cds_end"/> | |
| 18 </actions> | |
| 19 </data> | |
| 20 </outputs> | |
| 21 <tests> | |
| 22 <test> | |
| 23 <param name="input" ftype="bed" value="input.bed"/> | |
| 24 <output name="output" file="output.tabular"/> | |
| 25 </test> | |
| 26 </tests> | |
| 27 <help><![CDATA[ | |
| 28 Convert a BED format file of the proteins from a proteomics search database into a tabular format for the Multiomics Visualization Platform (MVP). | |
| 29 | |
| 30 Example input BED dataset:: | |
| 31 | |
| 32 X 276352 291629 ENST00000430923 20 + 284187 291629 80,80,80 5 42,148,137,129,131 0,7814,12380,14295,15146 | |
| 33 X 304749 318819 ENST00000326153 20 - 305073 318787 80,80,80 10 448,153,149,209,159,68,131,71,138,381 0,2610,2982,6669,8016,9400,10140,10479,12164,13689 | |
| 34 | |
| 35 | |
| 36 Output:: | |
| 37 | |
| 38 name chrom start end strand cds_start cds_end | |
| 39 ENST00000430923 X 284187 284314 + 0 127 | |
| 40 ENST00000430923 X 288732 288869 + 127 264 | |
| 41 ENST00000430923 X 290647 290776 + 264 393 | |
| 42 ENST00000430923 X 291498 291629 + 393 524 | |
| 43 ENST00000326153 X 318438 318787 - 0 349 | |
| 44 ENST00000326153 X 316913 317051 - 349 487 | |
| 45 ENST00000326153 X 315228 315299 - 487 558 | |
| 46 ENST00000326153 X 314889 315020 - 558 689 | |
| 47 ENST00000326153 X 314149 314217 - 689 757 | |
| 48 ENST00000326153 X 312765 312924 - 757 916 | |
| 49 ENST00000326153 X 311418 311627 - 916 1125 | |
| 50 ENST00000326153 X 307731 307880 - 1125 1274 | |
| 51 ENST00000326153 X 307359 307512 - 1274 1427 | |
| 52 ENST00000326153 X 305073 305197 - 1427 1551 | |
| 53 | |
| 54 | |
| 55 The tabular output can be converted to a sqlite database using the Query_Tabular_ tool. | |
| 56 | |
| 57 The sqlite table should be named: feature_cds_map | |
| 58 The names for the columns should be: name,chrom,start,end,strand,cds_start,cds_end | |
| 59 | |
| 60 This SQL query will return the genomic location for a peptide sequence in a protein (multiply the animo acid position by 3 for the cds location):: | |
| 61 | |
| 62 SELECT distinct chrom, CASE WHEN strand = '+' THEN start + cds_offset - cds_start ELSE end - cds_offset - cds_start END as "pos" | |
| 63 FROM feature_cds_map | |
| 64 WHERE name = acc_name AND cds_offset >= cds_start AND cds_offset < cds_end | |
| 65 | |
| 66 | |
| 67 .. _Query_Tabular: https://toolshed.g2.bx.psu.edu/view/iuc/query_tabular/1ea4e668bf73 | |
| 68 | |
| 69 ]]></help> | |
| 70 </tool> |
