| 0 | 1 <tool id="bamtools_split_ref" name="Split BAM by Reference" version="2.4.0"> | 
|  | 2     <description>into dataset list collection</description> | 
|  | 3     <requirements> | 
|  | 4         <requirement type="package" version="2.4.0">bamtools</requirement> | 
|  | 5     </requirements> | 
|  | 6     <command> | 
|  | 7         <![CDATA[ | 
|  | 8             ln -s '${input_bam}' 'localbam.bam' && | 
|  | 9             ln -s '${input_bam.metadata.bam_index}' 'localbam.bam.bai' && | 
|  | 10             bamtools split -reference | 
|  | 11             -in localbam.bam | 
|  | 12             -stub split_bam | 
|  | 13             ## Preserve order from metadata in the output collection | 
|  | 14             #import re | 
|  | 15             #set $name = $re.sub('\W','_',$re.sub('\.bam$','',$input_bam.name)) | 
|  | 16             #if str($refs) != 'None': | 
|  | 17                 #set $ref_list = ' '.join(str($refs).split(",")) | 
|  | 18             #else | 
|  | 19                 #set $ref_list = ' '.join([$re.sub('^.*__sq__(.+)__sq__.*$','\\1',n) if n.find('__sq__') >= 0 else n for n in str($input_bam.metadata.reference_names).split(',')]) | 
|  | 20             #end if | 
|  | 21             && mkdir -p outputs | 
|  | 22             && (export I=0; | 
|  | 23               for i in $ref_list; | 
|  | 24                 do I=\$((++I)); SN=`printf "split_bam.REF_%s.bam" "\$i"`; | 
|  | 25                   if [ -e \$SN ]; then FN=`printf "outputs/split_bam%05d%s.%s.bam" \$((I)) '$name' "\$i"`; mv \$SN \$FN; fi; | 
|  | 26                 done) | 
|  | 27         ]]> | 
|  | 28     </command> | 
|  | 29     <inputs> | 
|  | 30         <param name="input_bam" type="data" format="bam" label="BAM dataset to split by reference"/> | 
|  | 31         <param name="refs" type="select" optional="True" multiple="True" label="Select references (chromosomes and contigs) you would like to restrict bam to" > | 
|  | 32             <help><![CDATA[Click and type in the box above to see options. You can select multiple entries. | 
|  | 33                   If "No options available" is displayed, you need to re-detect metadata on the input dataset. | 
|  | 34             ]]></help> | 
|  | 35             <options> | 
|  | 36                 <filter type="data_meta" ref="input_bam" key="reference_names" /> | 
|  | 37             </options> | 
|  | 38         </param> | 
|  | 39     </inputs> | 
|  | 40     <outputs> | 
|  | 41         <collection name="output_bams" type="list" label="${input_bam.name} Split List"> | 
|  | 42             <discover_datasets pattern="split_bam\d*(?P<designation>.+)\.bam" ext="bam" directory="outputs" visible="false"/> | 
|  | 43         </collection> | 
|  | 44     </outputs> | 
|  | 45     <tests> | 
|  | 46         <test> | 
|  | 47             <param name="input_bam" ftype="bam" value="bamtools-input2.bam"/> | 
|  | 48             <output_collection name="output_bams"  type="list"> | 
|  | 49                 <element name="bamtools_input2.chr1"  file="bamtools_input2.chr1" compare="sim_size" delta="500" /> | 
|  | 50             </output_collection> | 
|  | 51         </test> | 
|  | 52     </tests> | 
|  | 53     <help> | 
|  | 54 **What is does** | 
|  | 55 | 
|  | 56 BAMTools split is a utility for splitting BAM files. It is based on BAMtools suite of tools by Derek Barnett (https://github.com/pezmaster31/bamtools). | 
|  | 57 | 
|  | 58 ----- | 
|  | 59 | 
|  | 60 .. class:: warningmark | 
|  | 61 | 
|  | 62 **DANGER: Multiple Outputs** | 
|  | 63 | 
|  | 64 As described below, splitting a BAM dataset(s) on reference name or a tag value can produce very large numbers of outputs. Read below and know what you are doing. | 
|  | 65 | 
|  | 66 ----- | 
|  | 67 | 
|  | 68 **How it works** | 
|  | 69 | 
|  | 70 Split alignments by reference name into a dataset list collection.  The collection will be in the same order as the input BAM references. | 
|  | 71 | 
|  | 72 In cases of unfinished genomes with very large number of reference sequences (scaffolds) | 
|  | 73 it can generate thousands (if not millions) of output datasets. | 
|  | 74 | 
|  | 75 | 
|  | 76 ----- | 
|  | 77 | 
|  | 78 .. class:: infomark | 
|  | 79 | 
|  | 80 **More information** | 
|  | 81 | 
|  | 82 Additional information about BAMtools can be found at https://github.com/pezmaster31/bamtools/wiki | 
|  | 83 | 
|  | 84     </help> | 
|  | 85     <citations> | 
|  | 86         <citation type="doi">10.1093/bioinformatics/btr174</citation> | 
|  | 87     </citations> | 
|  | 88 </tool> |