Mercurial > repos > greg > gene_family_classifier
view gene_family_classifier.xml @ 81:d8975fc916e0 draft
Uploaded
author | greg |
---|---|
date | Thu, 23 Feb 2017 08:46:07 -0500 |
parents | f71d34c9e6fc |
children | 539acfede958 |
line wrap: on
line source
<tool id="plant_tribes_gene_family_classifier" name="Classify gene sequences" version="0.4"> <description>into precomputed orthologous gene family clusters</description> <requirements> <requirement type="package" version="0.4">plant_tribes_gene_family_classifier</requirement> </requirements> <stdio> <!-- Anything other than zero is an error --> <exit_code range="1:" /> <exit_code range=":-1" /> <!-- In case the return code has not been set propery check stderr too --> <regex match="Error:" /> <regex match="Exception:" /> </stdio> <command> <![CDATA[ #import os #if str($options_type.options_type_selector) == 'advanced': #set specify_super_orthogroups_cond = $options_type.specify_super_orthogroups_cond #set specify_super_orthogroups = $specify_super_orthogroups_cond.specify_super_orthogroups #set create_orthogroup_cond = $options_type.create_orthogroup_cond #set create_orthogroup = $create_orthogroup_cond.create_orthogroup #set specify_single_copy_cond = $options_type.specify_single_copy_cond #set specify_single_copy = $specify_single_copy_cond.specify_single_copy #if str($specify_super_orthogroups) == 'yes': #set specify_super_orthos = True #set super_orthogroups = $specify_super_orthogroups_cond.super_orthogroups #else: #set specify_super_orthos = False #end if #if str($create_orthogroup) == 'yes': #set create_ortho_sequences = True #set orthogroups_fasta_src_dir = $os.path.join('geneFamilyClassification_dir', 'orthogroups_fasta') #set create_corresponding_coding_sequences_cond = $create_orthogroup_cond.create_corresponding_coding_sequences_cond #if str($create_corresponding_coding_sequences_cond.create_corresponding_coding_sequences) == 'yes': #set create_corresponding_coding_sequences = True #set orthogroups_fasta_dest_dir = $output_ptorthocs.files_path #else: #set create_corresponding_coding_sequences = False #set orthogroups_fasta_dest_dir = $output_ptortho.files_path #end if mkdir -p $orthogroups_fasta_dest_dir && #else: #set create_ortho_sequences = False #set create_corresponding_coding_sequences = False #end if #if str($specify_single_copy) == 'yes': #set single_copy_orthogroup = True #set single_copy_cond = $specify_single_copy_cond.single_copy_cond #set single_copy = $single_copy_cond.single_copy #if $create_ortho_sequences: #set single_copy_fasta_src_dir = $os.path.join('geneFamilyClassification_dir', 'single_copy_fasta') #set single_copy_fasta_dest_dir = $output_ptsco.extra_files_path mkdir -p $single_copy_fasta_dest_dir && #end if: #else: #set single_copy_orthogroup = False #end if #else: #set single_copy_orthogroup = False #set create_ortho_sequences = False #set create_corresponding_coding_sequences = False #end if GeneFamilyClassifier --proteins '$input' --scaffold '$scaffold.fields.path' --method $method --classifier $save_hmmscan_log_cond.classifier --config_dir '$scaffold.fields.path' --num_threads \${GALAXY_SLOTS:-4} #if str($options_type.options_type_selector) == 'advanced': #if specify_super_orthos: --super_orthogroups $super_orthogroups #end if #if $single_copy_orthogroup: #if str($single_copy) == 'custom': #set single_copy_custom_cond = $single_copy_cond.single_copy_custom_cond #set single_copy_custom = $single_copy_custom_cond.single_copy_custom #if str($single_copy_custom) == 'no': --single_copy_custom 'default' #else: --single_copy_custom '$single_copy_custom_cond.single_copy_custom_config' #end if #else: --single_copy_taxa $single_copy_cond.single_copy_taxa --taxa_present $single_copy_cond.taxa_present #end if #end if #if str($create_orthogroup) == 'yes': --orthogroup_fasta #if $create_corresponding_coding_sequences: --coding_sequences '$create_corresponding_coding_sequences_cond.coding_sequences' #end if #end if #end if >/dev/null #if str($save_hmmscan_log_cond.classifier) == 'hmmscan' or str($save_hmmscan_log_cond.classifier) == 'both': #if str($save_hmmscan_log_cond.save_hmmscan_log) == 'yes': && mv geneFamilyClassification_dir/hmmscan.log $hmmscan_log #else: && rm geneFamilyClassification_dir/hmmscan.log #end if #end if #if $create_ortho_sequences: #if $create_corresponding_coding_sequences: #set out_file = $output_ptorthocs #else: #set out_file = $output_ptortho #end if && echo '<html>\n<head>\n<title>Galaxy - GeneFamilyClassifier Output</title>\n</head>\n<body>\n<p/>\n<ul>\n' > $out_file #for $fname in sorted($os.listdir($orthogroups_fasta_src_dir)): && echo '<li><a href="$fname">$fname</a></li>\n' >> $out_file #end for && echo '</ul>\n</body>\n</html>\n' >> $out_file && mv $orthogroups_fasta_src_dir/* $orthogroups_fasta_dest_dir || true #end if #if $single_copy_orthogroup: #if $create_ortho_sequences: && echo '<html>\n<head>\n<title>Galaxy - GeneFamilyClassifier Output</title>\n</head>\n<body>\n<p/>\n<ul>\n' > $output_ptsco #for $fname in sorted($os.listdir($orthogroups_fasta_src_dir)): && echo '<li><a href="$fname">$fname</a></li>\n' >> $output_ptsco #end for && echo '</ul>\n</body>\n</html>\n' >> $output_ptsco && mv $single_copy_fasta_src_dir/* $single_copy_fasta_dest_dir || true #end if #end if ]]> </command> <inputs> <param name="input" format="fasta" type="data" label="Amino acids (proteins) sequences fasta file"/> <param name="scaffold" type="select" label="Orthogroups or gene families proteins scaffold"> <options from_data_table="plant_tribes_scaffolds" /> <validator type="no_options" message="No PlantTribes scaffolds are available. Use the PlantTribes Scaffolds Download Data Manager tool in Galaxy to install and populate the PlantTribes scaffolds data table."/> </param> <param name="method" type="select" label="Protein clustering method"> <option value="gfam" selected="true">GFam</option> <option value="orthofinder">OrthoFinder</option> <option value="orthomcl">OrthoMCL</option> </param> <conditional name="save_hmmscan_log_cond"> <param name="classifier" type="select" label="Protein classification method"> <option value="blastp" selected="true">blastp</option> <option value="hmmscan">HMMScan</option> <option value="both">Both blastp and HMMScan</option> </param> <when value="blastp" /> <when value="hmmscan"> <param name="save_hmmscan_log" type="select" label="Save hmmscan log?" help="Save the hmmscan log in an additional output dataset"> <option value="no" selected="true">No</option> <option value="yes">Yes</option> </param> </when> <when value="both"> <param name="save_hmmscan_log" type="select" label="Save hmmscan log?" help="Save the hmmscan log in an additional output dataset"> <option value="no" selected="true">No</option> <option value="yes">Yes</option> </param> </when> </conditional> <conditional name="options_type"> <param name="options_type_selector" type="select" label="Options Configuration"> <option value="basic" selected="true">Basic</option> <option value="advanced">Advanced</option> </param> <when value="basic" /> <when value="advanced"> <conditional name="specify_super_orthogroups_cond"> <param name="specify_super_orthogroups" type="select" label="Specify super orthogroups?" help="Secondary MCL clusters of orthogroups"> <option value="no" selected="true">No</option> <option value="yes">Yes</option> </param> <when value="no"/> <when value="yes"> <param name="super_orthogroups" type="select" label="Super Orthogroups"> <option value="min_evalue" selected="true">Minimum e-value</option> <option value="avg_evalue">Average e-value</option> </param> </when> </conditional> <conditional name="specify_single_copy_cond"> <param name="specify_single_copy" type="select" label="Specify single copy orthogroup selection?"> <option value="no" selected="true">No</option> <option value="yes">Yes</option> </param> <when value="no"/> <when value="yes"> <conditional name="single_copy_cond"> <param name="single_copy" type="select" label="Select single copy orthogroup configuration option"> <option value="custom" selected="true">Single copy orthogroup custom configuration</option> <option value="taxa">Minimum single copy taxa required in orthogroup</option> </param> <when value="custom"> <conditional name="single_copy_custom_cond"> <param name="single_copy_custom" type="select" label="Select single copy orthogroup custom configuration from the current history?" help="Select No to use the default configuration"> <option value="no" selected="true">No</option> <option value="yes">Yes</option> </param> <when value="no"/> <when value="yes"> <param name="single_copy_custom_config" format="txt" type="data" label="Single copy orthogroup custom configuration file"/> </when> </conditional> </when> <when value="taxa"> <param name="single_copy_taxa" type="integer" value="20" label="Minimum single copy taxa required in orthogroup"/> <param name="taxa_present" type="integer" value="21" label="Minimum taxa required in single copy orthogroup"/> </when> </conditional> </when> </conditional> <conditional name="create_orthogroup_cond"> <param name="create_orthogroup" type="select" label="Create orthogroup fasta files?"> <option value="no" selected="true">No</option> <option value="yes">Yes</option> </param> <when value="no" /> <when value="yes"> <conditional name="create_corresponding_coding_sequences_cond"> <param name="create_corresponding_coding_sequences" type="select" label="Create corresponding coding sequences?"> <option value="no" selected="true">No</option> <option value="yes">Yes</option> </param> <when value="no" /> <when value="yes"> <param name="coding_sequences" format="fasta" type="data" label="Corresponding coding sequences (CDS) fasta file"/> </when> </conditional> </when> </conditional> </when> </conditional> </inputs> <outputs> <data name="hmmscan_log" format="txt" label="Protein classification hmmscan.log on ${on_string}"> <filter>save_hmmscan_log_cond['classifier'] in ['hmmscan', 'both'] and save_hmmscan_log_cond['save_hmmscan_log'] == 'yes'</filter> </data> <data name="output_ptortho" format="ptortho" label="Gene family clusters on ${on_string}"> <filter>options_type['options_type_selector'] == 'advanced' and options_type['create_orthogroup_cond']['create_orthogroup'] == 'yes' and options_type['create_orthogroup_cond']['create_corresponding_coding_sequences_cond']['create_corresponding_coding_sequences'] == 'no'</filter> </data> <data name="output_ptorthocs" format="ptorthocs" label="Gene family clusters and corresponding coding sequences on ${on_string}"> <filter>options_type['options_type_selector'] == 'advanced' and options_type['create_orthogroup_cond']['create_orthogroup'] == 'yes' and options_type['create_orthogroup_cond']['create_corresponding_coding_sequences_cond']['create_corresponding_coding_sequences'] == 'yes'</filter> </data> <data name="output_ptsco" format="tabular" label="Single copy orthogroups on ${on_string}"> <filter>options_type['options_type_selector'] == 'advanced' and options_type['create_orthogroup_cond']['create_orthogroup'] == 'yes' and options_type['specify_single_copy_cond']['specify_single_copy'] == 'yes'</filter> <change_format> <when input="options_type.create_orthogroup_cond.create_corresponding_coding_sequences_cond.create_corresponding_coding_sequences" value="no" format="ptortho" /> <when input="options_type.create_orthogroup_cond.create_corresponding_coding_sequences_cond.create_corresponding_coding_sequences" value="yes" format="ptorthocs" /> </change_format> </data> <collection name="orthos" type="list"> <discover_datasets pattern="__name__" directory="geneFamilyClassification_dir" visible="false" ext="tabular" /> </collection> </outputs> <tests> <test> <param name="input" value="transcripts.cleaned.nr.pep" ftype="fasta" /> <param name="scaffold" value="22Gv1.1"/> <param name="method" value="orthomcl"/> <param name="classifier" value="blastp"/> <param name="dereplicate" value="yes"/> <param name="min_length" value="200"/> <output_collection name="orthos" type="list"> <element name="proteins.blastp.22Gv1.1" file="proteins.blastp.22Gv1.1" ftype="tabular"/> <element name="proteins.blastp.22Gv1.1.bestOrthos" file="proteins.blastp.22Gv1.1.bestOrthos" ftype="tabular"/> <element name="proteins.blastp.22Gv1.1.bestOrthos.summary" file="proteins.blastp.22Gv1.1.bestOrthos.summary" ftype="tabular"/> </output_collection> </test> </tests> <help> This tool is one of the PlantTribes' collection of automated modular analysis pipelines that utilize objective classifications of complete protein sequences from sequenced plant genomes to perform comparative evolutionary studies. This tool classifies gene sequences into precomputed orthologous gene family clusters using either blastp (faster), HMMScan (slower but more sensitive to remote homologs) or both (more exhaustive). This tool accepts any of the following as input: * the postprocessed assemblies produced by the **Postprocess de novo assembly transcripts into putative coding sequences** tool * externally predicted coding sequences and their corresponding amino acid translations derived from a transcriptome assembly * gene predictions from a sequenced genome ----- **Options** * **Orthogroups or gene families proteins scaffold** - PlantTribes scaffolds data installed into Galaxy by the PlantTribes Scaffolds Download Data Manager tool. * **Protein clustering method** - One of GFam (domain architecture based clustering), OrthoFinder (broadly defined clusters) or OrthoMCL (narrowly defined clusters). * **Protein classification method** - blastp (faster), HMMScan (slower but more sensative to the remote homologs) or both (more exhaustive). * **Super Orthogroups** - Secondary MCL clusters of orthogroups. * **Specify single copy orthogroup selection?** - Specify a single copy orthogroup custom configuration or the minimum single copy taxa required in the orthogroup. * **Select single copy orthogroup custom configuration from the current history?** - If a custom configuration is chosen, the configuration can be selected from the current history or the default configuration can be used. * **Minimum single copy taxa required in orthogroup** - Used with "Minimum single copy taxa required in orthogroup" configuration only. * **Minimum taxa required in single copy orthogroup** - Used with "Minimum single copy taxa required in orthogroup" configuration only. * **Corresponding coding sequences (CDS) fasta file** - Used only when selecting "Create orthogroup fasta files?". </help> <citations> <citation type="bibtex"> @unpublished{None, author = {Eric Wafula}, title = {None}, year = {None}, url = {https://github.com/dePamphilis/PlantTribes} }</citation> <citation type="doi">10.1186/1471-2105-10-421</citation> <citation type="bibtex"> @unpublished{None, author = {None}, title = {HMMER 3.1+ hmmscan search sequence(s) against a profile database}, year = {2013}, url = {http://hmmer.org/} }</citation> </citations> </tool>