diff hyphy_slac.xml @ 36:183cf2baf56e draft default tip

planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/hyphy/ commit d97b1b98a3a621c93a7ed9e7db16bda47eefcb92
author iuc
date Tue, 07 Oct 2025 20:40:22 +0000
parents 6130cd31dfb9
children
line wrap: on
line diff
--- a/hyphy_slac.xml	Thu Mar 02 15:05:59 2023 +0000
+++ b/hyphy_slac.xml	Tue Oct 07 20:40:22 2025 +0000
@@ -7,7 +7,7 @@
     <expand macro="requirements"/>
     <command detect_errors="exit_code"><![CDATA[
         @SYMLINK_FILES@
-        hyphy slac
+        @HYPHYMP@ slac
             --alignment ./$input_file
             @INPUT_TREE@
             --code '$gencodeid'
@@ -15,6 +15,7 @@
             --samples '$number_of_samples'
             --pvalue '$p_value'
             --output '$slac_output'
+            --kill-zero-lengths $kill_zero_lengths > slac_stdout.md
         @ERRORS@
     ]]></command>
     <inputs>
@@ -22,21 +23,29 @@
         <expand macro="gencode"/>
         <expand macro="branches"/>
         <param argument="--pvalue" name="p_value" type="float" value=".1" min="0" max="1" label="P-value"  />
-        <param argument="--samples" name="number_of_samples" type="integer" value="100" min="0" max="100000" label="Number of samples used to assess ancestral reconstruction uncertainty"/>
+        <param argument="--samples" name="number_of_samples" type="integer" value="0" min="0" max="10000" label="Number of samples used to assess ancestral reconstruction uncertainty"/>
+        <expand macro="kill_zero_lengths_param"/>
     </inputs>
     <outputs>
+        <data name="slac_md_report" format="markdown" from_work_dir="slac_stdout.md" label="SLAC Report (Markdown) for ${tool.name} on ${on_string}" />
         <data name="slac_output" format="hyphy_results.json" />
     </outputs>
     <tests>
         <test>
             <param name="input_file" ftype="fasta" value="absrel-in1.fa"/>
             <param name="input_nhx" ftype="nhx" value="absrel-in1.nhx"/>
+            <param name="number_of_samples" value="100"/>
             <output name="slac_output">
                 <assert_contents>
-                    <has_size value="280000" delta="8000"/>
-                    <has_text text="tested"/>
-                    <has_text text="sample-median"/>
+                    <has_text text="sample-2.5"/>
                     <has_text text="sample-97.5"/>
+                    <has_text text="Global MG94xREV"/>
+                </assert_contents>
+            </output>
+            <output name="slac_md_report">
+                <assert_contents>
+                    <has_text text="Performing joint maximum likelihood ancestral state reconstruction"/>
+                    <has_text text="Selected 5 branches to include in SLAC calculations: `Pig, Cow, Node2, Baboon, Rat`"/>
                 </assert_contents>
             </output>
         </test>
@@ -48,7 +57,7 @@
 What question does this method answer?
 --------------------------------------
 
-Which site(s) in a gene are subject to pervasive, i.e. consistently across the entire phylogeny, diversifying selection?
+SLAC (Single Likelihood Ancestor Counting) is designed to identify individual sites within a gene that are subject to pervasive diversifying selection, meaning selection that acts consistently across the entire evolutionary phylogeny. It helps answer: Which specific sites in a gene show evidence of positive selection that has been maintained throughout the evolutionary history of the analyzed sequences?
 
 Recommended Applications
 ------------------------
@@ -62,18 +71,20 @@
 Brief description
 -----------------
 
-SLAC (Single Likelihood Ancestor Counting) uses a maximum likelihood
-ancestral state reconstruction and minimum path substitution counting to
-estimate site - level dS and dN, and applies a simple binomial - based
-test to test if dS differs drom dN. The estimates aggregate information
-over all branches, so the signal is derived from pervasive
-diversification or conservation. A subset of branches can be selected
-for testing as well.
+SLAC (Single Likelihood Ancestor Counting) is a counting-based method designed to detect pervasive positive or negative selection at individual sites within a gene. It operates by first inferring ancestral sequences at each node of the provided phylogenetic tree using a maximum likelihood approach. This reconstruction allows for the estimation of synonymous (dS) and non-synonymous (dN) substitution rates at each site across the entire phylogeny. Finally, a binomial test is applied to determine if the observed number of non-synonymous substitutions significantly deviates from the expected number under neutrality (dN = dS). The method aggregates information across all branches of the phylogeny, making it suitable for detecting pervasive diversifying selection (dN > dS) or purifying selection (dN < dS) that acts consistently throughout the evolutionary history of the analyzed sequences. While generally less statistically robust than likelihood-based methods like FEL or FUBAR, SLAC offers direct interpretability of its results.
+
+How it works
+------------
+
+1.  Ancestral Sequence Reconstruction: SLAC begins by reconstructing the most likely ancestral sequences at each internal node of the provided phylogenetic tree.
+2.  Counting Substitutions: Once ancestral sequences are inferred, the method counts the number of synonymous (dS) and non-synonymous (dN) substitutions that have occurred along each branch of the phylogeny.
+3.  Binomial Test: For each site, SLAC applies a binomial test. The null hypothesis is that non-synonymous and synonymous mutations occur at equal rates (i.e., no selection).
+4.  Pervasive Selection: By aggregating evidence across all branches of the tree, SLAC identifies sites that have been under consistent selective pressure throughout the evolutionary history of the gene.
 
 Input
 -----
 
-1. A *FASTA* sequence alignment.
+1. A coding multiple sequence alignment.
 2. A phylogenetic tree in the *Newick* format
 
 Note: the names of sequences in the alignment must match the names of the sequences in the tree.
@@ -82,9 +93,7 @@
 Output
 ------
 
-A JSON file with analysis results (http://hyphy.org/resources/json-fields.pdf).
-
-A custom visualization module for viewing these results is available (see http://vision.hyphy.org/SLAC for an example)
+A JSON file with analysis results (http://hyphy.org/resources/json-fields.pdf). This JSON output can be visualized using the HyPhy Vision platform at http://vision.hyphy.org/SLAC/.
 
 Further reading
 ---------------
@@ -96,26 +105,25 @@
 ------------
 ::
 
+    --alignment         [required] An in-frame codon alignment in one of the formats supported by HyPhy.
+    --tree              [conditionally required] A phylogenetic tree (optionally annotated with {}).
 
-    --code              Which genetic code to use
+    --code              Which genetic code to use (see tool form for available options).
 
     --branches          Which branches should be tested for selection?
                             All [default] : test all branches
-
-                            Internal : test only internal branches (suitable for
-                            intra-host pathogen evolution for example, where terminal branches
-                            may contain polymorphism data)
-
+                            Internal : test only internal branches (suitable for intra-host pathogen evolution for example, where terminal branches may contain polymorphism data)
                             Leaves: test only terminal (leaf) branches
+                            Unlabeled: if the Newick string is labeled using the {} notation, test only branches without explicit labels (see http://hyphy.org/tutorials/phylotree/)
+                            Custom : Enter a branch label.
 
-                            Unlabeled: if the Newick string is labeled using the {} notation,
-                            test only branches without explicit labels
-                            (see http://hyphy.org/tutorials/phylotree/)
+    --pvalue            The significance level used to determine significance (default: 0.1, range: 0 to 1).
+    --samples           Draw this many alternative ancestral state reconstructions to evaluate uncertainty (default: 100, range: 0 to 10000).
 
-     --pvalue           The significance level used to determine significance
-
-     --samples          Draw this many alternative ancestral state reconstructions
-                        to evaluate uncertainty
+    --kill-zero-lengths Automatically delete internal zero-length branches for computational efficiency.
+                            Yes [default] : Automatically delete internal zero-length branches for computational efficiency (will not affect results otherwise).
+                            Constrain : Keep zero-length branches, but constrain their values to 0.
+                            No : Keep all branches.
 
   ]]>
   </help>