Mercurial > repos > greg > kaks_analysis
changeset 12:7945134d3956 draft
Uploaded
author | greg |
---|---|
date | Wed, 01 Mar 2017 15:01:15 -0500 |
parents | a0b1f599becc |
children | 09ef5cc67c78 |
files | kaks_analysis.xml |
diffstat | 1 files changed, 13 insertions(+), 23 deletions(-) [+] |
line wrap: on
line diff
--- a/kaks_analysis.xml Wed Mar 01 14:17:43 2017 -0500 +++ b/kaks_analysis.xml Wed Mar 01 15:01:15 2017 -0500 @@ -184,31 +184,20 @@ * **Required options** - - **Select gene family clusters** - Sequences classified into gene family clusters, optionally including corresponding coding sequences. - - **Orthogroups or gene families proteins scaffold** - PlantTribes scaffolds data. - - **Protein clustering method** - One of GFam (domain architecture based clustering), OrthoFinder (broadly defined clusters) or OrthoMCL (narrowly defined clusters). - - * **Multiple sequence alignments options** + - **Coding sequences (CDS) fasta file for the species** - Coding sequences (CDS) fasta file for the first species. + - **Aamino acids (proteins) sequences fasta file for the species** - Aamino acids (proteins) sequences fasta file for the first species + - **Select method for pairwise sequence comparison to determine homolgous pairs** - Pairwise sequence comparison to determine homolgous pairs (cross species comparison requires selection of inputs for second species). + - **Orthogroups or gene families proteins scaffold** - PlantTribes scaffolds data installed into Galaxy by the PlantTribes Scaffolds Download Data Manager tool. - - **Select method for multiple sequence alignments** - Method used for setting multiple sequence alignments. - - **Input sequences include corresponding coding sequences?** - Selecting 'Yes' for this option requires that the selected input data format is 'ptorthocs'. - - **Construct orthogroup multiple codon alignments?** - Construct orthogroup multiple codon alignments. - - **Sequence type used in the phylogenetic inference** - Sequence type (dna or amino acid) used in the phylogenetic inference. - - **Use corresponding coding sequences?** - Selecting 'Yes' for this option requires that the selected input data format is 'ptorthocs' or this tool will produce an error. + * **Other (optional) options** - * **Phylogenetic trees options** - - - **Phylogenetic trees inference method** - Phylogenetic trees inference method. - - **Select rooting order configuration for rooting trees??** - If 'No' is selected, trees will be rooted using the most distant taxon present in the orthogroup. - - **Number of replicates for rapid bootstrap analysis and search for the best-scoring ML tree** - Number of replicates for rapid bootstrap analysis and search for the best-scoring ML tree. - - **Maximum number of sequences in orthogroup alignments** - Maximum number of sequences in orthogroup alignments. - - **Minimum number of sequences in orthogroup alignments** - Minimum number of sequences in orthogroup alignments. - - * **MSA quality control options** - - - **Remove sequences with gaps of** - Removes gappy sequences in alignments (i.e., 0.5 removes sequences with 50% gaps). - - **Select process used for gap trimming** - Either nucleotide based trimming or alignments are trimed using using trimAl's ML heuristic trimming approach. - - **Remove sites in alignments with gaps of** - If the process used for gap trimming is nucleotide based, this is the gap value used when removing gappy sites in alignments (i.e., 0.1 removes sites with 90% gaps). + - **Minimum sequence pairwise coverage length between homologous pairs** - Minimum sequence pairwise coverage length between homologous pairs (e.g., 0.5 results in 50% coverage. Legal values lie between 0.3 and 1.0. + - **Evolutionary rate for recalibrating synonymous subsitutions (ks) of species** - (applies to paralogous ks analysis) Recalibrate synonymous subsitutions (ks) of species using a predetermined evoutionary rate that can be determined from a species tree inferred from a collection single copy genes from taxa of interest (Cui et al., 2006). + - **Select PAML codeml control file?** - Select PAML's codeml control file from your history. This file is used to to perfom ML analysis of protein-coding DNA sequences using codon substitution models. Selecting No uses the default file which does not include input (seqfile, treefile) and output (outfile) parameters of codeml. + - **Fit a mixture model of multivariate normal components to synonymous (ks) distribution?** - Fit a mixture model of multivariate normal components to synonymous (ks) distribution to identify significant duplication event(s) in a genome. + - **Number components to fit to synonymous subsitutions (ks) distribution** - Number components to fit to synonymous subsitutions (ks) distribution. + - **Lower limit of synonymous subsitutions (ks)** - Lower limit of synonymous subsitutions (ks) - necessary if fitting components to the distribution to reduce background noise from young paralogous pairs due to normal gene births and deaths in a genome. + - **Upper limit of synonymous subsitutions (ks)** - Upper limit of synonymous subsitutions (ks) - necessary if fitting components to the distribution to exclude likey ancient paralogous pairs. </help> <citations> @@ -220,6 +209,7 @@ url = {https://github.com/dePamphilis/PlantTribes} } </citation> + <citation type="doi">10.1093/bioinformatics/btw412</citation> <citation type="doi">10.1186/1471-2105-10-421</citation> <citation type="doi">10.1093/molbev/msm088</citation> <citation type="doi">10.18637/jss.v004.i02</citation>