# HG changeset patch # User abossers # Date 1414511945 -3600 # Node ID c1c38335322e5fd7d221bd4217258b8f19149c61 # Parent 59f302448cf6d1f440bfe1f28db167c9a348899b Add revised mummer toolshed files to testtoolshed diff -r 59f302448cf6 -r c1c38335322e MUMmer/README_mummer --- a/MUMmer/README_mummer Tue Jun 07 17:22:27 2011 -0400 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,56 +0,0 @@ -# Created/shared May 2011 -# -# Alex Bossers -# Central Veterinary Institute -# Wageningen University and Research centre -# Lelystad, The Netherlands -# -# Comments/improvements/bugs: Alex (dot) Bossers (at) wur (dot) nl - - -# WHAT IT DOES -The MUMmer suite is a set of very basic wrappers for the MUMmer genome comparison tools. Most common operations should be possible -by using these wrappers. MUMmer works fast on smaller (bacterial) genomes but can also cope with eukaryotic genomes. - -In addition to the original MUMmer tools it also contains an additional conversion script to convert MUMmer comparison files, -the so-called coords files into a readible format for Artemis Comparison Tool (ACT; Sanger UK). - - -# REQUIREMENTS -- Perl -- Galaxy :) -- MUMmer newer than version 3.20; - even though older versions might work as well. - Get your MUMmer here: http://mummer.sourceforge.net/ - Make sure MUMmer is in your PATH and/or update the tool xml configs and wrappers for the full MUMmer path - if it is different from /opt/MUMmer/MUMmer. -- ACT can be run locally or via Webstart if you want to visualise genome comparisons in detail: http://www.sanger.ac.uk/resources/software/act -- GNUplot is a requirement for the MUMmerplot part (see MUMmer installation documentation) - - -# SETUP -Just unpack the tool xml and perl script somewhere appropriate and adapt the MUMmer installation part if different from above. Plug the tool in the tool_config.xml -of your galaxy instance and refresh the tools or restart the galaxy server. - - -# TESTING -You can test the code by running Nucmer on the test data and visualise the results in MUMmerplot. -It should return a MUMmerplot identical to the image provided. For reference I also included the corresponding log file. - - -# LICENSE -Copyright (c) 2011 Central Veterinary Institute of Wageningen UR, Lelystad, The Netherlands. -MUMmer is copyright by its respective owner. See their licensing details. - -Our wrappers/programs are free software; you can redistribute it and/or modify -it under the terms of the GNU General Public License as published by -the Free Software Foundation; either version 3 of the License, or -(at your option) any later version. - -When distributing the tools please include this original reference. - -Use this tool at your own risk. Even though we tried to build tools and wrappers that free of errors, -check your output since it might be erroneous. We will not be relyable to any failure this may have caused. - -If you like these scripts, please acknowledge our work. - diff -r 59f302448cf6 -r c1c38335322e MUMmer/mummer_clustering.xml --- a/MUMmer/mummer_clustering.xml Tue Jun 07 17:22:27 2011 -0400 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,237 +0,0 @@ - - : order sequence matches in clusters - - - /opt/MUMmer/MUMmer/$tool.cmd - #if $tool.cmd=="gaps": - $in_reference - #if $tool.gaps_r=="yes": - -r - #end if - #end if - #if $tool.cmd=="mgaps": - #if $tool.cmd_C=="yes": - -C - #end if - -d $tool.cmd_d - #if $tool.cmd_e=="yes": - -e - #end if - -f $tool.cmd_f - -l $tool.cmd_l - -s $tool.cmd_s - #end if - < $tool.in_match_list - > $out_tool - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - gaps - mgaps - - - - - - -| - - -**Reference** -============= - -- **MUMmer clustering Galaxy tool wrapper:** Alex Bossers, CVI of Wageningen UR, The Netherlands. - -- **MUMmer suite v3.22:** http://mummer.sourceforge.net - -- **MUMmer tutorials:** http://mummer.sourceforge.net/examples/ - -If you found these tools/wrappers usefull in your research, please acknowledge our work. If you improve -or modify the wrappers please add instead of substitute yourself into the acknowlegement section :) - - -**MUMmer Clustering** -===================== - -MUMmer's clustering algorithms attempt to order small individual matches into larger match clusters -in order to make the output of mummer more intelligible. A dot plot makes it easy to spot alignment -regions from a match list, however when examining the data without graphic aids, it is very difficult -to draw any reasonable conclusions from the simple flat file list of matches. Clustering the matches -together into larger groups of neighboring matches makes this process much easier by ordering the -data and removing spurious matches. - - -Gaps ----- - -*gaps* is the primary clustering algorithm for run-mummer1, and although classified as a "clustering" -step, gaps is more of a sorting routine. It implements the LIS (longest increasing subset) algorithm -to extract the longest consistent set of matches between two sequences, and generates a single -cluster that represents the best "straight-line" arrangement of matches between the sequences. By -straight-line, we mean no rearrangements or inversions, just a simple path of agreeing matches -between the two sequences. This limits the usability of this program to the alignment of genomes -that are very similar and with no large scale mutations. *gaps* is best suited for the comparison of -near identical sequences with the goal of finding minor mutations like SNPs and small indels. - -Input can be filtered mummer output. The strange syntax is a result of a legacy issue described in -the Known problems (manual) section, and requires the header be stripped from the mummer output. In -addition, gaps is only designed to handle a single reference and a single query sequence, thus the -preceding mummer run must also follow this constraint. The -r is optional and designates the incoming -matches as reverse complement matches which must reference the reverse complement of the sequence, -therefore forcing mummer to be run without the -c option. - -Reference: http://mummer.sourceforge.net/manual/#gaps - -**Output:** -:: - - > /home/aphillip/data/GHP.1con Consistent matches - 183 17 22 none - - - 238 72 108 none 33 33 - 347 181 92 none 1 1 - 458 292 50 none 19 19 - 705 539 44 none 1 1 - 750 584 38 none 1 1 - 807 641 23 -16 0 4 - (output continues ...) - > Wrap around - 334398 329917 47 none - 225 - 334446 329965 62 none 1 1 - 334539 330058 20 none 31 31 - 334560 330079 92 none 1 1 - 334653 330172 77 none 1 1 - 334740 330259 41 none 10 10 - (output continues ...) - > /home/aphillip/data/GHP.1con Other matches - 1317231 4891 21 none - - - 1317275 4927 21 none - - - 1317804 5399 25 none 508 451 - 947580 5436 36 none - - - 23406 5518 34 none - - - 333079 6592 32 none - - - (output continues ...) - -Where the first line is the location of the reference file, and the first three columns are the same -as the three column match format described in the mummer section. The final three columns are the -overlap between this match and the previous match, the gap between the start of this match and the -end of the previous match in the reference, and the gap between the start of this match and the end -of the previous match in the query respectively. - - -mgaps ------ - -*mgaps* was introduced into the MUMmer pipeline in an effort to better handle large-scale -rearrangements and duplications. Unlike gaps, mgaps is a full clustering algorithm that is capable -of generating multiple groups of consistently ordered matches. Clustering is controlled by a set of -command-line parameters that adjust the minimum cluster size, maximum gap between matches, etc. Only -matches that were included in clusters will appear in the output, so by adjusting the command-line -parameters it is possible to filter out many of the spurious matches, thus leaving only the larger -areas of conservation between the input sequences. The major advantage of mgaps is its ability to -identify these "islands" of conservation. This frees the user from the single LIS restraints of the -gaps program and allows for the identification of large-scale rearrangements, duplications, gene -families and so on. - -Gaps can fail to identify clusters because they were not consistent with the LIS. However, by using -mgaps, all regions of conservation can now been identified. The only fallback being the increased -complexity of the output, where you once had only one cluster for the whole comparison, you usually -now get more. Because of this, it can sometimes be difficult separating the repetitive clusters from -"correct" clusters, *making mgaps more suited for global alignments instead of localized error detection*. - -Input can be raw mummer output. *mgaps* is only designed to handle a single reference and one or -more query sequences, thus the preceding mummer run must also follow this constraint. Please refer -to the run-mummer3 script (see online manual) for an example of how to use this program in an -alignment pipeline. Note that in order to cluster reverse complement matches, the reverse complement -matches must reference the reverse complement strand of the query sequence, therefore forcing mummer -to be run without the -c option. A rewrite of this algorithm to handle multiple reference sequences -and a better coordinate system (forward coordinates for reverse complement matches) is doubtful but -may eventually appear. - -The -d option can be interpreted as the number of insertions allowed between two matches in the same -cluster, while the -f option is a fraction equal to (diagonal difference / match separation) where -a higher value will increase the indel tolerance. Minimum cluster length is the sum of the contained -matches unless the -e option is used. The best way to get a feel for what each parameter controls -is to cluster the same data set numerous times with different values and observe the resulting -differences. It can also be helpful to set these parameters to the size of the element you wish to -capture, i.e. set the minimum cluster size to say the smallest exon you expect and set the max gap -to the smallest intron you expect to obtain clusters that could represent single exons (depending -of course of the similarity of the two sequences). - -Reference: http://mummer.sourceforge.net/manual/#mgaps - -**Output format** - -Output of *mgaps* shares much in common with the output of mummer and gaps, with a slightly different -header formatting than gaps to allow for multiple query sequences and multiple clusters. The output -of mgaps run on both forward and reverse complement matches is as follows: -:: - - > ID41 - > ID41 Reverse - 5177399 1 232 none - - - 5177632 234 6794 none 1 1 - 5184433 7035 24 none 7 7 - 5184468 7069 23 none 11 10 - > ID42 - 10181 43 1521 none - - - > ID42 Reverse - 4654536 17 36 none - - - 4654578 57 298 none 6 4 - 4654877 356 226 none 1 1 - # - 4655139 845 28 none - - - 4655178 884 694 none 11 11 - 4655873 1579 20 none 1 1 - # - 4850044 17 1492 none - - - 4851537 1510 711 none 1 1 - 4852249 2222 42 none 1 1 - (output continues ...) - - -Headers containing the ID for each query sequence are listed after the '>' characters, and a -following Reverse keyword identifies the reverse matches for that query sequence. Individual clusters -for each sequence are separated by a '#' character, and the six columns are exactly the same as the -gaps output (see the gaps section for more details). - - -| -| - - - - diff -r 59f302448cf6 -r c1c38335322e MUMmer/mummer_maxmatch.xml --- a/MUMmer/mummer_maxmatch.xml Tue Jun 07 17:22:27 2011 -0400 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,170 +0,0 @@ - - : Maximal exact sequence matching - - - /opt/MUMmer/MUMmer/$tool.cmd - #if $tool.cmd=="mummer": - $tool.cmd_extra - $tool.mum_ref_in - $tool.mum_q_in - #end if - #if $tool.cmd=="repeat-match": - -n $tool.rm_n - #if $tool.rm_E=="yes": - -E - #end if - $tool.cmd_extra - $tool.in_seq - #end if - #if $tool.cmd=="exact-tandems": - $tool.in_seq - $tool.et_minl - #end if - - 2>&- - > $out_tool - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - mummer - repeat-match - exact-tandems - - - - - - -| - - -**Reference** -============= - -- **MUMmer MaxExactMatch Galaxy tool wrapper:** Alex Bossers, CVI of Wageningen UR, The Netherlands. - -- **MUMmer suite v3.22:** http://mummer.sourceforge.net - -- **MUMmer tutorials:** http://mummer.sourceforge.net/examples/ - -Please do not use any of the command line options that modify prefixes or file names. As obvious -they are quite useless within galaxy and are likely to fail the routine! - -If you found these tools/wrappers usefull in your research, please acknowledge our work. If you improve -or modify the wrappers please add instead of substitute yourself into the acknowlegement section :) - - - -**MUMmer Maximal exact matching** -================================= - -The heart of the MUMmer package is its suffix tree based maximal matching routines. These can be -used for repeat detection within a single sequence as is done by *repeat-match* and *exact-tandems*, -or can be used for the alignment of two or more sequences as is done by *mummer*. - -Mummer ------- - -mummer is a suffix tree algorithm designed to find maximal exact matches of some minimum length -between two input sequences. by default mummer will only find maximal matches that are unique in -the entire set of reference sequences. The match lists produced by mummer can be used alone to -generate alignment dot plots, or can be passed on to the clustering algorithms for the identification -of longer non-exact regions of conservation. These match lists have great versatility because they -contain huge amounts of information and can be passed forward to other interpretation programs for -clustering, analysis, searching, etc. - - -Repeat-match ------------- - -repeat-match is a suffix tree algorithm designed to find maximal exact repeats within a single input -sequence. It uses a similar algorithm to mummer, but altered slightly to find maximal exact matches -within a single sequence. - -Output formatting varies depending on the command line parameters and the output can be quite large. -The standard output format that results from running repeat-match with default parameters is as follows: -:: - - Long Exact Matches: - Start1 Start2 Length - 4919485 4919506r 22 - -The three columns are the first position of the repeat, the second position of the repeat, and the -length of the repeat respectively. Reverse complement repeat positions are denoted by an 'r' -following the Start2 position, and are relative to the forward strand of the sequence. - - -Exact-tandems -------------- - -exact-tandems is a wrapper script for the repeat-match program. It provides a list of exact tandem -repeats within a single input sequence. As with repeat-match the sequence file should contain only -one sequence in FastA format, however if multiple sequences exist the first one will be used. The -sequence may contain any set of upper and lowercase characters, thus DNA and protein sequence are -both allowed and matching is case insensitive. The minimum match length parameter should be a -positive integer, this value will be passed to the repeat-match program via the -n option. - -The output format of exact-tandems is as follows: -:: - - Finding matches - Tandem repeats - Start Extent UnitLen Copies - 416173 150 45 3.3 - -The four columns are the first position of the tandem, the extent of the repeat region, the length -of each tandem repeat unit, and the number of repeat units respectively. - - - -**Manuals and CMD line options (specific for each tool!):** -=========================================================== - -**Mummer** - -http://mummer.sourceforge.net/manual/#mummer - -**Repeat-match** - -http://mummer.sourceforge.net/manual/#repeat - -**exact-tandems** - -http://mummer.sourceforge.net/manual/#exact - -| -| - - - - diff -r 59f302448cf6 -r c1c38335322e MUMmer/mummer_tool.sh --- a/MUMmer/mummer_tool.sh Tue Jun 07 17:22:27 2011 -0400 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,107 +0,0 @@ -#!/bin/bash -## use #!/bin/bash -x for debugging - -## Galaxy wrapper for MUMmer (nucmer/promer) -## Alex Bossers, CVI of Wageningen UR, NL -## alex_dot_bossers_at_wur_dot_nl -## -## Sep 2010 -## -## Wrapper runs MUMmer nucmer/promer and additional args -## Calculates the comparison scores (delta and optional coords file) -## Generates the optional STATIC comparison mummerplot to png (from delta file) -## -## finally the script renames (optional) output files to outfiles expected by Galaxy -## -## -## INPUT args: -## nucmer_tool.sh $input_ref $input_query $out_delta $out_coords $out_png $logfile -## @0 @1 @2 @3 @4 @5 -## $algorithm $keep_delta $make_coords $keep_log $make_image $cmd_extra -## @6 @7 @8 @9 @10 @11 -## - -# path to where mummer suite is installed -# adjust this for your machine -# this is the only hard coded path in the scripts -mum_path="/opt/MUMmer/MUMmer" - -# since we have more than 9 arguments we need to shift the sections or use own array -args=("$@") -# to keep things readible assign vars -input_ref="${args[0]}" -input_query="${args[1]}" -out_delta="${args[2]}" -out_coords="${args[3]}" -out_png="${args[4]}" -logfile="${args[5]}" -algorithm="${args[6]}" -keep_delta="${args[7]}" -make_coords="${args[8]}" -keep_log="${args[9]}" -make_image="${args[10]}" -cmd_extra="${args[11]}" - -# enable/disable the STDOUT log file -if [ "$keep_log" == "yes" ]; then - logfile_c="2>$logfile" - logfile_a="2>>$logfile" -else - #dump to dev/null - logfile_c="2>&-" - logfile_a="2>&-" -fi - -# extra mummer cmd line options - -## generate coords file on the fly? -if [ "$make_coords" == "yes" ]; then - options=" --coords" -fi -## extra cmd line args to be concatenated in options? We need to prevent extra spaces! -if [ "$cmd_extra" != "" ]; then - if [ "$options" == "" ]; then - options=" $cmd_extra" - else - options="$options $cmd_extra" - fi -fi - -# run nucmer/promer -eval "$mum_path/$algorithm$options $input_ref $input_query $logfile_c" - -## generate large png if option make_image = yes -## suppress error from mummerplot since some is deprecated but not a real error -## error can be easily avoided by modifying the source of mummerplot... just in case -## however we need to check if a valid png was generated. This is not the case is alignment is none -## 1 is stderr and 2 stdout. redirect to dev/null - -if [ "${make_image}" == "yes" ]; then - eval "$mum_path/mummerplot --large --png out.delta 1>&- $logfile_a" - if [ -f "out.png" ]; then - mv out.png $out_png - #cleanup temp gnuplot file - rm out.gp - else - echo "not exist the req png file!" - fi - - ## clean up remaining files - rm out.fplot - rm out.rplot - -fi - -# keep/rename or delete delta file -if [ "$keep_delta" == "yes" ]; then - mv out.delta "$out_delta" -else - rm out.delta -fi - -# keep/rename coords file if it was created -if [ "$make_coords" == "yes" ]; then - mv out.coords "$out_coords" -fi - -# end script diff -r 59f302448cf6 -r c1c38335322e MUMmer/mummer_tool.xml --- a/MUMmer/mummer_tool.xml Tue Jun 07 17:22:27 2011 -0400 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,104 +0,0 @@ - - : Compare genomes (Nucmer or Promer) - - mummer_tool.sh - $input_ref $input_query - $out_delta $out_coords $out_png $out_log - $algorithm - $keep_delta $make_coords $keep_log $make_image - $cmd_extra - - - - - - - - - - - - - - - - - - - - - - - - - - - - - make_coords=="yes" - - - keep_delta=="yes" - - - make_image=="yes" - - - keep_log=="yes" - - - - nucmer - promer - - - - - - -| - - -**Reference** -------------- - -- **Nucmer Galaxy tool wrapper: Alex Bossers, CVI of Wageningen UR, The Netherlands.** - -- **Nucmer or Promer of MUMmer suite:** v3.22 http://mummer.sourceforge.net/manual/ - -- **MUMmer tutorials:** http://mummer.sourceforge.net/examples/ - - -If you found these tools/wrappers useful in your research, please acknowledge our work. If you improve -or modify the wrappers please add instead of substitute yourself into the acknowlegement section :) - - -**Command line arguments** --------------------------- - ---mum Use anchor matches that are unique in both the reference and query ---mumreference Use anchor matches that are unique in the reference but not necessarily unique in the query (default behavior) ---maxmatch Use all anchor matches regardless of their uniqueness ---breaklen Distance an alignment extension will attempt to extend poor scoring regions before giving up (default 200) ---mincluster Minimum cluster length (default 65) ---delta Toggle the creation of the delta file. Setting --nodelta prevents the alignment extension step and only outputs the match clusters (default --delta) ---depend Print the dependency information and exit ---diagfactor Maximum diagonal difference factor for clustering, i.e. diagonal difference / match separation (default 0.12) ---extend Toggle the outward extension of alignments from their anchoring clusters. Setting --noextend will prevent alignment extensions but still align the DNA between clustered matches and create the .delta file (default --extend) ---forward Align only the forward strands of each sequence ---maxgap Maximum gap between two adjacent matches in a cluster (default 90) ---help Print the help information and exit ---minmatch Minimum length of an maximal exact match (default 20) ---optimize Toggle alignment score optimization. Setting --nooptimize will prevent alignment score optimization and result in sometimes longer, but lower scoring alignments (default --optimize) ---reverse Align only the reverse strand of the query sequence to the forward strand of the reference ---simplify Simplify alignments by removing shadowed clusters. Turn this option off (--nosimplify) if aligning a sequence to itself to look for repeats (default --simplify) ---version Print the version information and exit ---coords **Automatically ON in galaxy wrapper!** It generates the .coords file using the 'show-coords' program with the -r option. ---prefix **Do NOT use in Galaxy wrapper!** Set the output file prefix (default out) - -| -| - - - - diff -r 59f302448cf6 -r c1c38335322e MUMmer/mummer_utilities_tool.xml --- a/MUMmer/mummer_utilities_tool.xml Tue Jun 07 17:22:27 2011 -0400 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,187 +0,0 @@ - - : Show and filter on sequence delta file - - - /opt/MUMmer/MUMmer/$tool.cmd - $cmd_extra - $input_delta - #if $tool.cmd=="show-aligns": - $tool.aligns1 - $tool.aligns2 - #end if - > $out_tool - - - - - - - - - - - - - - - - - - - - - - - - - - - - - mummer-tiling - mummer-snps - mummer-diff - mummer-coords - mummer-aligns - delta-filter - - - - - - -| - - -**Reference** -============= - -- **MUMmer_utilities Galaxy tool wrapper:** Alex Bossers, CVI of Wageningen UR, The Netherlands. - -- **MUMmer utilities running on MUMmer delta file:** http://mummer.sourceforge.net/manual - -- **MUMmer tutorials:** http://mummer.sourceforge.net/examples/ - -If you found these tools/wrappers usefull in your research, please acknowledge our work. If you improve -or modify the wrappers please add instead of substitute yourself into the acknowlegement section :) - - -**MUMmer Utilities** -==================== - -All tools are using the MUMmer generated DELTA file! Additional arguments are only required for show-aligns. - -Show-coords ------------ - -show-coords parses the delta alignment output of NUCmer and PROmer, and displays summary -information such as position, percent identity and so on, of each alignment. It is the most -commonly used tool for analyzing the delta files. *Usually the -r is used to sort lines by reference* - - -Show-tiling ------------ - -show-tiling attempts to construct a tiling path out of the query contigs as mapped to the reference -sequences. Given the delta alignment information of a few long reference sequences and many small -query contigs, show-tiling will determine the best mapped location of each query contig. Note that -each contig may only be tiled once, so repetitive regions may cause this program some difficulty. -This program is useful for aiding in the scaffolding and closure of an unfinished set of contigs, -if a suitable, high similarity reference genome is available. Or, if using PROmer, show-tiling will -help in the identification of syntenic regions and their contig's mapping to the references. - -This program is not suitable for "many vs. many" assembly comparisons, however a new tool based on -the concepts of show-tiling should be available in the near future that will facilitate the mapping -of assembly contigs. - - -Show-snps ---------- - -show-snps is a utility program for reporting polymorphisms contained in a delta encoded alignment -file output by NUCmer or PROmer. It catalogs all of the single nucleotide polymorphisms (SNPs) and -insertions/deletions within the delta file alignments. Polymorphisms are reported one per line, in -a delimited fashion similar to show-coords. Pairing this program with the appropriate MUMmer tools -can create an easy to use SNP pipeline for the rapid identification of putative SNPs between any -two sequence sets, as demonstrated in the manual SNP detection section. - - -Show-diff ---------- - -Outputs a list of structural differences for each sequence in -the reference and query, sorted by position. For a reference -sequence R, and its matching query sequence Q, differences are -categorized as GAP (gap between two mutually consistent alignments), -DUP (inserted duplication), BRK (other inserted sequence), JMP -(rearrangement), INV (rearrangement with inversion), SEQ -(rearrangement with another sequence). The first five columns of -the output are seq ID, feature type, feature start, feature end, -and feature length. Additional columns are added depending on the -feature type. Negative feature lengths indicate overlapping adjacent -alignment blocks. -:: - - IDR GAP gap-start gap-end gap-length-R gap-length-Q gap-diff - IDR DUP dup-start dup-end dup-length - IDR BRK gap-start gap-end gap-length - IDR JMP gap-start gap-end gap-length - IDR INV gap-start gap-end gap-length - IDR SEQ gap-start gap-end gap-length prev-sequence next-sequence - -Positions always reference the sequence with the given ID. The -sum of the fifth column (ignoring negative values) is the total -amount of inserted sequence. Summing the fifth column after removing -DUP features is total unique inserted sequence. Note that unaligned -sequence are not counted, and could represent additional "unique" -sequences. See documentation for tips on how to interpret these -alignment break features. - - -Show-aligns ------------ - -show-aligns parses the delta encoded alignment output of NUCmer and PROmer, and displays -the pair-wise alignments from the two sequences specified on the command line. It is handy -for identifying the exact location of errors and looking for SNPs between two sequences. - - -Delta-filter ------------- - -delta-filter is a utility program for the manipulation of the delta encoded alignment files output -by the NUCmer and PROmer pipelines. It takes a delta file as input and filters the information based -on the various command line switches, outputting only the desired alignments to stdout. Options to filter by -alignment length, identity, uniqueness and consistency are provided. Certain combinations of these -options can greatly reduce the number of unwanted alignments in the delta file, thus making the output -of programs such as show-coords more comprehendible. - - - -**CMD line options (specific for each tool!):** -=============================================== - -**Show-coords** - -http://mummer.sourceforge.net/manual/#coords - -**Show-tiling** - -http://mummer.sourceforge.net/manual/#tiling - -**Show-snps** - -http://mummer.sourceforge.net/manual/#snps - -**Show-aligns** - -http://mummer.sourceforge.net/manual/#aligns - -**Delta-filter** - -http://mummer.sourceforge.net/manual/#filter - - - - - diff -r 59f302448cf6 -r c1c38335322e MUMmer/mummerplot_tool.sh --- a/MUMmer/mummerplot_tool.sh Tue Jun 07 17:22:27 2011 -0400 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,45 +0,0 @@ -#!/bin/bash - -## simple bash to generate mummerplot of MATCH file -## -## Galaxy wrapper by Alex Bossers, CVI of Wageningen UR, Lelystad, NL -## alex_dot_bossers_at_wur_dot_nl -## -## -## needs a rename of the fixed name to something recognised by galaxy -## needs cleanout of temp files -## -## call is mummerplot $format $in_match $out_file $cmd_extra -## $0 $1 $2 $3 $4 -## -## since mummerplot uses some deprecated syntax which can be fixed in the source -## we redirect STDERR to dev/null to circumvent errorstatus in galaxy -## io redirects 0=stdin 1=stdout 2=stderr to dev/null (or &-) - -# path to where mummer suite is installed -# adjust this for your machine -# this is the only hard coded path in the scripts -mum_path="/opt/MUMmer/MUMmer" - -# some default options to generate a LARGE fixed PNG/POSTSCRIPT image and not an interactive one. - -if [ "$1" = "png" ]; then - extension="png" -else - extension="ps" -fi - -eval "$mum_path/mummerplot --large --$1 $2 1>&- 2>&-" - -if [ -f "out.$extension" ]; then - #conditional move to something known by galaxy - mv out.$extension $3 - #remove gnuplot file - rm out.gp -fi - -## clean up -rm out.fplot -rm out.rplot - -#end script diff -r 59f302448cf6 -r c1c38335322e MUMmer/mummerplot_tool.xml --- a/MUMmer/mummerplot_tool.xml Tue Jun 07 17:22:27 2011 -0400 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,111 +0,0 @@ - - : Generate MUMmerplots from MUMmer match file - - mummerplot_tool.sh - #if $img_format=="png" - png $input_match $out_png - #else - postscript $input_match $out_postscript - #end if - $cmd_extra - - - - - - - - - - - - img_format=="png" - - - img_format=="postscript" - - - - mummerplot - - - - - - -| - - -**Reference** -============= - -- **MUMmerplot Galaxy tool wrapper: Alex Bossers, CVI of Wageningen UR, The Netherlands** - -- **MUMmerplot running on MUMmer-match file:** http://mummer.sourceforge.net/manual#mummerplot - -- **MUMmer tutorials:** http://mummer.sourceforge.net/examples/ - -If you found these tools/wrappers usefull in your research, please acknowledge our work. If you improve -or modify the wrappers please add instead of substitute yourself into the acknowlegement section :) - - -**MUMmerplot** -============== - -| This plotting tool requires a MUMmer match file (either the delta file or the tiling result file)! -| MUMmerplot requires gnuplot (www.gnuplot.info) to be installed. -| -| **The plotting has by default set the arguments --large and --png/--postscript to generate a fixed image instead of an interactive view!** Optional cmd line arguments can be used. -| - - - -Mummerplot is a script utility that takes output from *MUMmer, nucmer or promer* as DELTA file, or the -*show-tiling* result file, and converts it to a format suitable for plotting with gnuplot. The primary -plot type is an alignment dotplot where a sequence is laid out on each axis and a point is plotted at -every position where the two sequences show similarity. As an extension to this plot style, mummerplot -is also able to offset multiple 1-vs-1 dotplots to form a multiplot where multiple sequences can be -laid out on each axis. This plot style is especially handy for browsing an alignment of two contig -sets. Identity plots are also possible by coloring each data point with a color gradient representing -identity, or by collapsing the y-axis data onto a single line and then vertically offsetting the -data points by their identities. In addition to producing the plot data, mummerplot also generates a -gnuplot script that will be evaluated in order to generate the graph. - - -The *match file* can either be a three column match list from mummer (either 3 or 4 column format), -the delta file from nucmer or promer, or the default output from show-tiling. mummerplot will -automatically detect the type of input file it is given, regardless of its file extension, or it -will fail if the input file is of an unrecognized type. - - - -Optional command line arguments -------------------------------- - ---breaklen Highlight alignments with a breakpoint further than the given distance from the nearest sequence end ---nocolor Color plot lines with a percent similarity gradient or turn off all color (default color by match direction) ---coverage Generate a reference coverage plot, also known as a percent identity plot (default behavior for show-tiling input) ---depend Print dependency information and exit ---filter Only display alignments which represent the "best" one-to-one mapping of reference and query subsequences (requires delta formatted input) ---help Print help information and exit ---layout Layout a multiplot by ordering and orienting sequences such that the largest hits cluster near the main diagonal (requires delta formatted input) ---prefix *do not use in galaxy!* Set the output file prefix (default 'out') ---rv Reverse video, swap the foreground and background colors for x11 plots (requires x11 terminal) ---IdR Select a specific reference sequence for the x-axis ---IdQ Select a specific query sequence for the y-axis ---Rfile Generate a multiplot by using the order and length information contained in this file, either a FastA file of the desired reference sequences or a tab-delimited list of sequence IDs, lengths and orientations [ +-] ---Qfile Generate a multiplot by using the order and length information contained in this file, either a FastA file of the desired query sequences or a tab-delimited list of sequence IDs, lengths and orientations [ +-] ---size Set the output size to small, medium or large ---large **default enabled to generate highres image**. Other sizes no effect: --small --medium --large ---SNP Highlight SNP locations in the alignment ---terminal *do not use in galaxy* Set the output terminal to x11, postscript or png ---png **either png or postscript for fixed image**. Other interactive x11 not enabled ---postscript Alternate output format instead of png. ---xrange Set the x-range for the plot in the form "[min,max]" ---yrange Set the y-range for the plot in the form "[min,max]" ---version Display version information and exit - - - - - diff -r 59f302448cf6 -r c1c38335322e MUMmer/nucmer_coords2ACT_galaxy.pl --- a/MUMmer/nucmer_coords2ACT_galaxy.pl Tue Jun 07 17:22:27 2011 -0400 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,42 +0,0 @@ -#!/usr/bin/perl - -# converts the MUMmer-nucmer coords file in a file readable for Artemis Comparison Tool -# Output format is like crunch of BLAST -# -# [nov 2010] Galaxy wrapped up version -# -# Alex.Bossers@wur.nl - - -use warnings; -use strict; - -#$filename=shift; - #$ARGV[0] =~ m/^([A-Z0-9_.-]+)$/ig; -my $filename = $ARGV[0]; - #$ARGV[1] =~ m/^([A-Z0-9_.-]+)$/ig; -my $fileout = $ARGV[1]; -#my $filename = "Curated_vs_noncurated_8067_01.nucmer.coords"; -#my $fileout = "Curated_vs_noncurated_8067_01.nucmer.tab"; - -open (COORDS,$filename) || die "error opening input coords file"; -open (OUT,">$fileout") || die "error opening tab output file"; - -while () - { - unless ($_ =~ /^(\s*)\d/){next} - $_ =~ s/\|//g; - - my @f = split; - # create crude match score = ((length_of_match * %identity)-(length_of_match * (100 - %identity))) /20 - my $crude_plus_score=($f[4]*$f[6]); - my $crude_minus_score=($f[4]*(100-$f[6])); - my $crude_score= int(($crude_plus_score - $crude_minus_score) / 20); - # reorganise columns and print crunch format to stdout - # score %id S1 E1 seq1 S2 E2 seq2 (description) - print OUT " $crude_score $f[6] $f[0] $f[1] $f[7] $f[2] $f[3] $f[8] nucmer comparison coordinates\n" - } - -close (COORDS); -close (OUT); -print "Done!\n\n"; diff -r 59f302448cf6 -r c1c38335322e MUMmer/nucmer_coords2ACT_galaxy.xml --- a/MUMmer/nucmer_coords2ACT_galaxy.xml Tue Jun 07 17:22:27 2011 -0400 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,39 +0,0 @@ - - : convert MUMmer comparison (coords) file to ACT (Artemis) - - nucmer_coords2ACT_galaxy.pl $in_coords $out_act - - - - - - - - - nucmer_coords2ACT_galaxy.pl - - - - - - -| -| - -**Info** --------- - -This tool will convert the MUMmer comparison file (run MUMmer with the coords option) into a "blast crunch" file -that can be read as a comparison file in Artemic Comparison Tool (ACT). - -It will output a single tabular crunch file (save as extension .tab on windows systems). - -**Reference/questions/remarks** - -- *Conversion perl script and wrapper:* Alex Bossers, CVI of Wageningen UR, The Netherlands. - - - - - - diff -r 59f302448cf6 -r c1c38335322e MUMmer/suite_config.xml --- a/MUMmer/suite_config.xml Tue Jun 07 17:22:27 2011 -0400 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,22 +0,0 @@ - - This suite contains MUMmer genome alignment tools and parsers - - : Compare genomes by alignment (Nucmer or Promer) - - - : Maximal exact sequence matching - - - : order sequence matches in clusters - - - : Show and filter on sequence delta file - - - : Generate MUMmerplots from MUMmer match file - - - : convert MUMmer comparison (coords) file to ACT (Artemis) - - - diff -r 59f302448cf6 -r c1c38335322e MUMmer/tool_dependencies.xml --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/MUMmer/tool_dependencies.xml Tue Oct 28 16:59:05 2014 +0100 @@ -0,0 +1,13 @@ + + + + + + + + + + + \ No newline at end of file