Mercurial > repos > iuc > salsa
comparison salsa2.xml @ 3:5af503c47367 draft
"planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/salsa2 commit 4904594e8df7cbd6eeee4be24023c6bd15e162de"
| author | iuc |
|---|---|
| date | Thu, 11 Nov 2021 15:02:50 +0000 |
| parents | e15025381b3b |
| children | 41c4e48b0617 |
comparison
equal
deleted
inserted
replaced
| 2:e15025381b3b | 3:5af503c47367 |
|---|---|
| 3 <xrefs> | 3 <xrefs> |
| 4 <xref type="bio.tools">SALSA</xref> | 4 <xref type="bio.tools">SALSA</xref> |
| 5 </xrefs> | 5 </xrefs> |
| 6 <macros> | 6 <macros> |
| 7 <token name="@TOOL_VERSION@">2.3</token> | 7 <token name="@TOOL_VERSION@">2.3</token> |
| 8 <token name="@VERSION_SUFFIX@">1</token> | 8 <token name="@VERSION_SUFFIX@">2</token> |
| 9 </macros> | 9 </macros> |
| 10 <requirements> | 10 <requirements> |
| 11 <requirement type="package" version="@TOOL_VERSION@">salsa2</requirement> | 11 <requirement type="package" version="@TOOL_VERSION@">salsa2</requirement> |
| 12 <requirement type="package" version="1.11">samtools</requirement> | 12 <requirement type="package" version="1.11">samtools</requirement> |
| 13 </requirements> | 13 </requirements> |
| 41 #end if | 41 #end if |
| 42 -o ./out | 42 -o ./out |
| 43 ]]></command> | 43 ]]></command> |
| 44 <inputs> | 44 <inputs> |
| 45 <param name="fasta_in" type="data" format="fasta" label="Initial assembly file" help="Headers must not contain ':'."/> | 45 <param name="fasta_in" type="data" format="fasta" label="Initial assembly file" help="Headers must not contain ':'."/> |
| 46 <param name="bed_file" type="data" format="bed" label="Bed alignment" help="Sorted by read names"/> | 46 <param name="bed_file" type="data" format="bed" label="Bed alignment" help="To start scaffolding with SALSA, reads need to be mapped to the assembly. |
| 47 <param name="cutoff" argument="-c" type="integer" min="1" label="Cutoff" optional="true" help="Minimum contig length to scaffold."/> | 47 BWA or BOWTIE2 are recommended. SALSA requires a bed file as the input. The alignment bam file can be converted using the bamToBed command from |
| 48 <param name="gfa_file" argument="-g" type="data" format="gfa1,gfa2" optional="true" label="Sequence graphs" help="Sequence graphs encoded in GFA format."/> | 48 the Bedtools package."/> |
| 49 <param name="cutoff" argument="-c" type="integer" min="1" label="Cutoff" optional="true" help="Minimum contig length to scaffold"/> | |
| 50 <param name="gfa_file" argument="-g" type="data" format="gfa1,gfa2" optional="true" label="Sequence graphs" | |
| 51 help="An assembly graph can be optionally provided to guide the scaffolding, potentially reducing the scaffolding errors"/> | |
| 49 <conditional name="enzyme_conditional"> | 52 <conditional name="enzyme_conditional"> |
| 50 <param name="enzyme_options" type="select" label="Enzyme selection" help="TODO"> | 53 <param name="enzyme_options" type="select" label="Enzyme selection" help="Hi-C experiments can use different restriction enzymes. |
| 54 The enzyme frequency in contigs is used to normalize the Hi-C interaction frequency. Note that you need to specify the actual | |
| 55 sequence of the cutting site for a restriction enzyme and not the enzyme name. You can also specify DNASE as an enzyme if you | |
| 56 use an enzyme-free prep, e.g. Omin-C."> | |
| 51 <option value="preconfigured">Preconfigured restriction enzymes</option> | 57 <option value="preconfigured">Preconfigured restriction enzymes</option> |
| 52 <option value="specific">Enter a specific sequence</option> | 58 <option value="specific">Enter a specific sequence</option> |
| 53 </param> | 59 </param> |
| 54 <when value="preconfigured"> | 60 <when value="preconfigured"> |
| 55 <param name="preconfigured_enzymes" type="select" multiple="true" label="Preconfigured enzymes"> | 61 <param name="preconfigured_enzymes" type="select" multiple="true" label="Preconfigured enzymes"> |
| 63 help="Restriction enzyme sequence. If multiple were used, include all as a comma separated list without spaces (ex. 'GATC,AAGCTT')."> | 69 help="Restriction enzyme sequence. If multiple were used, include all as a comma separated list without spaces (ex. 'GATC,AAGCTT')."> |
| 64 <validator type="expression" message="Only alphabetical letters and the comma can be used in to define restriction enzym sequences.">value.replace(',', '').isalpha()</validator> | 70 <validator type="expression" message="Only alphabetical letters and the comma can be used in to define restriction enzym sequences.">value.replace(',', '').isalpha()</validator> |
| 65 </param> | 71 </param> |
| 66 </when> | 72 </when> |
| 67 </conditional> | 73 </conditional> |
| 68 <param name="iter" argument="-i" type="integer" min="0" label="Iterations" optional="true" help="Number of iterations to run"/> | 74 <param name="iter" argument="-i" type="integer" min="0" max="20" label="Iterations" optional="true" |
| 75 help="SALSA will scaffold through sequential iterations. The default number of iterations is 3. Increasing the number of iterations will | |
| 76 potentially increase the number of joins, however it could also introduce additional misjoins"/> | |
| 69 </inputs> | 77 </inputs> |
| 70 <outputs> | 78 <outputs> |
| 71 <data name="scaffolds_fasta" format="fasta" from_work_dir="out/scaffolds_FINAL.fasta" label="${tool.name} on ${on_string}: FASTA assembly"/> | 79 <data name="scaffolds_fasta" format="fasta" from_work_dir="out/scaffolds_FINAL.fasta" label="${tool.name} on ${on_string}: FASTA assembly"/> |
| 72 <data name="scaffolds_agp" format="tabular" from_work_dir="out/scaffolds_FINAL.agp" label="${tool.name} on ${on_string}: agp output"/> | 80 <data name="scaffolds_agp" format="tabular" from_work_dir="out/scaffolds_FINAL.agp" label="${tool.name} on ${on_string}: agp output"/> |
| 73 </outputs> | 81 </outputs> |
| 115 <output name="scaffolds_fasta" file="out.fasta"/> | 123 <output name="scaffolds_fasta" file="out.fasta"/> |
| 116 <output name="scaffolds_agp" file="out.agp"/> | 124 <output name="scaffolds_agp" file="out.agp"/> |
| 117 </test> | 125 </test> |
| 118 </tests> | 126 </tests> |
| 119 <help><![CDATA[ | 127 <help><![CDATA[ |
| 120 **What is does** | 128 .. class:: infomark |
| 129 | |
| 130 **Purpose** | |
| 121 | 131 |
| 122 SALSA (Simple AssembLy ScAffolder) is a scaffolding tool based on a computational method that exploits the genomic proximity | 132 SALSA (Simple AssembLy ScAffolder) is a scaffolding tool based on a computational method that exploits the genomic proximity |
| 123 information in Hi-C data sets for long range scaffolding of de novo genome assemblies. | 133 information in Hi-C data sets for long range scaffolding of de novo genome assemblies. |
| 134 | |
| 135 ---- | |
| 136 | |
| 137 .. class:: infomark | |
| 138 | |
| 139 **Mapping reads** | |
| 140 | |
| 141 To start the scaffolding, first step is to map reads to the assembly. We recommend using `BWA <https://usegalaxy.eu/root?tool_id=toolshed.g2.bx.psu.edu/repos/devteam/bwa/bwa_mem/0.7.17.2>`_ | |
| 142 or `BOWTIE2 <https://usegalaxy.eu/root?tool_id=toolshed.g2.bx.psu.edu/repos/devteam/bowtie2/bowtie2/2.4.2+galaxy0>`_ aligner to map reads. The read mapping generates a bam file. SALSA requires | |
| 143 BED file as the input. This can be done using the bamToBed command from the `Bedtools package <http://bedtools.readthedocs.io/en/latest/>`_. Also, SALSA requires BED files to be sorted by the | |
| 144 read name, rather than the alignment coordinates. Once you have bam file, you can run following commands to get the bam file needed as an input to SALSA. | |
| 145 | |
| 146 Since Hi-C reads and alignments contain experimental artifacts, the alignments needs some postprocessing. To align and postprocess | |
| 147 the alignments, you can use the pipeline released by Arima Genomics which can be found in the `GitHub repository <https://github.com/ArimaGenomics>`_. | |
| 148 | |
| 149 Additional information on how to generate/filter the bam `here <https://github.com/marbl/SALSA#mapping-reads>`_. | |
| 124 | 150 |
| 125 ]]></help> | 151 ]]></help> |
| 126 <citations> | 152 <citations> |
| 127 <citation type="doi">10.1101/261149</citation> | 153 <citation type="doi">10.1101/261149</citation> |
| 128 <citation type="doi">10.1186/s12864-017-3879-z</citation> | 154 <citation type="doi">10.1186/s12864-017-3879-z</citation> |
