annotate README.md @ 1:b0f2a3fd3c86 draft default tip

planemo upload
author yating-l
date Mon, 15 May 2017 15:03:14 -0400
parents 036cbfb47ee2
children
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
0
036cbfb47ee2 planemo upload
yating-l
parents:
diff changeset
1 WindowMasker
036cbfb47ee2 planemo upload
yating-l
parents:
diff changeset
2 ------------
036cbfb47ee2 planemo upload
yating-l
parents:
diff changeset
3
036cbfb47ee2 planemo upload
yating-l
parents:
diff changeset
4 This is a Galaxy Wrapper for WindowMasker, which was created by Wilson Leung. WindowMasker is a program that can mask out highly repetitive and low complexity DNA sequences within a genome using the sequence of the genome itself.
036cbfb47ee2 planemo upload
yating-l
parents:
diff changeset
5
036cbfb47ee2 planemo upload
yating-l
parents:
diff changeset
6 The WinMask module works in two stages. During Stage 1, unit counts are collected and stored in a separate file. During Stage 2 that file is used to mask the input sequences. Usually the unit counts file is created once per genome and then used multiple times for masking.
036cbfb47ee2 planemo upload
yating-l
parents:
diff changeset
7
036cbfb47ee2 planemo upload
yating-l
parents:
diff changeset
8 WindowMasker_mkcounts
036cbfb47ee2 planemo upload
yating-l
parents:
diff changeset
9 ======================
036cbfb47ee2 planemo upload
yating-l
parents:
diff changeset
10 Stage 1: Generate a counts file
036cbfb47ee2 planemo upload
yating-l
parents:
diff changeset
11
036cbfb47ee2 planemo upload
yating-l
parents:
diff changeset
12 $ windowmasker -mk_counts [-in input_file_name] [-out output_file_name] [-checkdup check_duplicates] [-t_low T_low] [-t_high T_high] [-fa_list input_is_a_list] [-mem available_memory] [-unit unit_length] [-genome_size genome_size] [-exclude_ids exclide_id_list] [-ids id_list] [-infmt input_format] [-sformat unit_counts_format] [-smem available_memory] [-use_ba use_bit_arrays]
036cbfb47ee2 planemo upload
yating-l
parents:
diff changeset
13
036cbfb47ee2 planemo upload
yating-l
parents:
diff changeset
14
036cbfb47ee2 planemo upload
yating-l
parents:
diff changeset
15 WindowMasker_ustat
036cbfb47ee2 planemo upload
yating-l
parents:
diff changeset
16 ===================
036cbfb47ee2 planemo upload
yating-l
parents:
diff changeset
17 Stage 2: WindowMasker reads the data generated in Stage 1 and a set of input DNA sequences to output information about masked subintervals. If "-dust true" is specified, then the corresponding algorithm of the DUST module is applied to the input sequences in addition to window based masking. When DUST module is run, the results of the DUST and WinMask modules are merged together in the output. Specifically, a base is masked if it is masked by either DUST or by WinMask.
036cbfb47ee2 planemo upload
yating-l
parents:
diff changeset
18
036cbfb47ee2 planemo upload
yating-l
parents:
diff changeset
19 windowmasker -ustat unit_counts [-in input_file_name] [-out output_file_name] [-window window_size] [-t_thres T_threshold] [-t_extend T_extend] [-t_low T_low] [-t_high T_high] [-set_t_low score] [-set_t_high score] [-infmt input_format] [-outfmt output_format] [-dust use_dust] [-exclude_ids exclude_id_list] [-ids id_list] [-text_match text_match_ids] [-use_ba use_bit_arrays]
036cbfb47ee2 planemo upload
yating-l
parents:
diff changeset
20
036cbfb47ee2 planemo upload
yating-l
parents:
diff changeset
21 Output formats:
036cbfb47ee2 planemo upload
yating-l
parents:
diff changeset
22 * Use the binary or text maskinfo ASN.1 output formats to generate the mask file for the NCBI BLAST+ makeblastdb tool
036cbfb47ee2 planemo upload
yating-l
parents:
diff changeset
23 * Use the BED output format to generate a list of masked regions
036cbfb47ee2 planemo upload
yating-l
parents:
diff changeset
24
036cbfb47ee2 planemo upload
yating-l
parents:
diff changeset
25 Reference
036cbfb47ee2 planemo upload
yating-l
parents:
diff changeset
26 ==========
036cbfb47ee2 planemo upload
yating-l
parents:
diff changeset
27 [NCBI C++ Toolkit Cross Reference -- WindowMasker](https://www.ncbi.nlm.nih.gov/IEB/ToolBox/CPP_DOC/lxr/source/src/app/winmasker/README)
036cbfb47ee2 planemo upload
yating-l
parents:
diff changeset
28
036cbfb47ee2 planemo upload
yating-l
parents:
diff changeset
29 Citation
036cbfb47ee2 planemo upload
yating-l
parents:
diff changeset
30 =========
036cbfb47ee2 planemo upload
yating-l
parents:
diff changeset
31
036cbfb47ee2 planemo upload
yating-l
parents:
diff changeset
32 [1] Morgulis A, Gertz EM, Schaffer AA, Agarwala R. WindowMasker:
036cbfb47ee2 planemo upload
yating-l
parents:
diff changeset
33 Window based masker for sequence genomes. Submitted for publication.