Mercurial > repos > pimarin > recentrifuge
changeset 3:2890083b1a84 draft
"planemo upload for repository https://github.com/mesocentre-clermont-auvergne/galaxy-tools/tree/master/tools/recentrifuge commit 000b196a8781301582ee706ab287f65f27478a12-dirty"
author | pimarin |
---|---|
date | Wed, 06 Apr 2022 14:54:52 +0000 |
parents | b135c5908e8c |
children | 512dc05a0e5a |
files | recentrifuge.xml |
diffstat | 1 files changed, 140 insertions(+), 125 deletions(-) [+] |
line wrap: on
line diff
--- a/recentrifuge.xml Wed Apr 06 13:52:48 2022 +0000 +++ b/recentrifuge.xml Wed Apr 06 14:54:52 2022 +0000 @@ -365,130 +365,145 @@ <output name="logfile" file="kraken_test/test3_tsv.log" lines_diff="20"/> </test> </tests> - <help> - <![CDATA[ - =-= /home/pierre/anaconda3/envs/rcf/bin/rcf =-= v1.8.1 - Mar 2022 =-= by Jose Manuel Martí =-= - usage: rcf [-h] [-V] [-n PATH] [--format GENERIC_FORMAT] - (-f FILE | -g FILE | -l FILE | -r FILE | -k FILE) [-o FILE] - [-e OUTPUT_TYPE] [-p] [--nohtml] [-a | -c CONTROLS_NUMBER] - [-s SCORING] [-y NUMBER] [-m INT] [-x TAXID] [-i TAXID] [-z NUMBER] - [-w INT] [-u SUMMARY_BEHAVIOR] [-t] [--nokollapse] [-d] [--strain] - [--sequential] - Robust comparative analysis and contamination removal for metagenomics - options: - -h, --help show this help message and exit - -V, --version show program's version number and exit - input: - Define Recentrifuge input files and formats - -n PATH, --nodespath PATH - path for the nodes information files (nodes.dmp and - names.dmp from NCBI) - --format GENERIC_FORMAT - format of the output files from a generic classifier - included with the option -g; It is a string like - "TYP:csv,TID:1,LEN:3,SCO:6,UNC:0" where valid file - TYPes are csv/tsv/ssv, and the rest of fields indicate - the number of column used (starting in 1) for the - TaxIDs assigned, the LENgth of the read, the SCOre - given to the assignment, and the taxid code used for - UNClassified reads - -f FILE, --file FILE Centrifuge output files; if a single directory is - entered, every .out file inside will be taken as a - different sample; multiple -f is available to include - several Centrifuge samples - -g FILE, --generic FILE - output file from a generic classifier; it requires the - flag --format (see such option for details); multiple - -g is available to include several generic samples - -l FILE, --lmat FILE LMAT output dir or file prefix; if just "." is - entered, every subdirectory under the current - directory will be taken as a sample and scanned - looking for LMAT output files; multiple -l is - available to include several samples - -r FILE, --clark FILE - CLARK full-mode output files; if a single directory is - entered, every .csv file inside will be taken as a - different sample; multiple -r is available to include - several CLARK, CLARK-l, and CLARK-S full-mode samples - -k FILE, --kraken FILE - Kraken output files; if a single directory is entered, - every .krk file inside will be taken as a different - sample; multiple -k is available to include several - Kraken (version 1 or 2) samples - output: - Related to the Recentrifuge output files - -o FILE, --outprefix FILE - output prefix; if not given, it will be inferred from - input files; an HTML filename is still accepted for - backwards compatibility with legacy --outhtml option - -e OUTPUT_TYPE, --extra OUTPUT_TYPE - type of extra output to be generated, and can be one - of ['FULL', 'CSV', 'MULTICSV', 'TSV', 'DYNOMICS'] - -p, --pickle pickle (serialize) statistics and data results in - pandas DataFrames (format affected by selection of - --extra) - --nohtml suppress saving the HTML output file - tuning: - Coarse tuning of algorithm parameters - -a, --avoidcross avoid cross analysis - -c CONTROLS_NUMBER, --controls CONTROLS_NUMBER - this number of first samples will be treated as - negative controls; default is no controls - -s SCORING, --scoring SCORING - type of scoring to be applied, and can be one of - ['SHEL', 'LENGTH', 'LOGLENGTH', 'NORMA', 'LMAT', - 'CLARK_C', 'CLARK_G', 'KRAKEN', 'GENERIC'] - -y NUMBER, --minscore NUMBER - minimum score/confidence of the classification of a - read to pass the quality filter; all pass by default - -m INT, --mintaxa INT - minimum taxa to avoid collapsing one level into the - parent (if not specified a value will be automatically - assigned) - -x TAXID, --exclude TAXID - NCBI taxid code to exclude a taxon and all underneath - (multiple -x is available to exclude several taxid) - -i TAXID, --include TAXID - NCBI taxid code to include a taxon and all underneath - (multiple -i is available to include several taxid); - by default, all the taxa are considered for inclusion - fine tuning: - Fine tuning of algorithm parameters - -z NUMBER, --ctrlminscore NUMBER - minimum score/confidence of the classification of a - read in control samples to pass the quality filter; it - defaults to "minscore" - -w INT, --ctrlmintaxa INT - minimum taxa to avoid collapsing one level into the - parent (if not specified a value will be automatically - assigned) - -u SUMMARY_BEHAVIOR, --summary SUMMARY_BEHAVIOR - choice for summary behaviour, and can be one of - ['ADD', 'ONLY', 'AVOID'] - -t, --takeoutroot remove counts directly assigned to the "root" level - --nokollapse show the "cellular organisms" taxon - advanced: - Advanced modes of running - -d, --debug increase output verbosity and perform additional - checks - --strain set strain level instead of species as the resolution - limit for the robust contamination removal algorithm; - use with caution, this is an experimental feature - --sequential deactivate parallel processing - rcf - Release 1.8.1 - Mar 2022 - Copyright (C) 2017–2022, Jose Manuel Martí Martínez - This program is free software: you can redistribute it and/or modify - it under the terms of the GNU Affero General Public License as - published by the Free Software Foundation, either version 3 of the - License, or (at your option) any later version. - This program is distributed in the hope that it will be useful, - but WITHOUT ANY WARRANTY; without even the implied warranty of - MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the - GNU Affero General Public License for more details. - You should have received a copy of the GNU Affero General Public License - along with this program. If not, see <https://www.gnu.org/licenses/>. - - ]]> - </help> + <help><![CDATA[ +usage: rcf [-h] [-V] [-n PATH] [--format GENERIC_FORMAT] + (-f FILE | -g FILE | -l FILE | -r FILE | -k FILE) [-o FILE] + [-e OUTPUT_TYPE] [-p] [--nohtml] [-a | -c CONTROLS_NUMBER] + [-s SCORING] [-y NUMBER] [-m INT] [-x TAXID] [-i TAXID] [-z NUMBER] + [-w INT] [-u SUMMARY_BEHAVIOR] [-t] [--nokollapse] [-d] [--strain] + [--sequential] + +Robust comparative analysis and contamination removal for metagenomics + +options: +-h, --help show this help message and exit +-V, --version show program's version number and exit + +input: +Define Recentrifuge input files and formats + +-n PATH, --nodespath PATH + path for the nodes information files (nodes.dmp and + names.dmp from NCBI) +--format GENERIC_FORMAT + format of the output files from a generic classifier + included with the option -g; It is a string like + "TYP:csv,TID:1,LEN:3,SCO:6,UNC:0" where valid file + TYPes are csv/tsv/ssv, and the rest of fields indicate + the number of column used (starting in 1) for the + TaxIDs assigned, the LENgth of the read, the SCOre + given to the assignment, and the taxid code used for + UNClassified reads +-f FILE, --file FILE Centrifuge output files; if a single directory is + entered, every .out file inside will be taken as a + different sample; multiple -f is available to include + several Centrifuge samples +-g FILE, --generic FILE + output file from a generic classifier; it requires the + flag --format (see such option for details); multiple + -g is available to include several generic samples +-l FILE, --lmat FILE LMAT output dir or file prefix; if just "." is + entered, every subdirectory under the current + directory will be taken as a sample and scanned + looking for LMAT output files; multiple -l is + available to include several samples +-r FILE, --clark FILE + CLARK full-mode output files; if a single directory is + entered, every .csv file inside will be taken as a + different sample; multiple -r is available to include + several CLARK, CLARK-l, and CLARK-S full-mode samples +-k FILE, --kraken FILE + Kraken output files; if a single directory is entered, + every .krk file inside will be taken as a different + sample; multiple -k is available to include several + Kraken (version 1 or 2) samples + +output: +Related to the Recentrifuge output files + +-o FILE, --outprefix FILE + output prefix; if not given, it will be inferred from + input files; an HTML filename is still accepted for + backwards compatibility with legacy --outhtml option +-e OUTPUT_TYPE, --extra OUTPUT_TYPE + type of extra output to be generated, and can be one + of ['FULL', 'CSV', 'MULTICSV', 'TSV', 'DYNOMICS'] +-p, --pickle pickle (serialize) statistics and data results in + pandas DataFrames (format affected by selection of + --extra) +--nohtml suppress saving the HTML output file + +tuning: +Coarse tuning of algorithm parameters + +-a, --avoidcross avoid cross analysis +-c CONTROLS_NUMBER, --controls CONTROLS_NUMBER + this number of first samples will be treated as + negative controls; default is no controls +-s SCORING, --scoring SCORING + type of scoring to be applied, and can be one of + ['SHEL', 'LENGTH', 'LOGLENGTH', 'NORMA', 'LMAT', + 'CLARK_C', 'CLARK_G', 'KRAKEN', 'GENERIC'] +-y NUMBER, --minscore NUMBER + minimum score/confidence of the classification of a + read to pass the quality filter; all pass by default +-m INT, --mintaxa INT + minimum taxa to avoid collapsing one level into the + parent (if not specified a value will be automatically + assigned) +-x TAXID, --exclude TAXID + NCBI taxid code to exclude a taxon and all underneath + (multiple -x is available to exclude several taxid) +-i TAXID, --include TAXID + NCBI taxid code to include a taxon and all underneath + (multiple -i is available to include several taxid); + by default, all the taxa are considered for inclusion + +fine tuning: +Fine tuning of algorithm parameters + +-z NUMBER, --ctrlminscore NUMBER + minimum score/confidence of the classification of a + read in control samples to pass the quality filter; it + defaults to "minscore" +-w INT, --ctrlmintaxa INT + minimum taxa to avoid collapsing one level into the + parent (if not specified a value will be automatically + assigned) +-u SUMMARY_BEHAVIOR, --summary SUMMARY_BEHAVIOR + choice for summary behaviour, and can be one of + ['ADD', 'ONLY', 'AVOID'] +-t, --takeoutroot remove counts directly assigned to the "root" level +--nokollapse show the "cellular organisms" taxon + +advanced: +Advanced modes of running + +-d, --debug increase output verbosity and perform additional + checks +--strain set strain level instead of species as the resolution + limit for the robust contamination removal algorithm; + use with caution, this is an experimental feature +--sequential deactivate parallel processing + +rcf - Release 1.8.1 - Mar 2022 + + Copyright (C) 2017–2022, Jose Manuel Martí Martínez + + This program is free software: you can redistribute it and/or modify + it under the terms of the GNU Affero General Public License as + published by the Free Software Foundation, either version 3 of the + License, or (at your option) any later version. + + This program is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU Affero General Public License for more details. + + You should have received a copy of the GNU Affero General Public License + along with this program. If not, see <https://www.gnu.org/licenses/>. + + + ]]></help> <expand macro="citations"/> </tool>