Mercurial > repos > devteam > kraken_filter
comparison kraken-filter.xml @ 6:ccfb9cbfcc72 draft
planemo upload for repository https://github.com/galaxyproject/tools-iuc/blob/master/tool_collections/kraken/kraken_filter/ commit e8fc7c9dad5f583ad6763ecb9bd8c924832abacd
| author | iuc |
|---|---|
| date | Mon, 07 Aug 2017 17:27:49 -0400 |
| parents | d246279116a4 |
| children | 6e690205b306 |
comparison
equal
deleted
inserted
replaced
| 5:d246279116a4 | 6:ccfb9cbfcc72 |
|---|---|
| 1 <tool id="kraken-filter" name="Kraken-filter" version="1.2.1"> | 1 <tool id="kraken-filter" name="Kraken-filter" version="@WRAPPER_VERSION@"> |
| 2 <description> | 2 <description>filter classification by confidence score</description> |
| 3 filter classification by confidence score | |
| 4 </description> | |
| 5 <macros> | 3 <macros> |
| 6 <import>macros.xml</import> | 4 <import>macros.xml</import> |
| 7 </macros> | 5 </macros> |
| 8 <expand macro="requirements" /> | 6 <expand macro="requirements" /> |
| 9 <expand macro="stdio" /> | |
| 10 <expand macro="version_command" /> | 7 <expand macro="version_command" /> |
| 11 <command> | 8 <command detect_errors="exit_code"><![CDATA[ |
| 12 <![CDATA[ | |
| 13 @SET_DATABASE_PATH@ && | 9 @SET_DATABASE_PATH@ && |
| 14 kraken-filter @INPUT_DATABASE@ --threshold $threshold "${input}" > "$filtered_output" | 10 |
| 15 ]]> | 11 kraken-filter |
| 16 </command> | 12 @INPUT_DATABASE@ |
| 13 --threshold $threshold | |
| 14 '${input}' | |
| 15 > '$filtered_output' | |
| 16 ]]></command> | |
| 17 <inputs> | 17 <inputs> |
| 18 <param format="tabular" label="Kraken output" name="input" type="data" help="Select taxonomy classification produced by kraken"/> | 18 <param name="input" type="data" format="tabular" label="Kraken output" help="Select taxonomy classification produced by kraken"/> |
| 19 <param label="Confidence threshold" max="1" min="0" name="threshold" type="float" value="0" help="--threshold; A number between 0 and 1; default=0"/> | 19 <param argument="--threshold" type="float" value="0" min="0" max="1" |
| 20 label="Confidence threshold" help="A floating point number between 0 and 1; default=0"/> | |
| 21 | |
| 20 <expand macro="input_database" /> | 22 <expand macro="input_database" /> |
| 21 </inputs> | 23 </inputs> |
| 22 <outputs> | 24 <outputs> |
| 23 <data format="tabular" name="filtered_output" /> | 25 <data format="tabular" name="filtered_output" /> |
| 24 </outputs> | 26 </outputs> |
| 25 <tests> | 27 <tests> |
| 26 <test> | 28 <test> |
| 27 <param name="input" value="kraken_filter_test1.tab"/> | 29 <param name="input" value="kraken_filter_test1.tab"/> |
| 28 <param name="threshold" value="0"/> | 30 <param name="threshold" value="0"/> |
| 29 <param name="kraken_database" value="test_db"/> | 31 <param name="kraken_database" value="test_db"/> |
| 30 <output name="output" file="kraken_filter_test1_output.tab" ftype="tabular"/> | 32 |
| 33 <output name="filtered_output" file="kraken_filter_test1_output.tab" ftype="tabular"/> | |
| 31 </test> | 34 </test> |
| 32 </tests> | 35 </tests> |
| 33 | 36 |
| 34 <help> | 37 <help> |
| 35 <![CDATA[ | 38 <![CDATA[ |
| 44 | 47 |
| 45 At present, we have not yet developed a confidence score with a solid probabilistic interpretation for Kraken. However, we have developed a simple scoring scheme that has yielded good results for us, and we've made that available in the kraken-filter script. The approach we use allows a user to specify a threshold score in the [0,1] interval; the ``kraken-filter`` script then will adjust labels up the tree until the label's score (described below) meets or exceeds that threshold. If a label at the root of the taxonomic tree would not have a score exceeding the threshold, the sequence is called unclassified by ``kraken-filter``. | 48 At present, we have not yet developed a confidence score with a solid probabilistic interpretation for Kraken. However, we have developed a simple scoring scheme that has yielded good results for us, and we've made that available in the kraken-filter script. The approach we use allows a user to specify a threshold score in the [0,1] interval; the ``kraken-filter`` script then will adjust labels up the tree until the label's score (described below) meets or exceeds that threshold. If a label at the root of the taxonomic tree would not have a score exceeding the threshold, the sequence is called unclassified by ``kraken-filter``. |
| 46 | 49 |
| 47 A sequence label's score is a fraction C/Q, where C is the number of k-mers mapped to LCA values in the clade rooted at the label, and Q is the number of k-mers in the sequence that lack an ambiguous nucleotide (i.e., they were queried against the database). Consider the example of the LCA mappings in Kraken's output:: | 50 A sequence label's score is a fraction C/Q, where C is the number of k-mers mapped to LCA values in the clade rooted at the label, and Q is the number of k-mers in the sequence that lack an ambiguous nucleotide (i.e., they were queried against the database). Consider the example of the LCA mappings in Kraken's output:: |
| 48 | 51 |
| 49 562:13 561:4 A:31 0:1 562:3 | 52 562:13 561:4 A:31 0:1 562:3 |
| 50 | 53 |
| 51 would indicate that:: | 54 would indicate that:: |
| 52 | 55 |
| 53 the first 13 k-mers mapped to taxonomy ID #562 | 56 the first 13 k-mers mapped to taxonomy ID #562 |
| 54 the next 4 k-mers mapped to taxonomy ID #561 | 57 the next 4 k-mers mapped to taxonomy ID #561 |
