0
|
1 <tool id="gd_evaluate_population_numbers" name="Evaluate" version="1.0.0">
|
|
2 <description>possible numbers of populations</description>
|
|
3
|
|
4 <command interpreter="bash">
|
|
5 evaluate_population_numbers.bash "${input.extra_files_path}/admix.ped" "$output" "$max_populations"
|
|
6 </command>
|
|
7
|
|
8 <inputs>
|
|
9 <param name="input" type="data" format="wped" label="Dataset" />
|
|
10 <param name="max_populations" type="integer" min="1" value="5" label="Maximum number of populations" />
|
|
11 </inputs>
|
|
12
|
|
13 <outputs>
|
|
14 <data name="output" format="txt" />
|
|
15 </outputs>
|
|
16
|
|
17 <tests>
|
|
18 <test>
|
|
19 <param name='input' value='fake' ftype='wped' >
|
|
20 <metadata name='base_name' value='admix' />
|
|
21 <composite_data value='genome_diversity/test_out/prepare_population_structure/prepare_population_structure.html' />
|
|
22 <composite_data value='genome_diversity/test_out/prepare_population_structure/admix.ped' />
|
|
23 <composite_data value='genome_diversity/test_out/prepare_population_structure/admix.map' />
|
|
24 <edit_attributes type='name' value='fake' />
|
|
25 </param>
|
|
26 <param name='max_populations' value='2' />
|
|
27
|
|
28 <output name="output" file="genome_diversity/test_out/evaluate_population_numbers/evaluate_population_numbers.txt" />
|
|
29 </test>
|
|
30 </tests>
|
|
31
|
|
32 <help>
|
|
33 **What it does**
|
|
34
|
|
35 The users selects a set of data generated by the Galaxy tool to "prepare
|
|
36 to look for population structure". For all possible numbers K of ancestral
|
|
37 populations, from 1 up to a user-specified maximum, this tool produces values
|
|
38 that indicate how well the data can be explained as genotypes from individuals
|
|
39 derived from K ancestral populations. These values are computed by a 5-fold
|
|
40 cross-validation procedure, so that a good choice for K will exhibit a low
|
|
41 cross-validation error compared with other potential settings for K.
|
|
42
|
|
43 **Acknowledgments**
|
|
44
|
|
45 We use the program "Admixture", downloaded from
|
|
46
|
|
47 http://www.genetics.ucla.edu/software/admixture/
|
|
48
|
|
49 and described in the paper "Fast model-based estimation of ancestry in
|
|
50 unrelated individuals" by David H. Alexander, John Novembre and Kenneth Lange,
|
|
51 Genome Research 19 (2009), pp. 1655-1664. Admixture is called with the "--cv"
|
|
52 flag to produce these values.
|
|
53 </help>
|
|
54 </tool>
|