annotate EDeN_cross_validation.xml @ 11:bf63bd4cf462 draft default tip

Uploaded
author bgruening
date Thu, 15 May 2014 17:25:44 -0400
parents 5be8af51780d
children
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
9
5be8af51780d Uploaded
bgruening
parents:
diff changeset
1 <tool id="bg_eden_cross_validation" name="EDeN Crossvalidation" version="0.1">
5be8af51780d Uploaded
bgruening
parents:
diff changeset
2 <description></description>
5be8af51780d Uploaded
bgruening
parents:
diff changeset
3 <macros>
5be8af51780d Uploaded
bgruening
parents:
diff changeset
4 <import>eden_macros.xml</import>
5be8af51780d Uploaded
bgruening
parents:
diff changeset
5 </macros>
5be8af51780d Uploaded
bgruening
parents:
diff changeset
6 <expand macro="requirements" />
5be8af51780d Uploaded
bgruening
parents:
diff changeset
7 <command>
5be8af51780d Uploaded
bgruening
parents:
diff changeset
8 EDeN --action CROSS_VALIDATION
5be8af51780d Uploaded
bgruening
parents:
diff changeset
9
5be8af51780d Uploaded
bgruening
parents:
diff changeset
10 --input_data_file_name $sparse_vector_infile
5be8af51780d Uploaded
bgruening
parents:
diff changeset
11 --file_type "SPARSE_VECTOR"
5be8af51780d Uploaded
bgruening
parents:
diff changeset
12
5be8af51780d Uploaded
bgruening
parents:
diff changeset
13 ## target_file_name is a file with 1 or -1 one in each row, indicating the class
5be8af51780d Uploaded
bgruening
parents:
diff changeset
14 --target_file_name $target_infile
5be8af51780d Uploaded
bgruening
parents:
diff changeset
15 --binary_file_type
5be8af51780d Uploaded
bgruening
parents:
diff changeset
16
5be8af51780d Uploaded
bgruening
parents:
diff changeset
17 --num_cross_validation_folds ${num_cross_validation_folds}
5be8af51780d Uploaded
bgruening
parents:
diff changeset
18 ;
5be8af51780d Uploaded
bgruening
parents:
diff changeset
19 cat cv_predictions | tr ' ' \\t > $outfile;
5be8af51780d Uploaded
bgruening
parents:
diff changeset
20
5be8af51780d Uploaded
bgruening
parents:
diff changeset
21 </command>
5be8af51780d Uploaded
bgruening
parents:
diff changeset
22 <inputs>
5be8af51780d Uploaded
bgruening
parents:
diff changeset
23 <param format="eden_sparse_vector" name="sparse_vector_infile" type="data" label="Input File" help="(--input_data_file_name/-f)"/>
5be8af51780d Uploaded
bgruening
parents:
diff changeset
24 <param format="txt" name="target_infile" type="data" label="Target file" help="indicates with -1 and 1 the class"/>
5be8af51780d Uploaded
bgruening
parents:
diff changeset
25
5be8af51780d Uploaded
bgruening
parents:
diff changeset
26 <param name="num_cross_validation_folds" type="integer" value="10" label="Number of cross validations" help="--num_cross_validation_folds/-c">
5be8af51780d Uploaded
bgruening
parents:
diff changeset
27 <validator type="in_range" min="1" />
5be8af51780d Uploaded
bgruening
parents:
diff changeset
28 </param>
5be8af51780d Uploaded
bgruening
parents:
diff changeset
29 </inputs>
5be8af51780d Uploaded
bgruening
parents:
diff changeset
30 <outputs>
5be8af51780d Uploaded
bgruening
parents:
diff changeset
31 <data format="tabular" name="outfile" label="Crossvalidation of ${on_string}"/>
5be8af51780d Uploaded
bgruening
parents:
diff changeset
32 </outputs>
5be8af51780d Uploaded
bgruening
parents:
diff changeset
33 <tests>
5be8af51780d Uploaded
bgruening
parents:
diff changeset
34 <test>
5be8af51780d Uploaded
bgruening
parents:
diff changeset
35 </test>
5be8af51780d Uploaded
bgruening
parents:
diff changeset
36 </tests>
5be8af51780d Uploaded
bgruening
parents:
diff changeset
37 <help>
5be8af51780d Uploaded
bgruening
parents:
diff changeset
38
5be8af51780d Uploaded
bgruening
parents:
diff changeset
39 .. class:: infomark
5be8af51780d Uploaded
bgruening
parents:
diff changeset
40
5be8af51780d Uploaded
bgruening
parents:
diff changeset
41 **What it does**
5be8af51780d Uploaded
bgruening
parents:
diff changeset
42
5be8af51780d Uploaded
bgruening
parents:
diff changeset
43 The linear model is induced using the accelerated stochastic gradient descent technique by Léon Bottou and Yann LeCun.
5be8af51780d Uploaded
bgruening
parents:
diff changeset
44 When the target information is 0, a self-training algorithm is used to impute a positive or negative class to the unsupervised instances.
5be8af51780d Uploaded
bgruening
parents:
diff changeset
45 If the target information is imbalanced a minority class resampling technique is used to rebalance the training set.
5be8af51780d Uploaded
bgruening
parents:
diff changeset
46
5be8af51780d Uploaded
bgruening
parents:
diff changeset
47 @references@
5be8af51780d Uploaded
bgruening
parents:
diff changeset
48
5be8af51780d Uploaded
bgruening
parents:
diff changeset
49 </help>
5be8af51780d Uploaded
bgruening
parents:
diff changeset
50 </tool>