Mercurial > repos > bgruening > text_processing
comparison easyjoin.xml @ 0:ec66f9d90ef0 draft
initial uploaded
| author | bgruening |
|---|---|
| date | Thu, 05 Sep 2013 04:58:21 -0400 |
| parents | |
| children | 7068d1548234 |
comparison
equal
deleted
inserted
replaced
| -1:000000000000 | 0:ec66f9d90ef0 |
|---|---|
| 1 <tool id="unixtools_easyjoin_tool" name="Join" version="0.1.1"> | |
| 2 <requirements> | |
| 3 <requirement type="package" version="8.21">gnu_coreutils</requirement> | |
| 4 </requirements> | |
| 5 <description>two files</description> | |
| 6 <command interpreter="perl">easyjoin $jointype | |
| 7 -t ' ' | |
| 8 $header | |
| 9 -e '$empty_string_filler' | |
| 10 -o auto | |
| 11 $ignore_case | |
| 12 -1 '$column1' | |
| 13 -2 '$column2' | |
| 14 "$input1" "$input2" | |
| 15 > '$output' | |
| 16 </command> | |
| 17 | |
| 18 <inputs> | |
| 19 <param format="txt" name="input1" type="data" label="1st file" /> | |
| 20 <param name="column1" label="Column to use from 1st file" type="data_column" data_ref="input1" accept_default="true" /> | |
| 21 | |
| 22 <param format="txt" name="input2" type="data" label="2nd File" /> | |
| 23 <param name="column2" label="Column to use from 2nd file" type="data_column" data_ref="input2" accept_default="true" /> | |
| 24 | |
| 25 <param name="jointype" type="select" label="Output lines appearing in"> | |
| 26 <option value=" ">BOTH 1st & 2nd file.</option> | |
| 27 <option value="-v 1">1st but not in 2nd file. [-v 1]</option> | |
| 28 <option value="-v 2">2nd but not in 1st file. [-v 2]</option> | |
| 29 <option value="-a 1">both 1st & 2nd file, plus unpairable lines from 1st file. [-a 1]</option> | |
| 30 <option value="-a 2">both 1st & 2nd file, plus unpairable lines from 2st file. [-a 2]</option> | |
| 31 <option value="-a 1 -a 2">All Lines [-a 1 -a 2]</option> | |
| 32 <option value="-v 1 -v 2">All unpairable lines [-v 1 -v 2]</option> | |
| 33 </param> | |
| 34 | |
| 35 <param name="header" type="boolean" checked="false" truevalue="--header" falsevalue="" label="First line is a header line" help="Use if first line contains column headers. It will not be sorted." /> | |
| 36 | |
| 37 <param name="ignore_case" type="boolean" checked="false" truevalue="-i" falsevalue="" label="Ignore case" help="Sort and Join key column values regardless of upper/lower case letters." /> | |
| 38 | |
| 39 <param name="empty_string_filler" type="text" size="20" value="0" label="Value to put in unpaired (empty) fields"> | |
| 40 <sanitizer> | |
| 41 <valid initial="string.printable"> | |
| 42 <remove value="'"/> | |
| 43 </valid> | |
| 44 </sanitizer> | |
| 45 </param> | |
| 46 | |
| 47 </inputs> | |
| 48 <outputs> | |
| 49 <data name="output" format="input" metadata_source="input1"/> | |
| 50 </outputs> | |
| 51 | |
| 52 <help> | |
| 53 **What it does** | |
| 54 | |
| 55 This tool joins two tabular files based on a common key column. | |
| 56 | |
| 57 ----- | |
| 58 | |
| 59 **Example** | |
| 60 | |
| 61 **First file**:: | |
| 62 | |
| 63 Fruit Color | |
| 64 Apple red | |
| 65 Banana yellow | |
| 66 Orange orange | |
| 67 Melon green | |
| 68 | |
| 69 **Second File**:: | |
| 70 | |
| 71 Fruit Price | |
| 72 Orange 7 | |
| 73 Avocado 8 | |
| 74 Apple 4 | |
| 75 Banana 3 | |
| 76 | |
| 77 **Joining** both files, using **key column 1** and a **header line**, will return:: | |
| 78 | |
| 79 Fruit Color Price | |
| 80 Apple red 4 | |
| 81 Avocado . 8 | |
| 82 Banana yellow 3 | |
| 83 Melon green . | |
| 84 Orange orange 7 | |
| 85 | |
| 86 # Input files need not be sorted. | |
| 87 # The header line (**Fruit Color Price**) was joined and kept as first line. | |
| 88 # Missing values ( Avocado's color, missing from the first file ) are replaced with a period character. | |
| 89 | |
| 90 ----- | |
| 91 | |
| 92 *easyjoin* was written by A. Gordon | |
| 93 | |
| 94 </help> | |
| 95 </tool> |
