Mercurial > repos > bgruening > text_processing
diff replace_text_in_column.xml @ 6:8928e6d1e7ba draft
Uploaded
author | bgruening |
---|---|
date | Thu, 08 Jan 2015 09:07:31 -0500 |
parents | 56e80527c482 |
children | d64eace4f9f3 |
line wrap: on
line diff
--- a/replace_text_in_column.xml Wed Jan 07 11:15:41 2015 -0500 +++ b/replace_text_in_column.xml Thu Jan 08 09:07:31 2015 -0500 @@ -7,16 +7,14 @@ <requirement type="package" version="4.1.0">gnu_awk</requirement> </expand> <version_command>awk --version | head -n 1</version_command> - <command interpreter="sh"> + <command> <![CDATA[ - ##adapt to awk's quirks - to pass an acutal backslash - two backslashes are required (just like in a C string) - REPLACE_PATTERN=\${$replace_pattern//\\/\\\\}; awk - -v OFS="\t" + -v OFS=" " --re-interval - --sandbox "{ \$$column = gensub( /$find_pattern/, \"$replace_pattern\", \"g\", \$$column ) ; print \$0 ; }" + --sandbox '{ \$$column = gensub( /$find_pattern/, "$replace_pattern", "g", \$$column ) ; print \$0 ; }' "$infile" - > "$output" + > "$outfile" ]]> </command> <inputs> @@ -39,22 +37,22 @@ </param> </inputs> <outputs> - <data format="input" name="output" metadata_source="infile" /> + <data name="outfile" format_source="infile" metadata_source="infile" /> </outputs> <tests> <test> - <param name="infile" value="replace_text_in_column_in1.txt" ftype="tabular" /> + <param name="infile" value="replace_text_in_column1.txt" ftype="tabular" /> <param name="column" value="4" /> <param name="find_pattern" value=".+_(R.)" /> - <param name="replace_pattern" value="\1" /> - <output name="output" file="replace_text_in_column_output1.txt" /> + <param name="replace_pattern" value="\\1" /> + <output name="outfile" file="replace_text_in_column_results1.txt" /> </test> </tests> <help> <![CDATA[ **What it does** -This tool performs find & replace operation on a specified column in a given file. +This tool performs find & replace operation on a specified column in a given file. .. class:: infomark @@ -79,7 +77,7 @@ **Examples of Replace Patterns** - **WORLD** The word 'WORLD' will be placed whereever the find pattern was found. -- **FOO-&-BAR** Each time the find pattern is found, it will be surrounded with 'FOO-' at the begining and '-BAR' at the end. **&** (ampersand) represents the matched find pattern. +- **FOO-&-BAR** Each time the find pattern is found, it will be surrounded with 'FOO-' at the begining and '-BAR' at the end. **&** (ampersand) represents the matched find pattern. - **\\1** The text which matched the first parenthesis in the Find Pattern. @@ -97,7 +95,7 @@ **Example 2** **Find Pattern:** ^(.{4}) -**Replace Pattern:** &\\t +**Replace Pattern:** &\\t Find the first four characters in each line, and replace them with the same text, followed by a tab character. In practice - this will split the first line into two columns. This operation affects only the selected column.