comparison replace_text_in_column.xml @ 7:d64eace4f9f3 draft

Uploaded
author bgruening
date Sat, 17 Jan 2015 08:30:15 -0500
parents 8928e6d1e7ba
children c78b1767db2b
comparison
equal deleted inserted replaced
6:8928e6d1e7ba 7:d64eace4f9f3
19 </command> 19 </command>
20 <inputs> 20 <inputs>
21 <param format="tabular" name="infile" type="data" label="File to process" /> 21 <param format="tabular" name="infile" type="data" label="File to process" />
22 <param name="column" label="in column" type="data_column" data_ref="infile" accept_default="true" /> 22 <param name="column" label="in column" type="data_column" data_ref="infile" accept_default="true" />
23 23
24 <param name="find_pattern" type="text" size="20" label="Find pattern" help="Use simple text, or a valid regular expression (without backslashes // ) " > 24 <param name="find_pattern" type="text" size="20" label="Find pattern" help="Use simple text, or a valid regular expression (without backslashes // ) " >
25 <sanitizer> 25 <sanitizer>
26 <valid initial="string.printable"> 26 <valid initial="string.printable">
27 <remove value="&apos;"/> 27 <remove value="&apos;"/>
28 </valid> 28 </valid>
29 </sanitizer> 29 </sanitizer>
92 92
93 ----- 93 -----
94 94
95 **Example 2** 95 **Example 2**
96 96
97 **Find Pattern:** ^(.{4}) 97 **Find Pattern:** ^(.{4})
98 **Replace Pattern:** &\\t 98 **Replace Pattern:** &\\t
99 99
100 Find the first four characters in each line, and replace them with the same text, followed by a tab character. In practice - this will split the first line into two columns. This operation affects only the selected column. 100 Find the first four characters in each line, and replace them with the same text, followed by a tab character. In practice - this will split the first line into two columns. This operation affects only the selected column.
101 101
102 102
103 ----- 103 -----
104 104
105 **Extened Regular Expression Syntax** 105 **Extened Regular Expression Syntax**
106 106
107 The select tool searches the data for lines containing or not containing a match to the given pattern. A Regular Expression is a pattern descibing a certain amount of text. 107 The select tool searches the data for lines containing or not containing a match to the given pattern. A Regular Expression is a pattern descibing a certain amount of text.
108 108
109 - **( ) { } [ ] . * ? + \ ^ $** are all special characters. **\\** can be used to "escape" a special character, allowing that special character to be searched for. 109 - **( ) { } [ ] . * ? + \ ^ $** are all special characters. **\\** can be used to "escape" a special character, allowing that special character to be searched for.
110 - **^** matches the beginning of a string(but not an internal line). 110 - **^** matches the beginning of a string(but not an internal line).
111 - **(** .. **)** groups a particular pattern. 111 - **(** .. **)** groups a particular pattern.
112 - **{** n or n, or n,m **}** specifies an expected number of repetitions of the preceding pattern. 112 - **{** n or n, or n,m **}** specifies an expected number of repetitions of the preceding pattern.
113 113
114 - **{n}** The preceding item is matched exactly n times. 114 - **{n}** The preceding item is matched exactly n times.
115 - **{n,}** The preceding item ismatched n or more times. 115 - **{n,}** The preceding item ismatched n or more times.
116 - **{n,m}** The preceding item is matched at least n times but not more than m times. 116 - **{n,m}** The preceding item is matched at least n times but not more than m times.
117 117
118 - **[** ... **]** creates a character class. Within the brackets, single characters can be placed. A dash (-) may be used to indicate a range such as **a-z**. 118 - **[** ... **]** creates a character class. Within the brackets, single characters can be placed. A dash (-) may be used to indicate a range such as **a-z**.
119 - **.** Matches any single character except a newline. 119 - **.** Matches any single character except a newline.
120 - ***** The preceding item will be matched zero or more times. 120 - ***** The preceding item will be matched zero or more times.
121 - **?** The preceding item is optional and matched at most once. 121 - **?** The preceding item is optional and matched at most once.
122 - **+** The preceding item will be matched one or more times. 122 - **+** The preceding item will be matched one or more times.
123 - **^** has two meaning: 123 - **^** has two meaning:
124 - matches the beginning of a line or string. 124 - matches the beginning of a line or string.
125 - indicates negation in a character class. For example, [^...] matches every character except the ones inside brackets. 125 - indicates negation in a character class. For example, [^...] matches every character except the ones inside brackets.
126 - **$** matches the end of a line or string. 126 - **$** matches the end of a line or string.
127 - **\|** Separates alternate possibilities. 127 - **\|** Separates alternate possibilities.
128 128
129 129
130 **Note**: AWK uses extended regular expression syntax, not Perl syntax. **\\d**, **\\w**, **\\s** etc. are **not** supported. 130 **Note**: AWK uses extended regular expression syntax, not Perl syntax. **\\d**, **\\w**, **\\s** etc. are **not** supported.
131 131
132 @REFERENCES@ 132 @REFERENCES@