Skip to content

Commit

Permalink
sub, gsub, and ssub verbs (#1361)
Browse files Browse the repository at this point in the history
* sub, gsub, and ssub verbs

* doc mods

* content for verbs reference page

* test/cases/verb-sub-gsub-ssub/
  • Loading branch information
johnkerl committed Aug 19, 2023
1 parent d4a3bf9 commit 793f52c
Show file tree
Hide file tree
Showing 24 changed files with 888 additions and 37 deletions.
39 changes: 33 additions & 6 deletions docs/src/manpage.md
Original file line number Diff line number Diff line change
Expand Up @@ -194,12 +194,13 @@ MILLER(1) MILLER(1)
1mVERB LIST0m
altkv bar bootstrap case cat check clean-whitespace count-distinct count
count-similar cut decimate fill-down fill-empty filter flatten format-values
fraction gap grep group-by group-like having-fields head histogram json-parse
json-stringify join label latin1-to-utf8 least-frequent merge-fields
most-frequent nest nothing put regularize remove-empty-columns rename reorder
repeat reshape sample sec2gmtdate sec2gmt seqgen shuffle skip-trivial-records
sort sort-within-records split stats1 stats2 step summary tac tail tee
template top utf8-to-latin1 unflatten uniq unspace unsparsify
fraction gap grep group-by group-like gsub having-fields head histogram
json-parse json-stringify join label latin1-to-utf8 least-frequent
merge-fields most-frequent nest nothing put regularize remove-empty-columns
rename reorder repeat reshape sample sec2gmtdate sec2gmt seqgen shuffle
skip-trivial-records sort sort-within-records split ssub stats1 stats2 step
sub summary tac tail tee template top utf8-to-latin1 unflatten uniq unspace
unsparsify

1mFUNCTION LIST0m
abs acos acosh any append apply arrayify asin asinh asserting_absent
Expand Down Expand Up @@ -1245,6 +1246,15 @@ MILLER(1) MILLER(1)
Options:
-h|--help Show this message.

1mgsub0m
Usage: mlr gsub [options]
Replaces old string with new string in specified field(s), with regex support
for the old string and handling multiple matches, like the `gsub` DSL function.
See also the `sub` and `ssub` verbs.
Options:
-f {a,b,c} Field names to convert.
-h|--help Show this message.

1mhaving-fields0m
Usage: mlr having-fields [options]
Conditionally passes through records depending on each record's field names.
Expand Down Expand Up @@ -1853,6 +1863,14 @@ MILLER(1) MILLER(1)

See also the "tee" DSL function which lets you do more ad-hoc customization.

1mssub0m
Usage: mlr ssub [options]
Replaces old string with new string in specified field(s), without regex support for
the old string, like the `ssub` DSL function. See also the `gsub` and `sub` verbs.
Options:
-f {a,b,c} Field names to convert.
-h|--help Show this message.

1mstats10m
Usage: mlr stats1 [options]
Computes univariate statistics for one or more given fields, accumulated across
Expand Down Expand Up @@ -1990,6 +2008,15 @@ MILLER(1) MILLER(1)
https://en.wikipedia.org/wiki/Moving_average#Exponential_moving_average
for more information on EWMA.

1msub0m
Usage: mlr sub [options]
Replaces old string with new string in specified field(s), with regex support
for the old string and not handling multiple matches, like the `sub` DSL function.
See also the `gsub` and `ssub` verbs.
Options:
-f {a,b,c} Field names to convert.
-h|--help Show this message.

1msummary0m
Usage: mlr summary [options]
Show summary statistics about the input data.
Expand Down
39 changes: 33 additions & 6 deletions docs/src/manpage.txt
Original file line number Diff line number Diff line change
Expand Up @@ -173,12 +173,13 @@ MILLER(1) MILLER(1)
1mVERB LIST0m
altkv bar bootstrap case cat check clean-whitespace count-distinct count
count-similar cut decimate fill-down fill-empty filter flatten format-values
fraction gap grep group-by group-like having-fields head histogram json-parse
json-stringify join label latin1-to-utf8 least-frequent merge-fields
most-frequent nest nothing put regularize remove-empty-columns rename reorder
repeat reshape sample sec2gmtdate sec2gmt seqgen shuffle skip-trivial-records
sort sort-within-records split stats1 stats2 step summary tac tail tee
template top utf8-to-latin1 unflatten uniq unspace unsparsify
fraction gap grep group-by group-like gsub having-fields head histogram
json-parse json-stringify join label latin1-to-utf8 least-frequent
merge-fields most-frequent nest nothing put regularize remove-empty-columns
rename reorder repeat reshape sample sec2gmtdate sec2gmt seqgen shuffle
skip-trivial-records sort sort-within-records split ssub stats1 stats2 step
sub summary tac tail tee template top utf8-to-latin1 unflatten uniq unspace
unsparsify

1mFUNCTION LIST0m
abs acos acosh any append apply arrayify asin asinh asserting_absent
Expand Down Expand Up @@ -1224,6 +1225,15 @@ MILLER(1) MILLER(1)
Options:
-h|--help Show this message.

1mgsub0m
Usage: mlr gsub [options]
Replaces old string with new string in specified field(s), with regex support
for the old string and handling multiple matches, like the `gsub` DSL function.
See also the `sub` and `ssub` verbs.
Options:
-f {a,b,c} Field names to convert.
-h|--help Show this message.

1mhaving-fields0m
Usage: mlr having-fields [options]
Conditionally passes through records depending on each record's field names.
Expand Down Expand Up @@ -1832,6 +1842,14 @@ MILLER(1) MILLER(1)

See also the "tee" DSL function which lets you do more ad-hoc customization.

1mssub0m
Usage: mlr ssub [options]
Replaces old string with new string in specified field(s), without regex support for
the old string, like the `ssub` DSL function. See also the `gsub` and `sub` verbs.
Options:
-f {a,b,c} Field names to convert.
-h|--help Show this message.

1mstats10m
Usage: mlr stats1 [options]
Computes univariate statistics for one or more given fields, accumulated across
Expand Down Expand Up @@ -1969,6 +1987,15 @@ MILLER(1) MILLER(1)
https://en.wikipedia.org/wiki/Moving_average#Exponential_moving_average
for more information on EWMA.

1msub0m
Usage: mlr sub [options]
Replaces old string with new string in specified field(s), with regex support
for the old string and not handling multiple matches, like the `sub` DSL function.
See also the `gsub` and `ssub` verbs.
Options:
-f {a,b,c} Field names to convert.
-h|--help Show this message.

1msummary0m
Usage: mlr summary [options]
Show summary statistics about the input data.
Expand Down
146 changes: 146 additions & 0 deletions docs/src/reference-verbs.md
Original file line number Diff line number Diff line change
Expand Up @@ -1447,6 +1447,55 @@ record_count resource
150 /path/to/second/file
</pre>

## gsub

<pre class="pre-highlight-in-pair">
<b>mlr gsub -h</b>
</pre>
<pre class="pre-non-highlight-in-pair">
Usage: mlr gsub [options]
Replaces old string with new string in specified field(s), with regex support
for the old string and handling multiple matches, like the `gsub` DSL function.
See also the `sub` and `ssub` verbs.
Options:
-f {a,b,c} Field names to convert.
-h|--help Show this message.
</pre>

<pre class="pre-highlight-in-pair">
<b>mlr --icsv --opprint --from example.csv cat --filename then sub -f color,shape l X</b>
</pre>
<pre class="pre-non-highlight-in-pair">
filename color shape flag k index quantity rate
example.csv yeXlow triangXe true 1 11 43.6498 9.8870
example.csv red square true 2 15 79.2778 0.0130
example.csv red circXe true 3 16 13.8103 2.9010
example.csv red square false 4 48 77.5542 7.4670
example.csv purpXe triangXe false 5 51 81.2290 8.5910
example.csv red square false 6 64 77.1991 9.5310
example.csv purpXe triangXe false 7 65 80.1405 5.8240
example.csv yeXlow circXe true 8 73 63.9785 4.2370
example.csv yeXlow circXe true 9 87 63.5058 8.3350
example.csv purpXe square false 10 91 72.3735 8.2430
</pre>

<pre class="pre-highlight-in-pair">
<b>mlr --icsv --opprint --from example.csv cat --filename then gsub -f color,shape l X</b>
</pre>
<pre class="pre-non-highlight-in-pair">
filename color shape flag k index quantity rate
example.csv yeXXow triangXe true 1 11 43.6498 9.8870
example.csv red square true 2 15 79.2778 0.0130
example.csv red circXe true 3 16 13.8103 2.9010
example.csv red square false 4 48 77.5542 7.4670
example.csv purpXe triangXe false 5 51 81.2290 8.5910
example.csv red square false 6 64 77.1991 9.5310
example.csv purpXe triangXe false 7 65 80.1405 5.8240
example.csv yeXXow circXe true 8 73 63.9785 4.2370
example.csv yeXXow circXe true 9 87 63.5058 8.3350
example.csv purpXe square false 10 91 72.3735 8.2430
</pre>

## having-fields

<pre class="pre-highlight-in-pair">
Expand Down Expand Up @@ -3120,6 +3169,54 @@ then there will be split_yellow_triangle.csv, split_yellow_square.csv, etc.
See also the "tee" DSL function which lets you do more ad-hoc customization.
</pre>

## ssub

<pre class="pre-highlight-in-pair">
<b>mlr ssub -h</b>
</pre>
<pre class="pre-non-highlight-in-pair">
Usage: mlr ssub [options]
Replaces old string with new string in specified field(s), without regex support for
the old string, like the `ssub` DSL function. See also the `gsub` and `sub` verbs.
Options:
-f {a,b,c} Field names to convert.
-h|--help Show this message.
</pre>

<pre class="pre-highlight-in-pair">
<b>mlr --icsv --opprint --from example.csv cat --filename then sub -f filename . o</b>
</pre>
<pre class="pre-non-highlight-in-pair">
filename color shape flag k index quantity rate
oxample.csv yellow triangle true 1 11 43.6498 9.8870
oxample.csv red square true 2 15 79.2778 0.0130
oxample.csv red circle true 3 16 13.8103 2.9010
oxample.csv red square false 4 48 77.5542 7.4670
oxample.csv purple triangle false 5 51 81.2290 8.5910
oxample.csv red square false 6 64 77.1991 9.5310
oxample.csv purple triangle false 7 65 80.1405 5.8240
oxample.csv yellow circle true 8 73 63.9785 4.2370
oxample.csv yellow circle true 9 87 63.5058 8.3350
oxample.csv purple square false 10 91 72.3735 8.2430
</pre>

<pre class="pre-highlight-in-pair">
<b>mlr --icsv --opprint --from example.csv cat --filename then ssub -f filename . o</b>
</pre>
<pre class="pre-non-highlight-in-pair">
filename color shape flag k index quantity rate
exampleocsv yellow triangle true 1 11 43.6498 9.8870
exampleocsv red square true 2 15 79.2778 0.0130
exampleocsv red circle true 3 16 13.8103 2.9010
exampleocsv red square false 4 48 77.5542 7.4670
exampleocsv purple triangle false 5 51 81.2290 8.5910
exampleocsv red square false 6 64 77.1991 9.5310
exampleocsv purple triangle false 7 65 80.1405 5.8240
exampleocsv yellow circle true 8 73 63.9785 4.2370
exampleocsv yellow circle true 9 87 63.5058 8.3350
exampleocsv purple square false 10 91 72.3735 8.2430
</pre>

## stats1

<pre class="pre-highlight-in-pair">
Expand Down Expand Up @@ -3574,6 +3671,55 @@ $ each 10 uptime | mlr -p step -a delta -f 11

</pre>

## sub

<pre class="pre-highlight-in-pair">
<b>mlr sub -h</b>
</pre>
<pre class="pre-non-highlight-in-pair">
Usage: mlr sub [options]
Replaces old string with new string in specified field(s), with regex support
for the old string and not handling multiple matches, like the `sub` DSL function.
See also the `gsub` and `ssub` verbs.
Options:
-f {a,b,c} Field names to convert.
-h|--help Show this message.
</pre>

<pre class="pre-highlight-in-pair">
<b>mlr --icsv --opprint --from example.csv cat --filename then sub -f color,shape l X</b>
</pre>
<pre class="pre-non-highlight-in-pair">
filename color shape flag k index quantity rate
example.csv yeXlow triangXe true 1 11 43.6498 9.8870
example.csv red square true 2 15 79.2778 0.0130
example.csv red circXe true 3 16 13.8103 2.9010
example.csv red square false 4 48 77.5542 7.4670
example.csv purpXe triangXe false 5 51 81.2290 8.5910
example.csv red square false 6 64 77.1991 9.5310
example.csv purpXe triangXe false 7 65 80.1405 5.8240
example.csv yeXlow circXe true 8 73 63.9785 4.2370
example.csv yeXlow circXe true 9 87 63.5058 8.3350
example.csv purpXe square false 10 91 72.3735 8.2430
</pre>

<pre class="pre-highlight-in-pair">
<b>mlr --icsv --opprint --from example.csv cat --filename then gsub -f color,shape l X</b>
</pre>
<pre class="pre-non-highlight-in-pair">
filename color shape flag k index quantity rate
example.csv yeXXow triangXe true 1 11 43.6498 9.8870
example.csv red square true 2 15 79.2778 0.0130
example.csv red circXe true 3 16 13.8103 2.9010
example.csv red square false 4 48 77.5542 7.4670
example.csv purpXe triangXe false 5 51 81.2290 8.5910
example.csv red square false 6 64 77.1991 9.5310
example.csv purpXe triangXe false 7 65 80.1405 5.8240
example.csv yeXXow circXe true 8 73 63.9785 4.2370
example.csv yeXXow circXe true 9 87 63.5058 8.3350
example.csv purpXe square false 10 91 72.3735 8.2430
</pre>

## summary

<pre class="pre-highlight-in-pair">
Expand Down
42 changes: 42 additions & 0 deletions docs/src/reference-verbs.md.in
Original file line number Diff line number Diff line change
Expand Up @@ -487,6 +487,20 @@ GENMD-RUN-COMMAND
mlr --opprint group-like data/het.dkvp
GENMD-EOF

## gsub

GENMD-RUN-COMMAND
mlr gsub -h
GENMD-EOF

GENMD-RUN-COMMAND
mlr --icsv --opprint --from example.csv cat --filename then sub -f color,shape l X
GENMD-EOF

GENMD-RUN-COMMAND
mlr --icsv --opprint --from example.csv cat --filename then gsub -f color,shape l X
GENMD-EOF

## having-fields

GENMD-RUN-COMMAND
Expand Down Expand Up @@ -987,6 +1001,20 @@ GENMD-RUN-COMMAND
mlr split --help
GENMD-EOF

## ssub

GENMD-RUN-COMMAND
mlr ssub -h
GENMD-EOF

GENMD-RUN-COMMAND
mlr --icsv --opprint --from example.csv cat --filename then sub -f filename . o
GENMD-EOF

GENMD-RUN-COMMAND
mlr --icsv --opprint --from example.csv cat --filename then ssub -f filename . o
GENMD-EOF

## stats1

GENMD-RUN-COMMAND
Expand Down Expand Up @@ -1095,6 +1123,20 @@ Example deriving uptime-delta from system uptime:

GENMD-INCLUDE-ESCAPED(data/ping-delta-example.txt)

## sub

GENMD-RUN-COMMAND
mlr sub -h
GENMD-EOF

GENMD-RUN-COMMAND
mlr --icsv --opprint --from example.csv cat --filename then sub -f color,shape l X
GENMD-EOF

GENMD-RUN-COMMAND
mlr --icsv --opprint --from example.csv cat --filename then gsub -f color,shape l X
GENMD-EOF

## summary

GENMD-RUN-COMMAND
Expand Down
3 changes: 3 additions & 0 deletions internal/pkg/transformers/aaa_transformer_table.go
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,7 @@ var TRANSFORMER_LOOKUP_TABLE = []TransformerSetup{
GrepSetup,
GroupBySetup,
GroupLikeSetup,
GsubSetup,
HavingFieldsSetup,
HeadSetup,
HistogramSetup,
Expand Down Expand Up @@ -62,9 +63,11 @@ var TRANSFORMER_LOOKUP_TABLE = []TransformerSetup{
SortSetup,
SortWithinRecordsSetup,
SplitSetup,
SsubSetup,
Stats1Setup,
Stats2Setup,
StepSetup,
SubSetup,
SummarySetup,
TacSetup,
TailSetup,
Expand Down
Loading

0 comments on commit 793f52c

Please sign in to comment.