Skip to content

Commit

Permalink
6.6.0
Browse files Browse the repository at this point in the history
  • Loading branch information
johnkerl committed Jan 1, 2023
1 parent 31fdc1c commit 9951f36
Show file tree
Hide file tree
Showing 12 changed files with 104 additions and 66 deletions.
26 changes: 13 additions & 13 deletions docs/src/data-diving-examples.md
Original file line number Diff line number Diff line change
Expand Up @@ -271,19 +271,19 @@ The histogram shows the different distribution of 0/1 flags:
<b>mlr --opprint histogram -f flag,u,v --lo -0.1 --hi 1.1 --nbins 12 data/colored-shapes.dkvp</b>
</pre>
<pre class="pre-non-highlight-in-pair">
bin_lo bin_hi flag_count u_count v_count
-0.010000000000000002 0.09000000000000002 6058 0 36
0.09000000000000002 0.19000000000000003 0 1062 988
0.19000000000000003 0.29000000000000004 0 985 1003
0.29000000000000004 0.39000000000000007 0 1024 1014
0.39000000000000007 0.4900000000000001 0 1002 991
0.4900000000000001 0.5900000000000002 0 989 1041
0.5900000000000002 0.6900000000000002 0 1001 1016
0.6900000000000002 0.7900000000000001 0 972 962
0.7900000000000001 0.8900000000000002 0 1035 1070
0.8900000000000002 0.9900000000000002 0 995 993
0.9900000000000002 1.0900000000000003 4020 1013 939
1.0900000000000003 1.1900000000000002 0 0 25
bin_lo bin_hi flag_count u_count v_count
-0.1 0.000000000000000013877787807814457 6058 0 36
0.000000000000000013877787807814457 0.10000000000000003 0 1062 988
0.10000000000000003 0.20000000000000004 0 985 1003
0.20000000000000004 0.30000000000000004 0 1024 1014
0.30000000000000004 0.40000000000000013 0 1002 991
0.40000000000000013 0.5000000000000001 0 989 1041
0.5000000000000001 0.6000000000000002 0 1001 1016
0.6000000000000002 0.7000000000000002 0 972 962
0.7000000000000002 0.8000000000000002 0 1035 1070
0.8000000000000002 0.9000000000000002 0 995 993
0.9000000000000002 1 4020 1013 939
1 1.1 0 0 25
</pre>

Look at univariate stats by color and shape. In particular, color-dependent flag probabilities pop out, aligning with their original Bernoulli probabilities from the data-generator script:
Expand Down
17 changes: 13 additions & 4 deletions docs/src/manpage.md
Original file line number Diff line number Diff line change
Expand Up @@ -50,7 +50,7 @@ MILLER(1) MILLER(1)
insertion-ordered hash map. This encompasses a variety of data
formats, including but not limited to the familiar CSV, TSV, and JSON.
(Miller can handle positionally-indexed data as a special case.) This
manpage documents mlr 6.5.0-dev.
manpage documents mlr 6.6.0.

1mEXAMPLES0m
mlr --icsv --opprint cat example.csv
Expand Down Expand Up @@ -197,7 +197,7 @@ MILLER(1) MILLER(1)
most-frequent nest nothing put regularize remove-empty-columns rename reorder
repeat reshape sample sec2gmtdate sec2gmt seqgen shuffle skip-trivial-records
sort sort-within-records split stats1 stats2 step summary tac tail tee
template top utf8-to-latin1 unflatten uniq unsparsify
template top utf8-to-latin1 unflatten uniq unspace unsparsify

1mFUNCTION LIST0m
abs acos acosh any append apply arrayify asin asinh asserting_absent
Expand Down Expand Up @@ -2080,6 +2080,15 @@ MILLER(1) MILLER(1)
With -n, produces only one record which is the unique-record count.
With neither -c nor -n, produces unique records.

1munspace0m
Usage: mlr unspace [options]
Replaces spaces in record keys and/or values with _. This is helpful for PPRINT output.
Options:
-f {x} Replace spaces with specified filler character.
-k Unspace only keys, not keys and values.
-v Unspace only values, not keys and values.
-h|--help Show this message.

1munsparsify0m
Usage: mlr unsparsify [options]
Prints records with the union of field names over all input records.
Expand Down Expand Up @@ -3135,7 +3144,7 @@ MILLER(1) MILLER(1)
int: declares an integer local variable in the current curly-braced scope.
Type-checking happens at assignment: 'int x = 0.0' is an error.

map
1mmap0m
map: declares a map-valued local variable in the current curly-braced scope.
Type-checking happens at assignment: 'map b = 0' is an error. map b = {} is
always OK. map b = a is OK or not depending on whether a is a map.
Expand Down Expand Up @@ -3288,5 +3297,5 @@ MILLER(1) MILLER(1)



2022-12-05 MILLER(1)
2023-01-01 MILLER(1)
</pre>
17 changes: 13 additions & 4 deletions docs/src/manpage.txt
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@ MILLER(1) MILLER(1)
insertion-ordered hash map. This encompasses a variety of data
formats, including but not limited to the familiar CSV, TSV, and JSON.
(Miller can handle positionally-indexed data as a special case.) This
manpage documents mlr 6.5.0-dev.
manpage documents mlr 6.6.0.

1mEXAMPLES0m
mlr --icsv --opprint cat example.csv
Expand Down Expand Up @@ -176,7 +176,7 @@ MILLER(1) MILLER(1)
most-frequent nest nothing put regularize remove-empty-columns rename reorder
repeat reshape sample sec2gmtdate sec2gmt seqgen shuffle skip-trivial-records
sort sort-within-records split stats1 stats2 step summary tac tail tee
template top utf8-to-latin1 unflatten uniq unsparsify
template top utf8-to-latin1 unflatten uniq unspace unsparsify

1mFUNCTION LIST0m
abs acos acosh any append apply arrayify asin asinh asserting_absent
Expand Down Expand Up @@ -2059,6 +2059,15 @@ MILLER(1) MILLER(1)
With -n, produces only one record which is the unique-record count.
With neither -c nor -n, produces unique records.

1munspace0m
Usage: mlr unspace [options]
Replaces spaces in record keys and/or values with _. This is helpful for PPRINT output.
Options:
-f {x} Replace spaces with specified filler character.
-k Unspace only keys, not keys and values.
-v Unspace only values, not keys and values.
-h|--help Show this message.

1munsparsify0m
Usage: mlr unsparsify [options]
Prints records with the union of field names over all input records.
Expand Down Expand Up @@ -3114,7 +3123,7 @@ MILLER(1) MILLER(1)
int: declares an integer local variable in the current curly-braced scope.
Type-checking happens at assignment: 'int x = 0.0' is an error.

map
1mmap0m
map: declares a map-valued local variable in the current curly-braced scope.
Type-checking happens at assignment: 'map b = 0' is an error. map b = {} is
always OK. map b = a is OK or not depending on whether a is a map.
Expand Down Expand Up @@ -3267,4 +3276,4 @@ MILLER(1) MILLER(1)



2022-12-05 MILLER(1)
2023-01-01 MILLER(1)
28 changes: 12 additions & 16 deletions docs/src/operating-on-all-fields.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,10 +24,9 @@ Suppose you want to replace spaces with underscores in your column names:
<b>cat data/spaces.csv</b>
</pre>
<pre class="pre-non-highlight-in-pair">
a b c,def,g h i
123,4567,890
2468,1357,3579
9987,3312,4543
column 1,column 2,column 3
apple,ball,cat
dale egg,fish,gale
</pre>

The simplest way is to use `mlr rename` with `-g` (for global replace, not just first occurrence of space within each field) and `-r` for pattern-matching (rather than explicit single-column renames):
Expand All @@ -36,20 +35,18 @@ The simplest way is to use `mlr rename` with `-g` (for global replace, not just
<b>mlr --csv rename -g -r ' ,_' data/spaces.csv</b>
</pre>
<pre class="pre-non-highlight-in-pair">
a_b_c,def,g_h_i
123,4567,890
2468,1357,3579
9987,3312,4543
column_1,column_2,column_3
apple,ball,cat
dale egg,fish,gale
</pre>

<pre class="pre-highlight-in-pair">
<b>mlr --csv --opprint rename -g -r ' ,_' data/spaces.csv</b>
</pre>
<pre class="pre-non-highlight-in-pair">
a_b_c def g_h_i
123 4567 890
2468 1357 3579
9987 3312 4543
column_1 column_2 column_3
apple ball cat
dale egg fish gale
</pre>

You can also do this with a for-loop:
Expand All @@ -69,10 +66,9 @@ $* = newrec
<b>mlr --icsv --opprint put -f data/bulk-rename-for-loop.mlr data/spaces.csv</b>
</pre>
<pre class="pre-non-highlight-in-pair">
a_b_c def g_h_i
123 4567 890
2468 1357 3579
9987 3312 4543
column_1 column_2 column_3
apple ball cat
dale egg fish gale
</pre>

## Bulk rename of fields with carriage returns
Expand Down
32 changes: 16 additions & 16 deletions docs/src/reference-verbs.md
Original file line number Diff line number Diff line change
Expand Up @@ -4099,7 +4099,7 @@ The primary use-case is for PPRINT output, which is space-delimited. For example
<b>cat data/spaces.csv</b>
</pre>
<pre class="pre-non-highlight-in-pair">
column 1, column 2, column 3
column 1,column 2,column 3
apple,ball,cat
dale egg,fish,gale
</pre>
Expand All @@ -4108,40 +4108,40 @@ dale egg,fish,gale
<b>mlr --icsv --opprint cat data/spaces.csv</b>
</pre>
<pre class="pre-non-highlight-in-pair">
column 1 column 2 column 3
apple ball cat
dale egg fish gale
column 1 column 2 column 3
apple ball cat
dale egg fish gale
</pre>

<pre class="pre-highlight-in-pair">
<b>mlr --icsv --opprint cat data/spaces.csv</b>
</pre>
<pre class="pre-non-highlight-in-pair">
column 1 column 2 column 3
apple ball cat
dale egg fish gale
column 1 column 2 column 3
apple ball cat
dale egg fish gale
</pre>

<pre class="pre-highlight-in-pair">
<b>mlr --icsv --opprint unspace data/spaces.csv</b>
</pre>
<pre class="pre-non-highlight-in-pair">
column_1 _column_2 _column_3
apple ball cat
dale_egg fish gale
column_1 column_2 column_3
apple ball cat
dale_egg fish gale
</pre>

<pre class="pre-highlight-in-pair">
<b>mlr --icsv --opprint unspace data/spaces.csv | mlr --ipprint --oxtab cat</b>
</pre>
<pre class="pre-non-highlight-in-pair">
column_1 apple
_column_2 ball
_column_3 cat
column_1 apple
column_2 ball
column_3 cat

column_1 dale_egg
_column_2 fish
_column_3 gale
column_1 dale_egg
column_2 fish
column_3 gale
</pre>

## unsparsify
Expand Down
1 change: 0 additions & 1 deletion docs/src/spaces.csv
Original file line number Diff line number Diff line change
Expand Up @@ -3,4 +3,3 @@ Zone,Total MWh
17,39.8
24,7.4
30,50.5

1 change: 0 additions & 1 deletion internal/pkg/go-csv/csv_reader.go
Original file line number Diff line number Diff line change
Expand Up @@ -473,4 +473,3 @@ parseField:
}
return dst, err
}

1 change: 0 additions & 1 deletion internal/pkg/go-csv/csv_writer.go
Original file line number Diff line number Diff line change
Expand Up @@ -179,4 +179,3 @@ func (w *Writer) fieldNeedsQuotes(field string) bool {
r1, _ := utf8.DecodeRuneInString(field)
return unicode.IsSpace(r1)
}

2 changes: 1 addition & 1 deletion internal/pkg/version/version.go
Original file line number Diff line number Diff line change
Expand Up @@ -4,4 +4,4 @@ package version
// Nominally things like "6.0.0" for a release, then "6.0.0-dev" in between.
// This makes it clear that a given build is on the main dev branch, not a
// particular snapshot tag.
var STRING string = "6.5.0-dev"
var STRING string = "6.6.0"
17 changes: 13 additions & 4 deletions man/manpage.txt
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@ MILLER(1) MILLER(1)
insertion-ordered hash map. This encompasses a variety of data
formats, including but not limited to the familiar CSV, TSV, and JSON.
(Miller can handle positionally-indexed data as a special case.) This
manpage documents mlr 6.5.0-dev.
manpage documents mlr 6.6.0.

1mEXAMPLES0m
mlr --icsv --opprint cat example.csv
Expand Down Expand Up @@ -176,7 +176,7 @@ MILLER(1) MILLER(1)
most-frequent nest nothing put regularize remove-empty-columns rename reorder
repeat reshape sample sec2gmtdate sec2gmt seqgen shuffle skip-trivial-records
sort sort-within-records split stats1 stats2 step summary tac tail tee
template top utf8-to-latin1 unflatten uniq unsparsify
template top utf8-to-latin1 unflatten uniq unspace unsparsify

1mFUNCTION LIST0m
abs acos acosh any append apply arrayify asin asinh asserting_absent
Expand Down Expand Up @@ -2059,6 +2059,15 @@ MILLER(1) MILLER(1)
With -n, produces only one record which is the unique-record count.
With neither -c nor -n, produces unique records.

1munspace0m
Usage: mlr unspace [options]
Replaces spaces in record keys and/or values with _. This is helpful for PPRINT output.
Options:
-f {x} Replace spaces with specified filler character.
-k Unspace only keys, not keys and values.
-v Unspace only values, not keys and values.
-h|--help Show this message.

1munsparsify0m
Usage: mlr unsparsify [options]
Prints records with the union of field names over all input records.
Expand Down Expand Up @@ -3114,7 +3123,7 @@ MILLER(1) MILLER(1)
int: declares an integer local variable in the current curly-braced scope.
Type-checking happens at assignment: 'int x = 0.0' is an error.

map
1mmap0m
map: declares a map-valued local variable in the current curly-braced scope.
Type-checking happens at assignment: 'map b = 0' is an error. map b = {} is
always OK. map b = a is OK or not depending on whether a is a map.
Expand Down Expand Up @@ -3267,4 +3276,4 @@ MILLER(1) MILLER(1)



2022-12-05 MILLER(1)
2023-01-01 MILLER(1)
23 changes: 19 additions & 4 deletions man/mlr.1
Original file line number Diff line number Diff line change
Expand Up @@ -2,12 +2,12 @@
.\" Title: mlr
.\" Author: [see the "AUTHOR" section]
.\" Generator: ./mkman.rb
.\" Date: 2022-12-05
.\" Date: 2023-01-01
.\" Manual: \ \&
.\" Source: \ \&
.\" Language: English
.\"
.TH "MILLER" "1" "2022-12-05" "\ \&" "\ \&"
.TH "MILLER" "1" "2023-01-01" "\ \&" "\ \&"
.\" -----------------------------------------------------------------
.\" * Portability definitions
.\" ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Expand Down Expand Up @@ -47,7 +47,7 @@ on integer-indexed fields: if the natural data structure for the latter is the
array, then Miller's natural data structure is the insertion-ordered hash map.
This encompasses a variety of data formats, including but not limited to the
familiar CSV, TSV, and JSON. (Miller can handle positionally-indexed data as
a special case.) This manpage documents mlr 6.5.0-dev.
a special case.) This manpage documents mlr 6.6.0.
.SH "EXAMPLES"
.sp

Expand Down Expand Up @@ -217,7 +217,7 @@ json-stringify join label latin1-to-utf8 least-frequent merge-fields
most-frequent nest nothing put regularize remove-empty-columns rename reorder
repeat reshape sample sec2gmtdate sec2gmt seqgen shuffle skip-trivial-records
sort sort-within-records split stats1 stats2 step summary tac tail tee
template top utf8-to-latin1 unflatten uniq unsparsify
template top utf8-to-latin1 unflatten uniq unspace unsparsify
.fi
.if n \{\
.RE
Expand Down Expand Up @@ -2604,6 +2604,21 @@ Options:
.fi
.if n \{\
.RE
.SS "unspace"
.if n \{\
.RS 0
.\}
.nf
Usage: mlr unspace [options]
Replaces spaces in record keys and/or values with _. This is helpful for PPRINT output.
Options:
-f {x} Replace spaces with specified filler character.
-k Unspace only keys, not keys and values.
-v Unspace only values, not keys and values.
-h|--help Show this message.
.fi
.if n \{\
.RE
.SS "unsparsify"
.if n \{\
.RS 0
Expand Down
5 changes: 4 additions & 1 deletion miller.spec
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
Summary: Name-indexed data processing tool
Name: miller
Version: 6.5.0
Version: 6.6.0
Release: 1%{?dist}
License: BSD
Source: https://github.com/johnkerl/miller/releases/download/%{version}/miller-%{version}.tar.gz
Expand Down Expand Up @@ -36,6 +36,9 @@ make install
%{_mandir}/man1/mlr.1*

%changelog
* Sun Jan 1 2023 John Kerl <[email protected]> - 6.6.0-1
- 6.6.0 release

* Sun Nov 27 2022 John Kerl <[email protected]> - 6.5.0-1
- 6.5.0 release

Expand Down

0 comments on commit 9951f36

Please sign in to comment.