diff --git a/docs/src/data-diving-examples.md b/docs/src/data-diving-examples.md index 4a62754030..39738f193d 100644 --- a/docs/src/data-diving-examples.md +++ b/docs/src/data-diving-examples.md @@ -271,19 +271,19 @@ The histogram shows the different distribution of 0/1 flags: mlr --opprint histogram -f flag,u,v --lo -0.1 --hi 1.1 --nbins 12 data/colored-shapes.dkvp
-bin_lo                bin_hi              flag_count u_count v_count
--0.010000000000000002 0.09000000000000002 6058       0       36
-0.09000000000000002   0.19000000000000003 0          1062    988
-0.19000000000000003   0.29000000000000004 0          985     1003
-0.29000000000000004   0.39000000000000007 0          1024    1014
-0.39000000000000007   0.4900000000000001  0          1002    991
-0.4900000000000001    0.5900000000000002  0          989     1041
-0.5900000000000002    0.6900000000000002  0          1001    1016
-0.6900000000000002    0.7900000000000001  0          972     962
-0.7900000000000001    0.8900000000000002  0          1035    1070
-0.8900000000000002    0.9900000000000002  0          995     993
-0.9900000000000002    1.0900000000000003  4020       1013    939
-1.0900000000000003    1.1900000000000002  0          0       25
+bin_lo                              bin_hi                              flag_count u_count v_count
+-0.1                                0.000000000000000013877787807814457 6058       0       36
+0.000000000000000013877787807814457 0.10000000000000003                 0          1062    988
+0.10000000000000003                 0.20000000000000004                 0          985     1003
+0.20000000000000004                 0.30000000000000004                 0          1024    1014
+0.30000000000000004                 0.40000000000000013                 0          1002    991
+0.40000000000000013                 0.5000000000000001                  0          989     1041
+0.5000000000000001                  0.6000000000000002                  0          1001    1016
+0.6000000000000002                  0.7000000000000002                  0          972     962
+0.7000000000000002                  0.8000000000000002                  0          1035    1070
+0.8000000000000002                  0.9000000000000002                  0          995     993
+0.9000000000000002                  1                                   4020       1013    939
+1                                   1.1                                 0          0       25
 
Look at univariate stats by color and shape. In particular, color-dependent flag probabilities pop out, aligning with their original Bernoulli probabilities from the data-generator script: diff --git a/docs/src/manpage.md b/docs/src/manpage.md index 8b0683d39c..70bb446bca 100644 --- a/docs/src/manpage.md +++ b/docs/src/manpage.md @@ -50,7 +50,7 @@ MILLER(1) MILLER(1) insertion-ordered hash map. This encompasses a variety of data formats, including but not limited to the familiar CSV, TSV, and JSON. (Miller can handle positionally-indexed data as a special case.) This - manpage documents mlr 6.5.0-dev. + manpage documents mlr 6.6.0. 1mEXAMPLES0m mlr --icsv --opprint cat example.csv @@ -197,7 +197,7 @@ MILLER(1) MILLER(1) most-frequent nest nothing put regularize remove-empty-columns rename reorder repeat reshape sample sec2gmtdate sec2gmt seqgen shuffle skip-trivial-records sort sort-within-records split stats1 stats2 step summary tac tail tee - template top utf8-to-latin1 unflatten uniq unsparsify + template top utf8-to-latin1 unflatten uniq unspace unsparsify 1mFUNCTION LIST0m abs acos acosh any append apply arrayify asin asinh asserting_absent @@ -2080,6 +2080,15 @@ MILLER(1) MILLER(1) With -n, produces only one record which is the unique-record count. With neither -c nor -n, produces unique records. + 1munspace0m + Usage: mlr unspace [options] + Replaces spaces in record keys and/or values with _. This is helpful for PPRINT output. + Options: + -f {x} Replace spaces with specified filler character. + -k Unspace only keys, not keys and values. + -v Unspace only values, not keys and values. + -h|--help Show this message. + 1munsparsify0m Usage: mlr unsparsify [options] Prints records with the union of field names over all input records. @@ -3135,7 +3144,7 @@ MILLER(1) MILLER(1) int: declares an integer local variable in the current curly-braced scope. Type-checking happens at assignment: 'int x = 0.0' is an error. - map + 1mmap0m map: declares a map-valued local variable in the current curly-braced scope. Type-checking happens at assignment: 'map b = 0' is an error. map b = {} is always OK. map b = a is OK or not depending on whether a is a map. @@ -3288,5 +3297,5 @@ MILLER(1) MILLER(1) - 2022-12-05 MILLER(1) + 2023-01-01 MILLER(1) diff --git a/docs/src/manpage.txt b/docs/src/manpage.txt index e0e99eb7d5..224be25208 100644 --- a/docs/src/manpage.txt +++ b/docs/src/manpage.txt @@ -29,7 +29,7 @@ MILLER(1) MILLER(1) insertion-ordered hash map. This encompasses a variety of data formats, including but not limited to the familiar CSV, TSV, and JSON. (Miller can handle positionally-indexed data as a special case.) This - manpage documents mlr 6.5.0-dev. + manpage documents mlr 6.6.0. 1mEXAMPLES0m mlr --icsv --opprint cat example.csv @@ -176,7 +176,7 @@ MILLER(1) MILLER(1) most-frequent nest nothing put regularize remove-empty-columns rename reorder repeat reshape sample sec2gmtdate sec2gmt seqgen shuffle skip-trivial-records sort sort-within-records split stats1 stats2 step summary tac tail tee - template top utf8-to-latin1 unflatten uniq unsparsify + template top utf8-to-latin1 unflatten uniq unspace unsparsify 1mFUNCTION LIST0m abs acos acosh any append apply arrayify asin asinh asserting_absent @@ -2059,6 +2059,15 @@ MILLER(1) MILLER(1) With -n, produces only one record which is the unique-record count. With neither -c nor -n, produces unique records. + 1munspace0m + Usage: mlr unspace [options] + Replaces spaces in record keys and/or values with _. This is helpful for PPRINT output. + Options: + -f {x} Replace spaces with specified filler character. + -k Unspace only keys, not keys and values. + -v Unspace only values, not keys and values. + -h|--help Show this message. + 1munsparsify0m Usage: mlr unsparsify [options] Prints records with the union of field names over all input records. @@ -3114,7 +3123,7 @@ MILLER(1) MILLER(1) int: declares an integer local variable in the current curly-braced scope. Type-checking happens at assignment: 'int x = 0.0' is an error. - map + 1mmap0m map: declares a map-valued local variable in the current curly-braced scope. Type-checking happens at assignment: 'map b = 0' is an error. map b = {} is always OK. map b = a is OK or not depending on whether a is a map. @@ -3267,4 +3276,4 @@ MILLER(1) MILLER(1) - 2022-12-05 MILLER(1) + 2023-01-01 MILLER(1) diff --git a/docs/src/operating-on-all-fields.md b/docs/src/operating-on-all-fields.md index 452f4486d9..476b685dd8 100644 --- a/docs/src/operating-on-all-fields.md +++ b/docs/src/operating-on-all-fields.md @@ -24,10 +24,9 @@ Suppose you want to replace spaces with underscores in your column names: cat data/spaces.csv
-a b c,def,g h i
-123,4567,890
-2468,1357,3579
-9987,3312,4543
+column 1,column 2,column 3
+apple,ball,cat
+dale egg,fish,gale
 
The simplest way is to use `mlr rename` with `-g` (for global replace, not just first occurrence of space within each field) and `-r` for pattern-matching (rather than explicit single-column renames): @@ -36,20 +35,18 @@ The simplest way is to use `mlr rename` with `-g` (for global replace, not just mlr --csv rename -g -r ' ,_' data/spaces.csv
-a_b_c,def,g_h_i
-123,4567,890
-2468,1357,3579
-9987,3312,4543
+column_1,column_2,column_3
+apple,ball,cat
+dale egg,fish,gale
 
 mlr --csv --opprint rename -g -r ' ,_'  data/spaces.csv
 
-a_b_c def  g_h_i
-123   4567 890
-2468  1357 3579
-9987  3312 4543
+column_1 column_2 column_3
+apple    ball     cat
+dale egg fish     gale
 
You can also do this with a for-loop: @@ -69,10 +66,9 @@ $* = newrec mlr --icsv --opprint put -f data/bulk-rename-for-loop.mlr data/spaces.csv
-a_b_c def  g_h_i
-123   4567 890
-2468  1357 3579
-9987  3312 4543
+column_1 column_2 column_3
+apple    ball     cat
+dale egg fish     gale
 
## Bulk rename of fields with carriage returns diff --git a/docs/src/reference-verbs.md b/docs/src/reference-verbs.md index 1bbeb2e703..5fca68607f 100644 --- a/docs/src/reference-verbs.md +++ b/docs/src/reference-verbs.md @@ -4099,7 +4099,7 @@ The primary use-case is for PPRINT output, which is space-delimited. For example cat data/spaces.csv
-column 1, column 2, column 3
+column 1,column 2,column 3
 apple,ball,cat
 dale egg,fish,gale
 
@@ -4108,40 +4108,40 @@ dale egg,fish,gale mlr --icsv --opprint cat data/spaces.csv
-column 1  column 2  column 3
-apple    ball      cat
-dale egg fish      gale
+column 1 column 2 column 3
+apple    ball     cat
+dale egg fish     gale
 
 mlr --icsv --opprint cat data/spaces.csv
 
-column 1  column 2  column 3
-apple    ball      cat
-dale egg fish      gale
+column 1 column 2 column 3
+apple    ball     cat
+dale egg fish     gale
 
 mlr --icsv --opprint unspace data/spaces.csv
 
-column_1 _column_2 _column_3
-apple    ball      cat
-dale_egg fish      gale
+column_1 column_2 column_3
+apple    ball     cat
+dale_egg fish     gale
 
 mlr --icsv --opprint unspace data/spaces.csv | mlr --ipprint --oxtab cat
 
-column_1  apple
-_column_2 ball
-_column_3 cat
+column_1 apple
+column_2 ball
+column_3 cat
 
-column_1  dale_egg
-_column_2 fish
-_column_3 gale
+column_1 dale_egg
+column_2 fish
+column_3 gale
 
## unsparsify diff --git a/docs/src/spaces.csv b/docs/src/spaces.csv index 6fc75cea31..50c2f89d06 100644 --- a/docs/src/spaces.csv +++ b/docs/src/spaces.csv @@ -3,4 +3,3 @@ Zone,Total MWh 17,39.8 24,7.4 30,50.5 - diff --git a/internal/pkg/go-csv/csv_reader.go b/internal/pkg/go-csv/csv_reader.go index 708e62fbde..507e9a94ca 100644 --- a/internal/pkg/go-csv/csv_reader.go +++ b/internal/pkg/go-csv/csv_reader.go @@ -473,4 +473,3 @@ parseField: } return dst, err } - diff --git a/internal/pkg/go-csv/csv_writer.go b/internal/pkg/go-csv/csv_writer.go index 4f352e68d8..ac64b4d54c 100644 --- a/internal/pkg/go-csv/csv_writer.go +++ b/internal/pkg/go-csv/csv_writer.go @@ -179,4 +179,3 @@ func (w *Writer) fieldNeedsQuotes(field string) bool { r1, _ := utf8.DecodeRuneInString(field) return unicode.IsSpace(r1) } - diff --git a/internal/pkg/version/version.go b/internal/pkg/version/version.go index 7f08f9bca9..96afe00cc0 100644 --- a/internal/pkg/version/version.go +++ b/internal/pkg/version/version.go @@ -4,4 +4,4 @@ package version // Nominally things like "6.0.0" for a release, then "6.0.0-dev" in between. // This makes it clear that a given build is on the main dev branch, not a // particular snapshot tag. -var STRING string = "6.5.0-dev" +var STRING string = "6.6.0" diff --git a/man/manpage.txt b/man/manpage.txt index e0e99eb7d5..224be25208 100644 --- a/man/manpage.txt +++ b/man/manpage.txt @@ -29,7 +29,7 @@ MILLER(1) MILLER(1) insertion-ordered hash map. This encompasses a variety of data formats, including but not limited to the familiar CSV, TSV, and JSON. (Miller can handle positionally-indexed data as a special case.) This - manpage documents mlr 6.5.0-dev. + manpage documents mlr 6.6.0. 1mEXAMPLES0m mlr --icsv --opprint cat example.csv @@ -176,7 +176,7 @@ MILLER(1) MILLER(1) most-frequent nest nothing put regularize remove-empty-columns rename reorder repeat reshape sample sec2gmtdate sec2gmt seqgen shuffle skip-trivial-records sort sort-within-records split stats1 stats2 step summary tac tail tee - template top utf8-to-latin1 unflatten uniq unsparsify + template top utf8-to-latin1 unflatten uniq unspace unsparsify 1mFUNCTION LIST0m abs acos acosh any append apply arrayify asin asinh asserting_absent @@ -2059,6 +2059,15 @@ MILLER(1) MILLER(1) With -n, produces only one record which is the unique-record count. With neither -c nor -n, produces unique records. + 1munspace0m + Usage: mlr unspace [options] + Replaces spaces in record keys and/or values with _. This is helpful for PPRINT output. + Options: + -f {x} Replace spaces with specified filler character. + -k Unspace only keys, not keys and values. + -v Unspace only values, not keys and values. + -h|--help Show this message. + 1munsparsify0m Usage: mlr unsparsify [options] Prints records with the union of field names over all input records. @@ -3114,7 +3123,7 @@ MILLER(1) MILLER(1) int: declares an integer local variable in the current curly-braced scope. Type-checking happens at assignment: 'int x = 0.0' is an error. - map + 1mmap0m map: declares a map-valued local variable in the current curly-braced scope. Type-checking happens at assignment: 'map b = 0' is an error. map b = {} is always OK. map b = a is OK or not depending on whether a is a map. @@ -3267,4 +3276,4 @@ MILLER(1) MILLER(1) - 2022-12-05 MILLER(1) + 2023-01-01 MILLER(1) diff --git a/man/mlr.1 b/man/mlr.1 index 4711b829f1..99c5f4aac0 100644 --- a/man/mlr.1 +++ b/man/mlr.1 @@ -2,12 +2,12 @@ .\" Title: mlr .\" Author: [see the "AUTHOR" section] .\" Generator: ./mkman.rb -.\" Date: 2022-12-05 +.\" Date: 2023-01-01 .\" Manual: \ \& .\" Source: \ \& .\" Language: English .\" -.TH "MILLER" "1" "2022-12-05" "\ \&" "\ \&" +.TH "MILLER" "1" "2023-01-01" "\ \&" "\ \&" .\" ----------------------------------------------------------------- .\" * Portability definitions .\" ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ @@ -47,7 +47,7 @@ on integer-indexed fields: if the natural data structure for the latter is the array, then Miller's natural data structure is the insertion-ordered hash map. This encompasses a variety of data formats, including but not limited to the familiar CSV, TSV, and JSON. (Miller can handle positionally-indexed data as -a special case.) This manpage documents mlr 6.5.0-dev. +a special case.) This manpage documents mlr 6.6.0. .SH "EXAMPLES" .sp @@ -217,7 +217,7 @@ json-stringify join label latin1-to-utf8 least-frequent merge-fields most-frequent nest nothing put regularize remove-empty-columns rename reorder repeat reshape sample sec2gmtdate sec2gmt seqgen shuffle skip-trivial-records sort sort-within-records split stats1 stats2 step summary tac tail tee -template top utf8-to-latin1 unflatten uniq unsparsify +template top utf8-to-latin1 unflatten uniq unspace unsparsify .fi .if n \{\ .RE @@ -2604,6 +2604,21 @@ Options: .fi .if n \{\ .RE +.SS "unspace" +.if n \{\ +.RS 0 +.\} +.nf +Usage: mlr unspace [options] +Replaces spaces in record keys and/or values with _. This is helpful for PPRINT output. +Options: +-f {x} Replace spaces with specified filler character. +-k Unspace only keys, not keys and values. +-v Unspace only values, not keys and values. +-h|--help Show this message. +.fi +.if n \{\ +.RE .SS "unsparsify" .if n \{\ .RS 0 diff --git a/miller.spec b/miller.spec index 9dc84b0e9f..c396bcc38a 100644 --- a/miller.spec +++ b/miller.spec @@ -1,6 +1,6 @@ Summary: Name-indexed data processing tool Name: miller -Version: 6.5.0 +Version: 6.6.0 Release: 1%{?dist} License: BSD Source: https://github.com/johnkerl/miller/releases/download/%{version}/miller-%{version}.tar.gz @@ -36,6 +36,9 @@ make install %{_mandir}/man1/mlr.1* %changelog +* Sun Jan 1 2023 John Kerl - 6.6.0-1 +- 6.6.0 release + * Sun Nov 27 2022 John Kerl - 6.5.0-1 - 6.5.0 release