Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a stat DSL function #1560

Merged
merged 3 commits into from
May 9, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 14 additions & 2 deletions docs/src/manpage.md
Original file line number Diff line number Diff line change
Expand Up @@ -225,7 +225,7 @@ MILLER(1) MILLER(1)
percentiles pow qnorm reduce regextract regextract_or_else rightpad round
roundm rstrip sec2dhms sec2gmt sec2gmtdate sec2hms sec2localdate sec2localtime
select sgn sha1 sha256 sha512 sin sinh skewness sort sort_collection splita
splitax splitkv splitkvx splitnv splitnvx sqrt ssub stddev strfntime
splitax splitkv splitkvx splitnv splitnvx sqrt ssub stat stddev strfntime
strfntime_local strftime strftime_local string strip strlen strmatch strmatchx
strpntime strpntime_local strptime strptime_local sub substr substr0 substr1
sum sum2 sum3 sum4 sysntime system systime systimeint tan tanh tolower toupper
Expand Down Expand Up @@ -2990,6 +2990,18 @@ MILLER(1) MILLER(1)
Example:
ssub("abc.def", ".", "X") gives "abcXdef"

1mstat0m
(class=system #args=1) Returns a map containing information about the provided path: "name" with string value, "size" as decimal int value, "mode" as octal int value, "modtime" as int-valued epoch seconds, and "isdir" as boolean value.
Examples:
stat("./mlr") gives {
"name": "mlr",
"size": 38391584,
"mode": 0755,
"modtime": 1715207874,
"isdir": false
}
stat("./mlr")["size"] gives 38391584

1mstddev0m
(class=stats #args=1) Returns the sample standard deviation of values in an array or map. Returns empty string AKA void for array/map of length less than two; returns error for non-array/non-map types.
Example:
Expand Down Expand Up @@ -3720,5 +3732,5 @@ MILLER(1) MILLER(1)



2024-04-11 MILLER(1)
2024-05-09 MILLER(1)
</pre>
16 changes: 14 additions & 2 deletions docs/src/manpage.txt
Original file line number Diff line number Diff line change
Expand Up @@ -204,7 +204,7 @@ MILLER(1) MILLER(1)
percentiles pow qnorm reduce regextract regextract_or_else rightpad round
roundm rstrip sec2dhms sec2gmt sec2gmtdate sec2hms sec2localdate sec2localtime
select sgn sha1 sha256 sha512 sin sinh skewness sort sort_collection splita
splitax splitkv splitkvx splitnv splitnvx sqrt ssub stddev strfntime
splitax splitkv splitkvx splitnv splitnvx sqrt ssub stat stddev strfntime
strfntime_local strftime strftime_local string strip strlen strmatch strmatchx
strpntime strpntime_local strptime strptime_local sub substr substr0 substr1
sum sum2 sum3 sum4 sysntime system systime systimeint tan tanh tolower toupper
Expand Down Expand Up @@ -2969,6 +2969,18 @@ MILLER(1) MILLER(1)
Example:
ssub("abc.def", ".", "X") gives "abcXdef"

1mstat0m
(class=system #args=1) Returns a map containing information about the provided path: "name" with string value, "size" as decimal int value, "mode" as octal int value, "modtime" as int-valued epoch seconds, and "isdir" as boolean value.
Examples:
stat("./mlr") gives {
"name": "mlr",
"size": 38391584,
"mode": 0755,
"modtime": 1715207874,
"isdir": false
}
stat("./mlr")["size"] gives 38391584

1mstddev0m
(class=stats #args=1) Returns the sample standard deviation of values in an array or map. Returns empty string AKA void for array/map of length less than two; returns error for non-array/non-map types.
Example:
Expand Down Expand Up @@ -3699,4 +3711,4 @@ MILLER(1) MILLER(1)



2024-04-11 MILLER(1)
2024-05-09 MILLER(1)
17 changes: 16 additions & 1 deletion docs/src/reference-dsl-builtin-functions.md
Original file line number Diff line number Diff line change
Expand Up @@ -76,7 +76,7 @@ is 2. Unary operators such as `!` and `~` show argument-count of 1; the ternary
* [**Math functions**](#math-functions): [abs](#abs), [acos](#acos), [acosh](#acosh), [asin](#asin), [asinh](#asinh), [atan](#atan), [atan2](#atan2), [atanh](#atanh), [cbrt](#cbrt), [ceil](#ceil), [cos](#cos), [cosh](#cosh), [erf](#erf), [erfc](#erfc), [exp](#exp), [expm1](#expm1), [floor](#floor), [invqnorm](#invqnorm), [log](#log), [log10](#log10), [log1p](#log1p), [logifit](#logifit), [max](#max), [min](#min), [qnorm](#qnorm), [round](#round), [roundm](#roundm), [sgn](#sgn), [sin](#sin), [sinh](#sinh), [sqrt](#sqrt), [tan](#tan), [tanh](#tanh), [urand](#urand), [urand32](#urand32), [urandelement](#urandelement), [urandint](#urandint), [urandrange](#urandrange).
* [**Stats functions**](#stats-functions): [antimode](#antimode), [count](#count), [distinct_count](#distinct_count), [kurtosis](#kurtosis), [maxlen](#maxlen), [mean](#mean), [meaneb](#meaneb), [median](#median), [minlen](#minlen), [mode](#mode), [null_count](#null_count), [percentile](#percentile), [percentiles](#percentiles), [skewness](#skewness), [sort_collection](#sort_collection), [stddev](#stddev), [sum](#sum), [sum2](#sum2), [sum3](#sum3), [sum4](#sum4), [variance](#variance).
* [**String functions**](#string-functions): [capitalize](#capitalize), [clean_whitespace](#clean_whitespace), [collapse_whitespace](#collapse_whitespace), [contains](#contains), [format](#format), [gssub](#gssub), [gsub](#gsub), [index](#index), [latin1_to_utf8](#latin1_to_utf8), [leftpad](#leftpad), [lstrip](#lstrip), [regextract](#regextract), [regextract_or_else](#regextract_or_else), [rightpad](#rightpad), [rstrip](#rstrip), [ssub](#ssub), [strip](#strip), [strlen](#strlen), [strmatch](#strmatch), [strmatchx](#strmatchx), [sub](#sub), [substr](#substr), [substr0](#substr0), [substr1](#substr1), [tolower](#tolower), [toupper](#toupper), [truncate](#truncate), [unformat](#unformat), [unformatx](#unformatx), [utf8_to_latin1](#utf8_to_latin1), [\.](#dot).
* [**System functions**](#system-functions): [exec](#exec), [hostname](#hostname), [os](#os), [system](#system), [version](#version).
* [**System functions**](#system-functions): [exec](#exec), [hostname](#hostname), [os](#os), [stat](#stat), [system](#system), [version](#version).
* [**Time functions**](#time-functions): [dhms2fsec](#dhms2fsec), [dhms2sec](#dhms2sec), [fsec2dhms](#fsec2dhms), [fsec2hms](#fsec2hms), [gmt2localtime](#gmt2localtime), [gmt2nsec](#gmt2nsec), [gmt2sec](#gmt2sec), [hms2fsec](#hms2fsec), [hms2sec](#hms2sec), [localtime2gmt](#localtime2gmt), [localtime2nsec](#localtime2nsec), [localtime2sec](#localtime2sec), [nsec2gmt](#nsec2gmt), [nsec2gmtdate](#nsec2gmtdate), [nsec2localdate](#nsec2localdate), [nsec2localtime](#nsec2localtime), [sec2dhms](#sec2dhms), [sec2gmt](#sec2gmt), [sec2gmtdate](#sec2gmtdate), [sec2hms](#sec2hms), [sec2localdate](#sec2localdate), [sec2localtime](#sec2localtime), [strfntime](#strfntime), [strfntime_local](#strfntime_local), [strftime](#strftime), [strftime_local](#strftime_local), [strpntime](#strpntime), [strpntime_local](#strpntime_local), [strptime](#strptime), [strptime_local](#strptime_local), [sysntime](#sysntime), [systime](#systime), [systimeint](#systimeint), [upntime](#upntime), [uptime](#uptime).
* [**Typing functions**](#typing-functions): [asserting_absent](#asserting_absent), [asserting_array](#asserting_array), [asserting_bool](#asserting_bool), [asserting_boolean](#asserting_boolean), [asserting_empty](#asserting_empty), [asserting_empty_map](#asserting_empty_map), [asserting_error](#asserting_error), [asserting_float](#asserting_float), [asserting_int](#asserting_int), [asserting_map](#asserting_map), [asserting_nonempty_map](#asserting_nonempty_map), [asserting_not_array](#asserting_not_array), [asserting_not_empty](#asserting_not_empty), [asserting_not_map](#asserting_not_map), [asserting_not_null](#asserting_not_null), [asserting_null](#asserting_null), [asserting_numeric](#asserting_numeric), [asserting_present](#asserting_present), [asserting_string](#asserting_string), [is_absent](#is_absent), [is_array](#is_array), [is_bool](#is_bool), [is_boolean](#is_boolean), [is_empty](#is_empty), [is_empty_map](#is_empty_map), [is_error](#is_error), [is_float](#is_float), [is_int](#is_int), [is_map](#is_map), [is_nan](#is_nan), [is_nonempty_map](#is_nonempty_map), [is_not_array](#is_not_array), [is_not_empty](#is_not_empty), [is_not_map](#is_not_map), [is_not_null](#is_not_null), [is_null](#is_null), [is_numeric](#is_numeric), [is_present](#is_present), [is_string](#is_string), [typeof](#typeof).

Expand Down Expand Up @@ -1502,6 +1502,21 @@ os (class=system #args=0) Returns the operating-system name as a string.
</pre>


### stat
<pre class="pre-non-highlight-non-pair">
stat (class=system #args=1) Returns a map containing information about the provided path: "name" with string value, "size" as decimal int value, "mode" as octal int value, "modtime" as int-valued epoch seconds, and "isdir" as boolean value.
Examples:
stat("./mlr") gives {
"name": "mlr",
"size": 38391584,
"mode": 0755,
"modtime": 1715207874,
"isdir": false
}
stat("./mlr")["size"] gives 38391584
</pre>


### system
<pre class="pre-non-highlight-non-pair">
system (class=system #args=1) Run command string, yielding its stdout minus final carriage return.
Expand Down
16 changes: 14 additions & 2 deletions man/manpage.txt
Original file line number Diff line number Diff line change
Expand Up @@ -204,7 +204,7 @@ MILLER(1) MILLER(1)
percentiles pow qnorm reduce regextract regextract_or_else rightpad round
roundm rstrip sec2dhms sec2gmt sec2gmtdate sec2hms sec2localdate sec2localtime
select sgn sha1 sha256 sha512 sin sinh skewness sort sort_collection splita
splitax splitkv splitkvx splitnv splitnvx sqrt ssub stddev strfntime
splitax splitkv splitkvx splitnv splitnvx sqrt ssub stat stddev strfntime
strfntime_local strftime strftime_local string strip strlen strmatch strmatchx
strpntime strpntime_local strptime strptime_local sub substr substr0 substr1
sum sum2 sum3 sum4 sysntime system systime systimeint tan tanh tolower toupper
Expand Down Expand Up @@ -2969,6 +2969,18 @@ MILLER(1) MILLER(1)
Example:
ssub("abc.def", ".", "X") gives "abcXdef"

1mstat0m
(class=system #args=1) Returns a map containing information about the provided path: "name" with string value, "size" as decimal int value, "mode" as octal int value, "modtime" as int-valued epoch seconds, and "isdir" as boolean value.
Examples:
stat("./mlr") gives {
"name": "mlr",
"size": 38391584,
"mode": 0755,
"modtime": 1715207874,
"isdir": false
}
stat("./mlr")["size"] gives 38391584

1mstddev0m
(class=stats #args=1) Returns the sample standard deviation of values in an array or map. Returns empty string AKA void for array/map of length less than two; returns error for non-array/non-map types.
Example:
Expand Down Expand Up @@ -3699,4 +3711,4 @@ MILLER(1) MILLER(1)



2024-04-11 MILLER(1)
2024-05-09 MILLER(1)
24 changes: 21 additions & 3 deletions man/mlr.1
Original file line number Diff line number Diff line change
Expand Up @@ -2,12 +2,12 @@
.\" Title: mlr
.\" Author: [see the "AUTHOR" section]
.\" Generator: ./mkman.rb
.\" Date: 2024-04-11
.\" Date: 2024-05-09
.\" Manual: \ \&
.\" Source: \ \&
.\" Language: English
.\"
.TH "MILLER" "1" "2024-04-11" "\ \&" "\ \&"
.TH "MILLER" "1" "2024-05-09" "\ \&" "\ \&"
.\" -----------------------------------------------------------------
.\" * Portability definitions
.\" ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Expand Down Expand Up @@ -251,7 +251,7 @@ nsec2gmt nsec2gmtdate nsec2localdate nsec2localtime null_count os percentile
percentiles pow qnorm reduce regextract regextract_or_else rightpad round
roundm rstrip sec2dhms sec2gmt sec2gmtdate sec2hms sec2localdate sec2localtime
select sgn sha1 sha256 sha512 sin sinh skewness sort sort_collection splita
splitax splitkv splitkvx splitnv splitnvx sqrt ssub stddev strfntime
splitax splitkv splitkvx splitnv splitnvx sqrt ssub stat stddev strfntime
strfntime_local strftime strftime_local string strip strlen strmatch strmatchx
strpntime strpntime_local strptime strptime_local sub substr substr0 substr1
sum sum2 sum3 sum4 sysntime system systime systimeint tan tanh tolower toupper
Expand Down Expand Up @@ -4602,6 +4602,24 @@ ssub("abc.def", ".", "X") gives "abcXdef"
.fi
.if n \{\
.RE
.SS "stat"
.if n \{\
.RS 0
.\}
.nf
(class=system #args=1) Returns a map containing information about the provided path: "name" with string value, "size" as decimal int value, "mode" as octal int value, "modtime" as int-valued epoch seconds, and "isdir" as boolean value.
Examples:
stat("./mlr") gives {
"name": "mlr",
"size": 38391584,
"mode": 0755,
"modtime": 1715207874,
"isdir": false
}
stat("./mlr")["size"] gives 38391584
.fi
.if n \{\
.RE
.SS "stddev"
.if n \{\
.RS 0
Expand Down
22 changes: 22 additions & 0 deletions pkg/bifs/system.go
Original file line number Diff line number Diff line change
Expand Up @@ -102,3 +102,25 @@ func BIF_exec(mlrvals []*mlrval.Mlrval) *mlrval.Mlrval {
outputString := strings.TrimRight(string(outputBytes), "\n")
return mlrval.FromString(outputString)
}

func BIF_stat(input1 *mlrval.Mlrval) *mlrval.Mlrval {
if !input1.IsStringOrVoid() {
return mlrval.FromNotStringError("system", input1)
}
path := input1.AcquireStringValue()

fileInfo, err := os.Stat(path)

if err != nil {
return mlrval.FromError(err)
}

output := mlrval.NewMlrmap()
output.PutReference("name", mlrval.FromString(fileInfo.Name()))
output.PutReference("size", mlrval.FromInt(fileInfo.Size()))
output.PutReference("mode", mlrval.FromIntShowingOctal(int64(fileInfo.Mode())))
output.PutReference("modtime", mlrval.FromInt(fileInfo.ModTime().UTC().Unix()))
output.PutReference("isdir", mlrval.FromBool(fileInfo.IsDir()))

return mlrval.FromMap(output)
}
17 changes: 17 additions & 0 deletions pkg/dsl/cst/builtin_function_manager.go
Original file line number Diff line number Diff line change
Expand Up @@ -2487,6 +2487,23 @@ Run a command via executable, path, args and environment, yielding its stdout mi
variadicFunc: bifs.BIF_exec,
},

{
name: "stat",
class: FUNC_CLASS_SYSTEM,
help: `Returns a map containing information about the provided path: "name" with string value, "size" as decimal int value, "mode" as octal int value, "modtime" as int-valued epoch seconds, and "isdir" as boolean value.`,
unaryFunc: bifs.BIF_stat,
examples: []string{
`stat("./mlr") gives {`,
` "name": "mlr",`,
` "size": 38391584,`,
` "mode": 0755,`,
` "modtime": 1715207874,`,
` "isdir": false`,
`}`,
`stat("./mlr")["size"] gives 38391584`,
},
},

{
name: "version",
class: FUNC_CLASS_SYSTEM,
Expand Down
9 changes: 9 additions & 0 deletions pkg/mlrval/mlrval_new.go
Original file line number Diff line number Diff line change
Expand Up @@ -197,6 +197,15 @@ func FromInt(input int64) *Mlrval {
}
}

func FromIntShowingOctal(input int64) *Mlrval {
return &Mlrval{
mvtype: MT_INT,
printrepValid: true,
printrep: fmt.Sprintf("0%o", input),
intf: input,
}
}

// TryFromIntString is used by the mlrval Formatter (fmtnum DSL function,
// format-values verb, etc). Each mlrval has printrep and a printrepValid for
// its original string, then a type-code like MT_INT or MT_FLOAT, and
Expand Down
1 change: 1 addition & 0 deletions test/cases/dsl-stat/0001/cmd
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
mlr --icsv --ojson put -f ${CASEDIR}/mlr ${CASEDIR}/input.csv
Empty file added test/cases/dsl-stat/0001/experr
Empty file.
12 changes: 12 additions & 0 deletions test/cases/dsl-stat/0001/expout
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
[
{
"path": "test/cases/dsl-stat/0001/input.csv",
"name": "input.csv",
"isdir": false
},
{
"path": "test/cases/dsl-stat/0001/",
"name": "0001",
"isdir": true
}
]
3 changes: 3 additions & 0 deletions test/cases/dsl-stat/0001/input.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
path
test/cases/dsl-stat/0001/input.csv
test/cases/dsl-stat/0001/
3 changes: 3 additions & 0 deletions test/cases/dsl-stat/0001/mlr
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
s = stat($path);
$name = s["name"];
$isdir = s["isdir"];
Loading