diff --git a/README.md b/README.md index 9841f24..6f40b97 100644 --- a/README.md +++ b/README.md @@ -146,202 +146,62 @@ local element_count = my_heap:count() The `rmean` module provides functionality for efficient moving average calculations with specified window sizes. -### Usage +The best use of rmean would be when you want to have Average Calls per second during last 5 seconds. -For the most convenient use of the `rmean` module, the `default` instance is provided with a window size of 5 seconds, similar to what is used in Tarantool. Here are some examples demonstrating the usage of the `default` rmean instance: +Easy to set and use: ```lua -local rmean = require('algo.rmean') +local rmean = require 'algo.rmean' --- Create a new collector in the default rmean instance with the same window size (5 seconds) -local my_collector = rmean.collector('my_collector') -``` - -#### Observing Values and Calculating Metrics - -```lua --- Observe values for the collector -my_collector:observe(10) -my_collector:observe(15) -my_collector:observe(20) - --- Calculate the moving average per second for the collector -local avg_per_sec = my_collector:per_second() -``` - -#### Getting All Registered Collectors in the Default rmean Instance +local calls_storage = rmean.collector('rmean_calls_storage', --[[ [window=(default: 5)] ]]) -```lua -local all_collectors = rmean.getall() -``` +local function call_storage() + -- increase by one each time, when method is called + calls_storage:observe(1) +end -#### Freeing a Specific Collector in the Default rmean Instance +calls_storage:per_second() -- => gives Moving Average per Second +calls_storage:max() -- => gives Moving Max within Time Window (default 5 sec) +calls_storage:sum() -- => gives Moving Sum within Time Window (default 5 sec) +calls_storage:mean() -- => gives Moving Mean within Time Window (default 5 sec) +calls_storage:min() -- => gives Moving Min within Time Window (default 5 sec) +calls_storage:hits() -- => gives Moving Count within Time Window (default 5 sec) -```lua --- Free a specific collector --- Collector will become unusable, though it's data will be preserved. --- This is a true way to destroy collector -rmean.free(my_collector) +-- rmean can be easily connected to metrics: +rmean.default:set_registry(require 'metrics'.registry) ``` -#### Note - -- The `default` rmean instance is the most preferred way to use `rmean`, as it has a window size of 5 seconds, aligning with the common practice in Tarantool. - -#### Collector methods - -The `rmean` module provides methods to efficiently calculate and access various metrics within the moving average collectors. +Rmean can be used to collect statistics of both discreate and continious variables. -##### `sum([depth=window_size])` - -- **Usage**: Retrieves the moving sum value within a specified time depth. -- **When to Use**: This method is useful when you need to track the cumulative sum of values observed by the collector over a specific time period. - -##### `min([depth=window_size])` - -- **Usage**: Returns the moving minimum value within a specified time depth. -- **When to Use**: Use this method when you want to find the minimum value observed by the collector within a specific time window. - -##### `max([depth=window_size])` - -- **Usage**: Retrieves the moving maximum value within a specified time depth. -- **When to Use**: Utilize this method to determine the maximum value observed by the collector within a particular time frame. - -##### `count and total fields` - -- **Count Field**: The `count` field represents the monotonic counter of all collected values from the last reset. -- **Total Field**: The `total` field stores the sum of all values collected by the collector from the last reset. -- **When to Use**: You can access these fields to keep track of the total count of observations and the cumulative total sum of values collected by the collector. +Collect Running Mean (Moving Average) of Latency of your calls: ```lua --- Obtain the moving sum value for the last 4 seconds -local sum_last_4_sec = my_collector:sum(4) +local latency_rmean = rmean.collector('latency') --- Get the minimum value observed in the last 3 seconds -local min_last_3_sec = my_collector:min(3) +latency_rmean:max() -- Will produce Moving maximum of the latency within specified window +latency_rmean:min() -- Will produce Moving minimum of the latency within specified window +latency_rmean:mean() -- Will produce Moving average of the latency within specified window --- Retrieve the maximum value in the last 2 seconds -local max_last_2_sec = my_collector:max(2) +-- latency_rmean:per_second() DOES not posses any meaningfull stats --- Access the total sum and count fields of the collector -local total_sum = my_collector.total -local observation_count = my_collector.count +latency_rmean:hits() -- Will produce Moving count of observed values within specified window ``` -**Note:** Ensure that the `depth` parameter does not exceed the `window size` of the collector. - -### Integrating with tarantool/metrics +Collect Per Second statistics for your calls or bytes ```lua -local metrics = require('metrics') -metrics.registry:register(rmean) -``` - -After registering `rmean` in `tarantool/metrics`, you can seamlessly collect metrics from all registered named rmean collectors. - -### Setting Labels for `rmean` Collectors +local tuple_size_rmean = rmean.collector('tuple_size') -Each collector within the `rmean` module allows you to set custom labels to provide additional context or categorization for the collected metrics. +-- Let's assume you measure bsize of tuples you save into Database +tuple_size_rmean:observe(tuple:bsize()) -```lua --- Set custom labels for a collector -my_collector:set_labels({ name = 'example_collector', environment = 'production' }) +tuple_size_rmean:per_second() -- will produce Moving average of bytes per second +tuple_size_rmean:max() +tuple_size_rmean:min() +tuple_size_rmean:hits() ``` -Each collector within the `rmean` module provides metrics suitable for export to Prometheus via the `tarantool/metrics` module. The metrics available for export are as follows: - -- **rmean_per_second**: Represents the running average of the collected values. -- **rmean_sum**: Represents the running sum of the collected values. -- **rmean_min**: Represents the minimum value observed within the collector's window. -- **rmean_max**: Represents the maximum value observed within the collector's window. -- **rmean_count**: Represents the number of observations made by the collector. -- **rmean_total**: Represents the total sum of all collected values. - -### Advanced Usage - -1. **Creating a New `rmean` Instance**: - -```lua -local rmean = require('algo.rmean') - --- Create a new rmean instance with a specified name, resolution, and window size -local my_rmean = rmean.new('my_rmean_instance', 1, 5) -``` - -1. **Creating a New Collector**: - - ```lua - -- Create a new collector within the rmean instance - local new_collector = my_rmean:collector('my_collector', 5) - ``` - -2. **Getting All Collectors**: - - ```lua - -- Get all registered collectors within the rmean instance - local all_collectors = my_rmean:getall() - ``` - -3. **Getting a Specific Collector**: - - ```lua - -- Get a specific collector by name - local specific_collector = my_rmean:get('my_collector') - ``` - -4. **Observing Values and Calculating Metrics**: - - ```lua - -- Observe a value for a collector - specific_collector:observe(10) - - -- Calculate the moving average per second for a collector - local avg_per_sec = specific_collector:per_second() - ``` - -5. **Reloading a Collector**: - - ```lua - -- Reload a collector from an existing one - -- specific_collector will be unsusable after executing this call - local reloaded_collector = my_rmean:reload(specific_collector) - ``` - -6. **Starting and Stopping the System**: - - ```lua - -- Stop the system and disable creating new collectors - my_rmean:stop() - - -- Start the system to begin calculating averages - my_rmean:start() - ``` - -7. **Freeing Collectors**: - - ```lua - -- Free a specific collector - my_rmean:free(specific_collector) - ``` - -8. **Metrics Collection**: - - ```lua - -- Collect metrics from all registered collectors - local metrics_data = my_rmean:collect() - ``` - -9. **Setting Metrics Registry**: - - ```lua - -- Set a metrics registry for the rmean instance - my_rmean:set_registry(metrics_registry) - ``` - -### Notes - -- The system is automatically started when the rmean instance is created. Manual starting is only required if it was previously stopped. -- The module efficiently handles moving average calculations even with a large number of parallel running collectors and provides high-performance metrics collection capabilities. +Read more at [rmean](./doc/rmean.md) ## Ordered Dictionary (algo.odict) diff --git a/algo-scm-1.rockspec b/algo-scm-1.rockspec index 4273973..3226531 100644 --- a/algo-scm-1.rockspec +++ b/algo-scm-1.rockspec @@ -34,6 +34,8 @@ build = { algo = "algo.lua", ["algo.rlist"] = "algo/rlist.lua", ["algo.rmean"] = "algo/rmean.lua", - ["algo.heap"] = "algo/heap.lua" + ["algo.heap"] = "algo/heap.lua", + ["algo.odict"] = "algo/odict.lua", + ["algo.skiplist"] = "algo/skiplist.lua", } } diff --git a/algo.lua b/algo.lua index 2ac6da8..044a44d 100644 --- a/algo.lua +++ b/algo.lua @@ -3,5 +3,6 @@ return { heap = require 'algo.heap', rmean = require 'algo.rmean', odict = require 'algo.odict', - _VERSION = '0.1.0', + skiplist = require 'algo.skiplist', + _VERSION = '0.2.0', } diff --git a/algo/rlist.lua b/algo/rlist.lua index 096ba90..1ca760f 100644 --- a/algo/rlist.lua +++ b/algo/rlist.lua @@ -29,7 +29,9 @@ end ---@class algo.rlist.item:table ---@field prev? algo.rlist.item ---@field next? algo.rlist.item -local rlist_item_mt = {__serialize = __item_serialize} + +local rlist_item_mt = {} +rlist_item_mt.__serialize = __item_serialize ---@param node table local function to_rlist_item(node) diff --git a/algo/rmean.lua b/algo/rmean.lua index 4ef0999..5814357 100644 --- a/algo/rmean.lua +++ b/algo/rmean.lua @@ -5,25 +5,17 @@ local rlist = require 'algo.rlist' local log = require 'log' local math_floor = math.floor -local math_min = math.min -local math_max = math.max local setmetatable = setmetatable local table_new = table.new local tonumber = tonumber -local type = type ----Appends all k-vs from t2 to t1, if not exists in t1 ----@param t1 table? ----@param t2 table? -local function merge(t1, t2) - if type(t1) ~= 'table' or type(t2) ~= 'table' then return end +local merge = require 'algo.utils'.merge +local map_mt = require 'algo.utils'.map_mt +local weak_mt = require 'algo.utils'.weak_mt +local new_list = require 'algo.utils'.new_list +local new_zero_list = require 'algo.utils'.new_zero_list +local make_list_pretty = require 'algo.utils'.make_list_pretty - for k in pairs(t2) do - if t1[k] == nil then - t1[k] = t2[k] - end - end -end ---Class rmean is plain-Lua implementation of Tarantool's rmean collector --- @@ -48,7 +40,7 @@ end ---@field _roller_f? Fiber ---@class algo.rmean.collector.weak:algo.rlist.item ----@field collector algo.rmean.collector weakref to collector +---@field collector? algo.rmean.collector weakref to collector ---rmean.collector is named separate counter ---@class algo.rmean.collector @@ -56,9 +48,10 @@ end ---@field window number window size of collector (default=5s) ---@field size number capacity collector (window/resolution) ---@field label_pairs table specified label pairs for metrics ----@field sum_value number[] list of sum values per second ----@field min_value number[] list of min values per second ----@field max_value number[] list of max values per second +---@field sum_value number[] list of sum values per second (running sum) +---@field hit_value number[] list of hit values per second (running count) +---@field min_value number[] list of min values per second (running min) +---@field max_value number[] list of max values per second (running max) ---@field total number sum of all values from last reset ---@field count number monotonic counter of all collected values from last reset ---@field _resolution number length in seconds of each slot @@ -67,10 +60,6 @@ end ---@field _invalid? boolean is set to true when collector was freed from previous rmean local collector = {} -local map_mt = { __serialize = 'map' } -local list_mt = { __serialize = 'seq' } -local weak_mt = { __mode = 'v' } - local collector_mt = { __index = collector, __tostring = function(self) @@ -92,37 +81,12 @@ local collector_mt = { window = self.window, min = self:min(), max = self:max(), + mean = self:mean(), per_second = self:per_second(), }, map_mt) end } ---#region algo.rmean.utils - ----Creates new list with null values ----@param size number ----@private ----@return number[] -local function new_list(size) - size = math_floor(size) - local t = setmetatable(table_new(size, 0), list_mt) - return t -end - ----Creates new list with zero values ----@param size number ----@private ----@return number[] -local function new_zero_list(size) - local t = new_list(size) - -- we start iteration from 0 - -- we abuse how lua stores arrays - for i = 0, size do - t[i] = 0 - end - return t -end - ---@param depth number ---@param max_value number ---@return integer @@ -142,15 +106,6 @@ local function _get_depth(depth, max_value) return depth end -local function _list_serialize(list) - local r = {} - for i = 1, #list do - r[i] = tostring(list[i]) - end - return r -end - ---#endregion ---fiber roller of registered collectors ---@private @@ -231,11 +186,13 @@ function rmean_methods:collector(name, window) window = tonumber(window) or self.window local size = math_floor(window/self._resolution) + local remote = setmetatable({ name = name, window = window, size = size, sum_value = new_zero_list(size), + hit_value = new_zero_list(size), min_value = new_list(size), max_value = new_list(size), label_pairs = { name = name, window = window }, @@ -245,6 +202,9 @@ function rmean_methods:collector(name, window) _resolution = self._resolution, }, collector_mt) + -- cache hot function into object itself + remote.observe = remote.observe + ---@type algo.rmean.collector.weak local _item = setmetatable({ collector = remote }, weak_mt) remote._rlist_item = _item @@ -267,8 +227,9 @@ end ---@return algo.rmean.collector function rmean_methods:reload(counter) -- ? check __version - local new = self:collector(counter.name) + local new = self:collector(counter.name, counter.window) new.sum_value = counter.sum_value + new.hit_value = counter.hit_value new.count = counter.count new.total = counter.total new.max_value = counter.max_value @@ -298,7 +259,7 @@ function rmean_methods:getall() local rv = table_new(self._collectors.count, 0) local n = 0 - setmetatable(rv, {__serialize = _list_serialize}) + make_list_pretty(rv) for _, cursor in self._collectors:pairs() do ---@cast cursor algo.rmean.collector.weak @@ -313,7 +274,7 @@ end ---returns registered collector by name ---@param name string ----@return algo.rmean.collector? +---@return algo.rmean.collector|algo.rmean.collector[]|nil function rmean_methods:get(name) if not name then return self:getall() @@ -337,7 +298,7 @@ end ---metrics collect hook function rmean_methods:collect() - local result = table.new(self._collectors.count*6, 0) + local result = table_new(self._collectors.count*6, 0) local label_pairs if self.metrics_registry then label_pairs = self.metrics_registry.label_pairs @@ -363,18 +324,6 @@ end --#region algo.rmean.collector ----Calculates and returns per_second value ----@param depth integer? depth in seconds (default=window size, [1,window size]) ----@return number -function collector:per_second(depth) - depth = _get_depth(depth, self.window) - local sum = 0 - for i = 1, depth/self._resolution do - sum = sum + self.sum_value[i] - end - return sum / depth -end - function collector:set_labels(label_pairs) self.label_pairs = table.copy(label_pairs) self.label_pairs.window = tostring(self.window) @@ -387,13 +336,13 @@ end function collector:min(depth) depth = _get_depth(depth, self.window) - local min + local _min for i = 1, depth/self._resolution do - if not min or (self.min_value[i] and min > self.min_value[i]) then - min = self.min_value[i] + if not _min or (self.min_value[i] and _min > self.min_value[i]) then + _min = self.min_value[i] end end - return min + return _min end ---Returns moving max value @@ -402,18 +351,21 @@ end function collector:max(depth) depth = _get_depth(depth, self.window) - local max + local _max for i = 1, depth/self._resolution do - if not max or (self.max_value[i] and max < self.max_value[i]) then - max = self.max_value[i] + if not _max or (self.max_value[i] and _max < self.max_value[i]) then + _max = self.max_value[i] end end - return max + return _max end ---Returns moving sum value +--- +---Equivalent to SUM(VALUE[0:depth]) +--- ---@param depth integer? depth in seconds (default=window size, [1,window size]) ----@return number? sum # can return null if no values were observed +---@return number sum function collector:sum(depth) depth = _get_depth(depth, self.window) @@ -424,6 +376,61 @@ function collector:sum(depth) return sum end +--- Returns moving count value +--- +--- Equivalent to COUNT(VALUE[0:depth]) +--- +---@param depth integer? depth in seconds (default=window size, [1,window size]) +---@return number count +function collector:hits(depth) + depth = _get_depth(depth, self.window) + + local count = 0 + for i = 1, depth/self._resolution do + count = count + self.hit_value[i] + end + return count +end + +---Calculates and returns moving average value with given depth +--- +---Equivalent to SUM(VALUE[0:depth]) / COUNT(VALUE[0:depth]) +--- +---@param depth integer? depth in seconds (default=window size, [1,window size]) +---@return number average +function collector:mean(depth) + depth = _get_depth(depth, self.window) + + local sum = 0 + local count = 0 + for i = 1, depth/self._resolution do + sum = sum + self.sum_value[i] + count = count + self.hit_value[i] + end + if count == 0 then + return 0 + end + return sum / count +end + +---Calculates and returns moving sum value devided by depth +--- +---Equivalent to SUM(values[0:depth]) / depth +--- +---It has the same meaning as average 'per second' sum of values +--- +---Usefull for calculating average hits per second (such as rps or sizes) +---@param depth integer? depth in seconds (default=window size, [1,window size]) +---@return number +function collector:per_second(depth) + depth = _get_depth(depth, self.window) + local sum = 0 + for i = 1, depth/self._resolution do + sum = sum + self.sum_value[i] + end + return sum / depth +end + ---Increments current time bucket with given value ---@param value number|uint64_t|integer64 function collector:observe(value) @@ -432,11 +439,20 @@ function collector:observe(value) if not value then return end self.sum_value[0] = self.sum_value[0] + value + self.hit_value[0] = self.hit_value[0] + 1 self.total = self.total + value self.count = self.count + 1 - self.min_value[0] = math_min(self.min_value[0] or value, value) - self.max_value[0] = math_max(self.max_value[0] or value, value) + if self.min_value[0] then + if value < self.min_value[0] then + self.min_value[0] = value + elseif value > self.max_value[0] then + self.max_value[0] = value + end + else + self.min_value[0] = value + self.max_value[0] = value + end end -- inc is alias for observe @@ -451,6 +467,7 @@ function collector:collect() timestamp = fiber.time64(), }, { metric_name = 'rmean_sum', value = self:sum(), label_pairs = self.label_pairs, timestamp = fiber.time64() }, + { metric_name = 'rmean_mean', value = self:mean(), label_pairs = self.label_pairs, timestamp = fiber.time64() }, { metric_name = 'rmean_min', value = self:min(), label_pairs = self.label_pairs, timestamp = fiber.time64() }, { metric_name = 'rmean_max', value = self:max(), label_pairs = self.label_pairs, timestamp = fiber.time64() }, { metric_name = 'rmean_total', value = self.total, label_pairs = self.label_pairs, timestamp = fiber.time64() }, @@ -459,27 +476,30 @@ function collector:collect() end ---Rerolls statistics ----@param dt number +---@param dt number delta time function collector:roll(dt) if self._invalid then return end if dt < 0 then return end local sum = self.sum_value local min = self.min_value local max = self.max_value + local hit = self.hit_value local avg = sum[0] / dt local j = math_floor(self.size) while j > dt+0.1 do if j > 0 then - sum[j], min[j], max[j] = sum[j-1], min[j-1], max[j-1] + sum[j], min[j], max[j], hit[j] = sum[j-1], min[j-1], max[j-1], hit[j-1] else + -- j == 0 sum[j] = avg end j = j - 1 end for i = j, 1, -1 do - sum[i], min[i], max[i] = avg, min[0], max[0] + sum[i], min[i], max[i], hit[i] = avg, min[0], max[0], hit[0] end sum[0] = 0 + hit[0] = 0 min[0] = nil max[0] = nil end @@ -488,9 +508,10 @@ end function collector:reset() for i = 0, self.size do self.sum_value[i] = 0 + self.hit_value[i] = 0 + self.min_value[i] = math.huge + self.max_value[i] = -math.huge end - table.clear(self.min_value) - table.clear(self.max_value) self.total = 0 self.count = 0 end diff --git a/algo/utils.lua b/algo/utils.lua new file mode 100644 index 0000000..6992337 --- /dev/null +++ b/algo/utils.lua @@ -0,0 +1,70 @@ +local type = type +local math_floor = math.floor +local table_new = require('table.new') + +---Appends all k-vs from t2 to t1, if not exists in t1 +---@param t1 table? +---@param t2 table? +local function merge(t1, t2) + if type(t1) ~= 'table' or type(t2) ~= 'table' then return end + + for k in pairs(t2) do + if t1[k] == nil then + t1[k] = t2[k] + end + end +end + +local map_mt = { __serialize = 'map' } +local list_mt = { __serialize = 'seq' } +local weak_mt = { __mode = 'v' } + + +---Creates new list with null values +---@param size number +---@param init number? initial value +---@private +---@return number[] +local function new_list(size, init) + size = math_floor(size) + local t = setmetatable(table_new(size, 0), list_mt) + if not init then return t end + for i = 0, size do + t[i] = init + end + return t +end + +---Creates new list with zero values +---@param size number +---@private +---@return number[] +local function new_zero_list(size) + return new_list(size, 0) +end + +local function _list_serialize(list) + local r = {} + for i = 1, #list do + r[i] = tostring(list[i]) + end + return r +end + +local pretty_list_mt = {__serialize = _list_serialize} + +local function make_list_pretty(rv) + return setmetatable(rv, pretty_list_mt) +end + + +return { + merge = merge, + map_mt = map_mt, + list_mt = list_mt, + weak_mt = weak_mt, + + new_list = new_list, + new_zero_list = new_zero_list, + make_list_pretty = make_list_pretty, +} \ No newline at end of file diff --git a/benchmarks/arr_vs_tbl_bench.lua b/benchmarks/arr_vs_tbl_bench.lua new file mode 100644 index 0000000..1742f80 --- /dev/null +++ b/benchmarks/arr_vs_tbl_bench.lua @@ -0,0 +1,219 @@ +--- Benchmarks for array vs table performance +--- +--- Resolution: Tables are faster than arrays with (x2) and without JIT (x3) +local M = {} + +local math_floor = math.floor +local table_new = table.new +local setmetatable = setmetatable +local list_mt = {__serialize='seq'} + +---Creates new list +---@param size number +---@return number[] +local function new_list(size) + size = math_floor(size) + local t = setmetatable(table_new(size, 0), list_mt) + return t +end + +---Creates new list with zero values +---@param size number +---@private +---@return number[] +local function new_zero_list(size) + local t = new_list(size) + -- we start iteration from 0 + -- we abuse how lua stores arrays + for i = 0, size do + t[i] = 0 + end + return t +end + +local lists = { + [1] = new_zero_list(1), + [5] = new_zero_list(5), + [10] = new_zero_list(10), + [20] = new_zero_list(20), + [50] = new_zero_list(50), + [100] = new_zero_list(100), +} + + +-- 30ns/op | 0.7691ns/op +---@param b luabench.B +function M.bench_tbls(b) + + for tb_size, _ in pairs(lists) do + b:run(""..tb_size, function (sb) + local key = tb_size + local self = lists + for _ = 1, sb.N do + self[key][0] = self[key][0] + 1 + end + end) + end +end + +function M.bench_sum_tbls(b) + b:skip("skip for now") + for tb_size, _ in pairs(lists) do + b:run(""..tb_size, function (sb) + local key = tb_size + local self = lists + for _ = 1, sb.N do + local s = 0 + for i = 1, tb_size do + s = s + self[key][i] + end + end + end) + end +end + +local ffi = require 'ffi' + +local function new_array(size) + size = math_floor(size) + local t = ffi.new('double[?]', size) + return t +end + +local function new_zero_array(size) + return new_array(size) +end + +local arrs = { + [1] = new_zero_array(1), + [5] = new_zero_array(5), + [10] = new_zero_array(10), + [20] = new_zero_array(20), + [50] = new_zero_array(50), + [100] = new_zero_array(100), +} + +--- 104ns/op | 1.5ns/op +---@param b luabench.B +function M.bench_arrs(b) + b:skip("skip for now") + for size, _ in pairs(arrs) do + b:run(""..size, function (sb) + local key = size + local self = arrs + for _ = 1, sb.N do + self[key][0] = self[key][0] + 1 + end + end) + end +end + +function M.bench_sum_arrs(b) + for size, _ in pairs(arrs) do + b:run(""..size, function (sb) + local key = size + local self = arrs + for _ = 1, sb.N do + local s = 0 + for i = 1, size do + s = s + self[key][i] + end + end + end) + end +end + +return M + +--[[ +Tarantool version: Tarantool 3.3.0-0-g5fc82b8 +Tarantool build: Darwin-arm64-RelWithDebInfo (static) +Tarantool build flags: -fexceptions -funwind-tables -fasynchronous-unwind-tables -fno-common -fmacro-prefix-map=/var/folders/8x/1m5v3n6d4mn62g9w_65vvt_r0000gn/T/tarantool_install1980638789=. -std=c11 -Wall -Wextra -Wno-gnu-alignof-expression -Wno-cast-function-type -O2 -g -DNDEBUG -ggdb -O2 +CPU: Apple M1 @ 8 +JIT: Disabled +Duration: 5s +Global timeout: 60 + +--- BENCH: arr_vs_tbl_bench::bench_arrs:5 +58489223 105.8 ns/op 9454136 op/s 0 B/op +928B + +--- BENCH: arr_vs_tbl_bench::bench_arrs:100 +56514203 108.7 ns/op 9200601 op/s 0 B/op +928B + +--- BENCH: arr_vs_tbl_bench::bench_arrs:50 +56669531 107.2 ns/op 9327758 op/s 0 B/op +928B + +--- BENCH: arr_vs_tbl_bench::bench_arrs:1 +58135028 104.0 ns/op 9618124 op/s 0 B/op +928B + +--- BENCH: arr_vs_tbl_bench::bench_arrs:10 +58186332 104.1 ns/op 9610380 op/s 0 B/op +928B + +--- BENCH: arr_vs_tbl_bench::bench_arrs:20 +54986345 112.9 ns/op 8855729 op/s 0 B/op +928B + +--- BENCH: arr_vs_tbl_bench::bench_tbls:5 +203393139 28.65 ns/op 34902561 op/s 0 B/op +928B + +--- BENCH: arr_vs_tbl_bench::bench_tbls:100 +195765143 32.05 ns/op 31205992 op/s 0 B/op +928B + +--- BENCH: arr_vs_tbl_bench::bench_tbls:50 +196708412 31.85 ns/op 31399384 op/s 0 B/op +928B + +--- BENCH: arr_vs_tbl_bench::bench_tbls:1 +220661816 28.85 ns/op 34661519 op/s 0 B/op +928B + +--- BENCH: arr_vs_tbl_bench::bench_tbls:10 +221371009 28.78 ns/op 34747648 op/s 0 B/op +928B + +--- BENCH: arr_vs_tbl_bench::bench_tbls:20 +178823591 34.93 ns/op 28631065 op/s 0 B/op +928B + +============================================================================== + +Tarantool version: Tarantool 3.3.0-0-g5fc82b8 +Tarantool build: Darwin-arm64-RelWithDebInfo (static) +Tarantool build flags: -fexceptions -funwind-tables -fasynchronous-unwind-tables -fno-common -fmacro-prefix-map=/var/folders/8x/1m5v3n6d4mn62g9w_65vvt_r0000gn/T/tarantool_install1980638789=. -std=c11 -Wall -Wextra -Wno-gnu-alignof-expression -Wno-cast-function-type -O2 -g -DNDEBUG -ggdb -O2 +CPU: Apple M1 @ 8 +JIT: Enabled +JIT: fold cse dce fwd dse narrow loop abc sink fuse +Duration: 5s +Global timeout: 60 + +--- BENCH: arr_vs_tbl_bench::bench_arrs:5 +1000000000 1.598 ns/op 625938125 op/s 0 B/op +928B + +--- BENCH: arr_vs_tbl_bench::bench_arrs:100 +1000000000 1.543 ns/op 648050341 op/s 0 B/op +1.78KB + +--- BENCH: arr_vs_tbl_bench::bench_arrs:50 +1000000000 1.538 ns/op 650200554 op/s 0 B/op +880B + +--- BENCH: arr_vs_tbl_bench::bench_arrs:1 +1000000000 1.573 ns/op 635698407 op/s 0 B/op +880B + +--- BENCH: arr_vs_tbl_bench::bench_arrs:10 +1000000000 1.538 ns/op 649993468 op/s 0 B/op +880B + +--- BENCH: arr_vs_tbl_bench::bench_arrs:20 +1000000000 1.539 ns/op 649850274 op/s 0 B/op +880B + +--- BENCH: arr_vs_tbl_bench::bench_tbls:5 +1000000000 0.7691 ns/op 1300297118 op/s 0 B/op +928B + +--- BENCH: arr_vs_tbl_bench::bench_tbls:100 +1000000000 0.7690 ns/op 1300449305 op/s 0 B/op +1.90KB + +--- BENCH: arr_vs_tbl_bench::bench_tbls:50 +1000000000 0.7690 ns/op 1300368134 op/s 0 B/op +880B + +--- BENCH: arr_vs_tbl_bench::bench_tbls:1 +1000000000 0.7691 ns/op 1300185536 op/s 0 B/op +880B + +--- BENCH: arr_vs_tbl_bench::bench_tbls:10 +1000000000 0.7702 ns/op 1298288207 op/s 0 B/op +880B + +--- BENCH: arr_vs_tbl_bench::bench_tbls:20 +1000000000 0.7694 ns/op 1299778258 op/s 0 B/op +880B +]] diff --git a/benchmarks/lists_bench.lua b/benchmarks/lists_bench.lua index 1d7bc5e..19f2b6a 100644 --- a/benchmarks/lists_bench.lua +++ b/benchmarks/lists_bench.lua @@ -30,7 +30,7 @@ function M.bench_slist(b) for _ = 1, b.N do local x = math.random(100) cont[1] = x - local n = sl:get(cont) + local _ = sl:get(cont) end end diff --git a/benchmarks/rmean_bench.lua b/benchmarks/rmean_bench.lua index 5fc89e2..ae7711b 100644 --- a/benchmarks/rmean_bench.lua +++ b/benchmarks/rmean_bench.lua @@ -2,38 +2,48 @@ local M = {} local rmean = require 'algo.rmean' +local random_value = math.random(1e9) + local rm = rmean.default:collector() function M.bench_rmean_observe(b) + local rand = random_value for i = 1, b.N do - rm:observe(i) + rm:observe(rand+i) + end +end + +function M.bench_rmean_per_second(b) + for _ = 1, b.N do + local _ = rm:per_second() end end -function M.bench_rmean_mean(b) +function M.bench_rmean_collect(b) for _ = 1, b.N do - local x = rm:per_second() + local _ = rm:collect() end end function M.bench_rmean_sum(b) for _ = 1, b.N do - local x = rm:sum() + local _ = rm:sum() end end function M.bench_rmean_min(b) for _ = 1, b.N do - local x = rm:min() + local _ = rm:min() end end function M.bench_rmean_max(b) for _ = 1, b.N do - local x = rm:max() + local _ = rm:max() end end -function M.bench_rmean_thousands(b) +function M.bench_rmean_new_collector(b) + b:skip() local cols = table.new(1000, 0) for i = 1, b.N do local col = rmean.default:collector() diff --git a/doc/rmean.md b/doc/rmean.md new file mode 100644 index 0000000..1fa8847 --- /dev/null +++ b/doc/rmean.md @@ -0,0 +1,198 @@ +# Rmean (moving average) + +## Usage + +For the most convenient use of the `rmean` module, the `default` instance is provided with a window size of 5 seconds, similar to what is used in Tarantool. Here are some examples demonstrating the usage of the `default` rmean instance: + +```lua +local rmean = require('algo.rmean') + +-- Create a new collector in the default rmean instance with the same window size (5 seconds) +local my_collector = rmean.collector('my_collector') +``` + +### Observing Values and Calculating Metrics + +```lua +-- Observe values for the collector +my_collector:observe(10) +my_collector:observe(15) +my_collector:observe(20) + +-- Calculate the moving average per second for the collector +local avg_per_sec = my_collector:per_second() +``` + +### Getting All Registered Collectors in the Default rmean Instance + +```lua +local all_collectors = rmean.getall() +``` + +### Freeing a Specific Collector in the Default rmean Instance + +```lua +-- Free a specific collector +-- Collector will become unusable, though it's data will be preserved. +-- This is a true way to destroy collector +rmean.free(my_collector) +``` + +### Note + +- The `default` rmean instance is the most preferred way to use `rmean`, as it has a window size of 5 seconds, aligning with the common practice in Tarantool. + +### Collector methods + +The `rmean` module provides methods to efficiently calculate and access various metrics within the moving average collectors. + +#### `sum([depth=window_size])` + +- **Usage**: Retrieves the moving sum value within a specified time depth. +- **When to Use**: This method is useful when you need to track the cumulative sum of values observed by the collector over a specific time period. + +#### `min([depth=window_size])` + +- **Usage**: Returns the moving minimum value within a specified time depth. +- **When to Use**: Use this method when you want to find the minimum value observed by the collector within a specific time window. + +#### `max([depth=window_size])` + +- **Usage**: Retrieves the moving maximum value within a specified time depth. +- **When to Use**: Utilize this method to determine the maximum value observed by the collector within a particular time frame. + +#### `count and total fields` + +- **Count Field**: The `count` field represents the monotonic counter of all collected values from the last reset. +- **Total Field**: The `total` field stores the sum of all values collected by the collector from the last reset. +- **When to Use**: You can access these fields to keep track of the total count of observations and the cumulative total sum of values collected by the collector. + +```lua +-- Obtain the moving sum value for the last 4 seconds +local sum_last_4_sec = my_collector:sum(4) + +-- Get the minimum value observed in the last 3 seconds +local min_last_3_sec = my_collector:min(3) + +-- Retrieve the maximum value in the last 2 seconds +local max_last_2_sec = my_collector:max(2) + +-- Access the total sum and count fields of the collector +local total_sum = my_collector.total +local observation_count = my_collector.count +``` + +**Note:** Ensure that the `depth` parameter does not exceed the `window size` of the collector. + +### Integrating with tarantool/metrics + +```lua +local metrics = require('metrics') +metrics.registry:register(rmean) +``` + +After registering `rmean` in `tarantool/metrics`, you can seamlessly collect metrics from all registered named rmean collectors. + +### Setting Labels for `rmean` Collectors + +Each collector within the `rmean` module allows you to set custom labels to provide additional context or categorization for the collected metrics. + +```lua +-- Set custom labels for a collector +my_collector:set_labels({ name = 'example_collector', environment = 'production' }) +``` + +Each collector within the `rmean` module provides metrics suitable for export to Prometheus via the `tarantool/metrics` module. The metrics available for export are as follows: + +- **rmean_per_second**: Represents the running average of the collected values. +- **rmean_sum**: Represents the running sum of the collected values. +- **rmean_min**: Represents the minimum value observed within the collector's window. +- **rmean_max**: Represents the maximum value observed within the collector's window. +- **rmean_count**: Represents the number of observations made by the collector. +- **rmean_total**: Represents the total sum of all collected values. + +### Advanced Usage + +1. **Creating a New `rmean` Instance**: + +```lua +local rmean = require('algo.rmean') + +-- Create a new rmean instance with a specified name, resolution, and window size +local my_rmean = rmean.new('my_rmean_instance', 1, 5) +``` + +1. **Creating a New Collector**: + + ```lua + -- Create a new collector within the rmean instance + local new_collector = my_rmean:collector('my_collector', 5) + ``` + +2. **Getting All Collectors**: + + ```lua + -- Get all registered collectors within the rmean instance + local all_collectors = my_rmean:getall() + ``` + +3. **Getting a Specific Collector**: + + ```lua + -- Get a specific collector by name + local specific_collector = my_rmean:get('my_collector') + ``` + +4. **Observing Values and Calculating Metrics**: + + ```lua + -- Observe a value for a collector + specific_collector:observe(10) + + -- Calculate the moving average per second for a collector + local avg_per_sec = specific_collector:per_second() + ``` + +5. **Reloading a Collector**: + + ```lua + -- Reload a collector from an existing one + -- specific_collector will be unsusable after executing this call + local reloaded_collector = my_rmean:reload(specific_collector) + ``` + +6. **Starting and Stopping the System**: + + ```lua + -- Stop the system and disable creating new collectors + my_rmean:stop() + + -- Start the system to begin calculating averages + my_rmean:start() + ``` + +7. **Freeing Collectors**: + + ```lua + -- Free a specific collector + my_rmean:free(specific_collector) + ``` + +8. **Metrics Collection**: + + ```lua + -- Collect metrics from all registered collectors + local metrics_data = my_rmean:collect() + ``` + +9. **Setting Metrics Registry**: + + ```lua + -- Set a metrics registry for the rmean instance + my_rmean:set_registry(metrics_registry) + ``` + +### Notes + +- The system is automatically started when the rmean instance is created. Manual starting is only required if it was previously stopped. +- The module efficiently handles moving average calculations even with a large number of parallel running collectors and provides high-performance metrics collection capabilities. diff --git a/test/003_rmean_test.lua b/test/003_rmean_test.lua index d9dc2f7..0eb6999 100644 --- a/test/003_rmean_test.lua +++ b/test/003_rmean_test.lua @@ -55,6 +55,7 @@ function g.test_rmean_destroyer() t.assert_covers(clt, { { metric_name = 'rmean_per_second', value = 0, label_pairs = { window = 5, name = 'anon' }, timestamp = fiber.time64() }, { metric_name = 'rmean_sum', value = 0, label_pairs = { window = 5, name = 'anon' }, timestamp = fiber.time64() }, + { metric_name = 'rmean_mean', value = 0, label_pairs = { window = 5, name = 'anon' }, timestamp = fiber.time64() }, { metric_name = 'rmean_min', label_pairs = { window = 5, name = 'anon' }, timestamp = fiber.time64() }, { metric_name = 'rmean_max', label_pairs = { window = 5, name = 'anon' }, timestamp = fiber.time64() }, { metric_name = 'rmean_total', value = 100, label_pairs = { window = 5, name = 'anon' }, timestamp = fiber.time64() },