This document provides a concise overview of the Heistogram C API, based on the heistogram.h header file.
-
Heistogram:- The core structure representing a Heistogram. It's an opaque struct, meaning you primarily interact with it through the provided functions.
- Internally, it manages:
capacity: Allocated size for buckets.min_bucket_id: Optimization for serialization.total_count: Total data points added.min,max: Minimum and maximum values seen.buckets: An array ofBucketstructs.
-
Bucket:- A simple structure representing a histogram bucket.
- Contains a single member:
count: Number of data points falling into this bucket.
This section summarizes the main functions for interacting with Heistogram.
-
Heistogram* heistogram_create(void):- Description: Creates a new, empty Heistogram object.
- Parameters: None.
- Returns: A pointer to the newly created
Heistogramon success,NULLon failure (memory allocation error). Remember to check forNULLreturn!
-
void heistogram_free(Heistogram* h):- Description: Frees the memory allocated for a
Heistogramobject. - Parameters:
h: A pointer to theHeistogramobject to be freed.
- Returns:
void. It is crucial to call this function to avoid memory leaks when you are finished using a Heistogram.
- Description: Frees the memory allocated for a
-
uint64_t heistogram_count(const Heistogram* h):- Description: Returns the total number of data points inserted into the histogram.
- Parameters:
h: A pointer to aHeistogramobject (constant, will not be modified).
- Returns: The total count of values, or
0ifhisNULL.
-
uint64_t heistogram_max(const Heistogram* h):- Description: Returns the maximum value inserted into the histogram.
- Parameters:
h: A pointer to aHeistogramobject (constant).
- Returns: The maximum value, or
0ifhisNULLor no data has been inserted.
-
uint64_t heistogram_min(const Heistogram* h):- Description: Returns the minimum value inserted into the histogram.
- Parameters:
h: A pointer to aHeistogramobject (constant).
- Returns: The minimum value, or
0ifhisNULLor no data has been inserted.
-
uint32_t heistogram_memory_size(const Heistogram* h):- Description: Returns the approximate memory size (in bytes) used by the Heistogram object and its internal bucket storage.
- Parameters:
h: A pointer to aHeistogramobject (constant).
- Returns: The memory size in bytes, or
0ifhisNULL.
void heistogram_add(Heistogram* h, uint64_t value):- Description: Adds a single integer value to the Heistogram.
- Parameters:
h: A pointer to theHeistogramobject.value: Theuint64_tvalue to be inserted. Values should be non-negative.
- Returns:
void.
-
Heistogram* heistogram_merge(const Heistogram* h1, const Heistogram* h2):- Description: Merges two in-memory Heistograms (
h1andh2) into a new Heistogram object. This is a non-destructive merge;h1andh2are unchanged. - Parameters:
h1: Pointer to the firstHeistogram(constant).h2: Pointer to the secondHeistogram(constant).
- Returns: A pointer to a new
Heistogramobject representing the merged result, orNULLon error (e.g., memory allocation failure or if either input isNULL). Remember to free the returned histogram when done!
- Description: Merges two in-memory Heistograms (
-
int heistogram_merge_in_place(Heistogram* h1, const Heistogram* h2):- Description: Merges the contents of
h2intoh1, modifyingh1directly. This is an in-place merge.h2remains unchanged. - Parameters:
h1: Pointer to the destinationHeistogram(will be modified).h2: Pointer to theHeistogramto merge from (constant).
- Returns:
1on success,0on failure (e.g., if either input isNULLor memory reallocation fails ifh1needs to grow).
- Description: Merges the contents of
-
Heistogram* heistogram_merge_serialized(const Heistogram* h, const void* buffer, size_t size):- Description: Merges a serialized Heistogram (from
bufferandsize) into an in-memory Heistogramh. Creates a new Heistogram object as the result. - Parameters:
h: Pointer to the in-memoryHeistogram.buffer: Pointer to the serialized Heistogram data.size: Size of the serialized Heistogram data in bytes.
- Returns: A pointer to a new
Heistogramobject representing the merged result, orNULLon error (e.g., if inputs are invalid, deserialization fails, or memory allocation fails). Remember to free the returned histogram when done!
- Description: Merges a serialized Heistogram (from
-
Heistogram* heistogram_merge_two_serialized(const void* buffer1, size_t size1, const void* buffer2, size_t size2):- Description: Merges two serialized Heistograms (from
buffer1/size1andbuffer2/size2) into a new Heistogram object. - Parameters:
buffer1,size1: Serialized data and size for the first Heistogram.buffer2,size2: Serialized data and size for the second Heistogram.
- Returns: A pointer to a new
Heistogramobject representing the merged result, orNULLon error (e.g., if inputs are invalid, deserialization fails, or memory allocation fails). Remember to free the returned histogram when done!
- Description: Merges two serialized Heistograms (from
-
int heistogram_merge_inplace_serialized(Heistogram* h, const void* buffer, size_t size):- Description: Merges a serialized Heistogram (from
bufferandsize) in-place into an existing in-memory Heistogramh, modifyinghdirectly. - Parameters:
h: Pointer to the destination in-memoryHeistogram(will be modified).buffer: Pointer to the serialized Heistogram data.size: Size of the serialized Heistogram data in bytes.
- Returns:
1on success,0on failure (e.g., if inputs are invalid, deserialization fails, or memory reallocation fails ifhneeds to grow).
- Description: Merges a serialized Heistogram (from
-
double heistogram_percentile(const Heistogram* h, double p):- Description: Calculates the approximate p-th percentile from the Heistogram data.
- Parameters:
h: A pointer to aHeistogramobject (constant).p: The percentile value to calculate (a double between 0.0 and 100.0, inclusive).
- Returns: The approximate p-th percentile value as a
double. Returns0ifhisNULLorpis out of range.
-
void heistogram_percentiles(const Heistogram* h, const double* percentiles, size_t num_percentiles, double* results):- Description: Calculates multiple percentiles in a single pass for efficiency.
- Parameters:
h: A pointer to aHeistogramobject (constant).percentiles: An array ofdoublepercentile values (between 0.0 and 100.0).num_percentiles: The number of percentiles in thepercentilesarray.results: A pre-allocated array ofdoubleof sizenum_percentileswhere the results will be stored.
- Returns:
void. The calculated percentiles are placed in theresultsarray, in the same order as the inputpercentilesarray.
-
double heistogram_prank(const Heistogram* h, double value):- Description: Calculates the percentile rank (or p-rank) of a given value within the Heistogram's distribution. This is the approximate percentile below which the given
valuefalls. - Parameters:
h: A pointer to aHeistogramobject (constant).value: The value for which to calculate the percentile rank (adouble).
- Returns: The percentile rank of the value as a
double(between 0.0 and 100.0). Returns0ifhisNULLorvalueis negative, returns100if value is greater than or equal to max value in histogram.
- Description: Calculates the percentile rank (or p-rank) of a given value within the Heistogram's distribution. This is the approximate percentile below which the given
-
void* heistogram_serialize(const Heistogram* h, size_t* size):- Description: Serializes a Heistogram object into a byte buffer for storage or transmission.
- Parameters:
h: A pointer to theHeistogramobject to serialize (constant).size: A pointer to asize_tvariable where the size (in bytes) of the serialized data will be written.
- Returns: A pointer to a dynamically allocated byte buffer containing the serialized Heistogram data, or
NULLon error (e.g., ifhorsizeisNULLor memory allocation fails). You are responsible for freeing this buffer usingfree()when you are finished with it.
-
Heistogram* heistogram_deserialize(const void* buffer, size_t size):- Description: Deserializes a Heistogram object from a byte buffer.
- Parameters:
buffer: A pointer to the byte buffer containing the serialized Heistogram data.size: The size of the serialized data in bytes.
- Returns: A pointer to a newly created
Heistogramobject deserialized from the buffer, orNULLon error (e.g., ifbufferisNULL, deserialization fails, or memory allocation fails). Remember to free the returned histogram when done!
-
double heistogram_percentile_serialized(const void* buffer, size_t size, double p):- Description: Calculates the approximate p-th percentile directly from serialized Heistogram data, without deserializing the entire histogram. This is a key performance feature.
- Parameters:
buffer: A pointer to the byte buffer containing the serialized Heistogram data.size: The size of the serialized data in bytes.p: The percentile value to calculate (a double between 0.0 and 100.0).
- Returns: The approximate p-th percentile value as a
double. Returns0if inputs are invalid or deserialization fails.
-
void heistogram_percentiles_serialized(const void* buffer, size_t size, const double* percentiles, size_t num_percentiles, double* results):- Description: Calculates multiple percentiles directly from serialized Heistogram data in a single pass, without full deserialization. Optimized for bulk percentile queries on serialized data.
- Parameters:
buffer: A pointer to the byte buffer containing the serialized Heistogram data.size: The size of the serialized data in bytes.percentiles: An array ofdoublepercentile values.num_percentiles: The number of percentiles to calculate.results: A pre-allocateddoublearray to store the percentile results.
- Returns:
void. Results are written into theresultsarray.
- Error Handling: Many functions return
NULLor0on failure. Always check return values, especially fromheistogram_create,heistogram_deserialize,heistogram_merge,heistogram_merge_serialized, andheistogram_serializeto handle potential errors (like memory allocation failures). - Memory Management: You are responsible for freeing memory allocated by
heistogram_create,heistogram_deserialize,heistogram_merge,heistogram_merge_serialized, andheistogram_serializeusingheistogram_free()andfree()respectively. - Integer Data: Heistogram is optimized for integer data (
uint64_t). If you have floating-point data, consider scaling and rounding to integers before inserting. - Growth Factor (
HEIST_GROWTH_FACTOR): The header defines aHEIST_GROWTH_FACTOR(currently0.02). This parameter controls the accuracy and size of the histogram. While it's defined as a static global, it's likely intended to be configurable at compile time if you need to adjust the error bound. Consult documentation for recommended values and implications of changing it. - Varint Encoding: Heistogram uses varint encoding for serialization, which is efficient for compressing integer counts and metadata. This is handled internally.
This summary should provide a good starting point for understanding and using the Heistogram C API. For more detailed information, please refer to the complete documentation and examples (when available).