-
Notifications
You must be signed in to change notification settings - Fork 1.8k
processor_tda: Implement Topological Data Analysis (TDA) plugin for metrics #11250
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
cosmo0920
wants to merge
19
commits into
master
Choose a base branch
from
cosmo0920-ripser-for-analysis
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
19 commits
Select commit
Hold shift + click to select a range
4d1c797
build: lib: Bundle Ripser for analysis
cosmo0920 0ac5133
lib: ripser: Add callbacks to retrieve internal results
cosmo0920 6e9844a
lib: ripser: Provide C wrapper for ripser
cosmo0920 5a59894
tests: internal: ripser: Add internal test case for TDA library
cosmo0920 10c0d26
processor_tda: add TDA metrics processor based on ripser
cosmo0920 b528678
processor_tda: Make groups to construct point cloud for TDA calculations
cosmo0920 45ee9e0
processor_tda: Make a delay embedded cabability
cosmo0920 36662c7
processor_tda: Provide parameters for TDA process
cosmo0920 c667b85
processor_tda: Extract structs into a header
cosmo0920 1f9a411
processor_tda: Make threshold configurable
cosmo0920 aeb3f12
lib: ripser: Fix MSVC errors in ripser's explicit template specializa…
cosmo0920 09f73a5
dockerfiles: Fix CentOS 7 build for disabling ripser support
cosmo0920 1f2b865
packaging: centos: Handle ripser support properly
cosmo0920 d7c2c89
tests: internal: ripser: Fix building errors on Windows
cosmo0920 a38a0e5
processor_tda: Remove metrics suffix from its name
cosmo0920 f2b2fec
build: Fix a typo
cosmo0920 c8b550d
processor_tda: Use precise value for dimension
cosmo0920 2f588d2
processor_tda: Add a note
cosmo0920 86b83e5
processor_tda: Plug allocated assignments after releasing
cosmo0920 File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Some comments aren't visible on the classic Files Changed page.
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,99 @@ | ||
| /* -*- Mode: C; tab-width: 4; indent-tabs-mode: nil; c-basic-offset: 4 -*- */ | ||
|
|
||
| /* Fluent Bit | ||
| * ========== | ||
| * Copyright (C) 2025 The Fluent Bit Authors | ||
| * | ||
| * Licensed under the Apache License, Version 2.0 (the "License"); | ||
| * you may not use this file except in compliance with the License. | ||
| * You may obtain a copy of the License at | ||
| * | ||
| * http://www.apache.org/licenses/LICENSE-2.0 | ||
| * | ||
| * Unless required by applicable law or agreed to in writing, software | ||
| * distributed under the License is distributed on an "AS IS" BASIS, | ||
| * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
| * See the License for the specific language governing permissions and | ||
| * limitations under the License. | ||
| */ | ||
|
|
||
| #ifndef FLB_RIPSER_WRAPPER_H | ||
| #define FLB_RIPSER_WRAPPER_H | ||
|
|
||
| #include <stddef.h> | ||
|
|
||
| #ifdef __cplusplus | ||
| extern "C" { | ||
| #endif | ||
|
|
||
| #define FLB_RIPSER_MAX_BETTI_DIM 3 | ||
| /* Represents a single persistent homology interval [birth, death). | ||
| * death < 0 indicates an infinite interval. | ||
| */ | ||
| typedef struct flb_ripser_interval { | ||
| int dim; /* homology dimension (0,1,2,...) */ | ||
| float birth; /* birth radius */ | ||
| float death; /* death radius; negative means "infinity" */ | ||
| } flb_ripser_interval; | ||
|
|
||
| /* Summary of Betti numbers. | ||
| * Up to 8 dimensions supported for practical purposes. | ||
| */ | ||
| typedef struct flb_ripser_betti { | ||
| int max_dim; /* maximum computed dimension */ | ||
| int num_dims; /* number of valid dimensions (0..num_dims-1) */ | ||
| int betti[8]; /* Betti numbers for each dimension */ | ||
| } flb_ripser_betti; | ||
|
|
||
| /* | ||
| * Compute Betti numbers from a dense distance matrix. | ||
| * | ||
| * Parameters: | ||
| * dist_matrix: row-major dense matrix [n_points * n_points], diagonal = 0 | ||
| * n_points: number of points | ||
| * max_dim: maximum homology dimension to compute | ||
| * threshold: Rips complex cutoff; if <= 0, use "enclosing radius" (Ripser default) | ||
| * out_betti: filled with Betti number results | ||
| * | ||
| * Returns: | ||
| * 0 on success | ||
| * <0 on error (e.g., invalid arguments) | ||
| */ | ||
| int flb_ripser_compute_betti_from_dense_distance( | ||
| const float *dist_matrix, | ||
| size_t n_points, | ||
| int max_dim, | ||
| float threshold, | ||
| flb_ripser_betti *out_betti); | ||
|
|
||
| /* | ||
| * Callback type for retrieving each persistent interval. | ||
| * | ||
| * interval_cb is invoked once for every interval [birth, death). | ||
| * `user_data` is passed through unchanged. | ||
| */ | ||
| typedef void (*flb_ripser_interval_cb)( | ||
| const flb_ripser_interval *interval, | ||
| void *user_data); | ||
|
|
||
| /* | ||
| * Compute all persistent intervals from a dense distance matrix, | ||
| * delivering the result through a callback. | ||
| * | ||
| * Returns: | ||
| * 0 on success | ||
| * <0 on error | ||
| */ | ||
| int flb_ripser_compute_intervals_from_dense_distance( | ||
| const float *dist_matrix, | ||
| size_t n_points, | ||
| int max_dim, | ||
| float threshold, | ||
| flb_ripser_interval_cb interval_cb, | ||
| void *user_data); | ||
|
|
||
| #ifdef __cplusplus | ||
| } | ||
| #endif | ||
|
|
||
| #endif /* FLB_RIPSER_WRAPPER_H */ |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,4 @@ | ||
| DerivedData | ||
| ripser.dSYM | ||
| ripser | ||
| ripser-coeff |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,3 @@ | ||
| [submodule "robin-hood-hashing"] | ||
| path = robin-hood-hashing | ||
| url = https://github.com/martinus/robin-hood-hashing.git |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,10 @@ | ||
| add_library(ripser-static STATIC | ||
| ripser.cpp # upstream + patched version | ||
| ) | ||
|
|
||
| target_include_directories(ripser-static | ||
| PUBLIC | ||
| ${CMAKE_CURRENT_SOURCE_DIR} # ripser_internal.hpp | ||
| ) | ||
|
|
||
| target_compile_features(ripser-static PUBLIC cxx_std_11) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,9 @@ | ||
| You are under no obligation whatsoever to provide any bug fixes, patches, or | ||
| upgrades to the features, functionality or performance of the source code | ||
| ("Enhancements") to anyone; however, if you choose to make your Enhancements | ||
| available either publicly, or directly to the author of this software, without | ||
| imposing a separate written license agreement for such Enhancements, then you | ||
| hereby grant the following license: a non-exclusive, royalty-free perpetual | ||
| license to install, use, modify, prepare derivative works, incorporate into | ||
| other computer software, distribute, and sublicense such enhancements or | ||
| derivative works thereof, in binary and source code form. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,21 @@ | ||
| MIT License | ||
|
|
||
| Copyright (c) 2015–2018 Ulrich Bauer | ||
|
|
||
| Permission is hereby granted, free of charge, to any person obtaining a copy | ||
| of this software and associated documentation files (the "Software"), to deal | ||
| in the Software without restriction, including without limitation the rights | ||
| to use, copy, modify, merge, publish, distribute, sublicense, and/or sell | ||
| copies of the Software, and to permit persons to whom the Software is | ||
| furnished to do so, subject to the following conditions: | ||
|
|
||
| The above copyright notice and this permission notice shall be included in all | ||
| copies or substantial portions of the Software. | ||
|
|
||
| THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR | ||
| IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, | ||
| FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE | ||
| AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER | ||
| LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, | ||
| OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE | ||
| SOFTWARE. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,18 @@ | ||
| build: ripser | ||
|
|
||
|
|
||
| all: ripser ripser-coeff ripser-debug | ||
|
|
||
|
|
||
| ripser: ripser.cpp | ||
| c++ -std=c++11 -Wall ripser.cpp -o ripser -O3 -D NDEBUG | ||
|
|
||
| ripser-coeff: ripser.cpp | ||
| c++ -std=c++11 -Wall ripser.cpp -o ripser-coeff -O3 -D NDEBUG -D USE_COEFFICIENTS | ||
|
|
||
| ripser-debug: ripser.cpp | ||
| c++ -std=c++11 -Wall ripser.cpp -o ripser-debug -g | ||
|
|
||
|
|
||
| clean: | ||
| rm -f ripser ripser-coeff ripser-debug | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,131 @@ | ||
| # Ripser | ||
|
|
||
| Copyright © 2015–2021 [Ulrich Bauer]. | ||
|
|
||
|
|
||
| ### Description | ||
|
|
||
| Ripser is a lean C++ code for the computation of Vietoris–Rips persistence barcodes. It can do just this one thing, but does it extremely well. | ||
|
|
||
| To see a live demo of Ripser's capabilities, go to [live.ripser.org]. The computation happens inside the browser (using [Emscripten] to compile Ripser to [WebAssembly], supported on recent browsers). | ||
|
|
||
| The main features of Ripser: | ||
|
|
||
| - time- and memory-efficient | ||
| - only about 1000 lines of code in a single C++ file | ||
| - support for coefficients in prime finite fields | ||
| - no external dependencies (optional support for Google's [sparsehash]) | ||
|
|
||
| Currently, Ripser outperforms other codes ([Dionysus], [DIPHA], [GUDHI], [Perseus], [PHAT]) by a factor of more than 40 in computation time and a factor of more than 15 in memory efficiency (for the example linked at [live.ripser.org]). (Note that [PHAT] does not contain code for generating Vietoris–Rips filtrations). | ||
|
|
||
| Input formats currently supported by Ripser: | ||
|
|
||
| - comma-separated values lower triangular distance matrix | ||
| - comma-separated values upper triangular distance matrix (MATLAB output from the function `pdist`) | ||
| - comma-separated values full distance matrix | ||
| - [DIPHA] distance matrix data | ||
| - sparse distance matrix in sparse triplet format | ||
| - binary lower triangular distance matrix | ||
| - point cloud data | ||
|
|
||
| Ripser's efficiency is based on a few important concepts and principles, building on key previous and concurrent developments by other researchers in computational topology: | ||
|
|
||
| - Compute persistent *co*homology (as suggested by [Vin de Silva, Dmitriy Morozov, and Mikael Vejdemo-Johansson](https://doi.org/10.1088/0266-5611/27/12/124003)) | ||
| - Use the chain complex property that boundaries are cycles | ||
| (employ the *clearing* optimization, aka *persistence with a twist*, as suggested by [Chao Chen and Michael Kerber](http://www.geometrie.tugraz.at/kerber/kerber_papers/ck-phcwat-11.pdf)) | ||
| - If no threshold is specified, choose the *enclosing radius* as the threshold, from which on homology is guaranteed to be trivial (as suggested by [Greg Henselman-Petrusek](https://github.com/Eetion/Eirene.jl)) | ||
| - Don't store information that can be readily recomputed (in particular, the original and the reduced boundary matrix) | ||
| - Take computational shortcuts (*apparent* and *emergent persistence pairs*) | ||
|
|
||
|
|
||
| ### Version | ||
| [Latest release][latest-release]: 1.2.1 (March 2021) | ||
|
|
||
|
|
||
| ### Building | ||
|
|
||
| Ripser requires a C++11 compiler. Here is how to obtain, build, and run Ripser: | ||
|
|
||
| ```sh | ||
| git clone https://github.com/Ripser/ripser.git | ||
| cd ripser | ||
| make | ||
| ./ripser examples/sphere_3_192.lower_distance_matrix | ||
| ``` | ||
|
|
||
|
|
||
| ### Options | ||
|
|
||
| Ripser supports several compile-time options. They are switched on by defining the C preprocessor macros listed below, either using `#define` in the code or by passing an argument to the compiler. The following options are supported: | ||
|
|
||
| - `USE_COEFFICIENTS`: enable support for coefficients in a prime field | ||
| - `INDICATE_PROGRESS`: indicate the current progress in the console | ||
| - `PRINT_PERSISTENCE_PAIRS`: output the computed persistence pairs (enabled by default in the code; comment out to disable) | ||
| - `USE_ROBINHOOD_HASHMAP`: enable support for Martin Ankerl's [robinhoodhash] data structure; may further reduce memory footprint | ||
|
|
||
| For example, to build Ripser with support for Martin Ankerl's robin hood hashmap: | ||
|
|
||
| ```sh | ||
| $ c++ -std=c++11 ripser.cpp -o ripser -O3 -D NDEBUG -D USE_ROBINHOOD_HASHMAP | ||
| ``` | ||
|
|
||
| A Makefile is provided with some variants of the above options. Use `make all` to build them. The default `make` builds a binary with the default options. | ||
|
|
||
| The input is given either in a file whose name is passed as an argument, or through stdin. The following options are supported at the command line: | ||
|
|
||
| - `--format`: use the specified file format for the input. The following formats are supported: | ||
| - `lower-distance`: lower triangular distance matrix; a comma (or whitespace, or other non-numerical character) separated list of the distance matrix entries below the diagonal, sorted lexicographically by row index, then column index. | ||
| - `upper-distance`: upper triangular distance matrix; similar to the previous, but for the entries above the diagonal; suitable for output from the MATLAB functions `pdist` or `seqpdist`, exported to a CSV file. | ||
| - `distance` (default if no format is specified): full distance matrix; similar to the above, but for all entries of the distance matrix. One line per row of the matrix; only the part below the diagonal is actually read. | ||
| - `dipha`: DIPHA distance matrix as described on the [DIPHA] website. | ||
| - `point-cloud`: point cloud; a comma (or whitespace, or other non-numerical character) separated list of coordinates of the points in some Euclidean space, one point per line. | ||
| - `binary`: lower distance matrix in binary file format; a sequence of the distance matrix entries below the diagonal in 32 bit float format (IEEE 754, single, little endian). | ||
| - `sparse`: sparse triplet format; a whitespace separated list of entries of a sparse distance matrix, one entry per line, each of the form *i j d(i,j)* specifying the distance between points *i* and *j*. Each pair of points should appear in the file at most once. | ||
| - `--dim k`: compute persistent homology up to dimension *k*. | ||
| - `--threshold t`: compute Rips complexes up to diameter *t*. | ||
| - `--modulus p`: compute homology with coefficients in the prime field Z/*p*Z (only available when built with the option `USE_COEFFICIENTS`). | ||
| - `--ratio r`: only show persistence pairs with death/birth ratio > *r*. | ||
|
|
||
|
|
||
|
|
||
| ### Experimental features | ||
|
|
||
| The following experimental features are currently available in separate branches: | ||
|
|
||
| - `representative-cocycles`: output of representative cocycles for persistent cohomology. | ||
| - `representative-cycles`: computation and output of representative cycles for persistent homology (in the standard version, only *co*cycles are computed). | ||
| - `simple`: a simplified version of Ripser, without support for sparse distance matrices and coefficients. This might be a good starting point for exploring the code. | ||
|
|
||
|
|
||
| ### Citing | ||
|
|
||
| If you use Ripser in your research or if you want to give a reference to Ripser in a paper, you may use the following bibtex entry (will be updated with complete publication data): | ||
|
|
||
| ``` | ||
| @misc{1908.02518, | ||
| Author = {Ulrich Bauer}, | ||
| Title = {Ripser: efficient computation of Vietoris-Rips persistence barcodes}, | ||
| Month = Feb, | ||
| Year = {2021}, | ||
| Eprint = {1908.02518v2}, | ||
| Note = {Preprint} | ||
| } | ||
| ``` | ||
|
|
||
|
|
||
| ### License | ||
|
|
||
| Ripser is licensed under the [MIT] license (`COPYING.txt`), with an extra clause (`CONTRIBUTING.txt`) clarifying the license for modifications released without an explicit written license agreement. Please contact the author if you want to use Ripser in your software under a different license. | ||
|
|
||
| [Ulrich Bauer]: <http://ulrich-bauer.org> | ||
| [live.ripser.org]: <http://live.ripser.org> | ||
| [Emscripten]: <http://emscripten.org> | ||
| [WebAssembly]: <https://webassembly.org> | ||
| [latest-release]: <https://github.com/Ripser/ripser/releases/latest> | ||
| [Dionysus]: <http://www.mrzv.org/software/dionysus/> | ||
| [DIPHA]: <http://git.io/dipha> | ||
| [PHAT]: <http://git.io/dipha> | ||
| [Perseus]: <http://www.sas.upenn.edu/~vnanda/perseus/> | ||
| [GUDHI]: <http://gudhi.gforge.inria.fr> | ||
| [robinhoodhash]: <https://github.com/martinus/robin-hood-hashing> | ||
| [MIT]: <https://opensource.org/licenses/mit-license.php> |
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.