This crate provides an efficient implementation of the newly invented DartUniFrac algorithm for large-scale UniFrac computation (weighted and unweighted). We named this new algorithm DartUniFrac because the key step is to use DartMinHash or Efficient Rejection Sampling (or ERS) on branches and the DartMinHash/ERS is about "Among the first r darts thrown, return those hitting
On Linux or MacOS
conda install -c bioconda -c conda-forge dartunifracRun on example data
wget https://github.com/jianshu93/DartUniFrac/releases/download/v0.2.3/GWMC_16S_otutab.biom
wget https://github.com/jianshu93/DartUniFrac/releases/download/v0.2.3/GWMC_rep_seqs_all.tre
dartunifrac -t ./GWMC_rep_seqs_all.tre -b ./GWMC_16S_otutab.biom -m dmh -s 3072 -o unifrac_unweighted.tsv
dartunifrac -t ./GWMC_rep_seqs_all.tre -b ./GWMC_16S_otutab.biom --weighted -m dmh -s 3072 -o unifrac_weighted.tsv
### obtain the truth via striped unifrac algorithm (SIMD supported), extremely slow at the million-sample scale
striped_unifrac -t ./GWMC_rep_seqs_all.tre -m ./GWMC_16S_otutab.biom --weighted -o unifrac_weighted_striped.tsv
UniFrac (unweighted) can be simply described as unique branches that differ two samples over shared branches (see original UniFrac paper here). Here, each sample has some taxa (or features) that are in the phylogenetic tree.
If we reformulate UniFrac in math, it can be descriped below:
Therefore:
since
here,
In summary, unweighted UniFrac distance can be considered as weighted Jaccard distance on branches. A fast PCoA on the resulting DartUniFrac distance can also be computed. We rely on fixed rank subspace iteration style randomized SVD, see below.
Weighted UniFrac measuers the mass accumulate (taxa weight) over branches(see here). Here, Weighted UniFrac is the normalized weighted UniFrac. A similar idea can be used to derive that Weighted UniFrac is a simple transformation from weighted Jaccard
where
We first created a few libraries for the best performance of DartUniFrac implementation.
1.Optimal representation of balanced parenthesis for phylogenetic trees via succparen
2.Implementation of DartMinHash and Efficient Rejection Sampling algorithms can be found here.
3.SIMD-aware Hamming similarity for computing hash collision probability of sketches, anndists
4.A fast, in-memory Principle Coordinate Analysis (PCoA) based on randomized SVD (subspace iteration type SVD), fpcoa
wget https://github.com/jianshu93/DartUniFrac/releases/download/v0.2.3/dartunifrac_Linux_x86-64_v0.2.3.zip
unzip dartunifrac_Linux_x86-64_v0.2.3.zip
chmod a+x ./dartunifrac
./dartunifrac -h## install homebrew first: https://brew.sh
brew tap jianshu93/DartUniFrac
brew install DartUniFrac
dartunifrac -hCMake needs to be installed first, see guidance here
git clone https://github.com/jianshu93/DartUniFrac.git
cd DartUniFrac
#### You must have CMake installed to compile HDF5 from source. This is for BIOM format input. You also need zstd installed and library files in system path for compression.
### Linux
cargo build --release --features intel-mkl-static,stdsimd
### macos
cargo build --release --features macos-accelerate,stdsimdDartUniFrac will use all availble CPU cores/threads via Rayon by default.
$ ./target/release/dartunifrac -h
************** initializing logger *****************
DartUniFrac: Approximate unweighted UniFrac via Weighted MinHash 🎯🎯🎯
Usage: dartunifrac [OPTIONS] --tree <tree> <--input <input>|--biom <biom>>
Options:
-t, --tree <tree> Input tree in Newick format
-i, --input <input> OTU/Feature table in TSV format
-b, --biom <biom> OTU/Feature table in BIOM (HDF5) format
-o, --output <output> Output distance matrix in TSV format [default: unifrac.tsv]
--weighted Weighted UniFrac (normalized)
-s, --sketch <sketch-size> Sketch size for Weighted MinHash (DartMinHash or ERS) [default: 2048]
-m, --method <method> Sketching method: dmh (DartMinHash) or ers (Efficient Rejection Sampling) [default: dmh] [possible values: dmh, ers]
-l, --length <seq-length> Per-hash independent random sequence length for ERS, must be >= 1024 [default: 4096]
-T, --threads <threads> Number of threads, default all logical cores
--seed <seed> Random seed for reproducibility [default: 1337]
--compress Compress output with zstd, .zst suffix will be added to the output file name
--pcoa Fast Principle Coordinate Analysis based on Randomized SVD (subspace iteration), output saved to pcoa.txt/ordination.txt
--streaming Streaming the distance matrix while computing (zstd-compressed)
--block <block> Number of rows per chunk, streaming mode only
-h, --help Print help
-V, --version Print version### DartMinHash, biom format input, unweighted
dartunifrac -t ./ASVs_aligned.tre -b ./ASVs_counts.biom -m dmh -s 2048 -o unifrac_dmh_dist.csv
### DartMinHash, biom format input, weighted
dartunifrac -t ./ASVs_aligned.tre -b ./ASVs_counts.biom --weighted -m dmh -s 2048 -o unifrac_dmh_dist.csv
### tsv/txt tabular input
dartunifrac -t ./data/ASVs_aligned.tre -i ./data/ASVs_counts.txt -m dmh -s 2048 -o unifrac_dmh_dist.csv
### Efficient Rejection Sampling
dartunifrac -t ./data/ASVs_aligned.tre -b ./data/ASVs_counts.biom -m ers -s 2048 -l 4096 -o unifrac_ers_dist.csv
### dartMinHash with pcoa and compressed output distance matrix
dartunifrac -t ./data/ASVs_aligned.tre -b ./data/ASVs_counts.biom -m dmh -s 2048 -o unifrac_dmh_dist.tsv --pcoa --compress
### Streaming mode (reduce memory requirement) for large number of samples. Block size is normally 1/50 to 1/5 of sample size
dartunifrac -t ./data/ASVs_aligned.tre -b ./data/ASVs_counts.biom -m dmh -s 2048 -o unifrac_dmh_dist.tsv --streaming --block 8192
We provide Nvidia GPU support via CUDA (CUDA v12.9.1 or later must be installed and in system library path) on Linux (x86-64 tested). It will fall back to CPU if no GPU device is detected. Only the Hamming distance computation step benefits from GPU. It is optimized for Nvidia A100 but also works for RTX series.
### get the binary
wget https://github.com/jianshu93/DartUniFrac/releases/download/v0.2.6/dartunifrac-cuda_Linux_x86-64_v0.2.6.zip
unzip dartunifrac-cuda_Linux_x86-64_v0.2.3.zip
chmod a+x ./dartunifrac-cuda
RUST_LOG=info ./dartunifrac-cuda -t ./GWMC_rep_seqs_all.tre -b ./GWMC_16S_otutab.biom -m dmh -s 3072 --weighted -o unifrac_weighted.cuda.tsv
A better way is to compile from source.
### Install rust first, see here: https://rustup.rs, after run it, run:
rustup install nightly
rustup default nightly
git clone --branch DartUniFrac-GPU https://github.com/jianshu93/DartUniFrac.git
cd DartUniFrac
cargo build --release --features intel-mkl-static,stdsimd,cuda
./target/release/dartunifrac-cuda -h
Speed benchmark for 50k samples, 4 Nvidia RTX 6000 Pro were available
$ RUST_LOG=info dartunifrac-cuda -t ./ag_emp.tre -b ag_emp_even500.biom --weighted -m dmh -s 2048
************** initializing logger *****************
[2025-12-01T07:03:48Z INFO dartunifrac_cuda] Logger initialized from default environment
[2025-12-01T07:03:48Z INFO dartunifrac_cuda] 48 threads will be used
[2025-12-01T07:03:48Z INFO dartunifrac_cuda] method=dmh k=2048 seed=1337
[2025-12-01T07:03:48Z INFO dartunifrac_cuda] Weighted mode
[2025-12-01T07:03:52Z INFO dartunifrac_cuda] nodes = 614110 leaves = 307055
[2025-12-01T07:03:53Z INFO dartunifrac_cuda] transposing BIOM CSR→CSC …
[2025-12-01T07:03:53Z INFO dartunifrac_cuda] building per-sample weighted sets from BIOM (CSC) …
[2025-12-01T07:03:54Z INFO dartunifrac_cuda] built weighted sets in 288 ms
[2025-12-01T07:03:55Z INFO dartunifrac_cuda] active edges = 545049 (from 614110 total, 307055 leaves)
[2025-12-01T07:03:55Z INFO dartunifrac_cuda] sketching starting...
[2025-12-01T07:03:56Z INFO dartunifrac_cuda] sketching done.
[2025-12-01T07:03:57Z INFO dartunifrac_cuda] CUDA detected (4 devices). Computing pairwise distances on GPUs …
[2025-12-01T07:04:02Z INFO dartunifrac_cuda] pairwise distances (GPU) in 4574 ms
[2025-12-01T07:04:02Z INFO dartunifrac_cuda] Writing uncompressed output → unifrac.tsv
[2025-12-01T07:05:08Z INFO dartunifrac_cuda] Done → unifrac.tsv
2 Nividia A100
$ RUST_LOG=info dartunifrac-cuda -t ./ag_emp.tre -b ./ag_emp_even500.biom --weighted -m dmh -s 3072 -o ag_emp.weighted.dmh.cuda.tsv --compress --pcoa
************** initializing logger *****************
[2025-09-07T05:33:22Z INFO dartunifrac] Logger initialized from default environment
[2025-09-07T05:33:22Z INFO dartunifrac] 8 threads will be used
[2025-09-07T05:33:22Z INFO dartunifrac] method=dmh k=3072 seed=1337
[2025-09-07T05:33:22Z INFO dartunifrac] Weighted mode
[2025-09-07T05:33:26Z INFO dartunifrac] nodes = 614110 leaves = 307055
[2025-09-07T05:33:27Z INFO dartunifrac] transposing BIOM CSR→CSC …
[2025-09-07T05:33:27Z INFO dartunifrac] building per-sample weighted sets from BIOM (CSC) …
[2025-09-07T05:33:28Z INFO dartunifrac] built weighted sets in 936 ms
[2025-09-07T05:33:28Z INFO dartunifrac] active edges = 545049 (from 614110 total, 307055 leaves)
[2025-09-07T05:33:28Z INFO dartunifrac] sketching starting...
[2025-09-07T05:33:43Z INFO dartunifrac] sketching done.
[2025-09-07T05:33:43Z INFO dartunifrac] CUDA detected (2 devices). Computing pairwise distances on GPUs …
[2025-09-07T05:34:03Z INFO dartunifrac] pairwise distances (GPU) in 19668 ms
[2025-09-07T05:34:03Z INFO dartunifrac] Writing compressed (zstd) output → ag_emp.weighted.dmh.cuda.tsv.zst1.A distance matrix (see also benchmark section) and pcoa stype output. Distance matrix:
| Orwoll_BI0023_BI | Orwoll_BI0056_BI | Orwoll_BI0131_BI | Orwoll_BI0153_BI | Orwoll_BI0215_BI | Orwoll_BI0353_BI | |
|---|---|---|---|---|---|---|
| Orwoll_BI0023_BI | 0.0 | 0.40307617187 | 0.36596679687 | 0.36010742187 | 0.35815429687 | 0.530273437 |
| Orwoll_BI0056_BI | 0.40307617187 | 0.0 | 0.3901367187 | 0.40747070312 | 0.3398437 | 0.5209960937 |
| Orwoll_BI0131_BI | 0.36596679687 | 0.3901367187 | 0.0 | 0.41577148437 | 0.40014648437 | 0.56860351562 |
| Orwoll_BI0153_BI | 0.36010742187 | 0.40747070312 | 0.41577148437 | 0.0 | 0.260742187 | 0.422851562 |
| Orwoll_BI0215_BI | 0.35815429687 | 0.3398437 | 0.40014648437 | 0.260742187 | 0.0 | 0.45825195312 |
| Orwoll_BI0353_BI | 0.530273437 | 0.5209960937 | 0.56860351562 | 0.422851562 | 0.45825195312 | 0.0 |
2.pcoa.txt (simple):
| PC1 | PC2 | PC3 | PC4 | PC5 | PC6 | |
|---|---|---|---|---|---|---|
| Orwoll_BI0023_BI | -0.12912805176115577 | -0.13478145930844512 | 0.042380530554922635 | 0.17338355657990664 | -0.003431287906286917 | 0.0 |
| Orwoll_BI0056_BI | -0.09201950670064461 | 0.23750165032060416 | -0.030767869352507412 | 0.05365540164234924 | -0.04135676583746806 | 0.0 |
| Orwoll_BI0131_BI | -0.18672940644255911 | -0.01880872256563759 | 0.15163198033883335 | -0.1408718693633402 | 0.013611393370684486 | 0.0 |
| Orwoll_BI0153_BI | 0.06843838458354137 | -0.09791766405635079 | -0.12381078744019998 | -0.07508375468747691 | -0.10416130668711489 | 0.0 |
| Orwoll_BI0215_BI | -0.003937706919220717 | -0.00814970660379848 | -0.15251179756497754 | -0.028692966865033972 | 0.11819355132475588 | 0.0 |
| Orwoll_BI0353_BI | 0.3433762872400387 | 0.022155902213627746 | 0.11307794346392891 | 0.017609632693595324 | 0.017144415735429568 | 0.0 |
| PC1 | PC2 | PC3 | PC4 | PC5 | PC6 | |
|---|---|---|---|---|---|---|
| proportion_explained | 0.4233339721296298 | 0.19721082825095754 | 0.17875499078496368 | 0.13806192702856432 | 0.0626382818058847 | 0.0 |
3.ordination.txt (scikit-bio/qiime format):
Eigvals 6
0.18261610725439215 0.0850720143777559 0.07711060939718065 0.059556599124712034 0.027020650223156684 0.0
Proportion explained 6
0.42333397212962964 0.19721082825095754 0.17875499078496382 0.13806192702856435 0.06263828180588471 0.0
Species 0 0
Site 6 6
Orwoll_BI0023_BI -0.12912805176115583 -0.1347814593084449 0.04238053055492221 0.1733835565799069 -0.0034312879062869286 0.0
Orwoll_BI0056_BI -0.09201950670064456 0.23750165032060455 -0.030767869352506142 0.05365540164234891 -0.0413567658374681 0.0
Orwoll_BI0131_BI -0.18672940644255914 -0.018808722565638855 0.15163198033883318 -0.14087186936334042 0.013611393370684503 0.0
Orwoll_BI0153_BI 0.06843838458354136 -0.0979176640563503 -0.1238107874402007 -0.07508375468747673 -0.104161306687115 0.0
Orwoll_BI0215_BI -0.003937706919220702 -0.008149706603797708 -0.15251179756497782 -0.028692966865033843 0.11819355132475592 0.0
Orwoll_BI0353_BI 0.3433762872400386 0.02215590221362715 0.11307794346392917 0.017609632693595185 0.017144415735429578 0.0
Biplot 0 0
Site constraints 0 0
the ordination.txt file can be imported into qiime2 like this:
qiime tools import --input-path ordination.txt --type 'PCoAResults' --output-path pcoa.qza
You can then visulize it via qiime emperor plot.
We use Striped UniFrac algorithm as the ground truth, which is an exact and efficient algorithm for large number of samples. A pure Rust implementaion, as a supporting crate for this one, can be found here, also included as a binary in this crate.
For the testing data (ASVs_count.tsv and ASV_aligned.tre), the truth from Striped UniFrac (unweighted) is:
| Orwoll_BI0023_BI | Orwoll_BI0056_BI | Orwoll_BI0131_BI | Orwoll_BI0153_BI | Orwoll_BI0215_BI | Orwoll_BI0353_BI | |
|---|---|---|---|---|---|---|
| Orwoll_BI0023_BI | 0 | 0.403847873210907 | 0.3646169304847717 | 0.366204708814621 | 0.3484474122524261 | 0.5317433476448059 |
| Orwoll_BI0056_BI | 0.403847873210907 | 0 | 0.3883068859577179 | 0.4069649279117584 | 0.3338068425655365 | 0.5172212719917297 |
| Orwoll_BI0131_BI | 0.3646169304847717 | 0.3883068859577179 | 0 | 0.4222687482833862 | 0.4006152451038361 | 0.5693299174308777 |
| Orwoll_BI0153_BI | 0.366204708814621 | 0.4069649279117584 | 0.4222687482833862 | 0 | 0.2608364820480347 | 0.4297698140144348 |
| Orwoll_BI0215_BI | 0.3484474122524261 | 0.3338068425655365 | 0.4006152451038361 | 0.2608364820480347 | 0 | 0.4572696685791016 |
| Orwoll_BI0353_BI | 0.5317433476448059 | 0.5172212719917297 | 0.5693299174308777 | 0.4297698140144348 | 0.4572696685791016 | 0 |
DartUniFrac estimation (DartMinHash, unweighted) is:
| Orwoll_BI0023_BI | Orwoll_BI0056_BI | Orwoll_BI0131_BI | Orwoll_BI0153_BI | Orwoll_BI0215_BI | Orwoll_BI0353_BI | |
|---|---|---|---|---|---|---|
| Orwoll_BI0023_BI | 0.0 | 0.40307617187 | 0.36596679687 | 0.36010742187 | 0.35815429687 | 0.530273437 |
| Orwoll_BI0056_BI | 0.40307617187 | 0.0 | 0.3901367187 | 0.40747070312 | 0.3398437 | 0.5209960937 |
| Orwoll_BI0131_BI | 0.36596679687 | 0.3901367187 | 0.0 | 0.41577148437 | 0.40014648437 | 0.56860351562 |
| Orwoll_BI0153_BI | 0.36010742187 | 0.40747070312 | 0.41577148437 | 0.0 | 0.260742187 | 0.422851562 |
| Orwoll_BI0215_BI | 0.35815429687 | 0.3398437 | 0.40014648437 | 0.260742187 | 0.0 | 0.45825195312 |
| Orwoll_BI0353_BI | 0.530273437 | 0.5209960937 | 0.56860351562 | 0.422851562 | 0.45825195312 | 0.0 |
DartUniFrac estimation (Efficient Rejection Sampling, uniweghted) is:
| Orwoll_BI0023_BI | Orwoll_BI0056_BI | Orwoll_BI0131_BI | Orwoll_BI0153_BI | Orwoll_BI0215_BI | Orwoll_BI0353_BI | |
|---|---|---|---|---|---|---|
| Orwoll_BI0023_BI | 0.0 | 0.4101562 | 0.3657226562 | 0.3618164062 | 0.3569335937 | 0.52539062 |
| Orwoll_BI0056_BI | 0.4101562 | 0.0 | 0.40039062 | 0.4086914062 | 0.340820312 | 0.5190429687 |
| Orwoll_BI0131_BI | 0.3657226562 | 0.40039062 | 0.0 | 0.4360351562 | 0.4145507812 | 0.570312 |
| Orwoll_BI0153_BI | 0.3618164062 | 0.4086914062 | 0.4360351562 | 0.0 | 0.264648437 | 0.4311523437 |
| Orwoll_BI0215_BI | 0.3569335937 | 0.340820312 | 0.4145507812 | 0.264648437 | 0.0 | 0.460937 |
| Orwoll_BI0353_BI | 0.52539062 | 0.5190429687 | 0.570312 | 0.4311523437 | 0.460937 | 0.0 |
For the testing data (ASVs_count.tsv and ASV_aligned.tre), the truth from Striped UniFrac (weighted_normalized) is:
| Orwoll_BI0023_BI | Orwoll_BI0056_BI | Orwoll_BI0131_BI | Orwoll_BI0153_BI | Orwoll_BI0215_BI | Orwoll_BI0353_BI | |
|---|---|---|---|---|---|---|
| Orwoll_BI0023_BI | 0 | 0.3701851367950439 | 0.3719282746315002 | 0.2419339865446091 | 0.235030472278595 | 0.2684231698513031 |
| Orwoll_BI0056_BI | 0.3701851367950439 | 0 | 0.4768573343753815 | 0.3429801762104034 | 0.2965413630008698 | 0.3132076263427734 |
| Orwoll_BI0131_BI | 0.3719282746315002 | 0.4768573343753815 | 0 | 0.3818131387233734 | 0.3768087327480316 | 0.4305019080638885 |
| Orwoll_BI0153_BI | 0.2419339865446091 | 0.3429801762104034 | 0.3818131387233734 | 0 | 0.1378696262836456 | 0.1329282969236374 |
| Orwoll_BI0215_BI | 0.235030472278595 | 0.2965413630008698 | 0.3768087327480316 | 0.1378696262836456 | 0 | 0.139195516705513 |
| Orwoll_BI0353_BI | 0.2684231698513031 | 0.3132076263427734 | 0.4305019080638885 | 0.1329282969236374 | 0.139195516705513 | 0 |
DartUniFrac estimation (DartMinHash, weighted) is:
| Orwoll_BI0023_BI | Orwoll_BI0056_BI | Orwoll_BI0131_BI | Orwoll_BI0153_BI | Orwoll_BI0215_BI | Orwoll_BI0353_BI | |
|---|---|---|---|---|---|---|
| Orwoll_BI0023_BI | 0 | 0.3726541554959786 | 0.3562913907284768 | 0.2412121212121212 | 0.24878048780487805 | 0.26889714993804215 |
| Orwoll_BI0056_BI | 0.3726541554959786 | 0 | 0.4566145092460882 | 0.33159947984395316 | 0.2832080200501253 | 0.30612244897959184 |
| Orwoll_BI0131_BI | 0.3562913907284768 | 0.4566145092460882 | 0 | 0.38565629228687415 | 0.3781965006729475 | 0.4301675977653631 |
| Orwoll_BI0153_BI | 0.2412121212121212 | 0.33159947984395316 | 0.38565629228687415 | 0 | 0.1265126512651265 | 0.1277533039647577 |
| Orwoll_BI0215_BI | 0.24878048780487805 | 0.2832080200501253 | 0.3781965006729475 | 0.1265126512651265 | 0 | 0.13399778516057587 |
| Orwoll_BI0353_BI | 0.26889714993804215 | 0.30612244897959184 | 0.4301675977653631 | 0.1277533039647577 | 0.13399778516057587 | 0 |
DartUniFrac estimation (ERS, weighted) is:
| Orwoll_BI0023_BI | Orwoll_BI0056_BI | Orwoll_BI0131_BI | Orwoll_BI0153_BI | Orwoll_BI0215_BI | Orwoll_BI0353_BI | |
|---|---|---|---|---|---|---|
| Orwoll_BI0023_BI | 0.0 | 0.37403555853740356 | 0.3540495867768595 | 0.24612108305445696 | 0.2322503008423586 | 0.26929036256585065 |
| Orwoll_BI0056_BI | 0.37403555853740356 | 0.0 | 0.46810035842293907 | 0.3671562082777036 | 0.31324142353318374 | 0.3333333333333333 |
| Orwoll_BI0131_BI | 0.3540495867768595 | 0.46810035842293907 | 0.0 | 0.38331644714623436 | 0.3735747820254863 | 0.4301675977653631 |
| Orwoll_BI0153_BI | 0.24612108305445696 | 0.3671562082777036 | 0.38331644714623436 | 0.0 | 0.14381457693381736 | 0.1265126512651265 |
| Orwoll_BI0215_BI | 0.2322503008423586 | 0.31324142353318374 | 0.3735747820254863 | 0.14381457693381736 | 0.0 | 0.13936022253129346 |
| Orwoll_BI0353_BI | 0.26929036256585065 | 0.3333333333333333 | 0.4301675977653631 | 0.1265126512651265 | 0.13936022253129346 | 0.0 |
The best L for achiving a given accuracy is related to the sparsity of the data (see ERS paper here). The author recommended an equation for L:
RUST_LOG=info ./target/release/striped_unifrac -t ./data/ASVs_aligned.tre -i ./data/AS
Vs_counts.txt -o ASVs_striped_unifac_dist.tsvYou will see some log like this:
************** initializing logger *****************
[2025-08-21T20:12:38Z INFO striped_unifrac] logger initialized from default environment
[2025-08-21T20:12:38Z INFO striped_unifrac] Total branches with positive length: 923
[2025-08-21T20:12:38Z INFO striped_unifrac] Start parsing input.
[2025-08-21T20:12:38Z INFO striped_unifrac] phase-1 masks built 0 ms
[2025-08-21T20:12:38Z INFO striped_unifrac] sample 0: relevant branches = 689 / 923
[2025-08-21T20:12:38Z INFO striped_unifrac] sample 1: relevant branches = 594 / 923
[2025-08-21T20:12:38Z INFO striped_unifrac] sample 2: relevant branches = 646 / 923
[2025-08-21T20:12:38Z INFO striped_unifrac] sample 3: relevant branches = 584 / 923
[2025-08-21T20:12:38Z INFO striped_unifrac] sample 4: relevant branches = 647 / 923
[2025-08-21T20:12:38Z INFO striped_unifrac] sample 5: relevant branches = 468 / 923
[2025-08-21T20:12:38Z INFO striped_unifrac] phase-2 sparse lists built (1 strips)
[2025-08-21T20:12:38Z INFO striped_unifrac] phase-3 block pass 0 ms
[2025-08-21T20:12:38Z INFO striped_unifrac] Start writing output.
This is a dense example where
We want to thank Otmar Ertl and Xiaoyun Li for their helpful comments on DartMinHash and Efficient Rejection Sampling, respectively. We want to thank Yuhan(Sherlyn) Weng for helping with DartUniFrac logo design.
Paper to come


