Skip to content

Infinite loop in Hypergeometric::sample with large (>2^50) parameters #35

@mstoeckl

Description

@mstoeckl

Summary

With large parameters, sampling from the Hypergeometric distribution implementation can hang in an infinite or very slow loop somewhere. I'd prefer success or an explicit error.

This is probably a consequence of limited precision floating point -- the inputs that trigger this are inching toward the 2^53 boundary . I haven't looked into the cause in detail yet, and will update when I have time to find out more. (Edit: this may be a fundamental issue with large parameters passed to algorithm "HIN", making it calculate a ratio of binomials incredibly slowly.)

Code sample

use rand::SeedableRng;
use rand::rngs::Xoshiro128PlusPlus;
use rand_distr::{Distribution, Hypergeometric};

fn main() {
    let mut rng = Xoshiro128PlusPlus::seed_from_u64(0x1);
    let dist = Hypergeometric::new(5277655813324800, 527765581332480, 52776558133248).unwrap();
    println!("{}", dist.sample(&mut rng));
}

Notes

While I encountered this while trying to write some additional property tests, it's possible to systematically look for hang and crash issues of this type with a fuzzer. As testing methods go, fuzzing is somewhat awkward and can require a lot of manual work to explore a complicated system; but most of the rand_distr sampler code doesn't have too complicated control flow or internal checks. My current fuzzer script is in https://github.com/mstoeckl/rand_distr/commits/fuzz-params/ ; if you'd like I can make a pull request out of it (or start reporting issues that come from it, or just PRs to fix them -- there are a few more distributions that can crash or hang on unusual inputs.)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions