Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Percentile is innacurate with a small data set #24

Open
scott-david-walker opened this issue Jan 20, 2023 · 1 comment
Open

Percentile is innacurate with a small data set #24

scott-david-walker opened this issue Jan 20, 2023 · 1 comment

Comments

@scott-david-walker
Copy link

scott-david-walker commented Jan 20, 2023

When calling Query using a small data set the actual underlying call to calculate the percentile is never used unless the stream has been flushed. It appears just to return the value in a certain index.

	q := quantile.NewTargeted(0.5, 0.9, 0.95, 0.99)
	q.Insert(float64(5340000000))
	q.Insert(float64(5750000000))
	q.Insert(float64(5930000000))
	q.Insert(float64(6160000000))
	q.Insert(float64(6560000000))
	q.Insert(float64(1156000000))
	fmt.Print(q.Query(0.50))  // 5750000000
	fmt.Print(q.Query(0.90)) // 6160000000
	fmt.Print(q.Query(0.95)) // 6160000000
	fmt.Print(q.Query(0.99)) // 6160000000
@bmizerany
Copy link
Owner

It's been a while since I've been active with this project and code, but IIRC, this is to be expected. The sketch is for larger datasets. Datasets that are bounded and that fit in memory are probably best off using simpler, more accurate ways of computing quantiles.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants