Skip to content

Implement dask collection protocol on HealpixDataset#1252

Closed
hombit wants to merge 4 commits intomainfrom
claude/fix-issue-1028-oIXip
Closed

Implement dask collection protocol on HealpixDataset#1252
hombit wants to merge 4 commits intomainfrom
claude/fix-issue-1028-oIXip

Conversation

@hombit
Copy link
Contributor

@hombit hombit commented Feb 11, 2026

Add dask_graph, dask_keys, dask_tokenize, dask_postcompute,
dask_postpersist, dask_optimize, and dask_scheduler methods to
HealpixDataset, delegating to the underlying _ddf NestedFrame.

This allows dask.distributed.Client.compute(catalog) to return a Future
instead of requiring users to access the internal _ddf attribute.

Fixes #1028

https://claude.ai/code/session_013NNQnzZHryYqUC1eAWFC5w

Add __dask_graph__, __dask_keys__, __dask_tokenize__, __dask_postcompute__,
__dask_postpersist__, __dask_optimize__, and __dask_scheduler__ methods to
HealpixDataset, delegating to the underlying _ddf NestedFrame.

This allows dask.distributed.Client.compute(catalog) to return a Future
instead of requiring users to access the internal _ddf attribute.

Fixes #1028

https://claude.ai/code/session_013NNQnzZHryYqUC1eAWFC5w
@codecov
Copy link

codecov bot commented Feb 11, 2026

Codecov Report

❌ Patch coverage is 85.71429% with 2 lines in your changes missing coverage. Please review.
✅ Project coverage is 96.62%. Comparing base (2372c11) to head (5f43916).

Files with missing lines Patch % Lines
src/lsdb/catalog/dataset/healpix_dataset.py 85.71% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1252      +/-   ##
==========================================
- Coverage   96.67%   96.62%   -0.06%     
==========================================
  Files          48       48              
  Lines        2949     2963      +14     
==========================================
+ Hits         2851     2863      +12     
- Misses         98      100       +2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Verify that a Catalog is recognized as a dask collection and that
dask.compute() returns the correct result, matching _ddf.compute().

https://claude.ai/code/session_013NNQnzZHryYqUC1eAWFC5w
@github-actions
Copy link

github-actions bot commented Feb 11, 2026

Before [2372c11] After [c96f039] Ratio Benchmark (Parameter)
8.02±0.03s 8.20±0.01s 1.02 benchmarks.time_lazy_crossmatch_many_columns_all_suffixes
8.03±0.04s 8.17±0.02s 1.02 benchmarks.time_lazy_crossmatch_many_columns_overlapping_suffixes
6.82±0.03s 6.86±0.02s 1.01 benchmarks.time_create_large_catalog
3.72±0.01s 3.76±0.01s 1.01 benchmarks.time_open_many_columns_all
359±4ms 362±4ms 1.01 benchmarks.time_open_many_columns_default
1.03±0s 1.03±0.01s 1 benchmarks.time_create_midsize_catalog
19.6±0.04s 19.5±0s 1 benchmarks.time_save_big_catalog
100±0.7ms 99.8±0.9ms 0.99 benchmarks.time_kdtree_crossmatch
161±0.8ms 160±0.8ms 0.99 benchmarks.time_open_many_columns_list
45.4±1ms 44.7±0.9ms 0.98 benchmarks.time_polygon_search

Click here to view all benchmarks.

Update test to use dask.distributed.Client and assert that the return
value is a Future, which is the core of issue #1028.

https://claude.ai/code/session_013NNQnzZHryYqUC1eAWFC5w
Move Client import to conftest.py as a reusable dask_client fixture.
Remove unused Client import from test_catalog.py, keeping only Future.

https://claude.ai/code/session_013NNQnzZHryYqUC1eAWFC5w
@hombit
Copy link
Contributor Author

hombit commented Feb 12, 2026

I don't really understand the circumstances of these changes, they change the behavior of Catalog a lot.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Make dask_client.compute(lsdb_catalog) to return a Future

2 participants