Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(feat): add async versions of {write,read}_{elem,dispatched} #1902

Draft
wants to merge 40 commits into
base: main
Choose a base branch
from

Conversation

ilan-gold
Copy link
Contributor

Copy link

scverse-benchmark bot commented Mar 7, 2025

Benchmark changes

Change Before [3a472c7] After [8855231] Ratio Benchmark (Parameter)
- 133M 121M 0.91 readwrite.BackedH5ADSuite.peakmem_read_backed('pbmc3k')
! 91478245 failed n/a readwrite.H5ADInMemorySizeSuite.track_actual_in_memory_size('pbmc3k')
! 92155572 failed n/a readwrite.ZarrInMemorySizeSuite.track_actual_in_memory_size('pbmc3k')

Warning

Some benchmarks failed

Comparison: https://github.com/scverse/anndata/compare/3a472c77287154746874a99e5aaa8d99bf63c1b4..8855231ff71cce6fda2de01cc6fb0b3fc8bd51fc
Last changed: Thu, 13 Mar 2025 15:43:13 +0000

More details: https://github.com/scverse/anndata/pull/1902/checks?check_run_id=38719466194

Copy link

codecov bot commented Mar 7, 2025

Codecov Report

Attention: Patch coverage is 84.61538% with 42 lines in your changes missing coverage. Please review.

Project coverage is 80.13%. Comparing base (3a472c7) to head (cd69257).

Files with missing lines Patch % Lines
src/anndata/_io/specs/methods.py 88.77% 11 Missing ⚠️
src/anndata/_core/sparse_dataset.py 63.15% 7 Missing ⚠️
src/anndata/_io/zarr.py 73.07% 7 Missing ⚠️
src/anndata/experimental/backed/_io.py 30.00% 7 Missing ⚠️
src/anndata/_io/utils.py 87.80% 5 Missing ⚠️
src/anndata/_io/specs/registry.py 91.30% 2 Missing ⚠️
src/anndata/_io/h5ad.py 96.96% 1 Missing ⚠️
src/anndata/_types.py 90.00% 1 Missing ⚠️
src/anndata/experimental/merge.py 87.50% 1 Missing ⚠️

❗ There is a different number of reports uploaded between BASE (3a472c7) and HEAD (cd69257). Click for more details.

HEAD has 2 uploads less than BASE
Flag BASE (3a472c7) HEAD (cd69257)
3 1
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1902      +/-   ##
==========================================
- Coverage   86.47%   80.13%   -6.35%     
==========================================
  Files          45       45              
  Lines        6774     6739      -35     
==========================================
- Hits         5858     5400     -458     
- Misses        916     1339     +423     
Files with missing lines Coverage Δ
src/anndata/_io/specs/__init__.py 100.00% <ø> (ø)
src/anndata/experimental/_dispatch_io.py 100.00% <100.00%> (ø)
src/anndata/io.py 100.00% <100.00%> (ø)
src/anndata/_io/h5ad.py 92.53% <96.96%> (-0.15%) ⬇️
src/anndata/_types.py 85.36% <90.00%> (-0.75%) ⬇️
src/anndata/experimental/merge.py 86.91% <87.50%> (+0.18%) ⬆️
src/anndata/_io/specs/registry.py 92.69% <91.30%> (-1.46%) ⬇️
src/anndata/_io/utils.py 77.55% <87.80%> (+1.39%) ⬆️
src/anndata/_core/sparse_dataset.py 85.67% <63.15%> (-6.71%) ⬇️
src/anndata/_io/zarr.py 79.38% <73.07%> (-3.77%) ⬇️
... and 2 more

... and 12 files with indirect coverage changes

🚀 New features to boost your workflow:
  • Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@@ -42,6 +43,32 @@ def read_dispatched(
return reader.read_elem(elem)


async def read_dispatched_async(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this doesn’t need to be a separate function.

We could just check if the callback returns something awaitable, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So we would have just

def read_dispatched (...) -> Coroutine | AnnData

and then it would be up to users to deal with it? Or you mean that all of the "duplicated" code should be made to just handle coroutines and not (which I kind of like)

@ilan-gold ilan-gold changed the title (feat): add async versions of read_{elem,dispatched} (feat): add async versions of {write,read}_{elem,dispatched} Mar 13, 2025
@ilan-gold ilan-gold changed the base branch from ig/faster_ci to main March 13, 2025 13:26
@ilan-gold ilan-gold added this to the 0.12.0 milestone Mar 13, 2025
@Koncopd
Copy link
Member

Koncopd commented Mar 25, 2025

I haven't checked the code, but i see asyncio.run calls. Please note that asyncio.run doesn't work in jupyter.

@ilan-gold
Copy link
Contributor Author

@Koncopd yes I am aware. This was just a first pass - we are going a different route with this feature, I think. Message me on zulip if you want more information, happy to chat.

@Koncopd
Copy link
Member

Koncopd commented Mar 26, 2025

Yeah, i think the correct way here is to create an event loop in a separate thread and execute all async functions using this event loop, like fsspec does.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

async internally for read_elem and write_elem
3 participants