Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added ISS support of Sanger data format #75

Merged
merged 14 commits into from
May 7, 2024

Conversation

BioinfoTongLI
Copy link
Contributor

No description provided.

@codecov-commenter
Copy link

codecov-commenter commented Aug 7, 2023

Codecov Report

Merging #75 (3b3763b) into main (755d475) will increase coverage by 0.07%.
Report is 41 commits behind head on main.
The diff coverage is 36.84%.

Additional details and impacted files
@@            Coverage Diff             @@
##             main      #75      +/-   ##
==========================================
+ Coverage   41.92%   42.00%   +0.07%     
==========================================
  Files          16       17       +1     
  Lines         854      900      +46     
==========================================
+ Hits          358      378      +20     
- Misses        496      522      +26     
Files Coverage Δ
src/spatialdata_io/__init__.py 100.00% <100.00%> (ø)
src/spatialdata_io/_constants/_constants.py 100.00% <100.00%> (ø)
src/spatialdata_io/readers/experimental/iss.py 53.33% <53.33%> (ø)
src/spatialdata_io/readers/merscope.py 25.00% <8.33%> (-2.03%) ⬇️

... and 1 file with indirect coverage changes

@LucaMarconato
Copy link
Member

Thank for the PR!

@giovp the PR enables the support for ISS data. ISS data doesn't come as a commercial format or as the output of a pipeline. I discussed with Tong about supporting readers for non-standard techs, and I think that even if this reader would not be guaranteed to work out-of-the-box like for the other techs, I can see it being helpful for the users as a starting point to build a customize reader for the data.

Wdyt, do you think it could be merged (maybe after some generalization via extra parameters) or should be keep the repository strict to documented techs and maybe ask to move this as a notebook or example code, like for the docs?

@giovp
Copy link
Member

giovp commented Oct 26, 2023

Yeah looks good to me! I honestly don't have strong opinion, I guess an idea would be to add it to a separate module called something like "experimental" or so but maybe we don't need to do that yet, and just go with this for now?

@BioinfoTongLI
Copy link
Contributor Author

experimental does make more sense to me. And it also correctly refects my understanding of SpatialData :)

@LucaMarconato
Copy link
Member

LucaMarconato commented Feb 9, 2024

Hi @BioinfoTongLI, getting back to this, sorry quite some time passed. I have added my review.

  • Do you have some public data that I could use to test this? In addition also non-public data would work if you could share it with me (please share to me via Zulip).

A few additional quick comments, after this we could merge. If you can share the data I can also quickly cover the following tasks.

  • Can you please remove change the __init__.py files so that we use from spatialdata_io.experimental import iss and not from spatialdata_io import iss.
  • Please update the changelog to reflect the new added function of this PR.
  • Please update api.md with a new section so the docs also show the experimental readers.

Thanks!

@melonora
Copy link
Collaborator

@LucaMarconato What is the status?

@LucaMarconato
Copy link
Member

Checking this today. @BioinfoTongLI is it ok if I rehost the data that you gave me?

The data has a T channel (in the screenshot I show the labels). We currently don't support T, so I need to remove it to be able to test the reader. I would upload the data for T=0.

Also, in the future it will be useful to parse the dims directly from the omero metadata when available. It's something that we want to do and when done this reader would benefit from it by become more general.

image

@LucaMarconato
Copy link
Member

@BioinfoTongLI there is a problem with the data you shared. These is the AnnData table:

Out[15]: 
           celltype               region
61793  PAX3+MyoProg  region_labels_image
61147  PAX7+MyoProg  region_labels_image
59104      TransMes  region_labels_image
58640          Mes1  region_labels_image
62107  PAX7+MyoProg  region_labels_image
61720  PAX7+MyoProg  region_labels_image
61020     InterZone  region_labels_image
59316       ProxMes  region_labels_image

And these are the unique IDs from the labels object.

Out[6]: Index([0, 28222, 28462, 28574, 29455, 29524, 29816, 29852, 30020], dtype='uint32')

Therefore the table is not annotating the labels. I will merge the PR anyway, but please when you have time it would be convenient if you could point to/share some public data (or data that I can upload to our public S3 storage) with correct table <> labels matching, so that in the future we can keep testing the dataset.

@LucaMarconato
Copy link
Member

Tested and ready to merge, @BioinfoTongLI thanks for the PR! (my comment above is still valid, please check it if you have time).

@LucaMarconato LucaMarconato merged commit 7427003 into scverse:main May 7, 2024
5 checks passed
@BioinfoTongLI
Copy link
Contributor Author

Hi @LucaMarconato , sorry for being silent for a while. I've uploaded another similar dataset onto Zenodo. I've checked the label image and the h5ad is matching this time. Hope that's what you're expecting?
please lmk if there's any other issue.
Best,
Tong

@LucaMarconato
Copy link
Member

LucaMarconato commented Jun 24, 2024

Thanks for sharing the new data. I have tried the reader with the new data and noticed that some small fixes were required in the reader (already merged here #164). I think the reader works correctly.

I have used this code to read and visualize the data:

##
from spatialdata_io.experimental import iss

f = '/Users/macbook/Downloads/10887810'

sdata = iss(f, raw_relative_path='raw.ome.tif', labels_relative_path='label.ome.tif', h5ad_relative_path='iss.h5ad')

from napari_spatialdata import Interactive

Interactive(sdata)

This is the plot that I get in napari (I am showing the instance_id column).
image

The black parts of the plot are the background, while the gray areas indicate pixels that have no corresponding annotation in the table (maybe cells that have been filtered out).

Precisely, we have that this code

from spatialdata import get_element_instances

labels_ids = get_element_instances(sdata['region_labels_image'])
table_ids = sdata['table'].obs['instance_id']

assert set(table_ids).issubset(set(labels_ids))
print(set(labels_ids).difference(set(table_ids)))

gives this:

{0, 43026, 38421, 42526, 42530, 37925, 35385, 37955, 37444, 40004, 41030, 40538, 42586, 39002, 36444, 38493, 35427, 38503, 35953, 41074, 36994, 43650, 36997, 43655, 42641, 40084, 40087, 35996, 40608, 39601, 35520, 41670, 35528, 40145, 37588, 40162, 36583, 40679, 39665, 36084, 40700, 41725, 43276, 42777, 43807, 35623, 41267, 39223, 35641, 39737, 40250, 41790, 39236, 43851, 37708, 42835, 43867, 42334, 42846, 37728, 38755, 38253, 35694, 43375, 38264, 38784, 39811, 39317, 37782, 43415, 37271, 42389, 35738, 42906, 38814, 36261, 36265, 38313, 41901, 43454, 39881, 42443, 39894, 43480, 38875, 37343, 43489, 41960, 39402, 38380, 40941, 37362}

Do you expect some cells not to be annotated by the table? (In any case, the reader seems to work correctly).

@BioinfoTongLI
Copy link
Contributor Author

Do you expect some cells not to be annotated by the table? (In any case, the reader seems to work correctly).

Yes, those are the cells been discarded in the single-cell processing steps. I forgot to mention that!

Very nice! Thanks for checking!

lucas-diedrich pushed a commit to lucas-diedrich/spatialdata-io that referenced this pull request Nov 26, 2024
Added ISS support of Sanger data format
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants