Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Subset using a list of barcodes #273

Open
yoavhadas opened this issue Jun 22, 2023 · 4 comments
Open

Subset using a list of barcodes #273

yoavhadas opened this issue Jun 22, 2023 · 4 comments

Comments

@yoavhadas
Copy link

Hello,
Is it possible to subset a pegasus h5ad file using a list of barcodes?

Thanks,
Yoav.

@yihming
Copy link
Member

yihming commented Jul 3, 2023

Hello @yoavhadas ,

Sure, you can do it in a way similar to AnnData's slicing:

import pegasus as pg

pdata = pg.read_input("<your-file-name>.h5ad")
pdata_subset = pdata[barcode_list, :].copy()

where copy() is added if you want a new pegasus UnimodalData object to be created. If dropping copy(), you'll get a View of the original data object instead.

@yoavhadas
Copy link
Author

Hi @yihming , I could subset the object using the above approach. Now I must reanalyze it (repeat the PCA, harmony, clustering, and UMAP generation) to get an updated object. When I use a similar script as I did for the original dataset, I get an error:
Traceback (most recent call last):

 File "pegasus_recluster.py", line 44, in <module>
    pegasus.neighbors(object, rep=pca_key, n_jobs = n_jobs)
  File "/sc/arion/work/tmp/miniconda3/envs/pec_pegasusenv/lib/python3.8/site-packages/pegasus/tools/nearest_neighbors.py", line 308, in neighbors
    W = calculate_affinity_matrix(indices[:, 0 : K - 1], distances[:, 0 : K - 1])
  File "/sc/arion/work/tmp/miniconda3/envs/pec_pegasusenv/lib/python3.8/site-packages/pegasusio/decorators.py", line 12, in wrapper_timer
    result = func(*args, **kwargs)
  File "/sc/arion/work/tmp/miniconda3/envs/pec_pegasusenv/lib/python3.8/site-packages/pegasus/tools/nearest_neighbors.py", line 207, in calculate_affinity_matrix
    numers = 2.0 * sigmas[i] * sigmas[indices[i, :]]
IndexError: index 228336 is out of bounds for axis 0 with size 228134

Would you happen to have any suggestions on how to overcome this issue?

Thanks
Yoav

@yihming
Copy link
Member

yihming commented Jan 6, 2024

Hi @yoavhadas ,

Sorry for getting back to you late. Have you tried setting use_cache=False in the pegasus.neighbors() function?

@x1han
Copy link

x1han commented Apr 6, 2024

Hi, I have a question when working with multimodaldata, for example, I have used three matrices containing the same cells to generate one multimodaldata, as follows:

MultimodalData object with 3 UnimodalData: 'RNA-RNA', 'RNA-integrated', 'RNA-AN'
    It currently binds to UnimodalData object RNA-AN

When I filter the multimodaldata through barcode, I got a unimodaldata that only contains one matrix like this:

UnimodalData object with n_obs x n_vars = 120 x 2215
    UID: RNA-RNA; Genome: RNA; Modality: RNA
    It contains 1 matrix: 'counts'
    It currently binds to matrix 'counts' as X 

I was wondering if there is a method to subset the entire multimodaldata, allowing me to obtain multimodaldata with three matrices that only contain subset cells?

Additionally, I am not sure if you will update the pegasusIO related tutorials, as I feel the content is somewhat limited. For instance, processes like creating multimodal data from a matrix, such as CreateSeurat, took me a long time to figure out on my own.

Regardless, Pegasus is still a single-cell analysis tool that I really like. I have already used it into my own work and cited it, hoping it will continue to improve.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants