How do I extract the list of genes names in each tissue? #39

orrl16 · 2022-11-30T21:15:51Z

I am working with the h5ad.

Additional question, what is the different in the pre-process phases of the raw.X and the X datasets?

aopisco · 2022-11-30T21:26:38Z

The gene names are in adata.var
raw.X is normalized, .X is normalized and scaled

orrl16 · 2022-11-30T21:42:10Z

Thanks a lot!
Where can I download the 'adata.var' file?

aopisco · 2022-11-30T22:26:17Z

it's part of the file, like you access .X or .raw.X you also have .var

orrl16 · 2022-12-01T00:48:03Z

Thanks a lot! I tried to access it but couldn’t. I downloaded all tissues h5ad files. That is what I got when displaying the structure of the HDF5 bat_facs.h5ad /var I did not find “adata” Dataset 'var' Size: 22899 MaxSize: 22899 Datatype: H5T_COMPOUND Member 'index': H5T_STRING String Length: 17 Padding: H5T_STR_NULLPAD Character Set: H5T_CSET_ASCII Character Type: H5T_C_S1 Member 'n_cells': H5T_STD_I64LE (int64) Member 'means': H5T_IEEE_F32LE (single) Member 'dispersions': H5T_IEEE_F32LE (single) Member 'dispersions_norm': H5T_IEEE_F32LE (single) Member 'highly_variable': H5T_ENUM Base Type: H5T_STD_I8LE Member 'FALSE': 0 Member 'TRUE': 1 ChunkSize: [] Filters: none FillValue: H5T_COMPOUND If you could send me the list of genes names used for all tissues in both FACS and DROPLET samples it would be great! my email is ***@***.***

…

On 30 Nov 2022, at 17:26, aopisco ***@***.***> wrote: it's part of the file, like you access .X or .raw.X you also have .var — Reply to this email directly, view it on GitHub <#39 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/A4QAS5SPZV2UBAXUJWK3XIDWK7IBHANCNFSM6AAAAAASQCFEJ4>. You are receiving this because you authored the thread.

orrl16 · 2022-12-01T00:48:52Z

my email is orr.levy et Yale.edu

orrl16 · 2022-12-02T18:52:12Z

it's part of the file, like you access .X or .raw.X you also have .var

I have tried to look at .var in these files: but there were no information about the gene list...

https://figshare.com/articles/dataset/Processed_files_to_use_with_scanpy_/8273102/2

aopisco · 2022-12-02T22:49:36Z

@orrl16 the h5ad objects follow the anndata (adata in short) structure: https://anndata.readthedocs.io/en/latest/index.html

orrl16 · 2022-12-03T00:55:33Z

Thanks again!
I looked at 'var' and found a list of genes names in the length of 22899.
Dataset 'var'
Size: 22899
MaxSize: 22899
I can easily extract the list of genes names from that structure. However, The relations between X (the gene expression table) and the index list is still not clear to me. In both .X or .raw.X there are 33538 genes.
How do I match the 22899 indexes to 33538 genes in the gene expression table?
Best regards and thanks again!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How do I extract the list of genes names in each tissue? #39

How do I extract the list of genes names in each tissue? #39

orrl16 commented Nov 30, 2022

aopisco commented Nov 30, 2022

orrl16 commented Nov 30, 2022

aopisco commented Nov 30, 2022

orrl16 commented Dec 1, 2022 via email

orrl16 commented Dec 1, 2022

orrl16 commented Dec 2, 2022

aopisco commented Dec 2, 2022

orrl16 commented Dec 3, 2022

How do I extract the list of genes names in each tissue? #39

How do I extract the list of genes names in each tissue? #39

Comments

orrl16 commented Nov 30, 2022

aopisco commented Nov 30, 2022

orrl16 commented Nov 30, 2022

aopisco commented Nov 30, 2022

orrl16 commented Dec 1, 2022 via email

orrl16 commented Dec 1, 2022

orrl16 commented Dec 2, 2022

aopisco commented Dec 2, 2022

orrl16 commented Dec 3, 2022