Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Permuting Specific Genes Option #16

Open
gwaybio opened this issue May 10, 2018 · 1 comment
Open

Permuting Specific Genes Option #16

gwaybio opened this issue May 10, 2018 · 1 comment

Comments

@gwaybio
Copy link
Contributor

gwaybio commented May 10, 2018

I am constructing hetnets in https://github.com/greenelab/interpret-compression and am looking to generate network permutations to use as a null distribution. I am running into the issue of potentially inflated z-scores (see example of similar procedure). Could the inflation be because there are many more genes in the network than what I am comparing against?

For example, if cell 6 doesn't include the total population of genes in the hetnet, won't there be an inflation of artificial zeros in the permuted swap? Would this then cause a deflated null distribution in the matrix multiplication in cell 9?

I am wondering if there could be functionality here to permute a hetnet for only certain genes, rather than only certain metaedges. Perhaps adding a variable nodes_to_include that defaults to all nodes would help. Maybe this addition could happen before deciding to loop over nodes here:

https://github.com/dhimmel/hetio/blob/9d9ef1320ee47609e3c61c9f4918531d3c1c8c96/hetio/permute.py#L15-L18

Would it be of interest to add this functionality here?

An alternative (and perhaps an easier alternative) would be to regenerate hetnets in my original scripts to only include genes of interest.

@dhimmel
Copy link
Member

dhimmel commented May 10, 2018

Could the inflation be because there are many more genes in the network than what I am comparing against?

Let's take a look at the DWPC, P-DWPC (average permuted DWPC), and Z-DWPC distributions to see if you are experiencing some sort of artefact like you're concerned about.

there are many more genes in the network than what I am comparing against?

I don't think that is a problem. It is correct to permute the entire network even though you'll only be accessing a small portion of the permuted networks. For the XSwap to properly impute, it should be swapping across all relationships of a given type.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants