Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can we order the top N genes within degplot #33

Open
klai001 opened this issue May 30, 2020 · 5 comments
Open

Can we order the top N genes within degplot #33

klai001 opened this issue May 30, 2020 · 5 comments

Comments

@klai001
Copy link

klai001 commented May 30, 2020

Hi Lorena
I noticed that DEGplot() orders the top N genes alphabetically.
I was wondering if there's an argument I can call upon to order the gene plots of N (e.g. top 20) according to the degree of variation it has instead of alphabetical order of the gene names

@lpantano
Copy link
Owner

Hi!

thank you for the comment. Sadly there is no way right now, but I can make a quick change on Monday to allow this.

Thank you for the idea!

lpantano added a commit that referenced this issue Jun 1, 2020
Thanks to  @klai001.

issue: #33
@lpantano
Copy link
Owner

lpantano commented Jun 1, 2020

@klai001, you may want to try this new version, it may do what you expect. You will need to install with BiocManager::install('lpantano/DEGreport'), hopefully there is no conflicts.

@lpantano
Copy link
Owner

lpantano commented Jun 1, 2020

I forgot to say, that you would need to get the gene names first and sort them in the order that you wish, and then use the genes= parameter in the function to plot them.

@klai001
Copy link
Author

klai001 commented Jun 4, 2020

Hi Lorena
thanks for the parameter 👍
I tried to sieve out the variable genes and order them in the order but i realised im getting totally different results of top genes from the plot? I wonder if it's my way of sieving out the variance and ordering them wrong. Im missing something but not sure what it is.
Plot1- before adding in the genes= parameter
c<-degPlot(dds=dds, n=50, xs = "group",group="group",groupLab = "sampletype",ann = c("gene_id","symbol"),color="Accent")

Plot 2-after adding in the genes= parameter
`
normcounts<-counts(dds,normalized=T)

var_genes <- apply(normcounts, 1, var)

select_var <- names(sort(var_genes, decreasing=TRUE))[1:50]

c2<-degPlot(dds=normcounts, n=50, xs = "group",group="group",groupLab = "sampletype",ann = c("gene_id","symbol"),color="Accent",genes=select_var)
`

@lpantano
Copy link
Owner

lpantano commented Jun 5, 2020

Hi,

I cannot see the plots. But you are not going to get the same genes with this two commands. The first plot the top significant genes according to p-adj value. And the other just the top variable, that is not the same.

There is something odd in the first command, did you forget to put here res=? because with this function you need or res or genes, otherwise it shouldn't work.

Anyway, these two commands won't give you the same results. In the first command the top significant genes are expressed, in the other the top variable. That is not the same. If you could same me a reproducible code, I could try to give more tips.

o, with genes you don't need res and with res you need n. If you want a particular order, you need to do the calculation outside and give genes. Right now, if you use res and n it will plot and sort by p-adj if you installed the latest change.

This will plot with -adj and FC:

res <- res[order(res$padj),] %>% .[!is.na(res$padj),][1:10,]
res <- res[order(res$log2FoldChange),] 
genes=rownames(res) # -> give this to the function

I hope this helps.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants