Skip to content

Conversation

@maltekuehl
Copy link
Contributor

@maltekuehl maltekuehl commented Sep 13, 2025

Fixes #305 and closes #359. Adds support for normalization factors based on length provided by pytximport. Very much still a draft.

Open issues:

  • Is the example dataset okay? Others seem to be synthetic and much smaller. Do you have some simple or synthetic source data that could be used to create an AnnData object with pytximport? Once clear, we should probably also add tests for correctness, comparing against a reference to prevent accidental future drift.
  • I have yet to figure out how size factors should be implemented in the case that we are also calculating normalization factors. Should they be equivalent to size factors in the standard case or some transform of the norm matrix adjusted counts? Similar for logmeans, as I do not fully understand where else this data is used throughout the code. Any help would be appreciated.
  • The statistical testing structure is quite a bit different in PyDESeq2 compared to DESeq2 and I may not have grasped all subtleties, would be thankful for help from maintainers to ensure that everything was adjusted.
  • The initial scaffold of this PR was LLM-generated, and while I provided ample context (including the issues and relevant code from DESeq2) and clear instructions and have checked and already adjusted the output quite a bit, it would be best to examine these changes critically.

Aside: Tests pass locally but fail due to an older AnnData version on Python 3.10 here. Would you be open for this PR to also include an update of the pre-commit, GitHub Actions, full move to uv/hatch/ruff like other scverse ecosystem packages and targeting Python 3.11 - 3.13?

CC @BorisMuzellec

@BorisMuzellec
Copy link
Collaborator

Hi @maltekuehl, sorry I haven't had the time to review your PR yet.

To answer your last remark: I'm all for switching to uv for package management. I don't see much of an issue in dropping support for python 3.10 starting from v0.5.3 as many packages (e.g., numpy) have also stopped supporting it in their latest releases.

Please go ahead if you wish to handle those changes! I'd just suggest you make them in a separate PR.

I'll try to have a look at this PR ASAP

[pre-commit.ci] pre-commit autoupdate (owkin#415)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Plans to add DESeqDataSetFromTximport? Add support for sample-/gene-dependent normalization factors (e.g., length offsets from pytximport)

2 participants