Skip to content

chore: add strict type annotations on the entire codebase #169

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 7 commits into
base: master
Choose a base branch
from

Conversation

regisb
Copy link
Contributor

@regisb regisb commented May 20, 2025

Description:

Add strict type annotations to the entire codebase, such that mypy --strict ... can be run on code_annotations/.

This was very much implemented as a proof-of-concept of what we can achieve with Claude Code integration. I spent ~$13.5 and 4 hours of work to annotate all modules and review all changes. Manual review was very much necessary, as some changes proposed by Claude were not functional.

A few minor issues were discovered along the way, mostly due to missing None checking. For instance click.echo(traceback.print_exc()) always printed None because traceback.print_exc returns None.

Dependencies: dependencies on other outstanding PRs, issues, etc.

Eventually, I would like all Open edX codebases to implement strict type checking.

Testing instructions:

Run make test-quality.

Merge checklist:

  • All reviewers approved
  • CI build is green
  • Version bumped
  • Changelog record added
  • Documentation updated (not only docstrings)
  • Commits are squashed

Post merge:

  • Create a tag
  • Check new version is pushed to PyPi after tag-triggered build is
    finished.
  • Delete working branch (if not needed anymore)

Author concerns:

I understand that the list of changes is large, but unfortunately a full review of all changes is necessary.

@openedx-webhooks openedx-webhooks added open-source-contribution PR author is not from Axim or 2U core contributor PR author is a Core Contributor (who may or may not have write access to this repo). labels May 20, 2025
@openedx-webhooks
Copy link

Thanks for the pull request, @regisb!

This repository is currently maintained by @bmtcril.

Once you've gone through the following steps feel free to tag them in a comment and let them know that your changes are ready for engineering review.

🔘 Get product approval

If you haven't already, check this list to see if your contribution needs to go through the product review process.

  • If it does, you'll need to submit a product proposal for your contribution, and have it reviewed by the Product Working Group.
    • This process (including the steps you'll need to take) is documented here.
  • If it doesn't, simply proceed with the next step.
🔘 Provide context

To help your reviewers and other members of the community understand the purpose and larger context of your changes, feel free to add as much of the following information to the PR description as you can:

  • Dependencies

    This PR must be merged before / after / at the same time as ...

  • Blockers

    This PR is waiting for OEP-1234 to be accepted.

  • Timeline information

    This PR must be merged by XX date because ...

  • Partner information

    This is for a course on edx.org.

  • Supporting documentation
  • Relevant Open edX discussion forum threads
🔘 Get a green build

If one or more checks are failing, continue working on your changes until this is no longer the case and your build turns green.


Where can I find more information?

If you'd like to get more details on all aspects of the review process for open source pull requests (OSPRs), check out the following resources:

When can I expect my changes to be merged?

Our goal is to get community contributions seen and reviewed as efficiently as possible.

However, the amount of time that it takes to review and merge a PR can vary significantly based on factors such as:

  • The size and impact of the changes that it introduces
  • The need for product review
  • Maintenance status of the parent repository

💡 As a result it may take up to several weeks or months to complete a review and merge your PR.

@github-project-automation github-project-automation bot moved this to Needs Triage in Contributions May 20, 2025
@regisb regisb force-pushed the regisb/types branch 2 times, most recently from c718913 to 49eff5c Compare May 20, 2025 15:14
@bmtcril
Copy link
Contributor

bmtcril commented May 20, 2025

Hey @regisb ! I'm very supportive of this effort, and happy to review PRs adding type hints anywhere I can. I don't think Axim is likely to push forward an initiative to adding hinting everywhere, but I think that an ADR to OEP-67 indicating that type hints should be added to new Python code would be a welcome first step. Maybe something similar to the one there for TypeScript?

I need to temporarily block this PR due to the use of Claude in creating it while I consult our legal team. We don't yet have specific guidance on adding LLM generated code to the org, but I know work is being done there. The closest thing that I know of is this. I'll update here once I get more information.

Copy link
Contributor

@bmtcril bmtcril left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't reviewed yet, but need to block on LLM generated code temporarily while I consult Axim legal.

@regisb
Copy link
Contributor Author

regisb commented May 27, 2025

Hi Ty, any update?

@bmtcril
Copy link
Contributor

bmtcril commented May 27, 2025

@regisb we received some guidance on Friday, I'm writing up the document for CCs today but will need approval on the language. I hope to have it cleared today or tomorrow.

Copy link
Contributor

@bmtcril bmtcril left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your patience while we got the policy set up and I reviewed. The AI policy is now live here: https://openedx.atlassian.net/wiki/spaces/COMM/pages/5022416899/Open+edX+Policy+for+Generative+AI+Tools . Please review it and make sure the PR is in compliance.

I've gone ahead with the review, and have some comments. I'd say that from the reviewing side, it was more difficult to review than a fully human PR. Especially in areas where context is lost in the LLM. I would normally be able to assume that if a person is using dict list etc for their typing that they understand that typing.Dict and typing.List is deprecated, and can be trusted to generally follow that. In this case, different files "felt" like they were written by different people who each had different assumptions and knowledge. The end result was that I had to fully commit my attention to every line (where I could normally trust the author and skim), and also more issues than I would generally find in a PR of this type.

How do you feel about having to scatter asserts around to appease the linter? I have mixed feelings about it and I know that people don't generally run our code with -O but I always worry about it. 😄 I think it could provide more useful errors than having things fall through to other code and fail later or have random Nones showing up, but might also raise inscrutable errors that might be better served with explicit error messages. I feel like this will be a recurring theme adding type hints, and don't have strong opinions on it yet.

@mphilbrick211 mphilbrick211 moved this from Needs Triage to In Eng Review in Contributions May 27, 2025
regisb added 5 commits May 28, 2025 10:17
This target has become a standard across Open edX repositories. It
allows developers to compile requirements without upgrading, which is
what we usually want to do whenever we add a new requirement.
We run mypy in non-strict mode, for now. But we should strive to
annotate the entire codebase.
We annotate the entire codebase to support `--strict` mode in mypy.
Thus, we simplify tox.ini and make it possible to run quality targets
with `make`.
@regisb regisb force-pushed the regisb/types branch 2 times, most recently from 180c6d8 to 451ff7b Compare May 28, 2025 08:24
@regisb
Copy link
Contributor Author

regisb commented May 28, 2025

Thanks for the thorough review!

typing.List/Dict/Tuple/Set/Type vs list/dict/tuple/set/type

Claude was adding the typing.* annotations, and I manually searched-and-replaced them. I missed a few in the process -- and I wasn't familiar with type[...] or re.Pattern in the first place.

assert statements

These used to make me anxious, but that's no longer the case. Now I think of them as instances of AssertionError. Maybe this is not the right exception class in some cases, where they should be replaced by ValueError or TypeError. I added those assert statements myself, because the solution implemented by Claude was not appropriate.

I do think that we need to raise exceptions when type checks don't cover our back. Debugging is much more difficult without them. But I also think that pull requests for new features should not include any new assert statement: they are usually an implementation red flag.

Copy link
Contributor

@bmtcril bmtcril left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like some things were fixed, some partially, and some marked as resolved but not fixed. I'm not sure if that was intentional or not, but here's another pass at review.

@regisb
Copy link
Contributor Author

regisb commented May 28, 2025

Sorry for the hassle, I had forgotten to save some files in my IDE after multiple search-and-replace.

Copy link
Contributor

@bmtcril bmtcril left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, thanks for taking this on!

@regisb
Copy link
Contributor Author

regisb commented May 29, 2025

Thanks for the review Ty! There were a couple of lessons learned for me here:

  1. Almost all typing.* annotations can be replaced by native ones.
  2. Annotate code with MonkeyType before running an LLM.
  3. Don't forget to save files in a large PR...

Now I'll take a stab at fixing the code coverage issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core contributor PR author is a Core Contributor (who may or may not have write access to this repo). open-source-contribution PR author is not from Axim or 2U
Projects
Status: In Eng Review
Development

Successfully merging this pull request may close these issues.

3 participants