Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Paper: Improving Code Quality with Array and DataFrame Type Hints #906

Merged
merged 19 commits into from
Sep 25, 2024

Conversation

flexatone
Copy link
Contributor

@flexatone flexatone commented May 23, 2024

If you are creating this PR in order to submit a draft of your paper, please name your PR with Paper: <title>. An editor will then add a paper label and GitHub Actions will be run to check and build your paper.

See the project readme for more information.

Editor: Charles Lindsey @cdlindsey

Reviewers:

@flexatone flexatone changed the title Paper: Improve Code Quality with Array and DataFrame Type Hints Paper: Improving Code Quality with Array and DataFrame Type Hints May 23, 2024
@hongsupshin hongsupshin added the paper This indicates that the PR in question is a paper label May 23, 2024
Copy link

github-actions bot commented May 23, 2024

Curvenote Preview

Directory Preview Checks Updated (UTC)
papers/christopher_ariza 🔍 Inspect 24 checks passed (8 optional) Aug 7, 2024, 3:27 PM

@EngineerKhan
Copy link
Collaborator

I have read this paper thrice and cannot find any issue with it. It's well-written, so easy to follow and builds a narrative in an excellent manner.

Some Minor Suggestions (can be ignored):

  • "Just like a dictionary, a DataFrame is a complex data structure composed of many component types: the type of the index labels, the type of the column labels, and the types of column values." - the type of part is a bit redundant and can be rewritten.
  • Raised errors are there in the code comments, but some of them (as per the author's discretion) may be explained a bit too.
  • "The same annotations can be used for runtime validation. While reliance on duck-typing over runtime validation is common in Python, runtime validation is often needed with complex data structures such as arrays and DataFrames." - maybe we can give an example to support the claim?
  • At some point, the section under "DataFrame Type Annotations" becomes a bit hard to read. Maybe we can add some subheadings there? Again, this is something the author can better determine.

Overall, a well-written article. Not only, I have learnt new things, but also enjoyed reading it with its smooth style.

@flexatone
Copy link
Contributor Author

Many thanks, @EngineerKhan , for your suggestions and feedback.

@vicadesoba
Copy link

vicadesoba commented Jul 9, 2024

@flexatone Glad to review your paper and nice to meet you! Just a quick intro: I got introduced to the open source community during my time at makepath a geospatial data science company that maintains several open source packages and I'm involved with Project Bokeh (as much as time allows). Recently, I started working for a process engineering company that isn't involved in the open source community so volunteering with SciPy gives me a small way to stay involved!

@cbcunc cbcunc requested a review from EngineerKhan July 14, 2024 22:20
@cdlindsey
Copy link
Contributor

Thanks for the interesting paper. I think it's worth mentioning how these issues are handled (or not) with TensorFlow and Pandas.

@flexatone
Copy link
Contributor Author

Many thanks, @cdlindsey , for your feedback. I have added a small section describing how these issues are handled in PyTorch, TensorFlow, and Pandas.

@cbcunc cbcunc merged commit e0971ab into scipy-conference:2024 Sep 25, 2024
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
paper This indicates that the PR in question is a paper ready-for-review
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants