Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update standards based on Nov 2019 PyHC meeting discussion #16

Open
wants to merge 5 commits into
base: main
Choose a base branch
from
Open
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
23 changes: 15 additions & 8 deletions standards.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,14 +4,20 @@

---

Drafted during the Heliopython Meeting of November 2018.
Drafted during the Python in Heliophysics Community Meeting of November 2018.

Amended during the Python in Heliophysics Community Meeting of November 2019.

---

Agreed on 10-Dec-2018 by (in alpahbetical order)
Agreed on 10-Dec-2018 by (in alphabetical order)

**A. Annex** (JHU), **B. L. Alterman** (Univ. of Michigan), **A. Azari** (Univ. of Michigan), **W. Barnes** (Rice Univ.), **M. Bobra** (Stanford), **B. Cecconi** (Observartoire de Paris), **S. Christe** (NASA GSFC), **J. Coxon** (Univ. of Southampton), **A. DeWolfe** (LASP), **A. Halford** (Aerospace Corporation), **B. Harter** (LASP), **J. Ireland** (NASA GSFC), **J. Jahn** (SwRI), **J. Klenzing** (NASA GSFC), **M. Liu** (SunPy), **J. Mason** (NASA GSFC), **R. McGranaghan** (NASA JPL), **N. Murphy** (CfA), **S. Murray** (Trinity College Dublin), **J. Niehof** (Univ. of New Hampshire), **M.D. Nguyen** (Lomonosov Moscow State Univ.), **R. Panneton** (CU/LASP), **A. Pembroke** (NASA GSFC), **D. Pérez-Suárez** (University College London), **C. Piker** (Univ. of Iowa), **A. Roberts** (NASA GSFC), **D. Ryan** (NASA GSFC), **S. Savage** (NASA GSFC), **J. Smith** (NASA GSFC, Catholic Univ.), **D. Stansby** (Imperial College London), **J. Vandegriff** (JHU/APL), **R. S. Weigel** (George Mason University)

Amended on 05-Nov-2019 by (in alphabetical order)

**N. Murphy** (CfA),

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A. G. Burrell (NRL). Also, A. Halford is now at NASA GSFC.

---

Definitions:
Expand All @@ -29,12 +35,13 @@ Definitions:
5. **License**: Projects must provide a license. Projects should use permissive licenses for open source scientific software (e.g., the BSD 2-clause, BSD 3-clause, or BSD+Patent licenses). Copyleft licenses such as GPL are not recommended and OSI-approved permissive licenses are recommended.
6. **Version control**: All code must use version control. It is strongly recommended that projects make use of a distributed version control system (e.g., git).
7. **Coding Style**: Projects must adopt the basic style recommendations of [PEP 8](https://www.python.org/dev/peps/pep-0008/) and static analysis tools should be used to identify deviations from the basic style recommendations (e.g. pylint, flake8, pycodestyle).
8. **Documentation**: All functions, classes, and modules must have documentation strings (docstrings) provided in a standard [conventions](https://www.python.org/dev/peps/pep-0257/) (e.g. [numpydoc](https://numpydoc.readthedocs.io/en/latest/format.html)). Docstrings must describe the code’s purpose, describe all inputs and outputs, and provide examples. High level documentation must also be provided as guides, tutorials, and developer docs. Documentation must be provided in version control with the code and be made available online in a readable form.
9. **Testing**: Stable packages must provide unit tests of individual components (e.g. functions, classes) as well as integration tests that test the interaction between components that covers most of the code. Testing coverage should be measured. Automated testing is recommended, in which tests are run before any code is merged. System[link] and acceptance[link] testing are also recommended.
8. **Documentation**: All functions, classes, and modules in the public-facing application programming interface (API) must have documentation strings (docstrings) provided in a standard [convention](https://www.python.org/dev/peps/pep-0257/) (e.g. [numpydoc](https://numpydoc.readthedocs.io/en/latest/format.html)). Docstrings must describe the code’s purpose, describe all inputs and outputs, and provide examples. High level documentation must also be provided as guides, tutorials, and developer docs. Documentation must be provided in version control with the code and be made available online in a readable form.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I understand this change. This suggests that non public-facing functions do not need doc strings. How is a contributor supposed to interact with developer code that is not documented?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm...good point. I've occasionally written short private helper functions/methods with no arguments that have their one-line docstring encoded in the function name, so the docstring doesn't end up adding any additional information. I'm also sometimes lax on docstrings for unstable temporary implementations that I know will not be lasting long. Perhaps we could say that all public facing functions/etc must have docstrings, and that private functions/etc should generally have docstrings?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think "all public facing functions/etc must have docstrings, and that private functions/etc should generally have docstrings" is a good compromise. I know there are non-public facing functions in SciPy that use internal comments instead of docstrings when the functions are never supposed to be interacted with externally.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't believe docstrings and comments are always needed if the code is self-explanatory. If the function has a clear "verb" that explains by itself what's the intention then comments and documentation may, not only, be redundant but also it can make the project harder to maintain - as it may be forgotten and never updated even making the documentation misleading! (and that's worse than no documentation at all!!)
So, rather than force people to have docstrings I would prefer to encourage good practices when naming functions, variables, etc. (which we don't have anything on that respect and I believe it's as important, or more, than documentation in itself). Chapter 2 of Clean Code is a good text about why we should encourage it more.

In any case, I agree with @namurphy's must/should rewrite.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Certainly requiring full numpydoc docstrings on all private functions is overkill. +1 on @aburrell 's suggestion

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine with going with the "should". I don't really like the idea of encouraging "second-class" functions that are more poorly documented than others but I understand the concern here. I do know that for sunpy there are many important non-public facing functions that need good documentation for developers to interact with them!

9. **Testing**: Stable packages must provide unit tests of individual components (e.g. functions, classes) as well as integration tests that test the interaction between components that covers most of the code. Testing coverage should be measured. Automated testing is recommended, in which tests are run before any code is merged. System and acceptance testing are also recommended.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe we should add links to all the four types of tests that are mentioned here. I've just found the software testing fundamentals website which describe them quite well. Otherwise, people that haven't obtained any software development training wouldn't really understand the differences.

10. **Dependencies**: Projects should import the minimum number of packages necessary. Adding new dependencies should be a __considered__ decision.
11. **Python 3**: All packages must be compatible or work towards being compatible with Python 3. Providing ongoing support for Python 2 is not recommended as the end of life for Python 2 is January 1, 2020 (see [PEP 373](https://www.python.org/dev/peps/pep-0373/)).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would update this to state that Python 2 has reached its end of life and would rephrase to be more forceful in that all packages MUST support Python 3.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed completely! This slipped my mind because I was making these edits in 2019.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. python3 may be redundant with 12. deprecation. It may now (2020) finally be sufficient to just omit mention of python2 and just mention NEP29.

12. **Duplication**: Duplication of code and functionality is discouraged. Forking projects into new projects is strongly discouraged.
13. **Collaboration**: Contributions to packages must be encouraged. Packages must provide contribution guidelines and clearly explain when a contribution is not accepted.
14. **Binaries**: Binary files should be added to the package repository only when necessary in order to keep packages as light as possible. Jupyter notebooks can be binary files and should not be committed to the package repository but can be provided in other repositories.
15. **Code of conduct**: Each project must adopt a code of conduct that is compatible with the [Contributor Covenant](https://www.contributor-covenant.org) and make it publicly available.
12. **Deprecation Policy** (in accordance with [NEP 29](https://numpy.org/neps/nep-0029-deprecation_policy.html)): Each project should support (1) all minor versions of Python released 42 months prior to the project, and at minimum the two latest minor versions; and (2) all minor versions of NumPy released in the 24 months prior to the project, and at minimum the last three minor versions. In ``setup.py``, the ``python_requires`` variable should be set to the minimum supported version of Python. All supported minor versions of Python should be in the test matrix and have binary artifacts built for the release. Minimum Python and NumPy version support should be adjusted upward only on major and minor releases, and never on a patch release.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

...In setup.py or setup.cfg, the python_requires ...

Although perhaps too technical for the standards document, in my opinion setup.cfg should be used as much as possible to allow better tool introspection without needing to install. For security, best practices, etc.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree with @scivision. We should encourage to keep that metadata on setup.cfg or pyproject.toml.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't this just be "projects should comply with NEP 29"?


13. **Duplication**: Duplication of code and functionality is discouraged. Forking projects into new projects is strongly discouraged.
14. **Collaboration**: Contributions to packages must be encouraged. Packages must provide contribution guidelines, and clearly and constructively explain when a contribution is not accepted.
15. **Binaries**: Binary files should be added to the package repository only when necessary in order to keep packages as light as possible. Jupyter notebooks can be binary files and should not be committed to the package repository but can be provided in other repositories.
16. **Code of conduct**: Each project must adopt a code of conduct that is compatible with the [Contributor Covenant](https://www.contributor-covenant.org) and make it publicly available. Each project should have a reporting procedure for reporting violations of the code of conduct and make it publicly available (for a highly detailed example, refer to the [Code of Conduct Response and Enforcement Manual for NumFOCUS](https://numfocus.org/code-of-conduct/response-and-enforcement-events-meetups)).