- Three topics in particular that we'll cover are modularity, documentation, and testing.
- Modularity implies dividing code into shorter functional units, which are more readable, maintainable and portable
- We can write modular code in python by leveraging packages, classes, and methods
- Documentation includes using comments, docstrings, and self-documenting code to document your Data Science python projects
- Testing can both be manual and automated:
- It's definitely worthwhile to perform manual tests
- But leveraging tools like the
pytest
package can automatically run and re-run your tests to ensure your code is working as intended even after adding new functionality
- 'Python Package Index' (PyPi) gives us an easy platform to leverage published packages
- Thanks to packages being modular, we can easily install them from PyPi using a tool called
pip
pip
is a recursive acronym that stands for 'Pip Installs Packages', and it does just that- To read documentation of packages/dtypes, use
help(object_name)
- PEP 8 is the defacto Style Guide for Python Code
- It lets us know how to format our code to be as readable as possible, and to quote PEP 8, 'code is read much more often than it is written'
- To ensure your code keeps up with PEP 8, you can use:
- Using pycodestyle in editor:
- A minimal python package consists of 2 elements: a directory and a python file
- The name of the directory will be the name of the package, but how should you name it?
- PEP 8 states that packages should have short, all-lowercase names
- The use of underscores in a package name is discouraged, but you can and should use them if it improves readability
- It's ideal to pick a name that conveys the functionality of the package
- The file in our newly branded directory doesn't have any flexibility in naming
- We must name it underscore underscore init underscore underscore dot py (
__init__.py
) - This file lets Python know that the directory we created is a package
- With this structure we've created a package that we can import just like we would import numpy or any other package
- To import a local package, we need to establish it's path:
- To import (if package and your script are in same directory):
- To add functionality to the package, we start by adding a .py file in package directory:
- To add the functionality and access it:
- Alternative to access the functionality:
- To extend package structure:
- You can also extend package structure by building packages inside your package (subpackages):
- Now that you have a functional package you might want to share it with your colleagues
- The two main steps to sharing a python package are creating
setup.py
andrequirements.txt
- These two pieces provide information on how to install your package and recreate its required environment
- These files list information about what dependencies you've used as well as allowing you to describe your package with additional metadata
- The contents of
requirements.txt
: - This installs all the packages listed with respect the correct version
- Note that we didn't actually install our package, we just recreated its environment
- The contents of
setup.py
: - Some less obvious arguments in our example are
install_requires
andpackages
packages
in essence lists the location of all the init files in our package. Our package has a single init file and it's in the directory 'my_package'- More complex packages might include subpackages with their own init files, if this was the case we would also list their locations here
- Until you start writing more complex packages, the contents of the
packages
list will likely be the same as the name argument install_requires
might look familiar, in the case of our package, the contents are the same as our requirements file- There are cases where
install_requires
may differ fromrequirements.txt
: - Now that we've completed our
setup.py
, we can install our package usingpip install .
from inside the same directory as our package - This will install our package at an environment level so we can import it into any python script using the same environment.