generated from v-goncharenko/data-science-template
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
1 parent
f59aa0a
commit 0601109
Showing
2 changed files
with
24 additions
and
65 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,73 +1,24 @@ | ||
# Template for Data Science Project | ||
# Some useful tools for Python [in context of Data Science] | ||
|
||
This repo aims to give a robust starting point to any Data Science related | ||
project. | ||
Here I gather functions that I need. | ||
|
||
It contains readymade tools setup to start adding dependencies and coding. | ||
Hope one time it will have a documentation published, but not for now ) | ||
|
||
To get yourself familiar with tools used here watch | ||
[my talk on Data Science project setup (in Russian)](https://youtu.be/jLIAiDMyseQ) | ||
## Installation | ||
|
||
**If you use this repo as a template - leave a star please** because template | ||
usages don't count in Forks. | ||
It's already published on PyPI, so simply | ||
|
||
## Workflow | ||
`pip install somepytools` | ||
|
||
Experiments and technology discovery are usualy performed on Jupyter Notebooks. | ||
For them `notebooks` directory is reserved. More info on working with Notebooks | ||
could be found in `notebooks/README.md`. | ||
## Reference | ||
|
||
More mature part of pipeline (functions, classes, etc) are stored in `.py` files | ||
in main package directory (by default `ds_project`). | ||
Modules inclues: | ||
|
||
## What to change? | ||
- extended typing module | ||
- common read-write operations for configs | ||
- utils to work with filesystem | ||
- functions to handle videos in opencv | ||
- torch utilities (infer and count parameters) | ||
- even more (e.g. wrapper to convert strings inputs to `pathlib`) | ||
|
||
- project name (default: `ds_project`) | ||
- in `pyproject.toml` - tool.poetry.name | ||
- main project directory (`ds_project`) | ||
- test in `tests` directory | ||
- line length (default: `90`) [Why 90?](https://youtu.be/esZLCuWs_2Y?t=1287) | ||
- in `pyproject.toml` in blocks | ||
- black | ||
- isort | ||
- in `setup.cfg` for `flake8` | ||
- in `.pre-commit-config.yaml` for `prettier` | ||
|
||
## How to setup an environment? | ||
|
||
This template use `poetry` to manage dependencies of your project. They | ||
|
||
First you need to | ||
[install poetry](https://python-poetry.org/docs/#installation). | ||
|
||
Then if you use `conda` (recommended) to manage environments (to use regular | ||
virtualenvenv just skip this step): | ||
|
||
- tell `poetry` not to create new virtualenv for you | ||
|
||
(instead `poetry` will use currently activated virtualenv): | ||
|
||
`poetry config virtualenvs.create false` | ||
|
||
- create new `conda` environment for your project (change env name for your | ||
desired one): | ||
|
||
`conda create -n ds_project python=3.9` | ||
|
||
- actiave environment: | ||
|
||
`conda activate ds_project` | ||
|
||
Now you are ready to add dependencies to your project. For this use | ||
[`add` command](https://python-poetry.org/docs/cli/#add): | ||
|
||
`poetry add scikit-learn torch <any_package_you_need>` | ||
|
||
Next run `poetry install` to check your final state are even with configs. | ||
|
||
After that add changes to git and commit them | ||
`git add pyproject.toml poetry.lock` | ||
|
||
Finally add `pre-commit` hooks to git: `pre-commit install` | ||
|
||
At this step you are ready to write clean reproducible code! | ||
For now it's better to go through the files and look at contents |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,5 +1,13 @@ | ||
from pathlib import Path | ||
|
||
import somepytools | ||
from somepytools.io import read_toml | ||
|
||
|
||
curr_dir = Path(__file__).resolve().parent | ||
|
||
|
||
def test_version(): | ||
assert somepytools.__version__ == "0.1.0" | ||
pyproject = read_toml(curr_dir / "../pyproject.toml") | ||
|
||
assert somepytools.__version__ == pyproject["tool"]["poetry"]["version"] |