Skip to content

Commit

Permalink
Added readme and fixed test suite
Browse files Browse the repository at this point in the history
  • Loading branch information
v-goncharenko committed Apr 3, 2022
1 parent f59aa0a commit 0601109
Show file tree
Hide file tree
Showing 2 changed files with 24 additions and 65 deletions.
79 changes: 15 additions & 64 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,73 +1,24 @@
# Template for Data Science Project
# Some useful tools for Python [in context of Data Science]

This repo aims to give a robust starting point to any Data Science related
project.
Here I gather functions that I need.

It contains readymade tools setup to start adding dependencies and coding.
Hope one time it will have a documentation published, but not for now )

To get yourself familiar with tools used here watch
[my talk on Data Science project setup (in Russian)](https://youtu.be/jLIAiDMyseQ)
## Installation

**If you use this repo as a template - leave a star please** because template
usages don't count in Forks.
It's already published on PyPI, so simply

## Workflow
`pip install somepytools`

Experiments and technology discovery are usualy performed on Jupyter Notebooks.
For them `notebooks` directory is reserved. More info on working with Notebooks
could be found in `notebooks/README.md`.
## Reference

More mature part of pipeline (functions, classes, etc) are stored in `.py` files
in main package directory (by default `ds_project`).
Modules inclues:

## What to change?
- extended typing module
- common read-write operations for configs
- utils to work with filesystem
- functions to handle videos in opencv
- torch utilities (infer and count parameters)
- even more (e.g. wrapper to convert strings inputs to `pathlib`)

- project name (default: `ds_project`)
- in `pyproject.toml` - tool.poetry.name
- main project directory (`ds_project`)
- test in `tests` directory
- line length (default: `90`) [Why 90?](https://youtu.be/esZLCuWs_2Y?t=1287)
- in `pyproject.toml` in blocks
- black
- isort
- in `setup.cfg` for `flake8`
- in `.pre-commit-config.yaml` for `prettier`

## How to setup an environment?

This template use `poetry` to manage dependencies of your project. They

First you need to
[install poetry](https://python-poetry.org/docs/#installation).

Then if you use `conda` (recommended) to manage environments (to use regular
virtualenvenv just skip this step):

- tell `poetry` not to create new virtualenv for you

(instead `poetry` will use currently activated virtualenv):

`poetry config virtualenvs.create false`

- create new `conda` environment for your project (change env name for your
desired one):

`conda create -n ds_project python=3.9`

- actiave environment:

`conda activate ds_project`

Now you are ready to add dependencies to your project. For this use
[`add` command](https://python-poetry.org/docs/cli/#add):

`poetry add scikit-learn torch <any_package_you_need>`

Next run `poetry install` to check your final state are even with configs.

After that add changes to git and commit them
`git add pyproject.toml poetry.lock`

Finally add `pre-commit` hooks to git: `pre-commit install`

At this step you are ready to write clean reproducible code!
For now it's better to go through the files and look at contents
10 changes: 9 additions & 1 deletion tests/test_version.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,13 @@
from pathlib import Path

import somepytools
from somepytools.io import read_toml


curr_dir = Path(__file__).resolve().parent


def test_version():
assert somepytools.__version__ == "0.1.0"
pyproject = read_toml(curr_dir / "../pyproject.toml")

assert somepytools.__version__ == pyproject["tool"]["poetry"]["version"]

0 comments on commit 0601109

Please sign in to comment.