Skip to content

dwslab/pybpmn-parser

Repository files navigation

run_tests workflow PyPI-Server

pybpmn-parser

Starter code for using the hdBPMN dataset for diagram recognition research.

The dump_coco.py script can be used to convert the images and BPMN XMLs into a COCO dataset. COCO is a common format used in computer vision research to annotate the objects and keypoints in images.

python scripts/dump_coco.py path/to/hdBPMN path/to/target/coco/directory/hdbpmn

Moreover, the demo.ipynb Jupyter notebook can be used to visualize (1) the extracted bounding boxes, keypoints, and relations, and (2) the annotated BPMN diagram overlayed over the hand-drawn image. Note that the latter requires the bpmn-to-image tool, which in turn requires a nodejs installation.

Installation

pip install pybpmn-parser

Development

In order to set up the necessary environment:

  1. create an environment pybpmn-parser with the help of conda:
    conda env create -f environment.yml
    
  2. activate the new environment with:
    conda activate pybpmn-parser
    

NOTE: The conda environment will have pybpmn-parser installed in editable mode. Some changes, e.g. in setup.cfg, might require you to run pip install -e . again.

Optional and needed only once after git clone:

  1. install JupyterLab kernel

    python -m ipykernel install --user --name "${CONDA_DEFAULT_ENV}" --display-name "$(python -V) (${CONDA_DEFAULT_ENV})"
    
  2. install several pre-commit git hooks with:

    pre-commit install
    # You might also want to run `pre-commit autoupdate`

    and checkout the configuration under .pre-commit-config.yaml. The -n, --no-verify flag of git commit can be used to deactivate pre-commit hooks temporarily.

Project Organization

├── LICENSE.txt             <- License as chosen on the command-line.
├── README.md               <- The top-level README for developers.
├── data
│   ├── external            <- Data from third party sources.
│   ├── interim             <- Intermediate data that has been transformed.
│   ├── processed           <- The final, canonical data sets for modeling.
│   └── raw                 <- The original, immutable data dump.
├── docs                    <- Directory for Sphinx documentation in rst or md.
├── environment.yml         <- The conda environment file for reproducibility.
├── notebooks               <- Jupyter notebooks. Naming convention is a number (for
│                              ordering), the creator's initials and a description,
│                              e.g. `1.0-fw-initial-data-exploration`.
├── pyproject.toml          <- Build system configuration. Do not change!
├── scripts                 <- Analysis and production scripts which import the
│                              actual Python package, e.g. train_model.py.
├── setup.cfg               <- Declarative configuration of your project.
├── setup.py                <- Use `pip install -e .` to install for development or
│                              or create a distribution with `tox -e build`.
├── src
│   └── pybpmn              <- Actual Python package where the main functionality goes.
├── tests                   <- Unit tests which can be run with `py.test`.
├── .coveragerc             <- Configuration for coverage reports of unit tests.
├── .isort.cfg              <- Configuration for git hook that sorts imports.
└── .pre-commit-config.yaml <- Configuration of pre-commit git hooks.

Note

This project has been set up using PyScaffold 4.0.1 and the dsproject extension 0.6.1.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Packages

No packages published