Deep Vision Processing

Deep computer-vision algorithms for Processing.

The idea behind this library is to provide a simple way to use (inference) machine learning algorithms for computer vision tasks inside Processing. Mainly portability and easy-to-use are the primary goals of this library. Because of that (and javacv), no GPU support at the moment.

Caution: The API is still in development and can change at any time.

Lightweight OpenPose Example

Install

Download the latest prebuilt version from the release sections and install it into your Processing library folder.

Because the library is still under development, it is not yet published in the Processing contribution manager.

Usage

The base of the library is the DeepVision class. It is used to download the pretrained models and create new networks.

import ch.bildspur.vision.*;
import ch.bildspur.vision.network.*;
import ch.bildspur.vision.result.*;

DeepVision vision = new DeepVision(this);

Usually it makes sense to define the network globally for your sketch and create it in setup. The create method downloads the pre-trained weights if they are not already existing. The network first has to be created and then be setup.

YOLONetwork network;

void setup() {
  // create the network & download the pre-trained models
  network = vision.createYOLOv3();

  // load the model
  network.setup();
  
  // set network settings (optional)
  network.setConfidenceThreshold(0.2f);
  
  ...
}

By default, the weights are stored in the library folder of Processing. If you want to download them to the sketch folder, use the following command:

// download to library folder
vision.storeNetworksGlobal();

// download to sketch/networks
vision.storeNetworksInSketch();

Each network has a run() method, which takes an image as a parameter and outputs a result. You can just throw in any PImage and the library starts processing it.

PImage myImg = loadImage("hello.jpg");
ResultList<ObjectDetectionResult> detections = network.run(myImg);

Please have a look at the specific networks for further information or at the examples.

Networks

Here you find a list of implemented networks:

Object Detection ✨
- YOLOv3-tiny
- YOLOv3-tiny-prn
- EfficientNetB0-YOLOv3
- YOLOv3 OpenImages Dataset
- YOLOv3-spp (spatial pyramid pooling)
- YOLOv3
- YOLOv4
- YOLOv4-tiny
- SSDMobileNetV2
- Handtracking based on SSDMobileNetV2
- TextBoxes
- Ultra-Light-Fast-Generic-Face-Detector-1MB RFB (~30 FPS on CPU)
- Ultra-Light-Fast-Generic-Face-Detector-1MB Slim (~40 FPS on CPU)
- Cascade Classifier
Object Segmentation
- Mask R-CNN
Object Recognition 🚙
- Tesseract LSTM
Keypoint Detection 🤾🏻‍♀️
- Facial Landmark Detection
- Single Human Pose Detection based on lightweight openpose
Classification 🐈
- MNIST CNN
- FER+ Emotion
- Age Net
- Gender Net
Depth Estimation 🕶
- MidasNet
Image Processing
- Style Transfer
- Multiple Networks for x2 x3 x4 Superresolution

The following list shows the networks that are on the list to be implemented (⚡️ already in progress):

YOLO 9K (not supported by OpenCV)
Multi Human Pose Detection ⚡️ (currently struggling with the partial affinity fields 🤷🏻‍♂️ help?)
TextBoxes++ ⚡️
CRNN ⚡️
PixelLink

Object Detection

Locating one or multiple predefined objects in an image is the task of the object detection networks.

YOLO Example

The result of these networks is usually a list of ObjectDetectionResult.

ObjectDetectionNetwork net = vision.createYOLOv3();
net.setup();

// detect new objects
ResultList<ObjectDetectionResult> detections = net.run(image);

for (ObjectDetectionResult detection : detections) {
    println(detection.getClassName() + "\t[" + detection.getConfidence() + "]");
}

Every object detection result contains the following fields:

getClassId() - id of the class the object belongs to
getClassName() - name of the class the object belongs to
getConfidence() - how confident the network is on this detection
getX() - x position of the bounding box
getY() - y position of the bounding box
getWidth() - width of the bounding box
getHeight() - height of the bounding box

YOLOv3 [Paper]

YOLOv3 the third version of the very fast and accurate single shot network. The pre-trained model is trained on the 80 classes COCO dataset. There are three different weights & models available in the repository:

YOLOv3-tiny (very fast, but trading performance for accuracy)
YOLOv3-spp (original model using spatial pyramid pooling)
YOLOv3 (608)
YOLOv4 (608) (most accurate network)
YOLOv4-tiny (416)

// setup the network
YOLONetwork net = vision.createYOLOv4();
YOLONetwork net = vision.createYOLOv4Tiny();
YOLONetwork net = vision.createYOLOv3();
YOLONetwork net = vision.createYOLOv3SPP();
YOLONetwork net = vision.createYOLOv3Tiny();

// set confidence threshold
net.setConfidenceThreshold(0.2f);

SSDMobileNetV2 [Paper]

This network is a single shot detector based on the mobilenetv2 architecture. It is pre-trained on the 90 classes COCO dataset and is really fast.

SSDMobileNetwork net = vision.createMobileNetV2();

WebCam Example MobileNet

Handtracking [Project]

This is a pre-trained SSD MobilenetV2 network to detect hands.

SSDMobileNetwork net = vision.createHandDetector();

Hand Detector WebCam Example

TextBoxes [Paper]

TextBoxes is a scene text detector in the wild based on SSD MobileNet. It is able to detect text in a scene and return its location.

TextBoxesNetwork net = vision.createTextBoxesDetector();

Ultra-Light-Fast-Generic-Face-Detector [Project]

ULFG Face Detector is a very fast CNN based face detector which reaches up to 40 FPS on a MacBook Pro. The face detector comes with four different pre-trained weights:

RFB640 & RFB320 - More accurate but slower detector
Slim640 & Slim320 - Less accurate but faster detector

ULFGFaceDetectionNetwork net = vision.createULFGFaceDetectorRFB640();
ULFGFaceDetectionNetwork net = vision.createULFGFaceDetectorRFB320();
ULFGFaceDetectionNetwork net = vision.createULFGFaceDetectorSlim640();
ULFGFaceDetectionNetwork net = vision.createULFGFaceDetectorSlim320();

The detector detects only the frontal face part and not the complete head. Most algorithms that run on results of face detections need a rectangular detection shape.

Cascade Classifier [Paper]

The cascade classifier detector is based on boosting and very common as pre-processor for many classifiers.

CascadeClassifierNetwork net = vision.createCascadeFrontalFace();

Face Detector Haar Webcam Example

Object Recognition

tbd

KeyPoint Detection

tbd

Classification

tbd

Depth Estimation

MidasNet

Towards Robust Monocular Depth Estimation: Mixing Datasets for Zero-shot Cross-dataset Transfer

MidasNet

Image Processing

tbd

Build

Install JDK 8 (because of Processing)
~~Download maven snapshots for JavaCV:~~ mvn -U compile (No need to this anymore)

Run gradle to build a fat jar:

# windows
gradlew.bat fatjar

# mac / unix
./gradlew fatjar

Create a new release:

./release.sh version

FAQ

Why is xy network not implemented?

Please open an issue if you have a cool network that could be implemented or just contribute a PR.

Why is it no possible to train my own network?

The idea was to give artist and makers a simple tool to run networks inside of Processing. To train a network needs a lot of specific knowledge about Neural Networks (CNN in specific).

Of course it is possible to train your own YOLO or SSDMobileNet and use the weights with this library.

Is it compatible with Greg Borensteins OpenCV for Processing?

No, OpenCV for Processing uses the direct OpenCV Java bindings instead of JavaCV. Please only include either one library, because Processing gets confused if two OpenCV packages are imported.

About

Maintained by cansik with the help of the following dependencies:

Stock images from the following peoples have been used:

yoga.jpg by Yogendra Singh from Pexels
office.jpg by fauxels from Pexels
faces.png by shvetsa from Pexels
hand.jpg by Thought Catalog on Unsplash
sport.jpg by John Torcasio on Unsplash
sticker.jpg by 🇨🇭 Claudio Schwarz | @purzlbaum on Unsplash
children.jpg by Sandeep Kr Yadav

Name		Name	Last commit message	Last commit date
Latest commit History 208 Commits
data		data
examples		examples
gradle/wrapper		gradle/wrapper
readme		readme
src		src
.gitignore		.gitignore
.travis.yml		.travis.yml
README.md		README.md
appveyor.yml		appveyor.yml
build.gradle		build.gradle
gradlew		gradlew
gradlew.bat		gradlew.bat
library.properties		library.properties
pom.xml		pom.xml
release.sh		release.sh
settings.gradle		settings.gradle
updateLatestTag.sh		updateLatestTag.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Deep Vision Processing

Install

Usage

Networks

Object Detection

YOLOv3 [Paper]

SSDMobileNetV2 [Paper]

Handtracking [Project]

TextBoxes [Paper]

Ultra-Light-Fast-Generic-Face-Detector [Project]

Cascade Classifier [Paper]

Object Recognition

KeyPoint Detection

Classification

Depth Estimation

MidasNet

Image Processing

Build

FAQ

About

About

Uh oh!

Releases

Packages

Languages

brad-miller/deep-vision-processing

Folders and files

Latest commit

History

Repository files navigation

Deep Vision Processing

Install

Usage

Networks

Object Detection

YOLOv3 [Paper]

SSDMobileNetV2 [Paper]

Handtracking [Project]

TextBoxes [Paper]

Ultra-Light-Fast-Generic-Face-Detector [Project]

Cascade Classifier [Paper]

Object Recognition

KeyPoint Detection

Classification

Depth Estimation

MidasNet

Image Processing

Build

FAQ

About

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages