Skip to content

Dynamic Vision System

Tom Tzook edited this page Sep 13, 2017 · 13 revisions

The Dynamic Vision System provides an easy-to-use automatic image processing which requires minimal code writing to process an image. Examples for the vision system are found here: Example->vision->filters.

Image processing is done by running filters over an image that filter out unwanted data. The remaining data is usually the wanted data, which is than used in different ways. The Dynamic Vision System works in a similar way, but spares the need to manually create filters, use them on an image and read the result.

Vision System

The Vision Process

The vision process is done by:

  • Running defined filters over an image
  • Attempting to extract an analysis of objects from the image

This process is done by a VisionProcessing object. This object contains a collections of filters in the form of VisionFilter objects, and one AnalysisCreator object which is responsible for returning post-vision analysis of the image in the form of an Analysis object.

Vision Process

It is possible to load and save data of a VisionProcessing to XML files or byte streams, allowing transfer and easy loading of processing parameters.

Since there are many libraries providing algorithms and logic for performing vision processing, FlashLib allows users to use any wanted library. To use a library, it is necessary to implement the VisionSource interface, which provides abstract logic for performing vision algorithms. It is then necessary to pass that object to the used by a VisionProcessing object when processing the vision filters.

FlashLib provides a built-in OpenCV implementation.

VisionFilter

The VisionFilter class provides a base for vision filters used in the vision system. It contains one method to be implemented: process. This method receives a VisionSource object and is responsible for executing the filter logic.

When relying on XML or byte stream processing creation, it is possible to provide each filter with parameters, allowing for more complex control of its operation. Parameters are defined as properties from FlashLib's beans package. There are 4 possible property types:

  • IntegerProperty
  • StringProperty
  • DoubleProperty
  • BooleanProperty

To define a parameter for your filter, simply create a property to hold its data in your class and create a method which returns the property object. The method must be named as following: "parameternameProperty". For example, a "amount" parameter with integer value will have the following method signature:

public IntegerProperty amountProperty()

When loading the filter through XML or byte streams, the parameters assigned in the data will be searched for as methods by their name. If the property method is found and the parameter data type matches the property type, data will be set. A similar operation occurs when saving: all methods which end with Property will be called, their properties value get and then saved.

In addition, filters should have at least two constructors:

  • An empty constructor, which will be used by the automatic creation prrocess of XML and byte streams
  • Full constructor, which will be used manually by developers

To allow for easy definision of filters for serialization, it is possible to use a FilterCreator object. Such objects convert between a filter class and a string name representing that class, meaning that when defining a filter in XML it is possible to name it color instead of edu.flash3388.flashlib.vision.ColorFilter.

Existing built-in implementations are:

  • CircleFilter: filters-out non circle contours
  • ClosestToCenterFilter: filters-out contours furthest from the center frame and leaves a maximum desired amount
  • ClosestToLeftFilter: filters-out contours furthest from the left side of the frame and leaves a maximum desired amount
  • ClosestToRightFilter: filters-out contours furthest from the right side of the frame and leaves a maximum desired amount
  • CoordinateFilter: filters-out contours furthest from the given coordinate on the frame and leaves a maximum desired amount
  • ColorFilter: filters-out contours which are outside a given color range. Works with RGB or HSV (makes sure to convert image format first).
  • GrayFilter: filters-out contours which are outside a given color range. Works grayscale (makes sure to convert image format first).
  • HighestFilter: filters-out lowest contours on the frame and leaves a maximum desired amount
  • LargestFilter: filters-out smallest contours on the frame and leaves a maximum desired amount
  • LowestFilter: filters-out highest contours on the frame and leaves a maximum desired amount
  • ShapeFilter: filters-out contours which do not match the wanted shape. Approximates contours' shapes
  • RatioFiltes: leaves only 2 contours which best match a given size and position ratios between the two

XML example of LargestFilter with a FilterCreator providing a name:

<filter name="largest">
   <param name="amount" type="int">10</param>
</filter>

FilterCreator

A simple class for convenience, resposible for converting VisionFilter classes to String names for serialization and String names to classes for deserialization. Only one instance of this class can be used throughout FlashLib at any given time. If the creator object cannot provide a converation for a class, the class name and package is used instead.

DefaultFilterCreator is the default implementation for this class and accommodates built-in filters. It is possible to manually add filters to it since it uses a map.

Analysis

Analysis is a simple class containing a map of values. It is used to store data from a vision process to be used by users. It can contain values of types: int, double, boolean, String. This class can be serialized and deserialized from byte streams. Since a map is used, each value is represented by a string.

To allow easy compatiblity with built-in FlashLib algorithms, there are several default values which should be considered:

  • Center x: X coordinate of the center of the desired object to be located
  • Center y: Y coordinate of the center of the desired object to be located
  • Horizontal distance: The x-axis offset in pixels of the desired object from the center frame. Including direction: positive - right, negative - left.
  • Vertical distance: The y-axis offset in pixels of the desired object from the center frame. Including direction: positive - down, negative - up.
  • Target distance: Distance from the camera to the object in real life.
  • Angle offset: Horizontal angle offset between the object and the center frame.

Those properties are not a must, but are recommanded. Please notice that some FlashLib vision motion algorithms depend on them so please make sure. To set the values, just use the constant String variables in the class to get the name of the property and treat the values as double.

AnalysisCreator

To allow users to manually save post-vision results, we introduced the AnalysisCreator object - an interface with one method createAnalysis, this object is used by VisionProcessing after all the filters have executed to retreive an Analysis object for the user.

createAnalysis receives a VisionSource and should return an Analysis object or null.

It is possible to serialize and deserialize AnalysisCreator object from XML files and byte streams. Meaning that like filters, they can be saved and transfer. AnalysisCreators also feature a parameter option like vision filters which works exactly the same. Because of that, please follow the same rules put in place for creating a custom VisionFilter object here, to make sure the serialization functions currectly.

Unlike VisionFilter, this class does not have a conversion to a String name for serialization and deserialization. Because of that, the classes name and package will be used.

Existing built-in implementations are:

  • GroupedAnalysisCreator: treats all remaining contours as one and gets the default Analysis data for that.
  • TemplateAnalysisCreator: uses template matching to find the wanted objects and gets the default Analysis data.

VisionProcessing

As stated before, the VisionProcessing object is responsible for storing the processing data and performing the vision process. It can hold a collection of VisionFilter and one AnalysisCreator. All of which can be serialized and deserialized through XML and byte streams.

To perform the vision process, simply call processAndGet and pass it a VisionSource object. This method will return an Analysis object if one was created, null otherwise. If no AnalysisCreator object was set to a VisionProcessing object, the Analysis object will be received from the VisionSource by calling getResult. This however, will return an object whose implementation depends on the VisionSource implementation and this method is not recommended.

XML example for VisionProcessing using the DefaultFilterCreator, ColorFilter and LargestFilter and GroupedAnalysisCreator:

<vision name="test">
  <filter name="color">
     <param name="hsv" type="boolean">true</param>
     <param name="min1" type="int">0</param>
     <param name="max1" type="int">180</param>
     <param name="min2" type="int">0</param>
     <param name="max2" type="int">255</param>
     <param name="min3" type="int">230</param>
     <param name="max3" type="int">255</param>
  </filter>
  <filter name="largest">
     <param name="amount" type="int">5</param>
  </filter>
  <analysis-creator name="edu.flash3388.flashlib.vision.GroupedAnalysisCreator">
     <param name="targetWidth" type="double">20.0</param>
     <param name="targetHeight" type="double">10.0</param>
     <param name="camFov" type="double">32.0</param>
     <param name="distanceHeight" type="boolean">false</param>
  </analysis-creator>
</vision>

VisionSource

VisionSource is an interface, which allows the vision logic to be abstracted and implemented differently. Basically, its only purpose is to allow user flexibilty when wanted.

Existing built-in implementations are:

  • CvSource: using the OpenCV library

Running and Controlling Vision

Although can be done manually, FlashLib provides logic for easily controling and running vision processing, including remote control.

A vision control object is represented by the Vision interface. The interface provides abstract logic for retreiving Analysis objects, controling the execution of a vision process and consideres the existance of multiple VisionProcessing objects.

The Vision interface considers two types of Analysis objects:

  • New: an Analysis object is considered new if the time since it was received from the vision process has not exceeded a timeout which can be changed by the user.
  • Normal: corresponds to any Analysis object.

The idea of a "new" Analysis object is to allow us to make sure the data is up to date. If too much time has passed, then the analysis data is most likely no longer updated with the current real-time events. This concept is crucial when working with robots.

There are 2 implementations to Vision:

  • VisionRunner: for locally running vision processing
  • RemoteVision: for remotely controlling a vision process

VisionRunner

VisionRunner is an abstract class, providing implementation for most of Vision. It performs the vision process locally and holds a collection of VisionProcessing objects with the ability to switch between them.

The abstraction of this class is to allow multiple ways to run the vision process locally. There are 2 implementations:

  • SimpleVisionRunner: for synchronized vision process running
  • ThreadedVisionRunner: for asynchronized vision process running

This class also extends Sendable from the communication system. This allows user to control this class remotly. However the remote control is already implemented through RemoteVision so manual implementation is unnecessary.

RemoteVision

RemoteVision allows users to control a VisionRunner object remotely using the FlashLib communications system. This class too implements Vision, but provides different logic to allow for remote control. It is possible to attach a VisionProcessing objects locally, they will be sent to the remote VisionRunner sendable.