Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Introduction
I am opening this proof-of-concept PR to facilitate discussion around the current state of
depthai
and more specifically how we handle many tasks related to neural network models.I have had this idea in mind for quite a while now but only got to actually writing up a proof-of-concept last week.
Start with the what
There are a couple of nodes in
depthai-core
that directly interact with neural network models, e.g.NeuralNetwork
,DetectionNetwork
,SpatialDetectionNetwork
and a few others.Don't get me wrong, I think the way they work is great, there are easy ways to load models into memory, set the configuration parameters, or do both at the same time thanks to the
NNArchive
class, packaging both of these in a single zip file.However, despite having all the different functionalities and ways of working with models, I propose we introduce a novel, unified approach of working with neural networks.
I have had the opportunity to work with all these nodes and to implement some of their methods myself. While implementing some of their functionalities, I came to the conclusion that each of the aforementioned nodes is doing more work than it really should.
Take the
setModelPath
as an example. I do not think a neural network node should be responsible for loading models into memory, let alone interacting with the filesystem.In my view, there should be just one method,
setModel
, with a single argument being some kind of a model wrapper around both the underlying neural network model and a pertinent settings config.Loading models into memory should be a job of a different module, one that abstracts anything model-type specific away (i.e.
Blob
vsDlc
) and one that ensures proper filesystem interaction and model-specific config initialization.What’s more, there have been some ideas for the future where we would be doing config optimization (e.g. determining the perfect number of shaves for an OpenVINO model) and I would really love to see another standalone module responsible for just that, not yet another copy-pasted method for each of the neural network nodes, each doing pretty much the same thing.
Finally, when it comes to sending the model onto a device, in case of the
Blob
model, we are sending the blob’s bytes directly to the device, while in the case of aDlc
model, we are sending the bytes as well as the model’s path, storing the bytes at that path on the camera and finally loading that stored file again into memory (correct me if I am wrong here). Why not unify the way we handle model loading onto a device and create a module responsible for serialization as well as de-serialization of each model type.In spite of this going against the common programmers’ precept “If it ain’t broke, don’t fix it”, I think we should improve on this, and what follows is an attempt, an idea, a mind dumb, how I think we could better structure one of the multiple parts of
depthai-core
and how we work with neural network models.Finish with the how
One way how we could abstract the type of a network away is through an introduction of a variant, a C++17 feature, encapsulating all the different model types we would like to support. Let me give an example of how a header could look like. Notice the
std::variant
at the end.This code snippet above is what I think could be part of the
Models.hpp
header file. What follows is theModelLoader.hpp
header.With that, provided that all the neural network handle each of the model variants explicitly, one can load and set a model very easily.
Model zoo can be built on top of the model loader, again returning a
ModelVariant
, all declared inModelZoo.hpp
There’s more modules, such as
ModelSerializer
with two methodsserialize
anddeserialize
. There’s also theModelOptimizer
with theoptimize
method to optimize parameters most critical for performance.I think the brief explanation above together with the code snippets should convey what I am trying to propose here.
To make this not just an idea proposal but something I or anyone else can play with, I’ve made some changes to our codebase itself.
By no means do I mean to say that my proposed changes are better that what we have right now. It is simply my idea how I would structure the codebase with the benefit of hindsight and from my personal experiences working with depthai.
If anything, I hope this short write-up will make us contemplate what the user and developer experience has been so far how we could improve upon it.
Any feedback, comments or other suggestions would be much appreciated.