Skip to content

Conversation

ChirayuRai
Copy link
Contributor

@ChirayuRai ChirayuRai commented Jul 11, 2025

Proposed change

This PR introduces the DeGirum detector, enabling Frigate users to run any model from DeGirum's AI Hub without complex setup. Provide the model name, input width and height, and the pixel format from the model's JSON. Users won't need to locally manage or configure models; the AI Hub manages models and their configurations.

The DeGirum detector matches native performance while giving access to a wide range of accelerators. While Frigate already supports Hailo, TFLite, and Rockchip, DeGirum adds support for hardware not yet integrated into Frigate, such as DEEPX, MemryX, and BrainChip. DeGirum also continuously develops support for other accelerators, such as accelerators from NVIDIA, AMD, NXP, and Axelera AI, with more to come.

Here’s some tests with Hailo and Openvino to show the matching of performance:

  • Degirum Detector using Yolo6n on Hailo8
Screenshot from 2025-04-23 12-36-30 Screenshot from 2025-04-23 12-36-40
  • Hailo Detector using Yolo6n on Hailo8
Screenshot from 2025-04-23 14-08-19 Screenshot from 2025-04-23 14-08-29
  • Degirum Detector using ssdlite mobilenet on i3-1115G4
Screenshot from 2025-05-08 11-33-41 Screenshot from 2025-05-08 11-33-48
  • Openvino detector using ssdlite mobilenet on i3-1115G4
Screenshot from 2025-05-08 11-18-09 Screenshot from 2025-05-08 11-18-24

Type of change

  • Dependency upgrade
  • Bugfix (non-breaking change which fixes an issue)
  • New feature
  • Breaking change (fix/feature causing existing functionality to break)
  • Code quality improvements to existing code
  • Documentation Update

Checklist

  • The code change is tested and works locally.
  • Local tests pass. Your PR cannot be merged unless tests pass
  • There is no commented out code in this PR.
  • UI changes including text have used i18n keys and have been added to the en locale.
  • The code has been formatted using Ruff (ruff format frigate)

Copy link

netlify bot commented Jul 11, 2025

Deploy Preview for frigate-docs canceled.

Name Link
🔨 Latest commit ef68645
🔍 Latest deploy log https://app.netlify.com/projects/frigate-docs/deploys/68ae33feb253f90007b76dce

@NickM-27
Copy link
Collaborator

Hi, this PR seems to have the same concerns as your original PR #17159

namely being that there is no guarantee that the detection results returned from the detector correspond to the detection request, which will lead to invalid results

@ChirayuRai
Copy link
Contributor Author

In our generator, we're ensuring that blocking is happening in the queue.get() call, meaning we always wait for a frame to be passed into our detect_raw(). That should ensure that frames are never de-synced. When testing as well, I never saw the queue size exceed 1. I can add a maxsize of 1 to the queue to help make that more clear?

@NickM-27
Copy link
Collaborator

When testing as well, I never saw the queue size exceed 1.

as far as I can tell you are only testing with one camera, so this behavior is expected. When you test with multiple cameras this will become a problem.

@ChirayuRai
Copy link
Contributor Author

Alright, I'll go back and test with multiple cameras and report back!

@ChirayuRai
Copy link
Contributor Author

When running tests with multiple cameras, I didn't seem to be able to desync anything, as long as the accelerator I used was able to properly handle the amount of frames thrown at it. If I was running a CPU detector trying to run some yolov11@30fps for multiple cameras, I was skipping frames and they were out of sync. But when I ran everything on a hailo accelerator with 4 cameras @ 15fps, it was doing fine.

Also, is the logic not set up to avoid these kinds of pitfalls from occurring? From what I understand, within the ObjectDetectProcess, the shared memory manager will ensure that we're at least processing the current frame for the camera synchronously? Like we grab the frame with the camera name first, then while we have that frame for the current camera, we just pass it into detect_raw. And in that case, our detect_raw is not returning prematurely. It's only returning once the inference is completed. So then nothing should cause syncing issues? Am I misunderstanding something?

@NickM-27
Copy link
Collaborator

And in that case, our detect_raw is not returning prematurely. It's only returning once the inference is completed. So then nothing should cause syncing issues? Am I misunderstanding something?

It depends how the detector works. If the detector itself is async like this one is, there is always the possibility that the frames in the detection queue don't come back in the order of the requests. We had similar issues that needed to be worked around in the memryx detector.

Im also in general unsure if this is something we want to support, as in general it is advantageous for us to have direct control of the inferences, especially as we start to implement full UI config and will have more ability to optimize the object detection for the way that frigate works.

As an example of this, I'm unsure if we would be able to make the changes to the label map that are made to the included label maps. And if it is possible, it is additional config that doesn't match what we have already implemented.

@ChirayuRai
Copy link
Contributor Author

We revamped and updated the code, it follows a synchronous nature now.

Async or Sync Detector

Let's start at detect_raw.

truncated_input = tensor_input.reshape(tensor_input.shape[1:])
self._queue.put(truncated_input)

We're only really doing two things here. First, we're truncating the input and removing the "n" channel from the tensor, so we can properly pass in said tensor into our prediction handler. After this tensor has been truncated, it's going into our prediction queue.

detections = np.zeros((20, 6), np.float32)
res = next(self.prediction)

Once this truncated tensor has been placed in our queue, we're just creating an empty detection array, and then looking for the next item in our prediction generator. Once that has been called, we will invoke the generator (making the control still completely synchronous).

def prediction_generator(self):
    with self.dg_model as model:
        while 1:
            data = self._queue.get(block=True)
            result = model.predict(data)
            yield result

After the generator has been called, all we're doing is popping the queue, running an inference on it, and then yielding said result. This self._queue.get() is also blocking, meaning the res = next(self.prediction) call MUST yield something for detect_raw. If this queue is for some reason not populated, this call will just block until the current frame times out and is then skipped.

The reason we had this generator + context management of the self.dg_model is so that we don't have a predict call constantly opening and closing a web socket connection with the inference server (the ai server in the docker compose).

if len(res.results) == 0 or len(res.results[0]) == 0:
    return detections

i = 0
for result in res.results:
    if i >= 20:
        break

    detections[i] = [
        result["category_id"],
        float(result["score"]),
        result["bbox"][1] / self.model_height,
        result["bbox"][0] / self.model_width,
        result["bbox"][3] / self.model_height,
        result["bbox"][2] / self.model_width,
    ]
    i += 1

return detections

Once we run through that generator and yield an inference result, we know we have some kind of InferenceResults object (a DeGirum specific object). Then, we just map this inference result to the proper format for the detections, and the method returns 20 of the detections found in the current frame! So as you can see, the approach is completely synchronous. It's kind of similar to Hailo where we use async looking code because that's what our code is optimized for, but ultimately, we're just using tons of blocking to make everything occur synchronously.

Label Mapping

We do allow label mappings to be changed! All you need to do is point dg.model._postprocessor._label_dictionary to some valid dict of category_id (as string) : label (as string). And as long as the inputs for the label map are reachable through each plugin, we'd be more than happy to create some kind of midlayer that converts your format of labels to one compliant with our format.

Direct Inference Result Usage

Technically, you can access direct inference results with any DeGirum model. Just set pre and post processor to None after the model has been initialized (model._postprocessor = None), and you'll be able to receive raw inference results. In fact, our pre processor is removed already because I noticed you guys already handle preprocessing before putting the tensor into detect_raw. The reason we still use our post processor in detect_raw is because it converts the outputs of all these different accelerators into our standard InferenceResults object, making it possible to have one code base that handles the outputs of all our supported accelerators.

Regarding future development, we're more than happy to continue to work with you guys to implement features! We do think there's a big value add with the ease of use that comes with our detector. Users don't have to worry about compiling models that work with their specific hardware on their own, ensuring architectures are all correct, figuring out the pixel layout, or small details like that. All of that information is available in the model.json file on the DeGirum AI Hub. And we have tons of compiled models in the public zoo that can be accessed and used for free with no token (as long as you're running inferences on your own local hardware using @local or some local ai server). It's one sign up and then just browse the zoo for any model and hardware combination the user wants. If users find that certain models don't exist in the public zoo, they can also compile their own models using our Cloud Compiler.

Copy link
Collaborator

@NickM-27 NickM-27 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the explanation, a few things:

  1. This will be considered a community supported detector, which means if a user creates a support discussion with a problem with this detector, we will ping y'all to provide support as we won't directly provide support for the detector itself.
  2. Are the files added in model necessary? They are currently not added to the docker image so they seem unnecessary and I think they should be removed.

@ChirayuRai
Copy link
Contributor Author

  1. Alright, that's perfect!
  2. No, sorry that slipped through when making modifications. Will be removing that in the next PR, along with the changes for the documentation!

…eaders to be more consistent with the rest of the page, and removed uneeded 'models' folder
@ChirayuRai
Copy link
Contributor Author

Pushed the changes!

Copy link
Collaborator

@NickM-27 NickM-27 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, one other thing to note here, 0.16 is close to release so we are not merging any new features. So there are two options:

  1. rebase this onto 0.17 branch, and I can change the base branch to 0.17
  2. Leave this pointing to dev and once 0.16 is released we can merge it

@ChirayuRai
Copy link
Contributor Author

Is it easier for yall if we just rebase?

@NickM-27
Copy link
Collaborator

sure

@ChirayuRai
Copy link
Contributor Author

If I go ahead and rebase, I have merge conflicts per every commit, and it doesn't feel productive to individually go through and fix each conflict per each commit. So I think it makes more sense to just leave it pointing at dev for now, and then merge after release :)

@NickM-27
Copy link
Collaborator

that is fine too, but just for clarity you need to rebase -i 0.17 and remove all of the commits from the list except yours here. In that case there should be no conflicts

@ChirayuRai
Copy link
Contributor Author

That's what I did originally to try and rebase, but I kept on getting merge conflicts. So I think it makes more sense to just wait.

…e, updated requirements with degirum_headless
…eaders to be more consistent with the rest of the page, and removed uneeded 'models' folder
…e, updated requirements with degirum_headless
…eaders to be more consistent with the rest of the page, and removed uneeded 'models' folder
@ChirayuRai
Copy link
Contributor Author

Sorry, I seem to have bogged up the commit history while trying to rebase. Would you guys happen to know how I can clean it back up?

@NickM-27 NickM-27 changed the base branch from 0.17 to dev August 26, 2025 19:49
@NickM-27
Copy link
Collaborator

What you have looks fine now, you just need to cleanup the code with ruff

@NickM-27
Copy link
Collaborator

nevermind, looks like classification and audio files were changed when they should not be. You likely just need to reset those files to the way they are in frigate:dev

@NickM-27 NickM-27 merged commit 0febc4d into blakeblackshear:dev Aug 26, 2025
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants