Omni3D with Custom Object Tracker

Omni3D: A Large Benchmark and Model for 3D Object Detection in the Wild
This repository extends the Omni3D model by integrating a custom object tracking mechanism, enhancing 3D detection with continuous tracking across frames.

Overview

Omni3D, originally developed by Garrick Brazil et al., is a state-of-the-art model for 3D object detection. This project incorporates a custom tracking mechanism to extend the detection capabilities, enabling real-time object tracking in various environments.

For more details on the Omni3D project, refer to the original repository.

Installation

Follow the steps below to set up the environment:

# Create and activate a new conda environment
conda create -n cubercnn python=3.8
source activate cubercnn

# Install main dependencies
conda install -c fvcore -c iopath -c conda-forge -c pytorch3d -c pytorch fvcore iopath pytorch3d pytorch=1.8 torchvision=0.9.1 cudatoolkit=10.1

# Install additional dependencies
pip install cython opencv-python
pip install 'git+https://github.com/cocodataset/cocoapi.git#subdirectory=PythonAPI'
python -m pip install detectron2 -f https://dl.fbaipublicfiles.com/detectron2/wheels/cu101/torch1.8/index.html
conda install -c conda-forge scipy seaborn

Running the Demo

# Download sample images
sh demo/download_demo_COCO_images.sh

# Run the demo with the custom tracker
python demo/demo_detection.py \
--config-file cubercnn://omni3d/cubercnn_DLA34_FPN.yaml \
--input-folder "datasets/coco_examples" \
--threshold 0.25 --display \
MODEL.WEIGHTS cubercnn://omni3d/cubercnn_DLA34_FPN.pth \
OUTPUT_DIR output/demo_with_tracking

Running the Demo with Tracker

# Download sample images
sh demo/download_demo_COCO_images.sh

# Run the demo with the custom tracker
python demo/demo_tracker.py \
--config-file cubercnn://omni3d/cubercnn_DLA34_FPN.yaml \
--input-video "demo/video_indoor2.mp4" \
--threshold 0.40 --display

Training

To train the Omni3D model with tracking:

python tools/train_net.py \
  --config-file configs/Base_Omni3D.yaml \
  OUTPUT_DIR output/omni3d_with_tracking

Inference

python tools/train_net.py \
  --eval-only --config-file cubercnn://omni3d/cubercnn_DLA34_FPN.yaml \
  MODEL.WEIGHTS cubercnn://omni3d/cubercnn_DLA34_FPN.pth \
  OUTPUT_DIR output/evaluation

Tracker Implementation

This project adds an object tracker to the original Omni3D model. The tracker matches detected objects across frames using a custom algorithm based on:

3D bounding box information
GIoU and Iou 3D computation
Object centers
Category types
Chamfer Distance

Key features:

Custom matching logic for continuous tracking across frames
High-cost match handling for challenging detections

Citing Omni3D

Please cite the original Omni3D paper:

@inproceedings{brazil2023omni3d,
  author =       {Garrick Brazil and Abhinav Kumar and Julian Straub and Nikhila Ravi and Justin Johnson and Georgia Gkioxari},
  title =        {{Omni3D}: A Large Benchmark and Model for {3D} Object Detection in the Wild},
  booktitle =    {CVPR},
  address =      {Vancouver, Canada},
  month =        {June},
  year =         {2023},
  organization = {IEEE},
}

Chat with Phi-3 Vision

This folder is part of the Monocular 3D Object Detection and Tracking project and contains a Streamlit-based application that allows users to interact with a vision model, specifically Phi-3 Vision, to analyze images and return detailed descriptions.

Overview

The application is designed to:

Receive images from a server via a socket connection
Allow users to submit specific queries about the images
Leverage Phi-3 Vision to provide detailed descriptions of image content

Key features:

Real-time communication with a server for image acquisition
Text-based interaction for specific image details
Automatic frame fetching upon query submission
Customizable chat interface that resets after each response
Responsive design for smooth user experience

Installation

Clone the repository:

git clone https://github.com/ramonatarantino/mocular-3d-object-detection-tracking.git
cd mocular-3d-object-detection-tracking/chat-with-phi-3-vision

Install dependencies:
```
pip install -r requirements.txt
```
Run the application:
```
streamlit run app.py
```
Start the server: Ensure the server providing the image stream is running.

License (Chat with Phi 3 Vision)

This project is licensed under the MIT License.
The server code is based on the work from this repository.

Contributing

Contributions are welcome! Feel free to open issues or submit pull requests.

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
chat-with-phi-3-vision		chat-with-phi-3-vision
configs		configs
cubercnn		cubercnn
datasets		datasets
demo		demo
tools		tools
DATA.md		DATA.md
LICENSE.md		LICENSE.md
MODEL_ZOO.md		MODEL_ZOO.md
README.md		README.md
RESULTS.md		RESULTS.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Omni3D with Custom Object Tracker

Table of Contents:

Overview

Installation

Running the Demo

Running the Demo with Tracker

Training

Inference

Tracker Implementation

Citing Omni3D

Chat with Phi-3 Vision

Overview

Installation

License (Chat with Phi 3 Vision)

Contributing

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

ramonatarantino/mocular-3d-object-detection-tracking

Folders and files

Latest commit

History

Repository files navigation

Omni3D with Custom Object Tracker

Table of Contents:

Overview

Installation

Running the Demo

Running the Demo with Tracker

Training

Inference

Tracker Implementation

Citing Omni3D

Chat with Phi-3 Vision

Overview

Installation

License (Chat with Phi 3 Vision)

Contributing

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages