[BUG] Generic ROS2 driver output for spatial yolo is incorrect #548

kikass13 · 2024-06-18T18:39:33Z

Hello,

i try to use the yolotiny4 with spatial information via the camera.cpp ros node (via the camera.launch.py). The model runs and the inference results in proper classification, but the spatial information is way off. I get -3.0 to 3.0 meters in all axis (x,y,z) for the pose.position while identifying a human (myself) sitting directly in front of the camera.

Position Log while im sitting ~50cm in front of the camera

[gesture_detection_inference_on_oakd-2] geometry_msgs.msg.Point(x=-0.0, y=0.0, z=0.0)
[gesture_detection_inference_on_oakd-2] geometry_msgs.msg.Point(x=-0.0, y=0.0, z=0.0)
[gesture_detection_inference_on_oakd-2] geometry_msgs.msg.Point(x=-4.74098539352417, y=3.4557785987854004, z=8.550938606262207)
[gesture_detection_inference_on_oakd-2] geometry_msgs.msg.Point(x=-1.1852463483810425, y=0.8710846900939941, z=2.1377346515655518)
[gesture_detection_inference_on_oakd-2] geometry_msgs.msg.Point(x=-3.7184200286865234, y=2.7328147888183594, z=6.706618309020996)
[gesture_detection_inference_on_oakd-2] geometry_msgs.msg.Point(x=-0.9341843128204346, y=0.6809415817260742, z=1.684913992881775)
[gesture_detection_inference_on_oakd-2] geometry_msgs.msg.Point(x=-3.1088430881500244, y=2.2848124504089355, z=5.607172966003418)
[gesture_detection_inference_on_oakd-2] geometry_msgs.msg.Point(x=-3.0101494789123535, y=2.2122786045074463, z=5.429166793823242)
[gesture_detection_inference_on_oakd-2] geometry_msgs.msg.Point(x=-1.1424061059951782, y=0.8395997285842896, z=2.060467004776001)
[gesture_detection_inference_on_oakd-2] geometry_msgs.msg.Point(x=-4.122596263885498, y=3.054694890975952, z=7.435598850250244)
[gesture_detection_inference_on_oakd-2] geometry_msgs.msg.Point(x=-1.7084633111953735, y=1.2556174993515015, z=3.0814192295074463)
[gesture_detection_inference_on_oakd-2] geometry_msgs.msg.Point(x=-2.873324155807495, y=2.111720561981201, z=5.182386875152588)
[gesture_detection_inference_on_oakd-2] geometry_msgs.msg.Point(x=-0.9825876355171204, y=0.72214275598526, z=1.7722152471542358)
[gesture_detection_inference_on_oakd-2] geometry_msgs.msg.Point(x=-0.0, y=0.0, z=0.0)
[gesture_detection_inference_on_oakd-2] geometry_msgs.msg.Point(x=-2.370492696762085, y=1.7564494609832764, z=4.2754693031311035)
[gesture_detection_inference_on_oakd-2] geometry_msgs.msg.Point(x=-5.3529887199401855, y=3.9821012020111084, z=9.772500991821289)
[gesture_detection_inference_on_oakd-2] geometry_msgs.msg.Point(x=-4.4880242347717285, y=3.318418264389038, z=8.14375114440918)
[gesture_detection_inference_on_oakd-2] geometry_msgs.msg.Point(x=-2.341932535171509, y=1.7278892993927002, z=4.2754693031311035)

The resulting pose is also very noisy, so i suspect that there is something wrong with it.

I have a working solution with this example:

https://github.com/luxonis/depthai-python/blob/main/examples/SpatialDetection/spatial_tiny_yolo.py

Here, the pipeline is obviously created manually (rather than the generic ros driver pipeline), which works rather good. The same model outputs reasonable xyz coordinates (in mm because its taken directly from the output) for

Minimal Reproducible Example

Start

ros2 launch depthai_ros_driver camera.launch.py camera_model:=OAK-D params_file:=$HOME/duckbrain_umbrella/ros2_ws/src/depthai-ros/depthai_ros_driver/config/rgbd.yaml

and watch the output of ros2 topic echo /oak/nn/spatial_detections while detecting something with the camera

Expected behavior

I would expect outputs like in this example

I ran the example like this:

python3 spatial_tiny_yolo.py

Position Log (x y z) while Im sitting ~50cm in front of the camera

0.1027177734375 0.04964692306518555 0.4928494567871094
0.09987322998046876 -0.17121124267578125 0.3734034118652344
-0.10212914276123047 0.017021522521972657 0.4900251159667969
0.1694300994873047 -0.2740165710449219 0.6021787719726562
-0.09702268981933594 0.015319369316101073 0.4900251159667969
0.09479540252685546 -0.16431202697753905 0.36386972045898436
-0.09282048797607421 0.028689970016479494 0.4858487548828125
0.10459590911865234 -0.16762165832519532 0.386046875
-0.04489921951293945 0.01726892852783203 0.4971475830078125
-0.07535162353515625 0.06658980560302734 0.5044801330566406
-0.043298191070556644 0.022515058517456055 0.49859698486328125
-0.04516178512573242 0.017369916915893555 0.5000548706054687
-0.04006796646118164 0.020905023574829103 0.5015213012695312
-0.02803781509399414 0.02979017448425293 0.5044801330566406
-0.03023613929748535 0.037350521087646485 0.5120322265625
-0.0316358642578125 0.03515095520019531 0.5059726867675781
-0.032600372314453126 0.03260036849975586 0.521398681640625
-0.03023613929748535 0.030236135482788085 0.5120322265625
-0.03032693862915039 0.026759061813354492 0.5135698852539062
-0.02767319107055664 0.025828311920166016 0.5311141967773437
-0.025748346328735353 0.025748346328735353 0.5294698486328125

Can someone tell me why there is such a difference between the output quality? It seems like a bug to me.

I also tried setting the following parameters in the .yaml config (to make the pipeline more similar to the example)

stereo:
      i_height: 416
      i_width: 416
      i_align_depth: true

left:
      i_resolution: 400P

right:
      i_resolution: 400P

The text was updated successfully, but these errors were encountered:

kikass13 · 2024-06-19T08:27:03Z

i tested this:

ros2 launch depthai_examples tracker_yolov4_spatial_node.launch.py
ros2 launch depthai_examples yolov4_publisher.launch.py spatial_camera:=true

and the results look fine as well.

its only with the camera.cpp generic pipeline where the results are bad

Serafadam · 2024-06-21T07:59:26Z

Hi, thanks for the report, could you try testing with following parameters:

    nn:
      i_disable_resize: false
    rgb:
      i_preview_size: 416

kikass13 · 2024-06-21T11:31:29Z

@Serafadam

thanks for your reply:

im using this config:

/oak:
  ros__parameters:
    ### will be added via launchfile due to me not wanting to fix the path here
    nn:
      i_nn_config_path: PLACEHOLDER_PATH_TO_CONFIG_JSON_WHICH_WILL_BE_REPLACED_BY_LAUNCH_CONFIG
      i_enable_passthrough: true
      i_enable_passthrough_depth: true
      i_disable_resize: false

    camera:
      i_nn_type: spatial
      i_pipeline_dump: true
      i_enable_ir: true
    
    rgb:
      i_fps: 10.0
      i_resolution: 720P
      i_preview_size: 416

      
    stereo:
      i_align_depth: true
      
      i_height: 320
      i_width: 320  

      # i_stereo_conf_threshold: 40
      i_stereo_conf_threshold: 200
      i_subpixel: true
      i_depth_preset: HIGH_DENSITY ###Prefers density over accuracy. Less invalid depth values, but more outliers.
      i_lr_check: true ### Left-Right Check or LR-Check is used to remove incorrectly calculated disparity pixels due to occlusions at object borders (Left and Right camera views are slightly different).
      i_lrc_threshold: 10
      i_fps: 10.0
      i_align_depth: true

      ### added filter
      i_enable_decimation_filter: true
      i_decimation_filter_decimation_mode: NON_ZERO_MEDIAN ### "PIXEL_SKIPPING", "NON_ZERO_MEDIAN", "NON_ZERO_MEAN"
      i_decimation_filter_decimation_factor: 4 ### default 1, max 4
      
      i_enable_spatial_filter: true
      i_spatial_filter_hole_filling_radius: 2
      i_spatial_filter_alpha: 0.5
      i_spatial_filter_delta: 20
      i_spatial_filter_iterations: 1

      i_enable_threshold_filter: true
      i_threshold_filter_min_range: 400
      i_threshold_filter_max_range: 10000

      i_enable_speckle_filter: true
      i_speckle_filter_speckle_range: 50

    left:
      i_publish_topic: false
      i_fps: 10.0

    right:
      i_publish_topic: false
      i_fps: 10.0

i_disable_resize: false and i_preview_size: 416 did not work, the resulting spatial info is still bad.

i have re-written one of the examples (depthai_examples/yolov4_spatial_publisher.cpp) and have nearly 1:1 hardcoded all the yaml parameters (from my config above) into the c++ pipeline. The resulting node works fine (spatial information is correct). That's why i assume that I have configured something wrong, or something in the pipeline is not created correctly (inside the camera.cpp driver)

kikass13 added the bug Something isn't working label Jun 18, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] Generic ROS2 driver output for spatial yolo is incorrect #548

[BUG] Generic ROS2 driver output for spatial yolo is incorrect #548

kikass13 commented Jun 18, 2024

kikass13 commented Jun 19, 2024 •

edited

Loading

Serafadam commented Jun 21, 2024

kikass13 commented Jun 21, 2024 •

edited

Loading

[BUG] Generic ROS2 driver output for spatial yolo is incorrect #548

[BUG] Generic ROS2 driver output for spatial yolo is incorrect #548

Comments

kikass13 commented Jun 18, 2024

kikass13 commented Jun 19, 2024 • edited Loading

Serafadam commented Jun 21, 2024

kikass13 commented Jun 21, 2024 • edited Loading

kikass13 commented Jun 19, 2024 •

edited

Loading

kikass13 commented Jun 21, 2024 •

edited

Loading