Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Generic ROS2 driver output for spatial yolo is incorrect #548

Open
kikass13 opened this issue Jun 18, 2024 · 3 comments
Open

[BUG] Generic ROS2 driver output for spatial yolo is incorrect #548

kikass13 opened this issue Jun 18, 2024 · 3 comments
Labels
bug Something isn't working

Comments

@kikass13
Copy link

Hello,

i try to use the yolotiny4 with spatial information via the camera.cpp ros node (via the camera.launch.py). The model runs and the inference results in proper classification, but the spatial information is way off. I get -3.0 to 3.0 meters in all axis (x,y,z) for the pose.position while identifying a human (myself) sitting directly in front of the camera.

Position Log while im sitting ~50cm in front of the camera
[gesture_detection_inference_on_oakd-2] geometry_msgs.msg.Point(x=-0.0, y=0.0, z=0.0)
[gesture_detection_inference_on_oakd-2] geometry_msgs.msg.Point(x=-0.0, y=0.0, z=0.0)
[gesture_detection_inference_on_oakd-2] geometry_msgs.msg.Point(x=-4.74098539352417, y=3.4557785987854004, z=8.550938606262207)
[gesture_detection_inference_on_oakd-2] geometry_msgs.msg.Point(x=-1.1852463483810425, y=0.8710846900939941, z=2.1377346515655518)
[gesture_detection_inference_on_oakd-2] geometry_msgs.msg.Point(x=-3.7184200286865234, y=2.7328147888183594, z=6.706618309020996)
[gesture_detection_inference_on_oakd-2] geometry_msgs.msg.Point(x=-0.9341843128204346, y=0.6809415817260742, z=1.684913992881775)
[gesture_detection_inference_on_oakd-2] geometry_msgs.msg.Point(x=-3.1088430881500244, y=2.2848124504089355, z=5.607172966003418)
[gesture_detection_inference_on_oakd-2] geometry_msgs.msg.Point(x=-3.0101494789123535, y=2.2122786045074463, z=5.429166793823242)
[gesture_detection_inference_on_oakd-2] geometry_msgs.msg.Point(x=-1.1424061059951782, y=0.8395997285842896, z=2.060467004776001)
[gesture_detection_inference_on_oakd-2] geometry_msgs.msg.Point(x=-4.122596263885498, y=3.054694890975952, z=7.435598850250244)
[gesture_detection_inference_on_oakd-2] geometry_msgs.msg.Point(x=-1.7084633111953735, y=1.2556174993515015, z=3.0814192295074463)
[gesture_detection_inference_on_oakd-2] geometry_msgs.msg.Point(x=-2.873324155807495, y=2.111720561981201, z=5.182386875152588)
[gesture_detection_inference_on_oakd-2] geometry_msgs.msg.Point(x=-0.9825876355171204, y=0.72214275598526, z=1.7722152471542358)
[gesture_detection_inference_on_oakd-2] geometry_msgs.msg.Point(x=-0.0, y=0.0, z=0.0)
[gesture_detection_inference_on_oakd-2] geometry_msgs.msg.Point(x=-2.370492696762085, y=1.7564494609832764, z=4.2754693031311035)
[gesture_detection_inference_on_oakd-2] geometry_msgs.msg.Point(x=-5.3529887199401855, y=3.9821012020111084, z=9.772500991821289)
[gesture_detection_inference_on_oakd-2] geometry_msgs.msg.Point(x=-4.4880242347717285, y=3.318418264389038, z=8.14375114440918)
[gesture_detection_inference_on_oakd-2] geometry_msgs.msg.Point(x=-2.341932535171509, y=1.7278892993927002, z=4.2754693031311035)

The resulting pose is also very noisy, so i suspect that there is something wrong with it.

I have a working solution with this example:

Here, the pipeline is obviously created manually (rather than the generic ros driver pipeline), which works rather good. The same model outputs reasonable xyz coordinates (in mm because its taken directly from the output) for

Minimal Reproducible Example

Start

ros2 launch depthai_ros_driver camera.launch.py camera_model:=OAK-D params_file:=$HOME/duckbrain_umbrella/ros2_ws/src/depthai-ros/depthai_ros_driver/config/rgbd.yaml

and watch the output of ros2 topic echo /oak/nn/spatial_detections while detecting something with the camera

Expected behavior

I would expect outputs like in this example

I ran the example like this:

python3 spatial_tiny_yolo.py

Position Log (x y z) while Im sitting ~50cm in front of the camera
0.1027177734375 0.04964692306518555 0.4928494567871094
0.09987322998046876 -0.17121124267578125 0.3734034118652344
-0.10212914276123047 0.017021522521972657 0.4900251159667969
0.1694300994873047 -0.2740165710449219 0.6021787719726562
-0.09702268981933594 0.015319369316101073 0.4900251159667969
0.09479540252685546 -0.16431202697753905 0.36386972045898436
-0.09282048797607421 0.028689970016479494 0.4858487548828125
0.10459590911865234 -0.16762165832519532 0.386046875
-0.04489921951293945 0.01726892852783203 0.4971475830078125
-0.07535162353515625 0.06658980560302734 0.5044801330566406
-0.043298191070556644 0.022515058517456055 0.49859698486328125
-0.04516178512573242 0.017369916915893555 0.5000548706054687
-0.04006796646118164 0.020905023574829103 0.5015213012695312
-0.02803781509399414 0.02979017448425293 0.5044801330566406
-0.03023613929748535 0.037350521087646485 0.5120322265625
-0.0316358642578125 0.03515095520019531 0.5059726867675781
-0.032600372314453126 0.03260036849975586 0.521398681640625
-0.03023613929748535 0.030236135482788085 0.5120322265625
-0.03032693862915039 0.026759061813354492 0.5135698852539062
-0.02767319107055664 0.025828311920166016 0.5311141967773437
-0.025748346328735353 0.025748346328735353 0.5294698486328125

Can someone tell me why there is such a difference between the output quality? It seems like a bug to me.

I also tried setting the following parameters in the .yaml config (to make the pipeline more similar to the example)

stereo:
      i_height: 416
      i_width: 416
      i_align_depth: true

left:
      i_resolution: 400P

right:
      i_resolution: 400P
@kikass13 kikass13 added the bug Something isn't working label Jun 18, 2024
@kikass13
Copy link
Author

kikass13 commented Jun 19, 2024

i tested this:

  • ros2 launch depthai_examples tracker_yolov4_spatial_node.launch.py
  • ros2 launch depthai_examples yolov4_publisher.launch.py spatial_camera:=true

and the results look fine as well.

its only with the camera.cpp generic pipeline where the results are bad

@Serafadam
Copy link
Collaborator

Hi, thanks for the report, could you try testing with following parameters:

    nn:
      i_disable_resize: false
    rgb:
      i_preview_size: 416

@kikass13
Copy link
Author

kikass13 commented Jun 21, 2024

@Serafadam

thanks for your reply:

im using this config:

/oak:
  ros__parameters:
    ### will be added via launchfile due to me not wanting to fix the path here
    nn:
      i_nn_config_path: PLACEHOLDER_PATH_TO_CONFIG_JSON_WHICH_WILL_BE_REPLACED_BY_LAUNCH_CONFIG
      i_enable_passthrough: true
      i_enable_passthrough_depth: true
      i_disable_resize: false

    camera:
      i_nn_type: spatial
      i_pipeline_dump: true
      i_enable_ir: true
    
    rgb:
      i_fps: 10.0
      i_resolution: 720P
      i_preview_size: 416

      
    stereo:
      i_align_depth: true
      
      i_height: 320
      i_width: 320  

      # i_stereo_conf_threshold: 40
      i_stereo_conf_threshold: 200
      i_subpixel: true
      i_depth_preset: HIGH_DENSITY ###Prefers density over accuracy. Less invalid depth values, but more outliers.
      i_lr_check: true ### Left-Right Check or LR-Check is used to remove incorrectly calculated disparity pixels due to occlusions at object borders (Left and Right camera views are slightly different).
      i_lrc_threshold: 10
      i_fps: 10.0
      i_align_depth: true

      ### added filter
      i_enable_decimation_filter: true
      i_decimation_filter_decimation_mode: NON_ZERO_MEDIAN ### "PIXEL_SKIPPING", "NON_ZERO_MEDIAN", "NON_ZERO_MEAN"
      i_decimation_filter_decimation_factor: 4 ### default 1, max 4
      
      i_enable_spatial_filter: true
      i_spatial_filter_hole_filling_radius: 2
      i_spatial_filter_alpha: 0.5
      i_spatial_filter_delta: 20
      i_spatial_filter_iterations: 1

      i_enable_threshold_filter: true
      i_threshold_filter_min_range: 400
      i_threshold_filter_max_range: 10000

      i_enable_speckle_filter: true
      i_speckle_filter_speckle_range: 50

    left:
      i_publish_topic: false
      i_fps: 10.0

    right:
      i_publish_topic: false
      i_fps: 10.0

i_disable_resize: false and i_preview_size: 416 did not work, the resulting spatial info is still bad.

i have re-written one of the examples (depthai_examples/yolov4_spatial_publisher.cpp) and have nearly 1:1 hardcoded all the yaml parameters (from my config above) into the c++ pipeline. The resulting node works fine (spatial information is correct). That's why i assume that I have configured something wrong, or something in the pipeline is not created correctly (inside the camera.cpp driver)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants