Rendering point clouds with perspective camera and specified intrinsics with Pulsar backend #1811

TheNeeloy · 2024-06-09T23:04:08Z

Hi there!

Goal:
I am attempting to use PyTorch3d with Pulsar to render images from point clouds of an outdoor scene.

Setup:
The ground truth image from the camera looks like this (720x1280 resolution):

The RGB image is loaded from a numpy file named "rgb_img_save.npy" in the attached zip folder.

I collected the depth image from the same camera and saved it as "depth_img_save.npy" under the attached zip folder.

I also have the camera extrinsics (position and orientation) in a 4x4 transformation matrix saved as "mat_pose_save.npy" in the attached zip folder (at the bottom of this issue).

My camera intrinsics as defined as follows:
WIDTH = 1_280
HEIGHT = 720
FOCAL_LENGTH = 529.
PRINCIPAL_PT_X = 631.0499
PRINCIPAL_PT_Y = 348.0125

Sanity Check with PyTorch3D
As a sanity check, I project the RGB image into the world coordinates using the pytorch3d.implicitron.tools.point_cloud_utils.get_rgbd_point_cloud function, and render the image back into 2D with pytorch3d.renderer.PointsRenderer:

    # Import statements
    import matplotlib.pyplot as plt
    from pytorch3d.renderer import PerspectiveCameras, PointsRasterizationSettings, PointsRasterizer, PointsRenderer, AlphaCompositor
    from pytorch3d.implicitron.tools.point_cloud_utils import get_rgbd_point_cloud
    from pytorch3d.structures import Pointclouds

    # Constants
    WIDTH = 1_280
    HEIGHT = 720
    FOCAL_LENGTH = 529.
    PRINCIPAL_PT_X = 631.0499
    PRINCIPAL_PT_Y = 348.0125
    device = torch.device("cuda")

    # Load our data
    rgb_img_load = np.load('rgb_img_save.npy')      # (1, 3, 720, 1280), uint8
    depth_img_load = np.load('depth_img_save.npy')  # (1, 1, 720, 1280), float32
    mat_pose_load = np.load('mat_pose_save.npy')    # (4, 4), float32

    rgb_img_torch = torch.from_numpy(rgb_img_load).to(device)
    depth_img_torch = torch.from_numpy(depth_img_load).to(device)
    mat_pose_torch = torch.from_numpy(mat_pose_load).to(device)

    # Define camera
    focal_length_torch = torch.tensor([[FOCAL_LENGTH]]).to(device)                      # (1, 1)
    principal_point_torch = torch.tensor([[PRINCIPAL_PT_X, PRINCIPAL_PT_Y]]).to(device) # (1, 2)
    rot_mat_torch = mat_pose_torch[:3,:3].unsqueeze(0)                                  # (1, 3, 3)
    trans_mat_torch = mat_pose_torch[:3,3].unsqueeze(0)                                 # (1, 3)
    img_size_torch = torch.tensor([[HEIGHT, WIDTH]]).to(device)                         # (1, 2)
    cameras = PerspectiveCameras(focal_length=focal_length_torch, principal_point=principal_point_torch, R=rot_mat_torch, T=trans_mat_torch, in_ndc=False, image_size=img_size_torch, device=device)

    # Create point cloud
    point_cloud_torch = get_rgbd_point_cloud(camera=cameras, image_rgb=rgb_img_torch, depth_map=depth_img_torch, euclidean=False)

    # Get features from point cloud
    pc_points = point_cloud_torch.points_packed()                                   # (720*1280, 3)
    pc_colors = point_cloud_torch.features_packed().type(torch.float32) / 255.      # (720*1280, 3)
    pc_rad = torch.ones(WIDTH*HEIGHT, dtype=torch.float32, device=device)           # (720*1280)

    pc_colors_alpha = torch.cat([pc_colors, torch.ones((WIDTH*HEIGHT, 1), device=device)], dim=1)
    pc_combined_alpha = Pointclouds(points=[pc_points], features=[pc_colors_alpha])

    # Render
    raster_settings = PointsRasterizationSettings(
        image_size=(HEIGHT,WIDTH), 
        radius = 0.003,
        points_per_pixel = 10,
        bin_size = 0
    )
    rasterizer = PointsRasterizer(cameras=cameras, raster_settings=raster_settings)
    renderer = PointsRenderer(
        rasterizer=rasterizer,
        compositor=AlphaCompositor()
    )
    images = renderer(pc_combined_alpha)
    plt.figure(figsize=(10, 10))
    plt.imshow(images[0, ..., :3].cpu().numpy())
    plt.axis("off")
    plt.show()

The rendered image looks correct! (The red and blue channels are flipped, but that's ok)

Attempting to use Pulsar backend to render image
I then tried to perform the same reprojection and achieve the same result using Pulsar to speed up the rendering.
I originally used the same camera definition.

    raster_settings = PointsRasterizationSettings(
        image_size=(HEIGHT,WIDTH), 
        radius = 0.0003,
        points_per_pixel = 1,
        bin_size = 0
    )
    renderer = PulsarPointsRenderer(
        rasterizer=PointsRasterizer(cameras=cameras, raster_settings=raster_settings),
        n_channels=4
    ).to(device)
    images = renderer(pc_combined_alpha, gamma=(1e-5,),
                    bg_col=torch.tensor([0.0, 1.0, 0.0, 1.0], dtype=torch.float32, device=device),
                        znear=[.1], zfar=[100.0])
    plt.figure(figsize=(10, 10))
    plt.imshow(images[0, ..., :3].cpu().numpy())
    plt.axis("off")
    plt.show()

The rendered picture is just the background green pixels.

I then tried redefining the camera from scratch, starting with ignoring the principal point, and just incorporating the focal length.

    cameras = PerspectiveCameras(device=device, R=rot_mat_torch, T=trans_mat_torch, focal_length=focal_length_torch)
    raster_settings = PointsRasterizationSettings(
        image_size=(HEIGHT,WIDTH), 
        radius = 0.0003,
        points_per_pixel = 1,
        bin_size = 0
    )
    renderer = PulsarPointsRenderer(
        rasterizer=PointsRasterizer(cameras=cameras, raster_settings=raster_settings),
        n_channels=4
    ).to(device)
    images = renderer(pc_combined_alpha, gamma=(1e-5,),
                    bg_col=torch.tensor([0.0, 1.0, 0.0, 1.0], dtype=torch.float32, device=device),
                        znear=[.1], zfar=[100.0])
    plt.figure(figsize=(10, 10))
    plt.imshow(images[0, ..., :3].cpu().numpy())
    plt.axis("off")
    plt.show()
    quit()

It looks like the spheres are there; just really big. So I kept scaling the focal length down.

Setting focal_length=focal_length_torch*.0015 in the above code:

Now we're getting closer! But I'm still not sure what the actual focal length Pulsar expects, nor the principal point, because the edges are still the background.

I also tried using the basic Pulsar backend (without the unified framework). So, I first got the camera parameters in the basic Pulsar format by using pytorch3d.utils.pulsar_from_cameras_projection on the original perspective camera definition above:

    pulsar_cam_params = pulsar_from_cameras_projection(cameras, img_size_torch)
    renderer = Renderer(WIDTH, HEIGHT, WIDTH*HEIGHT, right_handed_system=False).to(device)
    # Render.
    image = renderer(
        pc_points,
        pc_colors,
        pc_rad*.1,
        pulsar_cam_params[0],
        1.0e-5,  # Renderer blending parameter gamma, in [1., 1e-5].
        100.0,  # Maximum depth.
    )
    plt.figure(figsize=(10, 10))
    plt.imshow(image.cpu().numpy())
    plt.axis("off")
    plt.show()
    quit()

But the output is a completely blank image (white).

Would anyone have any suggestions for how the Pulsar camera should be defined using the original camera intrinsics to get the expected ground truth image? Truly appreciate your time and help! Thanks!

github_issue_files.zip

The text was updated successfully, but these errors were encountered:

TheNeeloy · 2024-06-10T16:38:44Z

I found out from this other issue #1352 (comment) that if I include the use the code from PR #1369, point clouds align between pulsar and pytorch3d. I'll be trying out this version on the data presented above and close the issue if it works as expected.

TheNeeloy · 2024-06-10T16:52:56Z

This code with the pull request installed produces the expected output:

    WIDTH = 1_280
    HEIGHT = 720
    FOCAL_LENGTH = 529.
    PRINCIPAL_PT_X = 631.0499
    PRINCIPAL_PT_Y = 348.0125
    device = torch.device("cuda")
    
    # Load our data
    rgb_img_load = np.load('rgb_img_save.npy')      # (1, 3, 720, 1280), uint8
    depth_img_load = np.load('depth_img_save.npy')  # (1, 1, 720, 1280), float32
    mat_pose_load = np.load('mat_pose_save.npy')    # (4, 4), float32

    rgb_img_torch = torch.from_numpy(rgb_img_load).to(device)
    depth_img_torch = torch.from_numpy(depth_img_load).to(device)
    mat_pose_torch = torch.from_numpy(mat_pose_load).to(device)

    # Define camera
    focal_length_torch = torch.tensor([[FOCAL_LENGTH]]).to(device)                      # (1, 1)
    principal_point_torch = torch.tensor([[PRINCIPAL_PT_X, PRINCIPAL_PT_Y]]).to(device) # (1, 2)
    rot_mat_torch = mat_pose_torch[:3,:3].unsqueeze(0)                                  # (1, 3, 3)
    trans_mat_torch = mat_pose_torch[:3,3].unsqueeze(0)                                 # (1, 3)
    img_size_torch = torch.tensor([[HEIGHT, WIDTH]]).to(device)                         # (1, 2)
    cameras = PerspectiveCameras(focal_length=focal_length_torch, principal_point=principal_point_torch, R=rot_mat_torch, T=trans_mat_torch, in_ndc=False, image_size=img_size_torch, device=device)

    # Create point cloud
    point_cloud_torch = get_rgbd_point_cloud(camera=cameras, image_rgb=rgb_img_torch, depth_map=depth_img_torch)

    # Get features from point cloud
    pc_points = point_cloud_torch.points_packed()                                   # (720*1280, 3)
    pc_colors = point_cloud_torch.features_packed().type(torch.float32) / 255.      # (720*1280, 3)
    pc_rad = torch.ones(WIDTH*HEIGHT, dtype=torch.float32, device=device)           # (720*1280)

    pc_colors_alpha = torch.cat([pc_colors, torch.ones((WIDTH*HEIGHT, 1), device=device)], dim=1)
    pc_combined_alpha = Pointclouds(points=[pc_points], features=[pc_colors_alpha])

    # Render settings
    raster_settings = PointsRasterizationSettings(
        image_size=(HEIGHT,WIDTH), 
        radius = 0.0003,
        points_per_pixel = 1,
        bin_size = 0
    )

    renderer = PulsarPointsRenderer(
        rasterizer=PointsRasterizer(cameras=cameras, raster_settings=raster_settings),
        n_channels=4
    ).to(device)

    # Render image
    images = renderer(pc_combined_alpha, gamma=(1e-5,),
                    bg_col=torch.tensor([0.0, 1.0, 0.0, 1.0], dtype=torch.float32, device=device),
                        znear=[.1], zfar=[100.0])
    plt.figure(figsize=(10, 10))
    plt.imshow(images[0, ..., :3].cpu().numpy())
    plt.axis("off")
    plt.show()

    quit()

Now I'll be trying to render with the basic pulsar interface rather than the unified interface, as it was mentioned in #1352 (comment) it should run faster with fewer conversions.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rendering point clouds with perspective camera and specified intrinsics with Pulsar backend #1811

Rendering point clouds with perspective camera and specified intrinsics with Pulsar backend #1811

TheNeeloy commented Jun 9, 2024 •

edited

Loading

TheNeeloy commented Jun 10, 2024

TheNeeloy commented Jun 10, 2024

Rendering point clouds with perspective camera and specified intrinsics with Pulsar backend #1811

Rendering point clouds with perspective camera and specified intrinsics with Pulsar backend #1811

Comments

TheNeeloy commented Jun 9, 2024 • edited Loading

TheNeeloy commented Jun 10, 2024

TheNeeloy commented Jun 10, 2024

TheNeeloy commented Jun 9, 2024 •

edited

Loading