Skip to content

fdcl-gwu/point-transformer

 
 

Repository files navigation

LiDAR-based UAV Pose Estimation in Ocean Environment with Deep Point Transformer Network

This repository contains the code for my thesis on LiDAR-based UAV pose estimation in ocean environments. The work presents a deep point transformer network to predict UAV poses from LiDAR point cloud data.

For more details, refer to the full thesis:

Simon, Karl Philip. 2025. Ship-relative UAV pose estimation with 3D LiDAR. MS thesis., The George Washington University, ProQuest.

Requirements

  1. Python 3.8 or higher
  2. PyTorch
  3. For additional dependencies, see the original PointTransformer repository.

Installation

  1. Clone the repository:

    git clone https://github.com/fdcl-gwu/point-transformer.git
    cd point-transformer
  2. To train the model, run:

    python train_pose.py
  3. Training output is written to /log/pose_estimation and contains the model .pth file.

  4. To run inference, execute:

    python test_pose.py

    Predictions are saved in as a JSON file named results.json with the following structure:

{
    "file": "000001.txt",  // Name of the input file
    "gt_kp": [[x1, y1, z1], [x2, y2, z2], ...],  // Ground truth keypoints (list of 3D coordinates)
    "pred_kp": [[x1, y1, z1], [x2, y2, z2], ...],  // Predicted keypoints (list of 3D coordinates)
    "gt_rotation": [[r11, r12, r13], [r21, r22, r23], [r31, r32, r33]],  // Ground truth rotation matrix
    "gt_translation": [tx, ty, tz],  // Ground truth translation vector
    "pred_rotation": [[r11, r12, r13], [r21, r22, r23], [r31, r32, r33]],  // Predicted rotation matrix
    "pred_translation": [tx, ty, tz],  // Predicted translation vector
    "scale": s,  // Scale factor used to normalize the input point cloud
    "centroid": [cx, cy, cz],  // Centroid of the input point cloud
    "points": [[x1, y1, z1], [x2, y2, z2], ...],  // Original input point cloud (list of 3D points)
    "refined_rotation": [[r11, r12, r13], [r21, r22, r23], [r31, r32, r33]],  // Refined rotation matrix after GICP
    "refined_translation": [tx, ty, tz],  // Refined translation vector after GICP
    "gicp_converged": true,  // Boolean indicating if GICP refinement converged
    "gicp_iterations": 3  // Number of iterations performed by GICP
}

Dataset

Format Requirements

The dataloader expects the dataset to follow this structure:

Dataset/
└── gazebo-metadata/
    ├── clouds/       # Point cloud .txt files (1024 points each, normalized)
    ├── poses/        # Ground-truth poses in quaternion format
    └── keypoints/    # Ship keypoints (x, y, z) for alignment

Note: gazebo-metadata is just an intermediate directory and can be renamed or left as a placeholder.

  1. All files must be text files, and must be named sequentially as 000000.txt, 000001.txt, etc

  2. Point clouds must contain 1024 points, scaled to the unit sphere (see SimNetDataLoader.py for details).

  3. Poses are represented as quaternions and resolved in the ship frame. Each pose has the following structure:

    {
         "x": -3.0525446766862956,
         "y": 1.8613828372730112,
         "z": 2.7186098109594017,
         "qx": 0.025889202460656503,
         "qy": 0.1006027945007082,
         "qz": -0.14496384882175387,
         "qw": 0.9839686526863461
    }
    
  4. The code uses 40 keypoints, each containing x, y, and z coordinates.

About

This is the official repository of the original Point Transformer architecture.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 83.4%
  • Jupyter Notebook 16.6%