Skip to content

Latest commit

 

History

History
 
 

04_3DGS

Assignment 4 - Implement Simplified 3D Gaussian Splatting

This assignment covers a complete pipeline for reconstructing a 3D scene represented by 3DGS from multi-view images. The following steps use the chair folder; you can use any other folder by placing images/ in it.

Resources:


Step 1. Structure-from-Motion

First, we use Colmap to recover camera poses and a set of 3D points. Please refer to 11-3D_from_Multiview.pptx to review the technical details.

python mvs_with_colmap.py --data_dir data/chair

Debug the reconstruction by running:

python debug_mvs_by_projecting_pts.py --data_dir data/chair

Step 2. A Simplified 3D Gaussian Splatting (Your Main Part)

From the debug output of Step 1, you can see that the 3D points are sparse for rendering the whole image. We will expand each point to a 3D Gaussian to make it cover more 3D space.

2.1 3D Gaussians Initialization

Refer to the original paper. For converting 3D points to 3D Gaussians, we need to define the covariance matrix for each point; the initial Gaussians' centers are just the points. According to equation (6), for defining covariance, we define a scaling matrix S and a rotation matrix R. Since we need to use the 3D Gaussians for volume rendering, we also need the opacity attribute and the color attribute for each Gaussian. The volume rendering process is formulated with equations (1), (2), (3). The code here contains functions to initialize these attributes as optimizable parameters. You need to fill the code here for computing the 3D Covariance matrix from the quaternion (for rotation) and the scaling parameters.

2.2 Project 3D Gaussians to Obtain 2D Gaussians

According to equation (5), we need to project the 3D Gaussians to the image space by transforming with the world to camera transformation W and the Jacobian matrix J of the projection transformation. You need to fill the code here for computing the projection.

2.3 Compute the Gaussian Values

We need to compute 2D Gaussians for volume rendering. A 2D Gaussian is represented by:

$$ f(\mathbf{x}; \boldsymbol{\mu}_{i}, \boldsymbol{\Sigma}_{i}) = \frac{1}{2 \pi \sqrt{ | \boldsymbol{\Sigma}_{i} |}} \exp \left ( {-\frac{1}{2}} (\mathbf{x} - \boldsymbol{\mu}_{i})^T \boldsymbol{\Sigma}_{i}^{-1} (\mathbf{x} - \boldsymbol{\mu}_{i}) \right ) = \frac{1}{2 \pi \sqrt{ | \boldsymbol{\Sigma}_{i} |}} \exp \left ( P_{(\mathbf{x}, i)} \right ) $$

Here, $\mathbf{x}$ is a 2D vector representing the pixel location, $\boldsymbol{\mu}$ represents a 2D vector representing the mean of the $i$-th 2D Gaussian, and $\boldsymbol{\Sigma}$ represents the covariance of the 2D Gaussian. The exponent part $P_{(\mathbf{x}, i)}$ is:

$$ P_{(\mathbf{x}, i)} = {-\frac{1}{2}} (\mathbf{x} - \boldsymbol{\mu}_{i})^T \mathbf{\Sigma}_{i}^{-1} (\mathbf{x} - \boldsymbol{\mu}_{i}) $$

You need to fill the code here for computing the Gaussian values.

2.4 Volume Rendering (α-blending)

According to equations (1-3), using these N ordered 2D Gaussians, we can compute their alpha and transmittance values at each pixel location in an image.

The alpha value of a 2D Gaussian $i$ at a single pixel location $\mathbf{x}$ can be calculated using:

$$ \alpha_{(\mathbf{x}, i)} = o_i*f(\mathbf{x}; \boldsymbol{\mu}_{i}, \boldsymbol{\Sigma}_{i}) $$

Here, $o_i$ is the opacity of each Gaussian, which is a learnable parameter.

Given N ordered 2D Gaussians, the transmittance value of a 2D Gaussian $i$ at a single pixel location $\mathbf{x}$ can be calculated using:

$$ T_{(\mathbf{x}, i)} = \prod_{j \lt i} (1 - \alpha_{(\mathbf{x}, j)}) $$

Fill the code here for final rendering computation.

After implementation, build your 3DGS model:

python train.py --colmap_dir data/chair --checkpoint_dir data/chair/checkpoints

Compare with the original 3DGS Implementation

Since we use a pure PyTorch implementation, the training speed and GPU memory usage are far from satisfactory. Also, we do not implement some crucial parts like adaptive Gaussian densification scheme. Run the original 3DGS implementation with the same dataset to compare the results.