Skip to content

Latest commit

 

History

History
146 lines (100 loc) · 3.43 KB

README.md

File metadata and controls

146 lines (100 loc) · 3.43 KB

stereo_matching

This is a Tensorflow re-implementation of Luo, W., & Schwing, A. G. (n.d.). Efficient Deep Learning for Stereo Matching. (https://www.cs.toronto.edu/~urtasun/publications/luo_etal_cvpr16.pdf)

To run

Setup data folders

data
└───kitti_2015
    │─── training
         |───image_2
             |───000000_10.png
             |───000001_10.png
             |─── ...
         |───image_3
         |───disp_noc_0
         |─── ...
    │─── testing
         |───image_2
         |───image_3

Start training

python main.py --dataset kitti_2015 --patch-size 37 --disparity-range 201

Results

  • After training for 40k iterations.
  • Qualitative results on validation set.
  • 3-pixel error evaluation on validation set.

KITTI 2015 Stereo

Example input images

Disparity Ground-truth

Example input patches

Qualitative results

Post-processing
  • Cost-aggregation

Without cost-aggregation

With cost-aggregation

A closer look to observe the smoothing of predictions, without cost aggregation and with respectively:

Quantitative results

  • To compare with results reported in paper, look at Table-5, column Ours(37).

    3-pixel error (%)
    baseline (paper) 7.13
    baseline (re-implementation) 7.271
    baseline + CA (paper) 6.58
    baseline + CA (re-implementation) 6.527

KITTI 2012 Stereo

Qualitative results

Possible next steps

  • Implement post processing to smoothen output.
  • Look into error metrics and do quantitative analysis.
  • Run inference on test video sequences.
  • Instead of the batch matrix multiplication during inference, which constructs a B x H x W x W tensor, use a loop to compute cost volume over the disparity range. Tensorflow VM might figure out that it should parallelise operations over the loop.