In this project we look at how to generate videos from stereo cards.
These cards are taken such that if you were to view the images from different eyes then you would obtain a notion of 3D. However, we wonder if you could take these two images and be able to both interpolate and extrapolate between these two images. An example is shown below.
Original StereoCard: The Pima Indian of Arizona Generated VideoWe assume that the stereocard has been pre-processed into left and right images. This can either be done automatically (as shown below in the Data Download section) or manually. The quality of the division here has a big impact on the downstream results.
The image pairs themselves are assumed to be arranged as follows:
- BASE_PATH
- imL
- im_0
- ...
- im_N
- imR
- im_0
- ...
- im_N
The method proceeds as follows:
- Figure out a dense correspondence between the pairs of the images using DTW (and
potentially first preprocessing by using SIFT correspondences in order to
first pre warp the images). This happens in
./dynamic_time_warping.py
. - Then use the warp to generate a video interpolating between the images and
extrapolating outside the perimeter. The projection code is based on that of
SynSin. This happens in
./generate_video.py
. - Then finally fill in the holes using a generative model based on the Boundless model. This is done for each pair separately. The code is in
./train_boundless.py
.
To run the code on a single sample, look at ./single_sample.sh
. This additionally shows how to run the individual steps with the right command line arguments. However, you need to update BASE_PATH
and im_name
in the shell script.
To download and preprocess the data, look at ./data/nyplstereo.py
. (But you first need
to obtain a token from the New York Public Library API and fill in the string marked XXXXXXX.)
This does the following:
- Downloads the high resolution stereo cards
- Downloads the title of the stereo cards
- Splits the stereo cards into left and right
If you want to contribute, here are some interesting future points of work:
- The method is quite slow, especially the inpainting part as I'm doing this separately for each pair. A faster way would be to train one model for all pairs. However, this may not generalise so well.