It's a repository trying to achieve the idea in paper IM2CAD. The main goal of this paper is to reconstruct a scene that is similar to the given photo of a room.
-
LSUN is needed in pixel-level labeling task to estimate the room geometry.
-
imagenet2012 dataset is used to detect the objects in the room(pre-trained model was used in the paper).
-
ShapeNet 3D models are the objects will appear in the reconstructed scene. (an account may needed to download data)
The lsun indoor dataset can be downloaded from the above link, or you can fork the official GitHub repository lsun and follow the instructions there.
The FCN is modified from the repo FCN.tensorflow. Note: The format of lsun indoor dataset is different a bit from the ADEChallengeData2016 dataset which is used in the origin repository.
To train the network, just running the following command:
python FCN.py --mode=train
It can also visualize the part of results by replacing the "train" with "visualize".
According to the paper, the Faster-RCNN is used to detect the objects occured in the indoor scene.