GitHub - mishra-18/lipnet-pytorch

lipnet-pytorch

The code is based on the paper LipNet: End-to-End Sentence-level Lipreading. LipNet utilizes 3D Convolutions and Recurrent Units to make sentence level prediction by extracting features from the lip movment in the input frames. This implementation provides 3DConv-Bi-LSTM over the 3DConv-GRU model along with a few other models with varying complexity. CTC loss is used to deal with variable length of input alignments (spoken sentences). The model weights are initialised with the same (he) initialization as proposed in the paper.

Training

venv suggested..

python -m venv lipenv
source lipenv/bin/activate

Install gdown for downloading the dataset from drive

pip install gdown

Train your model

python main.py --epoch 300 \
               --lr 0.001  \
               --hidden_size 256  \
               --model lipnet-lstm \
               --batch 16 \
               --workers 4

Reference

@article{assael2016lipnet,
  title={LipNet: End-to-End Sentence-level Lipreading},
  author={Assael, Yannis M. and Shillingford, Brendan and Whiteson, Shimon and de Freitas, Nando},
  journal={arXiv preprint arXiv:1611.01599},
  year={2016}
}

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
data		data
LICENSE		LICENSE
README.md		README.md
dataset.py		dataset.py
main.py		main.py
model.py		model.py
test.ipynb		test.ipynb
train.py		train.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

lipnet-pytorch

Training

Train your model

Reference

About

Releases

Packages

Languages

License

mishra-18/lipnet-pytorch

Folders and files

Latest commit

History

Repository files navigation

lipnet-pytorch

Training

Train your model

Reference

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages