tf-char-cnn-lstm

A Tensorflow version of Yoon Kim's char-cnn-lstm Torch7 code

It is a boosting and amended version of mkroutikov's work,to fix some mistakes and improve the speed. The structure of the whole model is the same.Now the training time is within 4 hours on large model, much faster than the original 20+ hours.And we can get slightly better result at the same time.

A simplified and neaty version called models.py is also provided for beginners,aimed to make the whole process easy to understand.It is identical to the model.py, but in a straight-forward style, inspired by sherjilozair

requirement

Tensorflow 0.10 cuda and cudnn should be installed for gpu implement

usage

Train:

python train_gpu.py
or
python train_sim.py

train_sim.py uses the models.py,which produces the same result

Evaluate

python evaluate_gpu.py

for CPU:

Train：

python train_gpu.py --gpuid -1

Evaluate:

python evaluate_gpu.py --gpuid -1

Large model of Yoon Kim's paper will be trained on PTB and also evaluated.Log will be printed to screen just like the log in repo.

Time

The training time the large model is about 3.5~3.8 hour on a GPU(k20) with 0.4sec/batch, while the lua code of Yoon Kim is about 5 hours on a GPU.

Previous Results

Learning rate	Train/Valid/Test loss	Train/Valid/Test perplexity
1.0	4.057 / 4.503 / 4.463	57.77 / 90.25 / 86.79
0.5	3.984 / 4.432 / 4.391	53.71 / 84.06 / 80.73

Now the results of models trained on new codes should be identical to Yoon Kim's, as listed in mkroutikov's work

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
data		data
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
data_reader.py		data_reader.py
evaluate_gpu.py		evaluate_gpu.py
generate.py		generate.py
model.py		model.py
models.py		models.py
read_param_init.py		read_param_init.py
train_gpu.py		train_gpu.py
train_loss_0.5.png		train_loss_0.5.png
train_sim.py		train_sim.py
validate_loss_0.5.png		validate_loss_0.5.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

tf-char-cnn-lstm

requirement

usage

for CPU:

Time

Previous Results

About

Releases

Packages

Languages

License

hejunqing/tf-char-cnn-lstm

Folders and files

Latest commit

History

Repository files navigation

tf-char-cnn-lstm

requirement

usage

for CPU:

Time

Previous Results

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages