GitHub - gordonstevens/AI-Deep-Learning-RNN-LSTM-IMDB-Dataset

Notebook and dataset backgrounder

I use non-client/customer specific applications of my projects due to privacy, but I would still like to showcase what I have built, so I use public datasets. This project is cross-published on Github and Kaggle.

The Internet Movie Database also known as IMDB is the best place on the Internet to find information about Hollywood productions, TV shows and movies.

A dataset with 25000 movie reviews from IMDB was compiled by the Keras team. It is great for demonstrating how Long Short-Term Memory (LSTM), a type of Recurrent Neural Network (RNN) can be used.

Theory backgrounder

This uses a neural network, a network of processing nodes (perceptrons/neurons) which activate or not dependent upon the mathematical activation equation. These mathematical neurons mimic how we think as humans with our biological neurons. The defacto type of RNN, LSTM is used in this project. As noted by a very good WikiPedia article, early RNNs had a vanishing gradient problem, so there is nothing to train against.

Unlike feedforward neural networks, which process data in a single pass, RNNs process data across multiple time steps, making them well-adapted for modelling and processing text, speech, and time series.

Tech backgrounder

During 2022 and 2023, when I originally learned how to work with Tensorflow and Keras, I learned about various ways to use Keras, and this project is to highlight my use of an RNN, LSTM type. Now in 2024, Keras can be run on top of JAX, Tensorflow, or PyTorch.

Here I will use Tensorflow and Keras, to supplt the dataset and use the LSTM.

Misc

While the dataset as provided has 25000 reviews in total, I will use 20000 samples, limit each review to a maximum of 150 words, training the model with 15000 samples, and validating the model with 5000 samples. After the model is run, I will be able to determine the number of trainiable parameters are in the model with this build.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
AI-Deep-Learning-RNN-LSTM-IMDB-Dataset.ipynb		AI-Deep-Learning-RNN-LSTM-IMDB-Dataset.ipynb
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Notebook and dataset backgrounder

Theory backgrounder

Tech backgrounder

Misc

About

Releases

Packages

Languages

License

gordonstevens/AI-Deep-Learning-RNN-LSTM-IMDB-Dataset

Folders and files

Latest commit

History

Repository files navigation

Notebook and dataset backgrounder

Theory backgrounder

Tech backgrounder

Misc

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages