Neural architecture search is a new, popular technique for automatically discovering successful deep neural architectures for a wide variety of domains. Neural architecture search has been implemented with many different types of search algorithms, including Q-learning and evolutionary strategies. We implemented neural architecture search using a simple mu + lambda evolutionary strategy to optimize accuracy on the popular Reuters and MNIST datasets. We evolve architectures with significantly better performance than the baseline architecture found here and comparable performance to hand-optimized architectures of similar size. In future work, we hope to also evolve the layers of the networks along with tuing the hyperparameters. Full explanation of the approach and results can be found here and slides can be found here
These instructions will get you a copy of the project up and running on your local machine for development and testing purposes. See deployment for notes on how to deploy the project on a live system.
What things you need to install the software and how to install them
- Navigate to the directory you wish to have the copy of the project
- Install a virtual environment for the project (optional, but highly recommended)
pip3 install virtualenv
virtualenv venv
source venv/bin/activate (to activate the virtual environment)
deactivate (to exit virtual environment, but don't deactivate before installing steps 3-5)
- Install Python (we have been developing with 3.6.7)
- Install Keras
pip3 install keras
- Install TensorFlow
pip3 install tensorflow
- Install MatPlotLib
pip3 install matplotlib
- Navigate to the directory you wish to clone the project in a terminal, or the directory that holds your virtual environment
- Activate the virtual environment if you set one up (instructions above)
- Clone the repository using the URL given on this page
git clone URL
The Reuters dataset is a collection of text newswires, feeds of news and magazine articles, from Reuters in 1987. The dataset has a over 46 topics, or categories, that the articles can be characterized under. More information about the Reuters dataset can be found here.
To train the Reuters dataset, run the following command: python3 search_reuters.py
We drew inspiration for our network model from TensorFlow and Keras tutorials.
The MNIST dataset is a collection of 70,000 grayscale handwritten digits. This is a popular dataset becasue it is the first dataset that a deep neural network was able to perform human levels of accuracy with LeNet. More information regarding MNIST can be found here.
To train the MNIST dataset, run the following command: python3 search_mnist.py
This project is licensed under the MIT License - see our LICENSE.md for details
Thanks to @markdtw for providing the base code we used to implement the neural networks found here. Also thanks to @mgorkove for the lovely tutorial on how to write data to a CSV with Python found here. We would also like to thank our professor, Dr. Schrum, for his assistance on the project details. Finally, we'd like to thank @kwg for letting us use (and occasionally break) his linux setup.