Skip to content

Msc thesis project repo to graduate in Artificial Intelligence at The University of Edinburgh.

Notifications You must be signed in to change notification settings

goncalomcorreia/vqa_human_attention

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Making Machines More Human: A Multitask Learning Approach to VQA and Human Attention Prediction

My MSc Dissertation Project had the goal of developing a Deep Learning algorithm capable of improving VQA performance of a state-of-the-art architecture while mimicking human attention, using the VQA-HAT dataset. The Project was successful and some results can be found below.

The Dissertation PDF can be found in this repository - msc-dissertation.pdf.

Code adapted from Stacked attention networks for image question answering.

Dependencies

The code is in python and uses Theano package.

  • Python 2.7
  • Theano
  • Numpy
  • h5py

Usage

To train a model,

cd src/scripts; python mtl_san_deepfix.py

There is another README.md inside src describing the files there.

Results

Some results can be found below. "Human Attention and Answer" is the ground-truth. "SAN" is our main baseline - the Stacked Attention Network. Our main algorithm is "MTL SAN+DeepFix", able to improve VQA accuracy of our baseline SAN, while mimicking human attention. Remaining models are different baselines. Thorough explanations can be found in msc-dissertation.pdf

alt tag

alt tag

About

Msc thesis project repo to graduate in Artificial Intelligence at The University of Edinburgh.

Topics

Resources

Stars

Watchers

Forks

Languages