Part of my work for my Bachelor's Thesis Project on improving spatio-temporal understanding for Videos.