Beyond the spotlight: analysing key drivers of actors' long-term career success

Applied Data Analysis - CS-401

Authors

Élise Boyer (@elboyer228)
Pol Fuentes (@SpaceMercury)
Mathieu Sanchez (@matsanch)
Mael Studer (@maelstuder)
Aiden Tschammer-Osten (@Hoodie031)

Abstract

In this project, we aim to analyse the factors contributing to the long-term career success of actors in the film industry. We seek to explore what sets successful actors apart. Starting from a bottom-up approach, we will first establish a "success index" for movies based on various weighted factors, such as ratings, revenue, awards, and popularity. Once we identify the most successful movies, we will trace the actors involved and evaluate their career paths, identifying trends that may contribute to their success. We will then explore specific actor attributes—such as genre specialization, age at career start, and frequency of successful roles—to determine correlations and potential predictors of sustained success. Ultimately, our goal is to offer a data-driven understanding of what makes certain actors thrive in the competitive film industry.

Research Questions

How can we define and calculate a "success index" for movies, and what factors should it include?
Does an actor’s age at career start, choice of genres, or frequency in high-grossing movies correlate with their career success?
Can we use these findings to predict the likelihood of success for actors based on early-career indicators?

Additional datasets

To enrich our analysis, we will use the following additional datasets:

The Oscar Award Dataset
Source: Kaggle - The Oscar Award
This dataset provides information on Oscar nominations and wins. It includes details such as categories, winners, and nominees across multiple years. This will help us assess the impact of awards on career success.
TMDb Movie Data
Source: Kaggle - TMDb Data 09/20
This dataset includes information on movies, such as popularity, revenue, budget, genre, release dates, and audience ratings. We will use this data to supplement our success index, particularly for metrics like revenue, budget, popularity, and ratings.

Methods

Movie Success Index: We will construct a weighted index for movie success using factors such as IMDb rating, review count, number of nominations, revenue, budget and genre. Each factor will be scaled from 0 to 10, with the weights summing to 1.
Actor Success Analysis: Based on the movies identified as successful, we will trace the actors involved and calculate each actor's "success index" based on the average success scores of their films.
Predictive Analysis: Using regression models, we will analyse the correlation of various actor attributes with their career success.

Timeline

Organization within the team

Questions for TAs

How do we handle missing data effectively when calculating the success index?

Project Structure

The directory structure of new project looks like this:

├── data                        <- Project data files
│   ├── character.metadata.tsv          <- Metadata for characters
│   ├── movie_data_tmbd.csv             <- Movie data from TMDB
│   ├── movie.metadata.tsv              <- Metadata for movies
│   ├── scrithe_oscar_awardpts.csv      <- Data on Oscar awards
│
├── src                         <- Source code
│   ├── data                            <- Data directory
│   ├── models                          <- Model directory
│   ├── utils                           <- Utility directory
│   ├── scripts                         <- Shell scripts
│
├── tests                       <- Tests of any kind
│
├── results.ipynb               <- a well-structured notebook showing the results
│
├── .gitignore                  <- List of files ignored by git
├── pip_requirements.txt        <- File for installing python dependencies
└── README.md

Acknowledgements

We would like to thank our professor and teaching assistants for their guidance and support throughout this first part of our project. 😊

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Beyond the spotlight: analysing key drivers of actors' long-term career success

Contents

Authors

Abstract

Research Questions

Additional datasets

Methods

Timeline

Organization within the team

Questions for TAs

Project Structure

Acknowledgements

About

Releases

Packages

Contributors 4

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
data		data
src		src
tests		tests
.gitignore		.gitignore
README.md		README.md
analysis.ipynb		analysis.ipynb
pip_requirements.txt		pip_requirements.txt
results.ipynb		results.ipynb

epfl-ada/ada-2024-project-sigma-squad

Folders and files

Latest commit

History

Repository files navigation

Beyond the spotlight: analysing key drivers of actors' long-term career success

Contents

Authors

Abstract

Research Questions

Additional datasets

Methods

Timeline

Organization within the team

Questions for TAs

Project Structure

Acknowledgements

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 4

Languages

Packages