Skip to content

Latest commit

 

History

History
49 lines (35 loc) · 1.9 KB

README.md

File metadata and controls

49 lines (35 loc) · 1.9 KB

SpongeGuyParkFeld

(We didn't use Family Guy but the name stuck)

Table of Contents

Abstract

We will clean and analyze online transcripts of several popular American sitcoms (Family Guy, South Park, Seinfeld, Spongebob) and present our findings. Our groups are looking to gain some insight onto the development of humor from more to less mature sitcoms as one of the primary goals in this project. We also look to compare the complexity of vocabulary between the shows. The desired outcome of this project is a text generator which will be able to create new content based on the styles of these individual shows.

This project was originally presented at the Data Science at UCSB 2018 Project Showcase. Check out Data Science at UCSB here

Contributors

  • Jay Singh
  • Lauren Shin
  • Evan Azevedo
  • Liam Abrams
  • Mikaela Guerrero
  • Jerry Liu
  • Stevyn Fessler
  • Michelle Su

Special thanks to Jason Freeberg and Timothy Nguyen for giving us guidance in our beginning and later stages, respectively.

Packages Required

Beautiful soup
Pandas
SciKit-Learn

See requirements.txt

Sources of Data

Spongebob
South Park
Seinfeld

Conclusion

We hope to learn something about the target audiences for each show based on the transcripts of those shows based on the vocabulary used and general tone of dialogue.