SpongeGuyParkFeld

(We didn't use Family Guy but the name stuck)

We will clean and analyze online transcripts of several popular American sitcoms (Family Guy, South Park, Seinfeld, Spongebob) and present our findings. Our groups are looking to gain some insight onto the development of humor from more to less mature sitcoms as one of the primary goals in this project. We also look to compare the complexity of vocabulary between the shows. The desired outcome of this project is a text generator which will be able to create new content based on the styles of these individual shows.

This project was originally presented at the Data Science at UCSB 2018 Project Showcase. Check out Data Science at UCSB here

Contributors

Jay Singh
Lauren Shin
Evan Azevedo
Liam Abrams
Mikaela Guerrero
Jerry Liu
Stevyn Fessler
Michelle Su

Special thanks to Jason Freeberg and Timothy Nguyen for giving us guidance in our beginning and later stages, respectively.

Packages Required

Beautiful soup
Pandas
SciKit-Learn

See requirements.txt

Sources of Data

Spongebob
South Park
Seinfeld

Conclusion

We hope to learn something about the target audiences for each show based on the transcripts of those shows based on the vocabulary used and general tone of dialogue.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

SpongeGuyParkFeld

Table of Contents

Abstract

Contributors

Packages Required

Sources of Data

Conclusion

Files

README.md

Latest commit

History

README.md

File metadata and controls

SpongeGuyParkFeld

Table of Contents

Abstract

Contributors

Packages Required

Sources of Data

Conclusion