#Data Science for Social Good Curriculum Our number one priority is to train fellows to do data science for social good work. To this end, we've developed a Data Science for Social Good curriculum, which includes many things you'd find in a data science course or bootcamp, but includes an emphasis on social science, ethics, privacy, and social issues.
Our guiding teaching philosophy is as follows (see meringue-making template for an example):
- You get out what you put in. Fellows are encouraged to take an active role in teaching and shaping the curriculum, as well as learning from it. Learning also takes initiative and participation on the student side.
- Clearly motivate topics and tools. For more technical topics: what actual task that a data scientist does will require this tool? What other options are there to do similar tasks? What are pitfalls, and what will it look like when something goes wrong? What are common mistakes or bugs? For conceptual topics: Why do we feel the need to communicate this topic? What are some concrete examples of where it's been done well or poorly in the past?
- Lessons should be user friendly. Lectures should be concise - 45 minutes at the outside - and materials should be practical. Slides should be accompanied by a worksheet or exercises so fellows can follow along and learn by doing, and a cheat sheet with relevant commands or code snippets should be included where possible.
This guide is for DSSG summer fellows, for those who want to learn more about the program, for universities or companies hoping to start a similar program, and for anyone who wants to do data science for social good.
We expect that every incoming fellow has experience programming in Python, basic knowledge of statistics and social science, and an interest in doing social good. However, we understand that everyone comes from a different background, so to ensure that everyone is able to contribute as a productive member of the team and the fellowship, we start the first few weeks off with an intensive orientation, getting everyone "up to speed" with the basic skills and tools they'll need.
- Week One
- Week Two
- Week Three
Training continues on throughout the summer in the form of "lunch and learns" - less formal lessons over lunch - and teachouts by staff or fellows who have relevant specializations. Sometimes we ask for volunteers to do a teachout on a topic we think is important, like data visualization or inference with observational data, and a few fellows will work together to put together a lesson. Sometimes a DSSGer will suggest a topic that they have a pet interest in, or that they think will be relevant to one or more of the summer projects. We have lunch and learns scheduled twice a week through the summer, and some fellows choose to offer optional teachouts at the end of the workday.
Although we don't expect all twelve teams to be working in unison, there is a general structure to the summer that guides how we pace the remaining curriculum - we try to schedule topics so that fellows know about them with enough time to incorporate them into their projects, but not so early that they've forgotten about what they learned by the time the knowledge would be useful. As we get nearer to the end of the summer, there are fewer required topics, so there are more open time slots for fellows to do teachouts.
- The Rest of the Summer
- Educational Data and Testing (Kevin Wilson)
- Social Good Business Models (Allison Weil and Paul van der Boor)
- Basic Web Scraping (Matt Bauman)
- Pipelines and Evaluation
- Feature Generation Workshop
- Test, Test, Test (Benedict Kuester)
- Beyond the Deep Learning Hype (Reza Borhani)
- Causal Inference with Observational Data (Dean Magee, Monica Alexander, Zhe Zhang, and Jackie Gutman)
- Model Evaluation
- Spatial Analysis Tools
- Operations Research (Jan Vlachy)
- Theory and Theorizing in the Social Sciences (Tom Davidson)
- Web Classification (Yaeli Cohen)
- Presentation Skills (Allison Weil)
- Data Visualization (Jon Keane, Monica Alexander, Diego Olano, Ned Yoxall)
- Natural Language Processing (Garren Gaut)
- Julia (Matt Bauman)
- Open and Closed Data (Jen Helsby)