This is a path for those of you who want to complete the Data Science undergraduate curriculum on your own time, for free, with courses from the best universities in the World.
In our curriculum, we give preference to MOOC (Massive Open Online Course) style courses because these courses were created with our style of learning in mind.
Here are two interesting links that can make all the difference in your journey.
The first one is a motivational video that shows a guy that went through the "MIT Challenge", which consists of learning the entire 4-year MIT curriculum for Computer Science in 1 year.
The second link is a MOOC that will teach you learning techniques used by experts in art, music, literature, math, science, sports, and many other disciplines. These are fundamental abilities to succeed in our journey.
Are you ready to get started?
The Data Science curriculum assumes the student has taken high school math and statistics.
OSSU Data Science uses the report Curriculum Guidelines for Undergraduate Programs in Data Science as our guide for course recommendation
Courses | Duration | Effort |
---|---|---|
Introduction to Data Science | 8 weeks | 10-12 hours/week |
Data Science - CS109 from Harvard | 12 weeks | 5-6 hours/week |
The Analytics Edge | 12 weeks | 10-15 hours/week |
Students who already know basic programming in any language can skip this first course
Introduction to Computational Thinking and Data Science
Introduction to Computer Science This course will introduce you to the world of computer science. Students who have been introduced to programming, either from the courses above or through study elsewhere, should take this course for a flavor of the material to come. If you finish the course wanting more, Computer Science is likely for you! Topics covered:
computation
imperative programming
basic data structures and algorithms
and more
Courses | Duration | Effort | Prerequisites | Discussion |
---|---|---|---|---|
Introduction to Computer Science and Programming using Python (alternative) | 9 weeks | 15 hours/week | high school algebra | chat |
The Algorithms courses are taught in Java. If students need to learn Java, they should take this course first
Database Management Essentials
Data Warehouse Concepts, Design, and Data Integration
Relational Database Support for Data Warehouses
Business Intelligence Concepts, Tools, and Applications
Design and Build a Data Warehouse for Business Intelligence Implementation
MongoDB for Developers Learning Path
Courses | Duration | Effort |
---|---|---|
Stanford's Database course | - weeks | 8-12 hours/week |
Topics covered:
Agile methodology
REST
software specifications
refactoring
relational databases
transaction processing
data modeling
neural networks
supervised learning
unsupervised learning
OpenGL
ray tracing
and more
Courses | Duration | Effort | Prerequisites | Discussion |
---|---|---|---|---|
Databases: Modeling and Theory | 2 weeks | 10 hours/week | core programming | chat |
Databases: Relational Databases and SQL | 2 weeks | 10 hours/week | core programming | chat |
Databases: Semistructured Data | 2 weeks | 10 hours/week | core programming | chat |
Machine Learning | 11 weeks | 9 hours/week | Basic coding | chat |
Code | Course | Duration | Effort |
---|---|---|---|
COMP 2312 | Databases | 10 Weeks | 8-12 Hours/Week |
Calculus 1C: Coordinate Systems & Infinite Series
Courses | Duration | Effort | Prerequisites |
---|---|---|---|
Multivariable Calculus | 12 weeks | 6 hours/week | Calculus 1C |
Topics covered:
Vector and matrix calculations
Linear transformations
Vector spaces
Eigenvalues and Eigenvectors
Courses | Duration | Effort |
---|---|---|
Linear Algebra - Foundations to Frontiers | 15 weeks | 8 hours/week |
Applications of Linear Algebra Part 1 | 5 weeks | 4 hours/week |
Applications of Linear Algebra Part 2 | 4 weeks | 5 hours/week |
Code | Course | Duration | Effort |
---|---|---|---|
MATH 1311 | College Algebra and Problem Solving | 4 Weeks | 6 Hours/Week |
Courses | Duration | Effort |
---|---|---|
Introduction to Computer Science and Programming Using Python | 9 weeks | 15 hours/week |
Introduction to Computational Thinking and Data Science | 10 weeks | 15 hours/week |
Introduction to Python for Data Science | 6 weeks | 2-4 hours/week |
Programming with Python for Data Science | 6 weeks | 3-4 hours/week |
Statistical Reasoning| - weeks | - hours/week
Intro to Descriptive Statistics
Intro to Inferential Statistics
Introduction to Statistics: Probability| 5 weeks | - hours/week
Introduction to Statistics: Inference| 5 weeks | - hours/week
Statistical Learning with Python by Stanford University on EdX or Statistical Learning With R by Stanford University on EdX
Probability is the mathematics of uncertainty. Statistics is the mathematical framework for quantifying uncertainty in real-world data. These two related but distinct fields of study help us describe variation and uncertainty in the world around us. These courses make heavy use of discrete mathematics, linear algebra, and calculus, and serve as a first opportunity to apply what you've learned in the other core courses.
Topics covered:
Random variables
Expectation and Variance
Probability Distributions
Courses | Duration | Effort | Prerequisites |
---|---|---|---|
Probability | 14 weeks | 12-16 hours/week | Multivariable Calculus, Math for Computer Science, Linear Algebra |
Statistics for Applications | 14 weeks | 12-16 hours/week | Probability |
Analysis is the mathematics of sequences and limits. Intro to Analysis is a course that builds on the concepts of Calculus and provides a rigorous and formalized study of the foundations of Calculus. This course will use formal proofs to establish mathematical results, starting by proving the existence of real numbers and building the foundation of single-variable Calculus from scratch.
Topics covered:
Proofs
Real analysis
Courses | Duration | Effort | Prerequisites |
---|---|---|---|
Introduction to Analysis | 14 weeks | 8-10 hours/week | Multivariable Calculus |
Supplemental Lecture Videos | 16 weeks | 8-10 hours/week | Multivariable Calculus |
Code | Course | Duration | Effort |
---|---|---|---|
MATH 1315 | Introduction to Probability and Data (with R) | 5 Weeks | 6 Hours/Week |
MATH 2314 | Inferential Statistics (with R) | 5 Weeks | 6 Hours/Week |
MATH 3311 | Linear Regression and Modeling (with R) | 4 Weeks | 6 Hours/Week |
MATH 3312 | Bayesian Statistics (with R) | 5 Weeks | 6 Hours/Week |
Courses | Duration | Effort |
---|---|---|
Learning From Data (Introductory Machine Learning) [caltech] | 10 weeks | 10-20 hours/week |
Statistical Learning | - weeks | 3 hours/week |
Stanford's Machine Learning Course | - weeks | 8-12 hours/week |
Code | Course | Duration | Effort |
---|---|---|---|
COMP 2312 | Databases | 10 Weeks | 8-12 Hours/Week |
COMP 4311 | Data Science | 13 Week | 10 Hours/Week |
COMP 5312 | Deep Learning | 8 Weeks | 6 Hours/Week |
Extension | Genomic Data Science Specialization | 32 Week | 6 Hours/Week |
OSS University is project-focused. The assignments and exams for each course are to prepare you to use your knowledge to solve real-world problems.
After you've gotten through all of Core CS and the parts of Advanced CS relevant to you, you should think about a problem that you can solve using the knowledge you've acquired. Not only does real project work look great on a resume, but the project will also validate and consolidate your knowledge. You can create something entirely new, or you can find an existing project that needs help via websites like CodeTriage or First Timers Only.
Students who would like more guidance in creating a project may choose to use a series of project oriented courses. Here is a sample of options (many more are available, at this point you should be capable of identifying a series that is interesting and relevant to you):
Complete Kaggle's Getting Started and Playground Competitions
Courses | Duration | Effort |
---|---|---|
Convex Optimization | 9 weeks | 10 hours/week |
Courses | Duration | Effort |
---|---|---|
Data Wrangling with MongoDB | 8 weeks | 10 hours/week |
Courses | Duration | Effort |
---|---|---|
Intro to Hadoop and MapReduce | 4 weeks | 6 hours/week |
Deploying a Hadoop Cluster | 3 weeks | 6 hours/week |
Courses | Duration | Effort |
---|---|---|
Stanford's Database course | - weeks | 8-12 hours/week |
Courses | Duration | Effort |
---|---|---|
Deep Learning for Natural Language Processing | - weeks | - hours/week |
Courses | Duration | Effort |
---|---|---|
Deep Learning | 12 weeks | 8-12 hours/week |
- Participate in Kaggle competition
- List down other ideas
After finishing the courses above, start your specializations on the topics that you have more interest. You can view a list of available specializations here.
Courses | Duration | Effort | Prerequisites |
---|---|---|---|
Data Mining (Specialization) | 30 weeks | 2-5 hours/week | machine learning |
Big Data (Specialization) | 30 weeks | 3-5 hours/week | none |
Internet of Things (Specialization) | 30 weeks | 1-5 hours/week | strong programming |
Cloud Computing (Specialization) | 30 weeks | 2-6 hours/week | C++ programming |
Data Science (Specialization) | 43 weeks | 1-6 hours/week | none |
Functional Programming in Scala (Specialization) | 29 weeks | 4-5 hours/week | One year programming experience |
Game Design and Development with Unity 2020 (Specialization) | 6 months | 5 hours/week | programming, interactive design |
It is possible to finish within about 2 years if you plan carefully and devote roughly 20 hours/week to your studies. Learners can use this spreadsheet to estimate their end date. Make a copy and input your start date and expected hours per week in the Timeline
sheet. As you work through courses you can enter your actual course completion dates in the Curriculum Data sheet and get updated completion estimates.
Some courses can be taken in parallel, while others must be taken sequentially. All of the courses within a topic should be taken in the order listed in the curriculum. The graph below demonstrates how topics should be ordered.
Python and R are heavily used in Data Science community and our courses teach you both. Remember, the important thing for each course is to internalize the core concepts and to be able to use them with whatever tool (programming language) that you wish.
Code | Course | Duration | Effort |
---|---|---|---|
Py4E | Python for Everybody | 10 weeks | 10 hours/week |
6.00.1x | Introduction to Computer Science and Programming using Python (alt) | 9 weeks | 15 hours/week |
MATH 1311 | College Algebra and Problem Solving | 4 Weeks | 6 Hours/Week |
MATH 1312 | Pre-calculus | 4 Weeks | 6 Hours/Week |
18.01.1x | Calculus 1A: Differentiation | 13 weeks | 6-10 hours/week |
18.01.2x | Calculus 1B: Integration | 13 weeks | 5-10 hours/week |
MATH 1315 | Introduction to Probability and Data (with R) | 5 Weeks | 6 Hours/Week |
Code | Course | Duration | Effort |
---|---|---|---|
18.01.3x | Calculus 1C: Coordinate Systems & Infinite Series | 6 weeks | 5-10 hours/week |
6.042J | Mathematics for Computer Science (Solutions) | 13 weeks | 5 hours/week |
COMP 2312 | Databases | 10 Weeks | 8-12 Hours/Week |
18.06 | Linear Algebra and Essence of Linear Algebra | 14 weeks | 12 hours/week |
COMP 2313 | Introduction to Linux | 8 Weeks | 5-7 Hours/Week |
MATH 2314 | Inferential Statistics (with R) | 5 Weeks | 6 Hours/Week |
Code | Course | Duration | Effort |
---|---|---|---|
COMP 3311a | Algorithmic Thinking 1 | 4 Weeks | 6 Hours/Week |
COMP 3311b | Algorithmic Thinking 2 | 4 Weeks | 6 Hours/Week |
MATH 3311 | Linear Regression and Modeling (with R) | 4 Weeks | 6 Hours/Week |
MATH 3312 | Bayesian Statistics (with R) | 5 Weeks | 6 Hours/Week |
MATH 3313 | Differential Equations | 7 Weeks | 8-10 Hours/Week |
Code | Course | Duration | Effort |
---|---|---|---|
COMP 4311 | Data Science | 13 Week | 10 Hours/Week |
Code | Course | Duration | Effort |
---|---|---|---|
COMP 5311 | Introduction to Machine Learning | 10 Weeks | 6 Hours/Week |
COMP 5312 | Deep Learning | 8 Weeks | 6 Hours/Week |
Extension | Genomic Data Science Specialization | 32 Week | 6 Hours/Week |
We also have labels to help you have more control through the process. The meaning of each of these labels is:
Main Curriculum
: cards with that label represent courses that are listed in our curriculum.Extra Courses
: cards with that label represent courses that was added by the student.Doing
: cards with that label represent courses the student is current doing.Done
: cards with that label represent courses finished by the student. Those cards should also have the link for at least one project/article built with the knowledge acquired in such course.Section
: cards with that label represent the section that we have in our curriculum. Those cards with theSection
label are only to help the organization of the Done column. You should put the Course's cards below its respective Section's card.Extra Sections
: cards with that label represent sections that was added by the student.
The intention of this board is to provide for our students a way to track their progress, and also the ability to show their progress through a public page for friends, family, employers, etc. You can change the status of your board to be public or private.
Yes! The intention is to conclude all the courses listed here! Also we highly encourage you to complete more by reading papers and attending research projects after your coursework is done.
List of skills:
- C/C++
- Unix System
- Python/Perl
- R
- Algorithms
These skills mentioned above are the very essential tool set that bioinformatician and computational biologist depends on.
The important thing for each course is to internalize the core concepts and to be able to use them with whatever tool (programming language) that you wish.
The curriculum is separated into two parts:
Upon finishing all the core mathematics courses, students can choose to take elective courses in advanced topics of their choice. It is not necessary to take every course within a subcategory, but it is recommended to take courses relevant to the intended field of study.
To complete your study of Advanced Topics, meet both the Breadth and Depth requirements.
- Breadth Requirement: For each of the 6 Advanced Topics below, select one course to take as an elective.
- Depth Requirement: Select one Advanced Topic below and take 3 additional courses from that topic.
Courses | Duration | Effort | Prerequisites |
---|---|---|---|
Introduction to Formal Logic | 15 weeks | 9 hours/week | - |
Combinatorics, probability, statistics, game theory, applied stats
Real analysis, numerical analysis, complex analysis, optimization theory
Abstract algebra, category theory, algebraic geometry and topology