Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Detailed Project Guides #4

Open
wants to merge 9 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Binary file added DATA SCIENCE INTERVIEW QUESTIONS.pdf
Binary file not shown.
Binary file added Data Science Cheat Sheet.pdf
Binary file not shown.
Binary file added Data Science Road Map.pdf
Binary file not shown.
Binary file added Data-Science-Roadmap.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
38 changes: 38 additions & 0 deletions Detailed Project Guides.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@

# Detailed Project Guides

_Asides from dedication to discovery and exploration, to succeed in a Data Science project, you must understand the process and optimize it to ensure that the results are reliable and the project is easy to follow, maintain and modify where necessary._

_And the best and fastest way to go about this is to structure your project using a template._

## Steps in a Data Science Workflow
_There are no fixed frameworks or defined templates for approaching Data Science projects. Each new dataset and each new problem will lead to a different roadmap. But, there are similar high-level steps applied when approaching many Data Science different problems, irrespective of the dataset._

_So let’s look at a clean workflow that can be used as a basis for data science projects._

_However, you should note that the steps outlined below are in no way linear. Instead, most Data Science projects are largely iterative, requiring multiple steps to be repeated and revisited._

- Acquisition
- Inspection
- Preparation
- Modeling
- Evaluation
- Deployment

### Step 1: Acquisition
_The process of training machine learning algorithms is a little like teaching a toddler an object’s name for the first time, then allowing them to identify it alone when next they see it. But human beings only need a few examples to recognize a new object. That is not so for a machine, as it needs hundreds or thousands of similar examples to become familiar with an object. And these examples or training objects need to come in the form of data._

### Step 2: Inspection
_After you have acquired the data to be used, the next step is to get a first impression of the data quality by inspecting it. The primary goal at this stage is to sanity-check the data, and the best way to accomplish this is to look for things that are either impossible or highly unlikely. Check for outliers and missing values, check the data types to see if they are correct, and check the most extreme cases. Do they make sense? A good practice is to run some simple statistical tests on the data and visualize it to get a quick overview of the statistical properties of the data and to detect possible outliers._

### Step 3: Preparation
_When you are confident you have your data in order, next you will need to prepare it by placing it in a format that is amenable to modelling. This stage encompasses several processes, such as filtering, aggregating, imputing, and transforming. The type of actions you need to take will be highly dependent on the type of data you’re working with, as well as the libraries and algorithms you will be utilizing._

### Step 4: Modeling
_Once the data preparation is complete, the next phase is modeling. Selecting an appropriate algorithm will depend on the type of data. For example, if the data is continuous you will apply regression modeling, if the data is categorical you will apply classification or logistics regression modeling. As a data scientist, you will try lots of models to get the best-fitted model._

### Step 5: Evaluation
_After building the model you need to measure its performance. The good news is there are several ways to do that, and again this step is largely dependent on the type of data you are working with and the type of model used, but on the whole, this step seeks to answer the question of how close the model’s predictions are to the actual value._

### Step 6: Deployment
_Working with data is one thing, but deploying a machine learning model to production is another. Once you are comfortable with the performance of your model, you’ll want to deploy it so it can reach the intended audience. This can take several forms depending on the use case, but a common scenario is utilization as a feature within another larger application._
Binary file not shown.
Binary file added Learn Python The Hard Way 3rd Edition V413HAV.pdf
Binary file not shown.
Binary file added Machine_Learning_Mastery_Jason_Brownlee.pdf
Binary file not shown.
Binary file added OOPs Concept Technical Round CheatBook.pdf
Binary file not shown.
Binary file added Python Cheat Sheet.pdf
Binary file not shown.
Binary file added Python Data Science Handbook ( PDFDrive ).pdf
Binary file not shown.
Binary file added Python Handbook (1).pdf
Binary file not shown.
Binary file added Python Tutorial @jobs_city .pdf
Binary file not shown.
40 changes: 17 additions & 23 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,23 +1,3 @@
<br />
<div align="center">
<a href="https://zenml.io">
<img src="assets/anternsvg.svg" alt="Logo" width="400">
</a>

<h3 align="center">One-stop solution for all your Data Science learning needs.</h3>

<p align="center">
All in one place, the best resources to learn Data Science with comprehensive and detailed roadmaps.
<br />
<a href="https://antern.co/"><strong>Go to website</strong></a>
<br />
<div align="center">
Join our <a href="https://discord.gg/t9uKG2m9" target="_blank">
<img width="25" src="assets/4373196_discord_logo_logos_icon.png" alt="discord"/>
<b>Antern Community</b> </a> and ask your questions there.
</div>
</p>
</div>

# Data Science Roadmap 🤖

Expand All @@ -37,6 +17,7 @@ I will divide the resources into different levels of learning and will also prov
- **Advanced Data Science**
- **Data Science Projects**
- **Guide to Data Science Interviews**
- **Data Science Professional Certification**

## Let's get ready to learn data science 🚀

Expand Down Expand Up @@ -135,6 +116,14 @@ Taking part in competitions is also a great way to learn and build your portfoli
- Kaggle
- Analytics Vidhya
- Zindi
- CrowdANALYTIX
- Innocentive
- Codalab
- DATASOURCE.AI
- Bitgrit
- Numerai
- DataScienceChallenge
- Machine Learning Contests

### Guide to Data Science Interviews

Expand All @@ -148,6 +137,14 @@ We will be publishing Interviews guide for every topic, but till then you can go
- [Data Scientist Interviews](https://applyingml.com/)
- [Guides by Applying ML](https://applyingml.com/resources/)

### Data Science Professional Certification

- IBM Data Science Professional Certification
- Microsoft Certified Azure Data Scientist Associate Certification
- Google Professional Data Engineer Certification
- Amazon AWS Big Data Certification
- SAS Certified Data Scientist

### Upcoming Topics 📚

This repository is a work in progress, we will be adding more topics in the future, you can check out the following topics which we will be adding in the future:-
Expand All @@ -159,6 +156,3 @@ This repository is a work in progress, we will be adding more topics in the futu
- Detailed Guide to Data Science Cover Letter
- Other ways to get spotted by recruiters

### Contributions 🤝

We are open to contributions, if you want to contribute to this repository, you can check out the [contributing guidelines](#). You can also contribute by sharing this repository with your friends and colleagues.
Binary file added fundamentals_of_data_engineering.pdf
Binary file not shown.
Binary file added road map DS.pdf
Binary file not shown.