Welcome to this extensive Data Science Learning Repository! This collection represents a complete curriculum covering everything from basic programming to advanced machine learning concepts, machine learning operations, and advanced statistical analysis.
This repository is structured to provide a systematic learning path through various aspects of data science, including:
- 🐍 Python Programming Fundamentals
- 📊 Data Analysis & Visualization
- 🤖 Machine Learning Algorithms
- 📈 Statistical Analysis
- 🔧 Feature Engineering
- 🎯 Model Evaluation & Optimization
- 🔄 MLOps Practices
.
├── Python/ # Python Programming Fundamentals
│ ├── Introduction/
│ ├── Data Structures/
│ ├── OOP/
│ └── Advanced Topics/
│
├── Data Analysis/ # Data Analysis Fundamentals
├── Data Visualization/ # Visualization Libraries
│ ├── Matplotlib/
│ ├── Seaborn/
│ └── Plotly/
│
├── Machine Learning/ # ML Algorithms & Techniques
│ ├── Linear Models/
│ ├── Tree-Based Models/
│ ├── Clustering/
│ └── Dimensionality Reduction/
│
├── Feature Engineering/ # Feature Processing
│ ├── Feature Encoding/
│ ├── Feature Scaling/
│ ├── Feature Selection/
│ └── Missing Values/
│
├── Model Evaluation/ # Model Assessment
├── Model Explainability/ # Model Interpretation
├── Hyperparameter Tuning/ # Model Optimization
│
├── Statistics/ # Statistical Analysis
├── Linear Algebra/ # Mathematical Foundations
├── SQL/ # Database Operations
└── MLOps/ # ML Operations
- Python/: Fundamentals including OOP, functions, data structures, and more
- Numpy/: Array operations, mathematical functions, and data manipulation
- Pandas/: Data analysis, cleaning, and time series manipulation
- Data Visualization/:
- Matplotlib: Static visualizations
- Seaborn: Statistical data visualization
- Plotly: Interactive plots and dashboards
- Machine Learning/:
- Classical algorithms (Linear Regression, SVM, KNN)
- Ensemble methods (Random Forest, XGBoost, LightGBM)
- Clustering techniques (KMeans, DBSCAN, Hierarchical)
- Dimensionality reduction (PCA, T-SNE)
- Feature Engineering/: Feature scaling, encoding, transformation techniques
- Statistics/: Statistical analysis and probability
- SQL/: Database operations and query optimization
- MLOps/: Machine learning operations and deployment
- Model Evaluation/: Metrics, validation techniques
- Model Explainability/: Model interpretation and analysis
-
Clone the repository:
git clone https://github.com/Shriram-Vibhute/CampusX-DSMP2.0 cd CampusX-DSMP2.0
-
Install dependencies:
pip install -r requirements.txt
-
Explore the notebooks and exercises in each module.
Contributions are welcome! Please open an issue or submit a pull request for improvements or suggestions.
This project is licensed under the MIT License.
Happy Learning! 🚀