Skip to content

The repository to showcase the best framework for tabular data - the Awesome CatBoost

Notifications You must be signed in to change notification settings

valeman/Awesome_CatBoost

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

19 Commits
Β 
Β 

Repository files navigation

Awesome_CatBoost

πŸ”₯ CatBoost: The Unrivaled King of Tabular Data πŸ”₯

TL;DR: CatBoost isn't just another machine learning frameworkβ€”it's THE framework dominating the tabular data landscape. And we’ve got the proof. πŸš€

πŸ’‘ Why CatBoost Rules:

  • According to "A Closer Look at Deep Learning on Tabular Data", the largest-ever study (300 datasets!) on tabular data, CatBoost consistently outperforms both deep learning and traditional tree-based models.
  • Seamlessly handles categorical features nativelyβ€”no more preprocessing headaches.
  • Built for efficiency, accuracy, and real-world practicality.

🌟 Explore the Ultimate Resource
The Awesome CatBoost Repository is your one-stop shop to master this powerhouse. Tutorials, best practices, cutting-edge papers, and moreβ€”all in one place.

🌐 Tabular data is everywhere, and CatBoost proves it’s the ultimate solution. Time to level up your machine learning game. πŸ’ͺ

πŸ‘‰ Dive in now: Awesome CatBoost Repo
πŸ“œ Read the groundbreaking paper: ArXiv Study

#MachineLearning #CatBoost #DataScience #AI #TabularData #GradientBoosting #AwesomeRepo #Domination

Videos

  1. Anna Veronika Dorogush - CatBoost - the new generation of Gradient Boosting (EuroPython Conference, 2018)
  2. XGBoost ❌ LightGBM ❌ CatBoost ❌ Scikit-Learn GRADIENT BOOSTING Performance Compared
  3. Yandex Catboost: Open-source Gradient Boosting Library (2018)
  4. CatBoost Part 1: Ordered Target Encoding by Josh Starmer (2023)

Papers

  1. A Closer Look at Deep Learning on Tabular Data large scale study (300! datasets) showing CatBoost dominates on tabular data πŸ”₯πŸ”₯πŸ”₯πŸ”₯πŸ”₯πŸš€πŸš€πŸš€πŸš€πŸš€ code
  2. CatBoost: gradient boosting with categorical features support by Anna Veronika Dorogush, Vasily Ershov, Andrey Gulin (NeurIPS, 2017) πŸ”₯πŸ”₯πŸ”₯πŸ”₯πŸ”₯
  3. CatBoost: unbiased boosting with categorical features by Liudmila Prokhorenkova, Gleb Gusev, Aleksandr Vorobev, Anna Veronika Dorogush, Andrey Gulin (Neurips, 2018) πŸ”₯πŸ”₯πŸ”₯πŸ”₯πŸ”₯
  4. [KiDS-SQuaD. Machine learning selection of bright extragalactic objects to search for new gravitationally lensed quasars](https://www.aanda.org/articles/aa/full_html/2019/12/aa36006-19/aa36006-19.html) (2019)
    
  5. When Do Neural Nets Outperform Boosted Trees on Tabular Data? (2023) large scale study showing CatBoost outperofmed xgboost by 6% in terms of accuracy πŸ”₯πŸ”₯πŸ”₯πŸ”₯πŸ”₯
  6. [CatBoost for big data: an interdisciplinary review](CatBoost for big data: an interdisciplinary review) (2020) πŸ”₯πŸ”₯πŸ”₯πŸ”₯πŸ”₯
  7. A Comprehensive Benchmark of Machine and Deep Learning Across Diverse Tabular Datasets (2024) πŸ”₯πŸ”₯πŸ”₯πŸ”₯πŸ”₯ large scale study showing CatBoost dominates on tabular data

Articles

  1. The Gradient Boosters V: CatBoost by Manu Joseph (2020)
  2. CatBoost Hyperparameter Tuning Guide with Optuna by Kaggle Grandmaster Mario Filho (2023) πŸ”₯πŸ”₯πŸ”₯πŸ”₯πŸ”₯
  3. XGBoost? CatBoost? LightGBM? (2023)
  4. Stop Using XGBoost… by Matt Przybyla (2021)
  5. When to Choose CatBoost Over XGBoost or LightGBM - Practical Guide (2023)
  6. CatBoost vs. Light GBM vs. XGBoost
  7. CatBoost: Gradient Tree Boosting for Recommender Systems, Classification and Regression
  8. Is CatBoost faster than LightGBM and XGBoost?https://tech.deliveryhero.com/is-catboost-faster-than-lightgbm-and-xgboost/)
  9. 5 Cute Features of CatBoost - Other boosting algorithms don't have these features by Rukshan Pramoditha (2021)
  10. What Is CatBoost? by Artem Oppermann (2023)
  11. CatBoost Secrets: How It Handles Categorical Columns and Tree Growth by Gneya Pandya (2024)
  12. Why you should learn CatBoost now by Felix Revert (2020)
  13. How CatBoost encodes categorical variables? by Adrian Biarnes (2021)

Kaggle

  1. catboost uncertainty by x4 Kaggle Grandmaster Darius BaruΕ‘auskas (Kaggle 'raddar') (2024) πŸ”₯πŸ”₯πŸ”₯πŸ”₯πŸ”₯

About

The repository to showcase the best framework for tabular data - the Awesome CatBoost

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published