AutoRegressX is an automated machine learning (AutoML) desktop application designed to simplify regression model training for datasets in CSV format. The system identifies feature and target variables, preprocesses data (including automatic encoding of categorical features), evaluates multiple regression algorithms, selects the best performing model using standardized metrics, and exports both the trained model and evaluation artifacts. This tool targets IoT developers and students who need quick, interpretable regression solutions without requiring coding expertise.
Building a regression model typically requires expertise in:
- Data preprocessing
- Feature engineering
- Model selection
- Evaluation and comparison
- Serialization of trained models
For IoT developers and students, these steps are often barriers to applying machine learning in real use cases, especially when deploying models to cloud backends for inference.
AutoRegressX automates the regression workflow by allowing the user to provide a dataset in CSV format and select the target variable. The system then performs automatic preprocessing by detecting numeric and categorical features, encoding categorical variables using One-Hot Encoding, imputing missing values, and scaling features when required, such as for Support Vector Regression (SVR).
Train multiple regression models:
- Linear Regression
- Ridge Regression
- Random Forest Regression
- Support Vector Regression
- KNN Regression
Evaluate using:
- R² (Coefficient of Determination)
- MAE (Mean Absolute Error)
- RMSE (Root Mean Squared Error)
- Feurer, Matthias, et al. “Efficient and Robust Automated Machine Learning.” NeurIPS 2015.
- Olson, Ryan S., et al. “Evaluation of a Tree-based Pipeline Optimization Tool for Automating Data Science.” GECCO 2016.
- LeDell, Evan, et al. “H2O AutoML: Scalable Automatic Machine Learning.” AutoML Workshop, KDD 2019.
- James, Gareth, et al. “An Introduction to Statistical Learning.” Springer, 2013.