Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Recommendation for adding models present in TabZilla to AutoGluon #101

Open
Innixma opened this issue Apr 6, 2024 · 0 comments
Open

Recommendation for adding models present in TabZilla to AutoGluon #101

Innixma opened this issue Apr 6, 2024 · 0 comments

Comments

@Innixma
Copy link

Innixma commented Apr 6, 2024

Hello, first of all great work in getting so many model families implemented in one place!

I am wondering if there is a recommendation from the authors on how I would best add certain model families such as ResNet to work in AutoGluon (example of custom model implementation in AutoGluon: https://auto.gluon.ai/stable/tutorials/tabular/advanced/tabular-custom-model.html)

While I could potentially do it from scratch such as adapting rtdl (https://github.com/yandex-research/rtdl/tree/main), this would end up being a lot of duplicated work that TabZilla already did. I notice in your code base that you have a lot of code logic that is specialized such as RTDL_ResNet_Model, but trying to call these classes requires many args being specified that are benchmark specific.

At the end of the day, I'm looking for something akin to a sklearn interface, and any required data preprocessing would happen inside the model implementation:

train_data = pd.read_csv("some_messy_data_with_categoricals_and_missing.csv")
X_test = pd.read_csv("some_messy_data_with_categoricals_and_missing_test.csv")
X = train_data.drop(target_column, axis=1)
y = train_data[target_column] 
for model_class in all_models_implemented_in_tabzilla:
     model = model_class(**params)
     model.fit(X, y)
     y_pred = model.predict(X_test)

The final goal would be to support all TabZilla models in TabRepo to improve the strength of the learned portfolios and find model families that synergize when ensembled.

Any guidance would be appreciated!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant