Extend predictors to zero-cost case #132

jr2021 · 2022-09-13T15:12:12Z

In the zerocost branch, Ensemble has been extended to the Zero Cost case and contains a single option for its base predictor, XGBoost. The XGBoost predictor has been adapted to the zero-cost case by implementing a set_pre_computations function, as well as modifying the BaseTree class. Currently, the Ensemble class supports only this single predictor:

trainable_predictors = {
    "xgb": XGBoost(
        ss_type=self.ss_type, zc=self.zc, encoding_type="adjacency_one_hot", zc_only=self.zc_only
    )
}

Should all other predictors be available in the merged Ensemble class and extended to the zero-cost case?

The text was updated successfully, but these errors were encountered:

Neonkraft · 2022-09-21T12:06:50Z

Ideally, yes. But there are about 19 predictors in the original ensemble. Let's focus now on extending the ZC case to the tree-based predictors, which are LGBoost, NGBoost, and RandomForestPredictor.

The other predictors must be available, of course, but without the option of using a ZC predictor. This also means that the get_ensemble method must be modified to return the sets of predictors based on self.zc

jr2021 · 2022-09-21T14:18:02Z

Sounds good.

We found that in the zerocost branch, the XGBoost class contains three functions specific to the zero-cost case, set_pre_computations, _verify_zc_info, and _set_zc_names which are also applicable to the other tree-based predictors.

In order to not duplicate these functions, we placed them in the BaseTree class, which is the parent class of all tree-based predictors.

jr2021 · 2022-09-21T14:30:35Z

One small remaining issue is a discrepancy between the zerocost and Develop implementation of fit in XGBoost.

In the Develop branch, it is possible for the user to load in custom hyper-parameters from a config file

def fit(self, xtrain, ytrain, train_info=None, params=None, **kwargs):
        if self.hparams_from_file and self.hparams_from_file not in ['False', 'None'] \
        and os.path.exists(self.hparams_from_file):
            self.hyperparams = json.load(open(self.hparams_from_file, 'rb'))['xgb']
            print('loaded hyperparams from', self.hparams_from_file)
        elif self.hyperparams is None:
            self.hyperparams = self.default_hyperparams.copy()
        return super(XGBoost, self).fit(xtrain, ytrain, train_info, params, **kwargs)

while in the zerocost branch, this is not an option.

def fit(self, xtrain, ytrain, train_info=None, params=None, **kwargs):
        if self.hyperparams is None:
            self.hyperparams = self.default_hyperparams.copy()
        return super(XGBoost, self).fit(xtrain, ytrain, train_info, params, **kwargs)

Which functionality should be adopted in the Develop_copy branch? Is this a case where the code in the zerocost branch should be taken as the more updated version?

Neonkraft · 2022-09-27T12:01:34Z

Best to be able to read from config file, too.

jr2021 added the zero cost merge Merge of zerocost with Develop into Develop_copy label Sep 13, 2022

jr2021 changed the title ~~Extend predictors to Zero Cost case~~ Extend predictors to zero-cost case Sep 17, 2022

jr2021 assigned jr2021 and abhash-er Sep 17, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Extend predictors to zero-cost case #132

Extend predictors to zero-cost case #132

jr2021 commented Sep 13, 2022 •

edited

Loading

Neonkraft commented Sep 21, 2022

jr2021 commented Sep 21, 2022

jr2021 commented Sep 21, 2022

Neonkraft commented Sep 27, 2022

Extend predictors to zero-cost case #132

Extend predictors to zero-cost case #132

Comments

jr2021 commented Sep 13, 2022 • edited Loading

Neonkraft commented Sep 21, 2022

jr2021 commented Sep 21, 2022

jr2021 commented Sep 21, 2022

Neonkraft commented Sep 27, 2022

jr2021 commented Sep 13, 2022 •

edited

Loading