Boosting method implementation (LightGBM) #1264

RomanKharkovskoy · 2024-03-05T17:46:37Z

План

Реализовать convert_to_dataset()
Обновить default_operation_params.json
Обновить search_space.py

Как работает

Реализован интерфейс fit/predict в родительском классе FedotLightGBMtImplementation

Код

class FedotLightGBMImplementation(ModelImplementation):
    __operation_params = ['n_jobs', 'use_eval_set']

    def __init__(self, params: Optional[OperationParameters] = None):
        super().__init__(params)

        self.model_params = {k: v for k, v in self.params.to_dict().items() if k not in self.__operation_params}
        self.model = None

    def fit(self, input_data: InputData):
        input_data = input_data.get_not_encoded_data()

        if self.params.get('use_eval_set'):
            train_input, eval_input = train_test_data_setup(input_data)

            train_input = self.convert_to_dataframe(train_input)
            eval_input = self.convert_to_dataframe(eval_input)

            train_x, train_y = train_input.drop(columns=['target']), train_input['target']
            eval_x, eval_y = eval_input.drop(columns=['target']), eval_input['target']

            if self.classes_ is None:
                eval_metric = 'rmse'
            elif len(self.classes_) < 3:
                eval_metric = 'auc'
            else:
                eval_metric = 'multi_logloss'

            self.model.fit(X=train_x, y=train_y,
                           eval_set=[(eval_x, eval_y)], eval_metric=eval_metric)

        else:

            train_data = self.convert_to_dataframe(input_data)
            train_x, train_y = train_data.drop(columns=['target']), train_data['target']
            self.model.fit(X=train_x, y=train_y)

        return self.model

    def predict(self, input_data: InputData):
        input_data = self.convert_to_dataframe(input_data.get_not_encoded_data())
        train_x = input_data.drop(columns=['target'])
        prediction = self.model.predict(train_x)

        return prediction

Интерфейс fit/predict не поддерживает работу с внутренним типом данных lightgbm.Dataset, поэтому необходимо было найти обходной путь. В данном случае был использован тип данных pandas.DataFrame.

Внутри интерфейса идёт преобразование InputData в pandas.DataFrame (categorical_idx становятся category, а numerical_idx становятся float)

Код

@staticmethod
def convert_to_dataframe(data: Optional[InputData]):
    dataframe = pd.DataFrame(data=data.features, columns=data.features_names)
    dataframe['target'] = data.target

    if data.categorical_idx is not None:
        for col in dataframe.columns[data.categorical_idx]:
            dataframe[col] = dataframe[col].astype('category')

    if data.numerical_idx is not None:
        for col in dataframe.columns[data.numerical_idx]:
            dataframe[col] = dataframe[col].astype('float')

    return dataframe

docu-mentor · 2024-03-05T17:46:39Z

👋 Hi, I'm @docu-mentor, an LLM-powered GitHub app
powered by Anyscale Endpoints
that gives you actionable feedback on your writing.

Simply create a new comment in this PR that says:

@docu-mentor run

and I will start my analysis. I only look at what you changed
in this PR. If you only want me to look at specific files or folders,
you can specify them like this:

@docu-mentor run doc/ README.md

In this example, I'll have a look at all files contained in the "doc/"
folder and the file "README.md". All good? Let's get started!

github-actions · 2024-03-05T17:47:31Z

All PEP8 errors has been fixed, thanks ❤️

Comment last updated at

valer1435 · 2024-03-07T14:29:31Z

@open-code-helper run

open-code-helper · 2024-03-07T14:30:03Z

🚀 Open code helper finished analysing your PR! 🚀

Take a look at your results:
=随机对
在随机
数据的值
在数据的类的类的类的类的类的类的类的类的类的类的类的类的类的类的类的类的类的类的类的类的类的类的类的值
的类的类的类的类的类的类的类的类的类的类的类的类的类的类的类的类的类的类的类的类的类的类的类的类的类的类的类的类的类的类的类的类的类的类的类的类的类的类的类的类的类的类的类的类的类的类的类的类的类的类的类的类的类的类的类的类的类的类的类的类的类的类的类的类的类的类的类的类的类的类的类的类的类的类的类的类的类的类的类的类的类的类的类的类的类的类的类的类的类的类的类的类的类的类的类的类的类的类的类的类的类的类的类的类的类的类的类的类的类的类的类的类的类的类的类的类的类的类的值的类的类的类的类的
在
的类的

的类的类的类的类的类的类的类的类的类的类的值的值的值的类的类的类的
在类的类的类的类的值的
为
的值的的值的值的类的
为
在自的类的类的类的类的类的类的类的类的类的类的值的类的值的类的类的五的五的五的的的字的五的五的五的值的自五的五的五的的自五的自五的自五的五的五的的五的的的的类的
自的五的值的值的的的的的的的的类的自的值的值的值的值的值的值的值的值的值的值的值的值的值的值的值的值的值的的值的值的值的值的值的值的值的值的值的值的值的值的值的值的值的值的值的值的值的值的值的的的的值的值的的值的的的值的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的类的值的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的类的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的，的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的的

This bot is powered by NVIDIA AI Foundation Models and Endpoints.

valer1435 · 2024-03-07T14:44:29Z

@open-code-helper run

codecov · 2024-03-10T07:25:03Z

Codecov Report

Attention: Patch coverage is 66.10169% with 20 lines in your changes are missing coverage. Please review.

Project coverage is 79.77%. Comparing base (c53881a) to head (c52937c).
Report is 4 commits behind head on master.

Files	Patch %	Lines
...mplementations/models/boostings_implementations.py	66.10%	20 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##           master    #1264      +/-   ##
==========================================
- Coverage   79.82%   79.77%   -0.06%     
==========================================
  Files         150      146       -4     
  Lines       10322    10089     -233     
==========================================
- Hits         8240     8048     -192     
+ Misses       2082     2041      -41

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

RomanKharkovskoy added 3 commits March 5, 2024 20:41

implemented lightgbm into boostings.py

3360c68

changed sklearn to boosting

787aacd

changed default params

cc2fef9

aimclub deleted a comment from open-code-helper bot Mar 7, 2024

updated default_params

c52937c

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Boosting method implementation (LightGBM) #1264

Boosting method implementation (LightGBM) #1264

RomanKharkovskoy commented Mar 5, 2024 •

edited

Loading

docu-mentor bot commented Mar 5, 2024

github-actions bot commented Mar 5, 2024 •

edited

Loading

valer1435 commented Mar 7, 2024

open-code-helper bot commented Mar 7, 2024

valer1435 commented Mar 7, 2024

codecov bot commented Mar 10, 2024 •

edited

Loading

Boosting method implementation (LightGBM) #1264

Are you sure you want to change the base?

Boosting method implementation (LightGBM) #1264

Conversation

RomanKharkovskoy commented Mar 5, 2024 • edited Loading

План

Как работает

docu-mentor bot commented Mar 5, 2024

github-actions bot commented Mar 5, 2024 • edited Loading

Comment last updated at

valer1435 commented Mar 7, 2024

open-code-helper bot commented Mar 7, 2024

valer1435 commented Mar 7, 2024

codecov bot commented Mar 10, 2024 • edited Loading

Codecov Report

RomanKharkovskoy commented Mar 5, 2024 •

edited

Loading

github-actions bot commented Mar 5, 2024 •

edited

Loading

codecov bot commented Mar 10, 2024 •

edited

Loading