Skip to content

Random Forest and Ensemble Learning #199

@jk1015

Description

@jk1015

I've recently been looking into adding Random Forest to linfa. Since Ensemble Learning is on the roadmap anyway I think the best way to do this would be to add Bootstrap Aggregation for any classifier rather than specialising the implementation to Decision Trees. I'm not totally sure what the design of this should look like though, especially since there don't seem to be any fixed conventions for implementing classifiers in linfa.

Would general bootstrap aggregation be a useful addition? If so I'm interested in other's opinions on how this should interface with existing/future classifiers in linfa along with any other design considerations.

Activity

YuhanLiin

YuhanLiin commented on Feb 18, 2022

@YuhanLiin
Collaborator

In impl_dataset.rs we already have bootstrap aggregation code that produces sub-samples from a dataset. We just need a generalized way of fitting classifiers over the subsamples. We have the trait linfa::traits::Fit that represents the training of a model using a set of hyperparameters, and we have linfa::traits::PredictInplace representing prediction using a trained model. You can define a new ensemble classifier that's generic over these traits, similar to how cross_validate is defined. Its Fit impl fits its "inner" classifier/regressor multiple times over the subsamples, and its Predict impl averages/votes on predictions made across its inner models.

EricTulowetzke

EricTulowetzke commented on Aug 3, 2022

@EricTulowetzke

Here is a WIP PR for RF

#43

YuhanLiin

YuhanLiin commented on Aug 7, 2022

@YuhanLiin
Collaborator

The work for that PR for ensemble learning ended up in #66 which didn't pan out for some reason. The current work is in #229.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

      Development

      No branches or pull requests

        Participants

        @YuhanLiin@EricTulowetzke@jk1015

        Issue actions

          Random Forest and Ensemble Learning · Issue #199 · rust-ml/linfa