Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Major refactor based on Julia Discourse feedback #25

Merged
merged 6 commits into from
Nov 28, 2023
Merged

Conversation

ablaom
Copy link
Member

@ablaom ablaom commented Nov 28, 2023

This PR is a major refactor, based on the feedback at this Julia Discourse thread as of November 2023.

There are really too many changes to itemize and I recommend interested parties peruse the new documentation, which is less bloated and substantially rationalised. These changes will not be considered "final", even after merge.

The main change is a move away from methods with return multiple values (such as separate report or state) and towards the design suggested by @CameronBieganek. A basic user workflow looks like this:

X = <some training features>
y = <some training target>
Xnew = <some test or production features>

# Train:
model = fit(forest, X, y)

# Predict probability distributions:
predict(model, Distribution(), Xnew)

# Generate point predictions:
ŷ = predict(model, LiteralTarget(), Xnew) # or `predict(model, Xnew)`

# Apply an "accessor function" to inspect byproducts of training:
LearnAPI.feature_importances(model)

# Slim down and otherwise prepare model for serialization:
small_model = minimize(model)
serialize("my_random_forest.jls", small_model)

# Recover saved model and algorithm configuration:
recovered_model = deserialize("my_random_forest.jls")
@assert LearnAPI.algorithm(recovered_model) == forest
@assert predict(recovered_model, LiteralTarget(), Xnew) == ŷ

I am also handling the previously "optional data interface" in a different way. See the obs method referred to in the docs.

I have not renamed algorithms to strategies in this PR. This was simply going to be too much cognitive dissonance

For later PR's

add SpellCheck GH action

work in progress

work in progress

work in progress

work in progress

WIP tests passing, dumped ext for MLUtils as redundant

work in progress

work in progress

work in progress

wip

wip

big refactor, based on Julia Discourse feedback

add components accessor function

rename is_wrapper -> is_composite

add predict shortcut

tweak

tweaks

tweaks
Copy link

codecov bot commented Nov 28, 2023

Codecov Report

Attention: 15 lines in your changes are missing coverage. Please review.

Comparison is base (7d9dae0) 55.29% compared to head (71c57f4) 64.70%.

Files Patch % Lines
src/traits.jl 53.33% 14 Missing ⚠️
src/minimize.jl 0.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##              dev      #25      +/-   ##
==========================================
+ Coverage   55.29%   64.70%   +9.41%     
==========================================
  Files           6        9       +3     
  Lines          85      119      +34     
==========================================
+ Hits           47       77      +30     
- Misses         38       42       +4     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@ablaom ablaom merged commit 986192e into dev Nov 28, 2023
4 of 6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant