Major refactor based on Julia Discourse feedback #25

ablaom · 2023-11-28T00:49:50Z

This PR is a major refactor, based on the feedback at this Julia Discourse thread as of November 2023.

There are really too many changes to itemize and I recommend interested parties peruse the new documentation, which is less bloated and substantially rationalised. These changes will not be considered "final", even after merge.

The main change is a move away from methods with return multiple values (such as separate report or state) and towards the design suggested by @CameronBieganek. A basic user workflow looks like this:

X = <some training features>
y = <some training target>
Xnew = <some test or production features>

# Train:
model = fit(forest, X, y)

# Predict probability distributions:
predict(model, Distribution(), Xnew)

# Generate point predictions:
ŷ = predict(model, LiteralTarget(), Xnew) # or `predict(model, Xnew)`

# Apply an "accessor function" to inspect byproducts of training:
LearnAPI.feature_importances(model)

# Slim down and otherwise prepare model for serialization:
small_model = minimize(model)
serialize("my_random_forest.jls", small_model)

# Recover saved model and algorithm configuration:
recovered_model = deserialize("my_random_forest.jls")
@assert LearnAPI.algorithm(recovered_model) == forest
@assert predict(recovered_model, LiteralTarget(), Xnew) == ŷ

I am also handling the previously "optional data interface" in a different way. See the obs method referred to in the docs.

I have not renamed algorithms to strategies in this PR. This was simply going to be too much cognitive dissonance

For later PR's

refactor "update", "ingest" apparatus (just dumped them for now).
address Accounting for data objects that only iterate #15
Address some other outstanding issues
Too many traits about input/output types/scitype. Dump some of these?

add SpellCheck GH action work in progress work in progress work in progress work in progress WIP tests passing, dumped ext for MLUtils as redundant work in progress work in progress work in progress wip wip big refactor, based on Julia Discourse feedback add components accessor function rename is_wrapper -> is_composite add predict shortcut tweak tweaks tweaks

codecov · 2023-11-28T00:52:31Z

Codecov Report

Attention: 15 lines in your changes are missing coverage. Please review.

Comparison is base (7d9dae0) 55.29% compared to head (71c57f4) 64.70%.

Files	Patch %	Lines
src/traits.jl	53.33%	14 Missing ⚠️
src/minimize.jl	0.00%	1 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##              dev      #25      +/-   ##
==========================================
+ Coverage   55.29%   64.70%   +9.41%     
==========================================
  Files           6        9       +3     
  Lines          85      119      +34     
==========================================
+ Hits           47       77      +30     
- Misses         38       42       +4

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

ablaom added 2 commits November 28, 2023 13:10

rm redundant line from doc example

2507568

ablaom added 4 commits November 28, 2023 14:04

spelling

da2b607

spelling again

417bd03

make codecov less fussy

c532f96

bump Documenter version to 1.0

71c57f4

ablaom merged commit 986192e into dev Nov 28, 2023
4 of 6 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Major refactor based on Julia Discourse feedback #25

Major refactor based on Julia Discourse feedback #25

ablaom commented Nov 28, 2023 •

edited

Loading

codecov bot commented Nov 28, 2023 •

edited

Loading

Major refactor based on Julia Discourse feedback #25

Major refactor based on Julia Discourse feedback #25

Conversation

ablaom commented Nov 28, 2023 • edited Loading

For later PR's

codecov bot commented Nov 28, 2023 • edited Loading

Codecov Report

ablaom commented Nov 28, 2023 •

edited

Loading

codecov bot commented Nov 28, 2023 •

edited

Loading