Skip to content

Commit

Permalink
Merge pull request #1035 from alan-turing-institute/dev
Browse files Browse the repository at this point in the history
For a 0.19.3 release
  • Loading branch information
ablaom authored Aug 24, 2023
2 parents ed416ae + 585a120 commit 0d58ec1
Show file tree
Hide file tree
Showing 10 changed files with 51 additions and 11 deletions.
6 changes: 5 additions & 1 deletion Project.toml
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
name = "MLJ"
uuid = "add582a8-e3ab-11e8-2d5e-e98b27df1bc7"
authors = ["Anthony D. Blaom <[email protected]>"]
version = "0.19.2"
version = "0.19.3"

[deps]
CategoricalArrays = "324d7699-5711-5eae-9e2f-1d82baa6b597"
Expand All @@ -11,13 +11,15 @@ Distributions = "31c24e10-a181-5473-b8eb-7969acd0382f"
LinearAlgebra = "37e2e46d-f89d-539d-b4ee-838fcccc9c8e"
MLJBase = "a7f614a8-145f-11e9-1d2a-a57a1082229d"
MLJEnsembles = "50ed68f4-41fd-4504-931a-ed422449fee0"
MLJFlow = "7b7b8358-b45c-48ea-a8ef-7ca328ad328f"
MLJIteration = "614be32b-d00c-4edb-bd02-1eb411ab5e55"
MLJModels = "d491faf4-2d78-11e9-2867-c94bc002c0b7"
MLJTuning = "03970b2e-30c4-11ea-3135-d1576263f10f"
OpenML = "8b6db2d4-7670-4922-a472-f9537c81ab66"
Pkg = "44cfe95a-1eb2-52ea-b672-e2afdf69b78f"
ProgressMeter = "92933f4c-e287-5a05-a399-4b506db050ca"
Random = "9a3f8284-a2c9-5f02-9a11-845980a1fd5c"
Reexport = "189a3867-3050-52da-a836-e630ba90ab69"
ScientificTypes = "321657f4-b219-11e9-178b-2701a2544e81"
Statistics = "10745b16-79ce-11e8-11f9-7d13ad32a3b2"
StatsBase = "2913bbd2-ae8a-5f71-8c99-4fb6c76f3a91"
Expand All @@ -29,11 +31,13 @@ ComputationalResources = "0.3"
Distributions = "0.21,0.22,0.23, 0.24, 0.25"
MLJBase = "0.21.3"
MLJEnsembles = "0.3"
MLJFlow = "0.1"
MLJIteration = "0.5"
MLJModels = "0.16"
MLJTuning = "0.7"
OpenML = "0.2,0.3"
ProgressMeter = "1.1"
Reexport = "1.2"
ScientificTypes = "3"
StatsBase = "0.32,0.33, 0.34"
Tables = "0.2,1.0"
Expand Down
14 changes: 7 additions & 7 deletions ROADMAP.md
Original file line number Diff line number Diff line change
@@ -1,27 +1,27 @@
# Road map

February 2020; updated, May 2021
February 2020; updated, July 2023

Please visit [contributing guidelines](CONTRIBUTING.md) if interested
in contributing to MLJ.

### Guiding goals
### Goals

- **Usability, interoperability, extensibility, reproducibility,**
and **code transparency**.

- Offer state-of-art tools for model **composition** and model
**optimization** (hyper-parameter tuning)

- Avoid common **pain-points** of other frameworks:
- Avoid common **pain-points** of other frameworks with MLJ:

- identifying all models that solve a given task
- identify and list all models that solve a given task

- routine operations requiring a lot of code
- easily perform routine operations requiring a lot of code

- passage from data source to algorithm-specific data format
- easily transform data, from source to algorithm-specific data format

- probabilistic predictions: inconsistent representations, lack
- make use of probabilistic predictions: no more inconsistent representations / lack
of options for performance evaluation

- Add some focus to julia machine learning software development more
Expand Down
6 changes: 6 additions & 0 deletions docs/ModelDescriptors.toml
Original file line number Diff line number Diff line change
Expand Up @@ -141,6 +141,7 @@ MultitargetLinearRegressor_MultivariateStats = ["regression"]
MultitargetNeuralNetworkRegressor_BetaML = ["regression"]
MultitargetNeuralNetworkRegressor_MLJFlux = ["regression", "iterative_models"]
MultitargetRidgeRegressor_MultivariateStats = ["regression"]
MultitargetSRRegressor_SymbolicRegression = ["regression"]
NeuralNetworkClassifier_BetaML = ["classification"]
NeuralNetworkClassifier_MLJFlux = ["classification", "iterative_models"]
NeuralNetworkRegressor_BetaML = ["regression"]
Expand Down Expand Up @@ -187,6 +188,11 @@ SGDClassifier_MLJScikitLearnInterface = ["classification"]
SGDRegressor_MLJScikitLearnInterface = ["regression"]
SODDetector_OutlierDetectionPython = ["outlier_detection", "outlier_detection"]
SOSDetector_OutlierDetectionPython = ["outlier_detection"]
SRRegressor_SymbolicRegression = ["regression"]
StableForestClassifier_SIRUS = ["classification"]
StableForestRegressor_SIRUS = ["regression"]
StableRulesClassifier_SIRUS = ["classification"]
StableRulesRegressor_SIRUS = ["regression"]
SVC_LIBSVM = ["classification"]
SVMClassifier_MLJScikitLearnInterface = ["classification"]
SVMLinearClassifier_MLJScikitLearnInterface = ["classification"]
Expand Down
1 change: 1 addition & 0 deletions docs/make.jl
Original file line number Diff line number Diff line change
Expand Up @@ -74,6 +74,7 @@ pages = [
"Learning Networks" => "learning_networks.md",
"Controlling Iterative Models" => "controlling_iterative_models.md",
"Generating Synthetic Data" => "generating_synthetic_data.md",
"Logging Workflows" => "logging_workflows.md",
"OpenML Integration" => "openml_integration.md",
"Acceleration and Parallelism" => "acceleration_and_parallelism.md",
"Simple User Defined Models" => "simple_user_defined_models.md",
Expand Down
7 changes: 5 additions & 2 deletions docs/src/evaluating_model_performance.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,8 +9,11 @@ In addition to hold-out and cross-validation, the user can specify
an explicit list of train/test pairs of row indices for resampling, or
define new resampling strategies.

For simultaneously evaluating *multiple* models and/or data
sets, see [Benchmarking](benchmarking.md).
For simultaneously evaluating *multiple* models, see [Comparing models of different type
and nested cross-validation](@ref).

For externally logging the outcomes of performance evaluation experiments, see [Logging
Workflows](@ref)

## Evaluating against a single measure

Expand Down
4 changes: 4 additions & 0 deletions docs/src/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -75,6 +75,10 @@ To support MLJ development, please cite these works or star the repo:
[Model Stacking](@ref) |
[Learning Networks](@ref)

### Integration
[Logging Workflows](@ref) |
[OpenML Integration](@ref)

### Customization and Extension
[Simple User Defined Models](@ref) |
[Quick-Start Guide to Adding Models](@ref) |
Expand Down
2 changes: 2 additions & 0 deletions docs/src/list_of_supported_models.md
Original file line number Diff line number Diff line change
Expand Up @@ -50,6 +50,8 @@ independent assessment.
[ParallelKMeans.jl](https://github.com/PyDataBlog/ParallelKMeans.jl) | - | KMeans | experimental |
[PartialLeastSquaresRegressor.jl](https://github.com/lalvim/PartialLeastSquaresRegressor.jl) | - | PLSRegressor, KPLSRegressor | experimental |
[ScikitLearn.jl](https://github.com/cstjean/ScikitLearn.jl) | [MLJScikitLearnInterface.jl](https://github.com/JuliaAI/MLJScikitLearnInterface.jl) | ARDRegressor, AdaBoostClassifier, AdaBoostRegressor, AffinityPropagation, AgglomerativeClustering, BaggingClassifier, BaggingRegressor, BayesianLDA, BayesianQDA, BayesianRidgeRegressor, BernoulliNBClassifier, Birch, ComplementNBClassifier, DBSCAN, DummyClassifier, DummyRegressor, ElasticNetCVRegressor, ElasticNetRegressor, ExtraTreesClassifier, ExtraTreesRegressor, FeatureAgglomeration, GaussianNBClassifier, GaussianProcessClassifier, GaussianProcessRegressor, GradientBoostingClassifier, GradientBoostingRegressor, HuberRegressor, KMeans, KNeighborsClassifier, KNeighborsRegressor, LarsCVRegressor, LarsRegressor, LassoCVRegressor, LassoLarsCVRegressor, LassoLarsICRegressor, LassoLarsRegressor, LassoRegressor, LinearRegressor, LogisticCVClassifier, LogisticClassifier, MeanShift, MiniBatchKMeans, MultiTaskElasticNetCVRegressor, MultiTaskElasticNetRegressor, MultiTaskLassoCVRegressor, MultiTaskLassoRegressor, MultinomialNBClassifier, OPTICS, OrthogonalMatchingPursuitCVRegressor, OrthogonalMatchingPursuitRegressor, PassiveAggressiveClassifier, PassiveAggressiveRegressor, PerceptronClassifier, ProbabilisticSGDClassifier, RANSACRegressor, RandomForestClassifier, RandomForestRegressor, RidgeCVClassifier, RidgeCVRegressor, RidgeClassifier, RidgeRegressor, SGDClassifier, SGDRegressor, SVMClassifier, SVMLClassifier, SVMLRegressor, SVMNuClassifier, SVMNuRegressor, SVMRegressor, SpectralClustering, TheilSenRegressor | high² |
[SIRUS.jl](https://github.com/rikhuijzer/SIRUS.jl) | - | StableForestClassifier, StableForestRegressor, StableRulesClassifier, StableRulesRegressor | low |
[SymbolicRegression.jl](https://github.com/MilesCranmer/SymbolicRegression.jl) | - | MultitargetSRRegressor, SRRegressor | experimental |
[TSVD.jl](https://github.com/JuliaLinearAlgebra/TSVD.jl) | [MLJTSVDInterface.jl](https://github.com/JuliaAI/MLJTSVDInterface.jl) | TSVDTransformer | high |
[XGBoost.jl](https://github.com/dmlc/XGBoost.jl) | [MLJXGBoostInterface.jl](https://github.com/JuliaAI/MLJXGBoostInterface.jl) | XGBoostRegressor, XGBoostClassifier, XGBoostCount | high |

Expand Down
12 changes: 12 additions & 0 deletions docs/src/logging_workflows.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
# Logging Workflows

## MLflow integration

[MLflow](https://mlflow.org) is a popular, language-agnostic, tool for externally logging
the outcomes of machine learning experiments, including those carried out using MLJ.

This functionality is provided by the [MLJFlow.jl](https://github.com/JuliaAI/MLJFlow.jl)
package whose methods are automatically available to MLJ users. Refer to the package's
documentation for examples.


5 changes: 4 additions & 1 deletion src/MLJ.jl
Original file line number Diff line number Diff line change
Expand Up @@ -8,13 +8,16 @@ import Distributed: @distributed, nworkers, pmap
import Pkg
import Pkg.TOML

using Reexport

# from the MLJ universe:
using MLJBase
import MLJBase.save
using MLJEnsembles
using MLJTuning
using MLJModels
using OpenML
@reexport using MLJFlow
using MLJIteration
import MLJIteration.IterationControl

Expand Down Expand Up @@ -89,7 +92,7 @@ export nrows, color_off, color_on,
@load_boston, @load_ames, @load_iris, @load_reduced_ames, @load_crabs,
load_boston, load_ames, load_iris, load_reduced_ames, load_crabs,
Machine, machine, AbstractNode, @node,
source, node, fit!, freeze!, thaw!, Node, sources, origins,

machines, sources, anonymize!, @from_network, fitresults,
@pipeline, Stack, Pipeline, TransformedTargetModel,
ResamplingStrategy, Holdout, CV, TimeSeriesCV,
Expand Down
5 changes: 5 additions & 0 deletions test/exported_names.jl
Original file line number Diff line number Diff line change
Expand Up @@ -22,4 +22,9 @@ Save()

@test OpenML.load isa Function


# MLJFlow

MLFlowLogger

true

0 comments on commit 0d58ec1

Please sign in to comment.