From 3f05fde4c9368a2e6718eaf01a11e986c00ea689 Mon Sep 17 00:00:00 2001 From: "Documenter.jl" Date: Fri, 10 May 2024 03:13:37 +0000 Subject: [PATCH] build based on b53d2fd --- dev/.documenter-siteinfo.json | 2 +- dev/accessor_functions/index.html | 4 +- dev/anatomy_of_an_implementation/index.html | 18 +- dev/assets/documenter.js | 923 ++++++++++-------- dev/assets/themes/documenter-dark.css | 2 +- dev/assets/themes/documenter-light.css | 2 +- dev/common_implementation_patterns/index.html | 2 +- dev/fit/index.html | 4 +- dev/index.html | 2 +- dev/kinds_of_target_proxy/index.html | 2 +- dev/minimize/index.html | 2 +- dev/objects.inv | Bin 0 -> 1931 bytes dev/obs/index.html | 2 +- dev/patterns/classification/index.html | 2 +- dev/patterns/clusterering/index.html | 2 +- dev/patterns/dimension_reduction/index.html | 2 +- .../incremental_algorithms/index.html | 2 +- dev/patterns/incremental_models/index.html | 2 +- dev/patterns/iterative_algorithms/index.html | 2 +- .../index.html | 2 +- .../missing_value_imputation/index.html | 2 +- dev/patterns/outlier_detection/index.html | 2 +- dev/patterns/regression/index.html | 2 +- dev/patterns/static_algorithms/index.html | 2 +- .../supervised_bayesian_algorithms/index.html | 2 +- .../supervised_bayesian_models/index.html | 2 +- dev/patterns/survival_analysis/index.html | 2 +- .../time_series_classification/index.html | 2 +- .../time_series_forecasting/index.html | 2 +- dev/predict_transform/index.html | 10 +- dev/reference/index.html | 2 +- dev/search_index.js | 2 +- dev/testing_an_implementation/index.html | 2 +- dev/traits/index.html | 24 +- 34 files changed, 599 insertions(+), 438 deletions(-) create mode 100644 dev/objects.inv diff --git a/dev/.documenter-siteinfo.json b/dev/.documenter-siteinfo.json index fd6d21e5..12bd8af1 100644 --- a/dev/.documenter-siteinfo.json +++ b/dev/.documenter-siteinfo.json @@ -1 +1 @@ -{"documenter":{"julia_version":"1.9.4","generation_timestamp":"2023-12-06T21:23:55","documenter_version":"1.2.1"}} \ No newline at end of file +{"documenter":{"julia_version":"1.10.3","generation_timestamp":"2024-05-10T03:13:33","documenter_version":"1.4.1"}} \ No newline at end of file diff --git a/dev/accessor_functions/index.html b/dev/accessor_functions/index.html index 9102ad17..97f5f071 100644 --- a/dev/accessor_functions/index.html +++ b/dev/accessor_functions/index.html @@ -1,3 +1,3 @@ -Accessor Functions · LearnAPI.jl

Accessor Functions

The sole argument of an accessor function is the output, model, of fit or obsfit.

Implementation guide

All new implementations must implement LearnAPI.algorithm. All others are optional. All implemented accessor functions must be added to the list returned by LearnAPI.functions.

Reference

LearnAPI.algorithmFunction
LearnAPI.algorithm(model)
-LearnAPI.algorithm(minimized_model)

Recover the algorithm used to train model or the output of minimize(model).

In other words, if model = fit(algorithm, data...), for some algorithm and data, then

LearnAPI.algorithm(model) == algorithm == LearnAPI.algorithm(minimize(model))

is true.

New implementations

Implementation is compulsory for new algorithm types. The behaviour described above is the only contract. If implemented, you must include algorithm in the tuple returned by the LearnAPI.functions trait.

source
LearnAPI.extrasFunction
LearnAPI.extras(model)

Return miscellaneous byproducts of an algorithm's computation, from the object model returned by a call of the form fit(algorithm, data).

For "static" algorithms (those without training data) it may be necessary to first call transform or predict on model.

See also fit.

New implementations

Implementation is discouraged for byproducts already covered by other LearnAPI.jl accessor functions: LearnAPI.algorithm, LearnAPI.coefficients, LearnAPI.intercept, LearnAPI.tree, LearnAPI.trees, LearnAPI.feature_importances, LearnAPI.training_labels, LearnAPI.training_losses, LearnAPI.training_scores and LearnAPI.components.

If implemented, you must include training_labels in the tuple returned by the LearnAPI.functions trait. .

source
LearnAPI.coefficientsFunction
LearnAPI.coefficients(model)

For a linear model, return the learned coefficients. The value returned has the form of an abstract vector of feature_or_class::Symbol => coefficient::Real pairs (e.g [:gender => 0.23, :height => 0.7, :weight => 0.1]) or, in the case of multi-targets, feature::Symbol => coefficients::AbstractVector{<:Real} pairs.

The model reports coefficients if LearnAPI.coefficients in LearnAPI.functions(Learn.algorithm(model)).

See also LearnAPI.intercept.

New implementations

Implementation is optional.

If implemented, you must include coefficients in the tuple returned by the LearnAPI.functions trait. .

source
LearnAPI.interceptFunction
LearnAPI.intercept(model)

For a linear model, return the learned intercept. The value returned is Real (single target) or an AbstractVector{<:Real} (multi-target).

The model reports intercept if LearnAPI.intercept in LearnAPI.functions(Learn.algorithm(model)).

See also LearnAPI.coefficients.

New implementations

Implementation is optional.

If implemented, you must include intercept in the tuple returned by the LearnAPI.functions trait. .

source
LearnAPI.treeFunction
LearnAPI.tree(model)

Return a user-friendly tree, in the form of a root object implementing the following interface defined in AbstractTrees.jl:

  • subtypes AbstractTrees.AbstractNode{T}
  • implements AbstractTrees.children()
  • implements AbstractTrees.printnode()

Such a tree can be visualized using the TreeRecipe.jl package, for example.

See also LearnAPI.trees.

New implementations

Implementation is optional.

If implemented, you must include tree in the tuple returned by the LearnAPI.functions trait. .

source
LearnAPI.feature_importancesFunction
LearnAPI.feature_importances(model)

Return the algorithm-specific feature importances of a model output by fit(algorithm, ...) for some algorithm. The value returned has the form of an abstract vector of feature::Symbol => importance::Real pairs (e.g [:gender => 0.23, :height => 0.7, :weight => 0.1]).

The algorithm supports feature importances if LearnAPI.feature_importances in LearnAPI.functions(algorithm).

If an algorithm is sometimes unable to report feature importances then LearnAPI.feature_importances will return all importances as 0.0, as in [:gender => 0.0, :height => 0.0, :weight => 0.0].

New implementations

Implementation is optional.

If implemented, you must include feature_importances in the tuple returned by the LearnAPI.functions trait. .

source
LearnAPI.training_lossesFunction
LearnAPI.training_losses(model)

Return the training losses obtained when running model = fit(algorithm, ...) for some algorithm.

See also fit.

New implementations

Implement for iterative algorithms that compute and record training losses as part of training (e.g. neural networks).

If implemented, you must include training_losses in the tuple returned by the LearnAPI.functions trait. .

source
LearnAPI.training_scoresFunction
LearnAPI.training_scores(model)

Return the training scores obtained when running model = fit(algorithm, ...) for some algorithm.

See also fit.

New implementations

Implement for algorithms, such as outlier detection algorithms, which associate a score with each observation during training, where these scores are of interest in later processes (e.g, in defining normalized scores for new data).

If implemented, you must include training_scores in the tuple returned by the LearnAPI.functions trait. .

source
LearnAPI.training_labelsFunction
LearnAPI.training_labels(model)

Return the training labels obtained when running model = fit(algorithm, ...) for some algorithm.

See also fit.

New implementations

If implemented, you must include training_labels in the tuple returned by the LearnAPI.functions trait. .

source
LearnAPI.componentsFunction
LearnAPI.components(model)

For a composite model, return the component models (fit outputs). These will be in the form of a vector of named pairs, property_name::Symbol => component_model. Here property_name is the name of some algorithm-valued property (hyper-parameter) of algorithm = LearnAPI.algorithm(model).

A composite model is one for which the corresponding algorithm includes one or more algorithm-valued properties, and for which LearnAPI.is_composite(algorithm) is true.

See also is_composite.

New implementations

Implementent if and only if model is a composite model.

If implemented, you must include components in the tuple returned by the LearnAPI.functions trait. .

source
+Accessor Functions · LearnAPI.jl

Accessor Functions

The sole argument of an accessor function is the output, model, of fit or obsfit.

Implementation guide

All new implementations must implement LearnAPI.algorithm. While, all others are optional, any implemented accessor functions must be added to the list returned by LearnAPI.functions.

Reference

LearnAPI.algorithmFunction
LearnAPI.algorithm(model)
+LearnAPI.algorithm(minimized_model)

Recover the algorithm used to train model or the output of minimize(model).

In other words, if model = fit(algorithm, data...), for some algorithm and data, then

LearnAPI.algorithm(model) == algorithm == LearnAPI.algorithm(minimize(model))

is true.

New implementations

Implementation is compulsory for new algorithm types. The behaviour described above is the only contract. If implemented, you must include algorithm in the tuple returned by the LearnAPI.functions trait.

source
LearnAPI.extrasFunction
LearnAPI.extras(model)

Return miscellaneous byproducts of an algorithm's computation, from the object model returned by a call of the form fit(algorithm, data).

For "static" algorithms (those without training data) it may be necessary to first call transform or predict on model.

See also fit.

New implementations

Implementation is discouraged for byproducts already covered by other LearnAPI.jl accessor functions: LearnAPI.algorithm, LearnAPI.coefficients, LearnAPI.intercept, LearnAPI.tree, LearnAPI.trees, LearnAPI.feature_importances, LearnAPI.training_labels, LearnAPI.training_losses, LearnAPI.training_scores and LearnAPI.components.

If implemented, you must include training_labels in the tuple returned by the LearnAPI.functions trait. .

source
LearnAPI.coefficientsFunction
LearnAPI.coefficients(model)

For a linear model, return the learned coefficients. The value returned has the form of an abstract vector of feature_or_class::Symbol => coefficient::Real pairs (e.g [:gender => 0.23, :height => 0.7, :weight => 0.1]) or, in the case of multi-targets, feature::Symbol => coefficients::AbstractVector{<:Real} pairs.

The model reports coefficients if LearnAPI.coefficients in LearnAPI.functions(Learn.algorithm(model)).

See also LearnAPI.intercept.

New implementations

Implementation is optional.

If implemented, you must include coefficients in the tuple returned by the LearnAPI.functions trait. .

source
LearnAPI.interceptFunction
LearnAPI.intercept(model)

For a linear model, return the learned intercept. The value returned is Real (single target) or an AbstractVector{<:Real} (multi-target).

The model reports intercept if LearnAPI.intercept in LearnAPI.functions(Learn.algorithm(model)).

See also LearnAPI.coefficients.

New implementations

Implementation is optional.

If implemented, you must include intercept in the tuple returned by the LearnAPI.functions trait. .

source
LearnAPI.treeFunction
LearnAPI.tree(model)

Return a user-friendly tree, in the form of a root object implementing the following interface defined in AbstractTrees.jl:

  • subtypes AbstractTrees.AbstractNode{T}
  • implements AbstractTrees.children()
  • implements AbstractTrees.printnode()

Such a tree can be visualized using the TreeRecipe.jl package, for example.

See also LearnAPI.trees.

New implementations

Implementation is optional.

If implemented, you must include tree in the tuple returned by the LearnAPI.functions trait. .

source
LearnAPI.feature_importancesFunction
LearnAPI.feature_importances(model)

Return the algorithm-specific feature importances of a model output by fit(algorithm, ...) for some algorithm. The value returned has the form of an abstract vector of feature::Symbol => importance::Real pairs (e.g [:gender => 0.23, :height => 0.7, :weight => 0.1]).

The algorithm supports feature importances if LearnAPI.feature_importances in LearnAPI.functions(algorithm).

If an algorithm is sometimes unable to report feature importances then LearnAPI.feature_importances will return all importances as 0.0, as in [:gender => 0.0, :height => 0.0, :weight => 0.0].

New implementations

Implementation is optional.

If implemented, you must include feature_importances in the tuple returned by the LearnAPI.functions trait. .

source
LearnAPI.training_lossesFunction
LearnAPI.training_losses(model)

Return the training losses obtained when running model = fit(algorithm, ...) for some algorithm.

See also fit.

New implementations

Implement for iterative algorithms that compute and record training losses as part of training (e.g. neural networks).

If implemented, you must include training_losses in the tuple returned by the LearnAPI.functions trait. .

source
LearnAPI.training_scoresFunction
LearnAPI.training_scores(model)

Return the training scores obtained when running model = fit(algorithm, ...) for some algorithm.

See also fit.

New implementations

Implement for algorithms, such as outlier detection algorithms, which associate a score with each observation during training, where these scores are of interest in later processes (e.g, in defining normalized scores for new data).

If implemented, you must include training_scores in the tuple returned by the LearnAPI.functions trait. .

source
LearnAPI.training_labelsFunction
LearnAPI.training_labels(model)

Return the training labels obtained when running model = fit(algorithm, ...) for some algorithm.

See also fit.

New implementations

If implemented, you must include training_labels in the tuple returned by the LearnAPI.functions trait. .

source
LearnAPI.componentsFunction
LearnAPI.components(model)

For a composite model, return the component models (fit outputs). These will be in the form of a vector of named pairs, property_name::Symbol => component_model. Here property_name is the name of some algorithm-valued property (hyper-parameter) of algorithm = LearnAPI.algorithm(model).

A composite model is one for which the corresponding algorithm includes one or more algorithm-valued properties, and for which LearnAPI.is_composite(algorithm) is true.

See also is_composite.

New implementations

Implementent if and only if model is a composite model.

If implemented, you must include components in the tuple returned by the LearnAPI.functions trait. .

source
diff --git a/dev/anatomy_of_an_implementation/index.html b/dev/anatomy_of_an_implementation/index.html index bb231b55..99bf2228 100644 --- a/dev/anatomy_of_an_implementation/index.html +++ b/dev/anatomy_of_an_implementation/index.html @@ -66,21 +66,21 @@ LearnAPI.functions(algorithm)
(LearnAPI.fit, LearnAPI.obsfit, LearnAPI.minimize, LearnAPI.predict, LearnAPI.obspredict, LearnAPI.obs, LearnAPI.algorithm, LearnAPI.coefficients)

Naive user workflow

Training and predicting with external resampling:

using Tables
 model = fit(algorithm, Tables.subset(X, train), y[train])
 ŷ = predict(model, LiteralTarget(), Tables.subset(X, test))
4-element Vector{Float64}:
- 1.3923271715113514
- 0.9897274455080671
- 1.0833712608796564
- 2.284815067968779

Advanced workflow

We now train and predict using internal data representations, resampled using the generic MLUtils.jl interface.

import MLUtils
+ 2.035030476492935
+ 3.1348335720184357
+ 1.1359846628809618
+ 2.8785342355493695

Advanced workflow

We now train and predict using internal data representations, resampled using the generic MLUtils.jl interface.

import MLUtils
 fit_data = obs(fit, algorithm, X, y)
 predict_data = obs(predict, algorithm, X)
 model = obsfit(algorithm, MLUtils.getobs(fit_data, train))
 ẑ = obspredict(model, LiteralTarget(), MLUtils.getobs(predict_data, test))
-@assert ẑ == ŷ
[ Info: Coefficients: [:a => 1.9764593532693593, :b => -0.44874600614288557, :c => 0.9467477933434958]

Applying an accessor function and serialization

Extracting coefficients:

LearnAPI.coefficients(model)
3-element Vector{Pair{Symbol, Float64}}:
- :a => 1.9764593532693593
- :b => -0.44874600614288557
- :c => 0.9467477933434958

Serialization/deserialization:

using Serialization
+@assert ẑ == ŷ
[ Info: Coefficients: [:a => 1.9054811886298182, :b => 0.3178672991278192, :c => 1.7868493089298811]

Applying an accessor function and serialization

Extracting coefficients:

LearnAPI.coefficients(model)
3-element Vector{Pair{Symbol, Float64}}:
+ :a => 1.9054811886298182
+ :b => 0.3178672991278192
+ :c => 1.7868493089298811

Serialization/deserialization:

using Serialization
 small_model = minimize(model)
 serialize("my_ridge.jls", small_model)
 
 recovered_model = deserialize("my_ridge.jls")
 @assert LearnAPI.algorithm(recovered_model) == algorithm
-predict(recovered_model, LiteralTarget(), X) == predict(model, LiteralTarget(), X)

¹ The definition of this and other structs above is not an explicit requirement of LearnAPI.jl, whose constructs are purely functional.

² An implementation can provide further accessor functions, if necessary, but like the native ones, they must be included in the LearnAPI.functions declaration.

+predict(recovered_model, LiteralTarget(), X) == predict(model, LiteralTarget(), X)

¹ The definition of this and other structs above is not an explicit requirement of LearnAPI.jl, whose constructs are purely functional.

² An implementation can provide further accessor functions, if necessary, but like the native ones, they must be included in the LearnAPI.functions declaration.

diff --git a/dev/assets/documenter.js b/dev/assets/documenter.js index f5311607..c6562b55 100644 --- a/dev/assets/documenter.js +++ b/dev/assets/documenter.js @@ -4,7 +4,6 @@ requirejs.config({ 'highlight-julia': 'https://cdnjs.cloudflare.com/ajax/libs/highlight.js/11.8.0/languages/julia.min', 'headroom': 'https://cdnjs.cloudflare.com/ajax/libs/headroom/0.12.0/headroom.min', 'jqueryui': 'https://cdnjs.cloudflare.com/ajax/libs/jqueryui/1.13.2/jquery-ui.min', - 'minisearch': 'https://cdn.jsdelivr.net/npm/minisearch@6.1.0/dist/umd/index.min', 'katex-auto-render': 'https://cdnjs.cloudflare.com/ajax/libs/KaTeX/0.16.8/contrib/auto-render.min', 'jquery': 'https://cdnjs.cloudflare.com/ajax/libs/jquery/3.7.0/jquery.min', 'headroom-jquery': 'https://cdnjs.cloudflare.com/ajax/libs/headroom/0.12.0/jQuery.headroom.min', @@ -103,9 +102,10 @@ $(document).on("click", ".docstring header", function () { }); }); -$(document).on("click", ".docs-article-toggle-button", function () { +$(document).on("click", ".docs-article-toggle-button", function (event) { let articleToggleTitle = "Expand docstring"; let navArticleToggleTitle = "Expand all docstrings"; + let animationSpeed = event.noToggleAnimation ? 0 : 400; debounce(() => { if (isExpanded) { @@ -116,7 +116,7 @@ $(document).on("click", ".docs-article-toggle-button", function () { isExpanded = false; - $(".docstring section").slideUp(); + $(".docstring section").slideUp(animationSpeed); } else { $(this).removeClass("fa-chevron-down").addClass("fa-chevron-up"); $(".docstring-article-toggle-button") @@ -127,7 +127,7 @@ $(document).on("click", ".docs-article-toggle-button", function () { articleToggleTitle = "Collapse docstring"; navArticleToggleTitle = "Collapse all docstrings"; - $(".docstring section").slideDown(); + $(".docstring section").slideDown(animationSpeed); } $(this).prop("title", navArticleToggleTitle); @@ -224,224 +224,465 @@ $(document).ready(function () { }) //////////////////////////////////////////////////////////////////////////////// -require(['jquery', 'minisearch'], function($, minisearch) { - -// In general, most search related things will have "search" as a prefix. -// To get an in-depth about the thought process you can refer: https://hetarth02.hashnode.dev/series/gsoc +require(['jquery'], function($) { -let results = []; -let timer = undefined; +$(document).ready(function () { + let meta = $("div[data-docstringscollapsed]").data(); -let data = documenterSearchIndex["docs"].map((x, key) => { - x["id"] = key; // minisearch requires a unique for each object - return x; + if (meta?.docstringscollapsed) { + $("#documenter-article-toggle-button").trigger({ + type: "click", + noToggleAnimation: true, + }); + } }); -// list below is the lunr 2.1.3 list minus the intersect with names(Base) -// (all, any, get, in, is, only, which) and (do, else, for, let, where, while, with) -// ideally we'd just filter the original list but it's not available as a variable -const stopWords = new Set([ - "a", - "able", - "about", - "across", - "after", - "almost", - "also", - "am", - "among", - "an", - "and", - "are", - "as", - "at", - "be", - "because", - "been", - "but", - "by", - "can", - "cannot", - "could", - "dear", - "did", - "does", - "either", - "ever", - "every", - "from", - "got", - "had", - "has", - "have", - "he", - "her", - "hers", - "him", - "his", - "how", - "however", - "i", - "if", - "into", - "it", - "its", - "just", - "least", - "like", - "likely", - "may", - "me", - "might", - "most", - "must", - "my", - "neither", - "no", - "nor", - "not", - "of", - "off", - "often", - "on", - "or", - "other", - "our", - "own", - "rather", - "said", - "say", - "says", - "she", - "should", - "since", - "so", - "some", - "than", - "that", - "the", - "their", - "them", - "then", - "there", - "these", - "they", - "this", - "tis", - "to", - "too", - "twas", - "us", - "wants", - "was", - "we", - "were", - "what", - "when", - "who", - "whom", - "why", - "will", - "would", - "yet", - "you", - "your", -]); - -let index = new minisearch({ - fields: ["title", "text"], // fields to index for full-text search - storeFields: ["location", "title", "text", "category", "page"], // fields to return with search results - processTerm: (term) => { - let word = stopWords.has(term) ? null : term; - if (word) { - // custom trimmer that doesn't strip @ and !, which are used in julia macro and function names - word = word - .replace(/^[^a-zA-Z0-9@!]+/, "") - .replace(/[^a-zA-Z0-9@!]+$/, ""); - } +}) +//////////////////////////////////////////////////////////////////////////////// +require(['jquery'], function($) { - return word ?? null; - }, - // add . as a separator, because otherwise "title": "Documenter.Anchors.add!", would not find anything if searching for "add!", only for the entire qualification - tokenize: (string) => string.split(/[\s\-\.]+/), - // options which will be applied during the search - searchOptions: { - boost: { title: 100 }, - fuzzy: 2, +/* +To get an in-depth about the thought process you can refer: https://hetarth02.hashnode.dev/series/gsoc + +PSEUDOCODE: + +Searching happens automatically as the user types or adjusts the selected filters. +To preserve responsiveness, as much as possible of the slow parts of the search are done +in a web worker. Searching and result generation are done in the worker, and filtering and +DOM updates are done in the main thread. The filters are in the main thread as they should +be very quick to apply. This lets filters be changed without re-searching with minisearch +(which is possible even if filtering is on the worker thread) and also lets filters be +changed _while_ the worker is searching and without message passing (neither of which are +possible if filtering is on the worker thread) + +SEARCH WORKER: + +Import minisearch + +Build index + +On message from main thread + run search + find the first 200 unique results from each category, and compute their divs for display + note that this is necessary and sufficient information for the main thread to find the + first 200 unique results from any given filter set + post results to main thread + +MAIN: + +Launch worker + +Declare nonconstant globals (worker_is_running, last_search_text, unfiltered_results) + +On text update + if worker is not running, launch_search() + +launch_search + set worker_is_running to true, set last_search_text to the search text + post the search query to worker + +on message from worker + if last_search_text is not the same as the text in the search field, + the latest search result is not reflective of the latest search query, so update again + launch_search() + otherwise + set worker_is_running to false + + regardless, display the new search results to the user + save the unfiltered_results as a global + update_search() + +on filter click + adjust the filter selection + update_search() + +update_search + apply search filters by looping through the unfiltered_results and finding the first 200 + unique results that match the filters + + Update the DOM +*/ + +/////// SEARCH WORKER /////// + +function worker_function(documenterSearchIndex, documenterBaseURL, filters) { + importScripts( + "https://cdn.jsdelivr.net/npm/minisearch@6.1.0/dist/umd/index.min.js" + ); + + let data = documenterSearchIndex.map((x, key) => { + x["id"] = key; // minisearch requires a unique for each object + return x; + }); + + // list below is the lunr 2.1.3 list minus the intersect with names(Base) + // (all, any, get, in, is, only, which) and (do, else, for, let, where, while, with) + // ideally we'd just filter the original list but it's not available as a variable + const stopWords = new Set([ + "a", + "able", + "about", + "across", + "after", + "almost", + "also", + "am", + "among", + "an", + "and", + "are", + "as", + "at", + "be", + "because", + "been", + "but", + "by", + "can", + "cannot", + "could", + "dear", + "did", + "does", + "either", + "ever", + "every", + "from", + "got", + "had", + "has", + "have", + "he", + "her", + "hers", + "him", + "his", + "how", + "however", + "i", + "if", + "into", + "it", + "its", + "just", + "least", + "like", + "likely", + "may", + "me", + "might", + "most", + "must", + "my", + "neither", + "no", + "nor", + "not", + "of", + "off", + "often", + "on", + "or", + "other", + "our", + "own", + "rather", + "said", + "say", + "says", + "she", + "should", + "since", + "so", + "some", + "than", + "that", + "the", + "their", + "them", + "then", + "there", + "these", + "they", + "this", + "tis", + "to", + "too", + "twas", + "us", + "wants", + "was", + "we", + "were", + "what", + "when", + "who", + "whom", + "why", + "will", + "would", + "yet", + "you", + "your", + ]); + + let index = new MiniSearch({ + fields: ["title", "text"], // fields to index for full-text search + storeFields: ["location", "title", "text", "category", "page"], // fields to return with results processTerm: (term) => { let word = stopWords.has(term) ? null : term; if (word) { + // custom trimmer that doesn't strip @ and !, which are used in julia macro and function names word = word .replace(/^[^a-zA-Z0-9@!]+/, "") .replace(/[^a-zA-Z0-9@!]+$/, ""); + + word = word.toLowerCase(); } return word ?? null; }, + // add . as a separator, because otherwise "title": "Documenter.Anchors.add!", would not + // find anything if searching for "add!", only for the entire qualification tokenize: (string) => string.split(/[\s\-\.]+/), - }, -}); + // options which will be applied during the search + searchOptions: { + prefix: true, + boost: { title: 100 }, + fuzzy: 2, + }, + }); -index.addAll(data); + index.addAll(data); + + /** + * Used to map characters to HTML entities. + * Refer: https://github.com/lodash/lodash/blob/main/src/escape.ts + */ + const htmlEscapes = { + "&": "&", + "<": "<", + ">": ">", + '"': """, + "'": "'", + }; + + /** + * Used to match HTML entities and HTML characters. + * Refer: https://github.com/lodash/lodash/blob/main/src/escape.ts + */ + const reUnescapedHtml = /[&<>"']/g; + const reHasUnescapedHtml = RegExp(reUnescapedHtml.source); + + /** + * Escape function from lodash + * Refer: https://github.com/lodash/lodash/blob/main/src/escape.ts + */ + function escape(string) { + return string && reHasUnescapedHtml.test(string) + ? string.replace(reUnescapedHtml, (chr) => htmlEscapes[chr]) + : string || ""; + } -let filters = [...new Set(data.map((x) => x.category))]; -var modal_filters = make_modal_body_filters(filters); -var filter_results = []; + /** + * Make the result component given a minisearch result data object and the value + * of the search input as queryString. To view the result object structure, refer: + * https://lucaong.github.io/minisearch/modules/_minisearch_.html#searchresult + * + * @param {object} result + * @param {string} querystring + * @returns string + */ + function make_search_result(result, querystring) { + let search_divider = `
`; + let display_link = + result.location.slice(Math.max(0), Math.min(50, result.location.length)) + + (result.location.length > 30 ? "..." : ""); // To cut-off the link because it messes with the overflow of the whole div + + if (result.page !== "") { + display_link += ` (${result.page})`; + } -$(document).on("keyup", ".documenter-search-input", function (event) { - // Adding a debounce to prevent disruptions from super-speed typing! - debounce(() => update_search(filter_results), 300); + let textindex = new RegExp(`${querystring}`, "i").exec(result.text); + let text = + textindex !== null + ? result.text.slice( + Math.max(textindex.index - 100, 0), + Math.min( + textindex.index + querystring.length + 100, + result.text.length + ) + ) + : ""; // cut-off text before and after from the match + + text = text.length ? escape(text) : ""; + + let display_result = text.length + ? "..." + + text.replace( + new RegExp(`${escape(querystring)}`, "i"), // For first occurrence + '$&' + ) + + "..." + : ""; // highlights the match + + let in_code = false; + if (!["page", "section"].includes(result.category.toLowerCase())) { + in_code = true; + } + + // We encode the full url to escape some special characters which can lead to broken links + let result_div = ` + +
+
${escape(result.title)}
+
${result.category}
+
+

+ ${display_result} +

+
+ ${display_link} +
+
+ ${search_divider} + `; + + return result_div; + } + + self.onmessage = function (e) { + let query = e.data; + let results = index.search(query, { + filter: (result) => { + // Only return relevant results + return result.score >= 1; + }, + }); + + // Pre-filter to deduplicate and limit to 200 per category to the extent + // possible without knowing what the filters are. + let filtered_results = []; + let counts = {}; + for (let filter of filters) { + counts[filter] = 0; + } + let present = {}; + + for (let result of results) { + cat = result.category; + cnt = counts[cat]; + if (cnt < 200) { + id = cat + "---" + result.location; + if (present[id]) { + continue; + } + present[id] = true; + filtered_results.push({ + location: result.location, + category: cat, + div: make_search_result(result, query), + }); + } + } + + postMessage(filtered_results); + }; +} + +// `worker = Threads.@spawn worker_function(documenterSearchIndex)`, but in JavaScript! +const filters = [ + ...new Set(documenterSearchIndex["docs"].map((x) => x.category)), +]; +const worker_str = + "(" + + worker_function.toString() + + ")(" + + JSON.stringify(documenterSearchIndex["docs"]) + + "," + + JSON.stringify(documenterBaseURL) + + "," + + JSON.stringify(filters) + + ")"; +const worker_blob = new Blob([worker_str], { type: "text/javascript" }); +const worker = new Worker(URL.createObjectURL(worker_blob)); + +/////// SEARCH MAIN /////// + +// Whether the worker is currently handling a search. This is a boolean +// as the worker only ever handles 1 or 0 searches at a time. +var worker_is_running = false; + +// The last search text that was sent to the worker. This is used to determine +// if the worker should be launched again when it reports back results. +var last_search_text = ""; + +// The results of the last search. This, in combination with the state of the filters +// in the DOM, is used compute the results to display on calls to update_search. +var unfiltered_results = []; + +// Which filter is currently selected +var selected_filter = ""; + +$(document).on("input", ".documenter-search-input", function (event) { + if (!worker_is_running) { + launch_search(); + } }); +function launch_search() { + worker_is_running = true; + last_search_text = $(".documenter-search-input").val(); + worker.postMessage(last_search_text); +} + +worker.onmessage = function (e) { + if (last_search_text !== $(".documenter-search-input").val()) { + launch_search(); + } else { + worker_is_running = false; + } + + unfiltered_results = e.data; + update_search(); +}; + $(document).on("click", ".search-filter", function () { if ($(this).hasClass("search-filter-selected")) { - $(this).removeClass("search-filter-selected"); + selected_filter = ""; } else { - $(this).addClass("search-filter-selected"); + selected_filter = $(this).text().toLowerCase(); } - // Adding a debounce to prevent disruptions from crazy clicking! - debounce(() => get_filters(), 300); + // This updates search results and toggles classes for UI: + update_search(); }); -/** - * A debounce function, takes a function and an optional timeout in milliseconds - * - * @function callback - * @param {number} timeout - */ -function debounce(callback, timeout = 300) { - clearTimeout(timer); - timer = setTimeout(callback, timeout); -} - /** * Make/Update the search component - * - * @param {string[]} selected_filters */ -function update_search(selected_filters = []) { - let initial_search_body = ` -
Type something to get started!
- `; - +function update_search() { let querystring = $(".documenter-search-input").val(); if (querystring.trim()) { - results = index.search(querystring, { - filter: (result) => { - // Filtering results - if (selected_filters.length === 0) { - return result.score >= 1; - } else { - return ( - result.score >= 1 && selected_filters.includes(result.category) - ); - } - }, - }); + if (selected_filter == "") { + results = unfiltered_results; + } else { + results = unfiltered_results.filter((result) => { + return selected_filter == result.category.toLowerCase(); + }); + } let search_result_container = ``; + let modal_filters = make_modal_body_filters(); let search_divider = `
`; if (results.length) { @@ -449,19 +690,23 @@ function update_search(selected_filters = []) { let count = 0; let search_results = ""; - results.forEach(function (result) { - if (result.location) { - // Checking for duplication of results for the same page - if (!links.includes(result.location)) { - search_results += make_search_result(result, querystring); - count++; - } - + for (var i = 0, n = results.length; i < n && count < 200; ++i) { + let result = results[i]; + if (result.location && !links.includes(result.location)) { + search_results += result.div; + count++; links.push(result.location); } - }); + } - let result_count = `
${count} result(s)
`; + if (count == 1) { + count_str = "1 result"; + } else if (count == 200) { + count_str = "200+ results"; + } else { + count_str = count + " results"; + } + let result_count = `
${count_str}
`; search_result_container = `
@@ -490,125 +735,37 @@ function update_search(selected_filters = []) { $(".search-modal-card-body").html(search_result_container); } else { - filter_results = []; - modal_filters = make_modal_body_filters(filters, filter_results); - if (!$(".search-modal-card-body").hasClass("is-justify-content-center")) { $(".search-modal-card-body").addClass("is-justify-content-center"); } - $(".search-modal-card-body").html(initial_search_body); + $(".search-modal-card-body").html(` +
Type something to get started!
+ `); } } /** * Make the modal filter html * - * @param {string[]} filters - * @param {string[]} selected_filters * @returns string */ -function make_modal_body_filters(filters, selected_filters = []) { - let str = ``; - - filters.forEach((val) => { - if (selected_filters.includes(val)) { - str += `${val}`; - } else { - str += `${val}`; - } - }); +function make_modal_body_filters() { + let str = filters + .map((val) => { + if (selected_filter == val.toLowerCase()) { + return `${val}`; + } else { + return `${val}`; + } + }) + .join(""); - let filter_html = ` + return `
Filters: ${str} -
- `; - - return filter_html; -} - -/** - * Make the result component given a minisearch result data object and the value of the search input as queryString. - * To view the result object structure, refer: https://lucaong.github.io/minisearch/modules/_minisearch_.html#searchresult - * - * @param {object} result - * @param {string} querystring - * @returns string - */ -function make_search_result(result, querystring) { - let search_divider = `
`; - let display_link = - result.location.slice(Math.max(0), Math.min(50, result.location.length)) + - (result.location.length > 30 ? "..." : ""); // To cut-off the link because it messes with the overflow of the whole div - - if (result.page !== "") { - display_link += ` (${result.page})`; - } - - let textindex = new RegExp(`\\b${querystring}\\b`, "i").exec(result.text); - let text = - textindex !== null - ? result.text.slice( - Math.max(textindex.index - 100, 0), - Math.min( - textindex.index + querystring.length + 100, - result.text.length - ) - ) - : ""; // cut-off text before and after from the match - - let display_result = text.length - ? "..." + - text.replace( - new RegExp(`\\b${querystring}\\b`, "i"), // For first occurrence - '$&' - ) + - "..." - : ""; // highlights the match - - let in_code = false; - if (!["page", "section"].includes(result.category.toLowerCase())) { - in_code = true; - } - - // We encode the full url to escape some special characters which can lead to broken links - let result_div = ` - -
-
${result.title}
-
${result.category}
-
-

- ${display_result} -

-
- ${display_link} -
-
- ${search_divider} - `; - - return result_div; -} - -/** - * Get selected filters, remake the filter html and lastly update the search modal - */ -function get_filters() { - let ele = $(".search-filters .search-filter-selected").get(); - filter_results = ele.map((x) => $(x).text().toLowerCase()); - modal_filters = make_modal_body_filters(filters, filter_results); - update_search(filter_results); +
`; } }) @@ -635,103 +792,107 @@ $(document).ready(function () { //////////////////////////////////////////////////////////////////////////////// require(['jquery'], function($) { -let search_modal_header = ` - -`; - -let initial_search_body = ` -
Type something to get started!
-`; - -let search_modal_footer = ` - -`; - -$(document.body).append( - ` - + obsfit(algorithm, Obs(), obs(fit, algorithm, data...); verbosity)

Rather, new algorithms should overload obsfit. See also obs.

source
LearnAPI.obsfitFunction
obsfit(algorithm, obsdata; verbosity=1)

A lower-level alternative to fit, this method consumes a pre-processed form of user data. Specifically, the following two code snippets are equivalent:

model = fit(algorithm, data...)

and

obsdata = obs(fit, algorithm, data...)
+model = obsfit(algorithm, obsdata)

Here obsdata is algorithm-specific, "observation-accessible" data, meaning it implements the MLUtils.jl getobs/numobs interface for observation resampling (even if data does not). Moreover, resampled versions of obsdata may be passed to obsfit in its place.

The use of obsfit may offer performance advantages. See more at obs.

See also fit, obs.

Extended help

New implementations

Implementation of the following method signature is compulsory for all new algorithms:

LearnAPI.obsfit(algorithm, obsdata, verbosity)

Here obsdata has the form explained above. If obs(fit, ...) is not being overloaded, then a fallback gives obsdata = data (always a tuple!). Note that verbosity is a positional argument, not a keyword argument in the overloaded signature.

New implementations must also implement LearnAPI.algorithm.

If overloaded, then the functions LearnAPI.obsfit and LearnAPI.fit must be included in the tuple returned by the LearnAPI.functions(algorithm) trait.

Non-generalizing algorithms

If the algorithm does not generalize to new data (e.g, DBSCAN clustering) then data = () and obsfit carries out no computation, as this happen entirely in a transform and/or predict call. In such cases, obsfit(algorithm, ...) may return algorithm, but another possibility is allowed: To provide a mechanism for transform/predict to report byproducts of the computation (e.g., a list of boundary points in DBSCAN clustering) they are allowed to mutate the model object returned by obsfit, which is then arranged to be a mutable struct wrapping algorithm and fields to store the byproducts. In that case, LearnAPI.predict_or_transform_mutates(algorithm) must be overloaded to return true.

source
diff --git a/dev/index.html b/dev/index.html index 9259c6bf..67801b03 100644 --- a/dev/index.html +++ b/dev/index.html @@ -29,4 +29,4 @@ # Recover saved model and algorithm configuration: recovered_model = deserialize("my_random_forest.jls") @assert LearnAPI.algorithm(recovered_model) == forest -@assert predict(recovered_model, LiteralTarget(), Xnew) == ŷ

Distribution and LiteralTarget are singleton types owned by LearnAPI.jl. They allow dispatch based on the kind of target proxy, a key LearnAPI.jl concept. LearnAPI.jl places more emphasis on the notion of target variables and target proxies than on the usual supervised/unsupervised learning dichotomy. From this point of view, a supervised algorithm is simply one in which a target variable exists, and happens to appear as an input to training but not to prediction.

In LearnAPI.jl, a method called obs gives users access to an "internal", algorithm-specific, representation of input data, which is always "observation-accessible", in the sense that it can be resampled using MLUtils.jl getobs/numobs interface. The implementation can arrange for this resampling to be efficient, and workflows based on obs can have performance benefits.

Learning more

+@assert predict(recovered_model, LiteralTarget(), Xnew) == ŷ

Distribution and LiteralTarget are singleton types owned by LearnAPI.jl. They allow dispatch based on the kind of target proxy, a key LearnAPI.jl concept. LearnAPI.jl places more emphasis on the notion of target variables and target proxies than on the usual supervised/unsupervised learning dichotomy. From this point of view, a supervised algorithm is simply one in which a target variable exists, and happens to appear as an input to training but not to prediction.

In LearnAPI.jl, a method called obs gives users access to an "internal", algorithm-specific, representation of input data, which is always "observation-accessible", in the sense that it can be resampled using MLUtils.jl getobs/numobs interface. The implementation can arrange for this resampling to be efficient, and workflows based on obs can have performance benefits.

Learning more

diff --git a/dev/kinds_of_target_proxy/index.html b/dev/kinds_of_target_proxy/index.html index 8c6fe06b..ead429d3 100644 --- a/dev/kinds_of_target_proxy/index.html +++ b/dev/kinds_of_target_proxy/index.html @@ -1,2 +1,2 @@ -Kinds of Target Proxy · LearnAPI.jl

Kinds of Target Proxy

The available kinds of target proxy are classified by subtypes of LearnAPI.KindOfProxy. These types are intended for dispatch only and have no fields.

LearnAPI.KindOfProxyType
LearnAPI.KindOfProxy

Abstract type whose concrete subtypes T each represent a different kind of proxy for some target variable, associated with some algorithm. Instances T() are used to request the form of target predictions in predict calls.

See LearnAPI.jl documentation for an explanation of "targets" and "target proxies".

For example, Distribution is a concrete subtype of LearnAPI.KindOfProxy and a call like predict(model, Distribution(), Xnew) returns a data object whose observations are probability density/mass functions, assuming algorithm supports predictions of that form.

Run LearnAPI.CONCRETE_TARGET_PROXY_TYPES to list all options.

source
LearnAPI.IIDType
LearnAPI.IID <: LearnAPI.KindOfProxy

Abstract subtype of LearnAPI.KindOfProxy. If kind_of_proxy is an instance of LearnAPI.IID then, given data constisting of $n$ observations, the following must hold:

  • ŷ = LearnAPI.predict(model, kind_of_proxy, data...) is data also consisting of $n$ observations.

  • The $j$th observation of , for any $j$, depends only on the $j$th observation of the provided data (no correlation between observations).

See also LearnAPI.KindOfProxy.

source

Simple target proxies (subtypes of LearnAPI.IID)

typeform of an observation
LearnAPI.LiteralTargetsame as target observations
LearnAPI.Sampleableobject that can be sampled to obtain object of the same form as target observation
LearnAPI.Distributionexplicit probability density/mass function whose sample space is all possible target observations
LearnAPI.LogDistributionexplicit log-probability density/mass function whose sample space is possible target observations
LearnAPI.Probabilitynumerical probability or probability vector
LearnAPI.LogProbabilitylog-probability or log-probability vector
LearnAPI.Parametrica list of parameters (e.g., mean and variance) describing some distribution
LearnAPI.LabelAmbiguouscollections of labels (in case of multi-class target) but without a known correspondence to the original target labels (and of possibly different number) as in, e.g., clustering
LearnAPI.LabelAmbiguousSampleablesampleable version of LabelAmbiguous; see Sampleable above
LearnAPI.LabelAmbiguousDistributionpdf/pmf version of LabelAmbiguous; see Distribution above
LearnAPI.ConfidenceIntervalconfidence interval
LearnAPI.Setfinite but possibly varying number of target observations
LearnAPI.ProbabilisticSetas for Set but labeled with probabilities (not necessarily summing to one)
LearnAPI.SurvivalFunctionsurvival function
LearnAPI.SurvivalDistributionprobability distribution for survival time
LearnAPI.OutlierScorenumerical score reflecting degree of outlierness (not necessarily normalized)
LearnAPI.Continuousreal-valued approximation/interpolation of a discrete-valued target, such as a count (e.g., number of phone calls)

† Provided for completeness but discouraged to avoid ambiguities in representation.

Table of concrete subtypes of LearnAPI.IID <: LearnAPI.KindOfProxy.

When the proxy for the target is a single object

In the following table of subtypes T <: LearnAPI.KindOfProxy not falling under the IID umbrella, it is understood that predict(model, ::T, ...) is not divided into individual observations, but represents a single probability distribution for the sample space $Y^n$, where $Y$ is the space the target variable takes its values, and n is the number of observations in data.

type Tform of output of predict(model, ::T, data...)
LearnAPI.JointSampleableobject that can be sampled to obtain a vector whose elements have the form of target observations; the vector length matches the number of observations in data.
LearnAPI.JointDistributionexplicit probability density/mass function whose sample space is vectors of target observations; the vector length matches the number of observations in data
LearnAPI.JointLogDistributionexplicit log-probability density/mass function whose sample space is vectors of target observations; the vector length matches the number of observations in data

Table of LearnAPI.KindOfProxy subtypes not subtyping LearnAPI.IID

+Kinds of Target Proxy · LearnAPI.jl

Kinds of Target Proxy

The available kinds of target proxy are classified by subtypes of LearnAPI.KindOfProxy. These types are intended for dispatch only and have no fields.

LearnAPI.KindOfProxyType
LearnAPI.KindOfProxy

Abstract type whose concrete subtypes T each represent a different kind of proxy for some target variable, associated with some algorithm. Instances T() are used to request the form of target predictions in predict calls.

See LearnAPI.jl documentation for an explanation of "targets" and "target proxies".

For example, Distribution is a concrete subtype of LearnAPI.KindOfProxy and a call like predict(model, Distribution(), Xnew) returns a data object whose observations are probability density/mass functions, assuming algorithm supports predictions of that form.

Run LearnAPI.CONCRETE_TARGET_PROXY_TYPES to list all options.

source
LearnAPI.IIDType
LearnAPI.IID <: LearnAPI.KindOfProxy

Abstract subtype of LearnAPI.KindOfProxy. If kind_of_proxy is an instance of LearnAPI.IID then, given data constisting of $n$ observations, the following must hold:

  • ŷ = LearnAPI.predict(model, kind_of_proxy, data...) is data also consisting of $n$ observations.

  • The $j$th observation of , for any $j$, depends only on the $j$th observation of the provided data (no correlation between observations).

See also LearnAPI.KindOfProxy.

source

Simple target proxies (subtypes of LearnAPI.IID)

typeform of an observation
LearnAPI.LiteralTargetsame as target observations
LearnAPI.Sampleableobject that can be sampled to obtain object of the same form as target observation
LearnAPI.Distributionexplicit probability density/mass function whose sample space is all possible target observations
LearnAPI.LogDistributionexplicit log-probability density/mass function whose sample space is possible target observations
LearnAPI.Probabilitynumerical probability or probability vector
LearnAPI.LogProbabilitylog-probability or log-probability vector
LearnAPI.Parametrica list of parameters (e.g., mean and variance) describing some distribution
LearnAPI.LabelAmbiguouscollections of labels (in case of multi-class target) but without a known correspondence to the original target labels (and of possibly different number) as in, e.g., clustering
LearnAPI.LabelAmbiguousSampleablesampleable version of LabelAmbiguous; see Sampleable above
LearnAPI.LabelAmbiguousDistributionpdf/pmf version of LabelAmbiguous; see Distribution above
LearnAPI.ConfidenceIntervalconfidence interval
LearnAPI.Setfinite but possibly varying number of target observations
LearnAPI.ProbabilisticSetas for Set but labeled with probabilities (not necessarily summing to one)
LearnAPI.SurvivalFunctionsurvival function
LearnAPI.SurvivalDistributionprobability distribution for survival time
LearnAPI.OutlierScorenumerical score reflecting degree of outlierness (not necessarily normalized)
LearnAPI.Continuousreal-valued approximation/interpolation of a discrete-valued target, such as a count (e.g., number of phone calls)

† Provided for completeness but discouraged to avoid ambiguities in representation.

Table of concrete subtypes of LearnAPI.IID <: LearnAPI.KindOfProxy.

When the proxy for the target is a single object

In the following table of subtypes T <: LearnAPI.KindOfProxy not falling under the IID umbrella, it is understood that predict(model, ::T, ...) is not divided into individual observations, but represents a single probability distribution for the sample space $Y^n$, where $Y$ is the space the target variable takes its values, and n is the number of observations in data.

type Tform of output of predict(model, ::T, data...)
LearnAPI.JointSampleableobject that can be sampled to obtain a vector whose elements have the form of target observations; the vector length matches the number of observations in data.
LearnAPI.JointDistributionexplicit probability density/mass function whose sample space is vectors of target observations; the vector length matches the number of observations in data
LearnAPI.JointLogDistributionexplicit log-probability density/mass function whose sample space is vectors of target observations; the vector length matches the number of observations in data

Table of LearnAPI.KindOfProxy subtypes not subtyping LearnAPI.IID

diff --git a/dev/minimize/index.html b/dev/minimize/index.html index f0b3a7ec..b94358ca 100644 --- a/dev/minimize/index.html +++ b/dev/minimize/index.html @@ -15,4 +15,4 @@ transform(minimize(model; options...), args...; kwargs...) == transform(model, args...; kwargs...) inverse_transform(minimize(model; options), args...; kwargs...) == - inverse_transform(model, args...; kwargs...)

Additionally:

minimize(minimize(model)) == minimize(model)
source + inverse_transform(model, args...; kwargs...)

Additionally:

minimize(minimize(model)) == minimize(model)
source diff --git a/dev/objects.inv b/dev/objects.inv new file mode 100644 index 0000000000000000000000000000000000000000..7dbfe04196d37c39846582860810a205fcbbf8b4 GIT binary patch literal 1931 zcmV;62Xy!&AX9K?X>NERX>N99Zgg*Qc_4OWa&u{KZXhxWBOp+6Z)#;@bUGkRWnpq| zK~PC9YHSK4AXa5^b7^mGIv_AEF)lC)BOp|0Wgv28ZDDC{WMy(7Z)PBLXlZjGW@&6? zAZc?TV{dJ6a%FRKWn>_Ab7^j8AbMx$WM_-BU(f}KV&Rtg{GQ|BBUOpTj6WBi$w~Ac+=WNlBGW}CPFA`-`XexJ1M+YgMP9aw>!3zIjmW}cUyNvJ5GD^7)dEx8I{6PoF5+3VFOIoN@ z$ULMv+g#O4udrWx1y&|tLIwsNnL;@)c+S7J46N5y{u2^g6;i<0mHC5c95bbajMB1* zjmuhO=eywVY7M`y*AN_z>lUM10pe^YB-i^KKiCx%i#!+IAVvqOHIr@KR|FL1=kbwW}SwZxk5i5g0j@oaIif{#XDbSuVBDSBw=C+ppTLl`8SPX)& zI~s_!-zw!Cj)I+7?8Ksn(}Bk{BacFUNi*@N+KNoZ8$%ow_Ea5+=r_NCi8a-ffEU0R z#pIDTSEirS&|9yC^k{!NFnOSo=8y(*jQ>NM5^aiDx?IeNzg%h{L3l$Pm~E5; zi1M5jNCEj7Obx)IXYBz-Bjy|$y=2mNg3-#>I+x={*9D=;k%DUypXf8(ab4h=&1XKk zr$*m^7!y6a-_Q#PW?#PdYMd@y_FhLC@>|V>g#J?d0PyDE(`l}x}GD-#=;7J^zb(Va7r7|5&5{B_bfeOy;RzQ@%%Hc02Cq5A`iK_$_Tf`Hkj zIcIM!HQ8LT;E<(GAQeLxM{XO`Iw3VR3{2(@@M8g+2VmZ4n*m2Q1A1!GhjD=n7#NoC$^Z?a(mqFprT;33fCwz#969_o4<8X`E zb0F7sk#K1%<5{2BhS$7VJGH8^BJoWL& zWF}}rw}6z@6}v2j*9<%X9hr-nw>)a~_gALv?Ygpw$)9^P_*k)&ITx4Pt6S_HA398Ju2Ry?F=!uimV$UjKac)AtqS z)$3nul){c^^nQokPzRKPj;OlNH<0n{Nssm=_a%VA(KKQO%}$CB1B;^rmpD3yC2kSd z%c%rNxRQYG&3-(im9unR#IuXx9Sd>my^t)XZcds8t+b);B32*>RfCsF;?4Qs z*4)9t(aqNHq2cX%ynC9=-?J|2CMy-#%H*&U;()|uP6AUw#393EO1pzcHK>^3d{$jJ zuLT6H!MBB}HRKaZ%CR9esVO5_Q(6e9m$4Ust`CRoRKrUb2kcDrjE<`oZZW3|61EFz zw_f*FL_K9zvnY9~%B}4viAaf^d#~>{-)%PQ4y+8cnEgTlb%w+m4024JsTli{n0~c& z5_HS-UL@lq&l-$9LY6*^HhL9RJ+%D(DEc*Kb2T9GoUOKIJ1CyV&ZYyHD?TlM7eLT6QXamL}AfQnL^ z@XWQUdBANG_YT7WA0U9?uxds|-GIZ>xQDRC2^Yp}LGqn>W&y8FzXC4!C6QiqL%!^w zJ&&PO&9%ukwx9y=ZSmpuWe(_8(0umU=F-N3rBK27B`#|M^{jZCDaOoZ`#HqW8B4l6 zxPE4MRua#($o%BdCaYSjfW|y2eN6A_o3-FtcHO?f)o_fuyG+#v#%x9)Bgol zE=Wm(y9+8qWt&iHIf{0}Iah7MR4n;H3poovv}DWA#)~79iUmKv0(rJ literal 0 HcmV?d00001 diff --git a/dev/obs/index.html b/dev/obs/index.html index cfb1ec61..aff3f7d1 100644 --- a/dev/obs/index.html +++ b/dev/obs/index.html @@ -71,4 +71,4 @@ data.verbosity > 0 && @info "Training using these features: names." <construct final `model` using `coremodel`> return model -end

When is overloading obs optional?

Overloading obs is optional, for a given typeof(algorithm) and typeof(fun), if the components of data in the standard call func(algorithm_or_model, data...) are already expected to separately implement the getobs/numbobs interface. This is true for arrays whose last dimension is the observation dimension, and for suitable tables.

source +end

When is overloading obs optional?

Overloading obs is optional, for a given typeof(algorithm) and typeof(fun), if the components of data in the standard call func(algorithm_or_model, data...) are already expected to separately implement the getobs/numbobs interface. This is true for arrays whose last dimension is the observation dimension, and for suitable tables.

source diff --git a/dev/patterns/classification/index.html b/dev/patterns/classification/index.html index 2cdbe1fe..04fb8d2a 100644 --- a/dev/patterns/classification/index.html +++ b/dev/patterns/classification/index.html @@ -1,2 +1,2 @@ -Classification · LearnAPI.jl
+Classification · LearnAPI.jl diff --git a/dev/patterns/clusterering/index.html b/dev/patterns/clusterering/index.html index cf34580e..0cbbf8bb 100644 --- a/dev/patterns/clusterering/index.html +++ b/dev/patterns/clusterering/index.html @@ -1,2 +1,2 @@ -Clusterering · LearnAPI.jl
+Clusterering · LearnAPI.jl
diff --git a/dev/patterns/dimension_reduction/index.html b/dev/patterns/dimension_reduction/index.html index 0e45a53b..447de453 100644 --- a/dev/patterns/dimension_reduction/index.html +++ b/dev/patterns/dimension_reduction/index.html @@ -1,2 +1,2 @@ -Dimension Reduction · LearnAPI.jl +Dimension Reduction · LearnAPI.jl diff --git a/dev/patterns/incremental_algorithms/index.html b/dev/patterns/incremental_algorithms/index.html index 9b260462..b0371739 100644 --- a/dev/patterns/incremental_algorithms/index.html +++ b/dev/patterns/incremental_algorithms/index.html @@ -1,2 +1,2 @@ -Incremental Models · LearnAPI.jl +Incremental Models · LearnAPI.jl diff --git a/dev/patterns/incremental_models/index.html b/dev/patterns/incremental_models/index.html index b0838852..14de81e2 100644 --- a/dev/patterns/incremental_models/index.html +++ b/dev/patterns/incremental_models/index.html @@ -1,2 +1,2 @@ -Incremental Algorithms · LearnAPI.jl +Incremental Algorithms · LearnAPI.jl diff --git a/dev/patterns/iterative_algorithms/index.html b/dev/patterns/iterative_algorithms/index.html index 655c52a0..3d9337fb 100644 --- a/dev/patterns/iterative_algorithms/index.html +++ b/dev/patterns/iterative_algorithms/index.html @@ -1,2 +1,2 @@ -Iterative Algorithms · LearnAPI.jl +Iterative Algorithms · LearnAPI.jl diff --git a/dev/patterns/learning_a_probability_distribution/index.html b/dev/patterns/learning_a_probability_distribution/index.html index b4547ef2..71f0210a 100644 --- a/dev/patterns/learning_a_probability_distribution/index.html +++ b/dev/patterns/learning_a_probability_distribution/index.html @@ -1,2 +1,2 @@ -Learning a Probability Distribution · LearnAPI.jl +Learning a Probability Distribution · LearnAPI.jl diff --git a/dev/patterns/missing_value_imputation/index.html b/dev/patterns/missing_value_imputation/index.html index a10a500b..08a51d6a 100644 --- a/dev/patterns/missing_value_imputation/index.html +++ b/dev/patterns/missing_value_imputation/index.html @@ -1,2 +1,2 @@ -Missing Value Imputation · LearnAPI.jl +Missing Value Imputation · LearnAPI.jl diff --git a/dev/patterns/outlier_detection/index.html b/dev/patterns/outlier_detection/index.html index c626bf0a..5047d2bf 100644 --- a/dev/patterns/outlier_detection/index.html +++ b/dev/patterns/outlier_detection/index.html @@ -1,2 +1,2 @@ -Outlier Detection · LearnAPI.jl +Outlier Detection · LearnAPI.jl diff --git a/dev/patterns/regression/index.html b/dev/patterns/regression/index.html index 7bfbb6fd..f682425e 100644 --- a/dev/patterns/regression/index.html +++ b/dev/patterns/regression/index.html @@ -1,2 +1,2 @@ -Regression · LearnAPI.jl
+Regression · LearnAPI.jl
diff --git a/dev/patterns/static_algorithms/index.html b/dev/patterns/static_algorithms/index.html index 85a65627..abb9629b 100644 --- a/dev/patterns/static_algorithms/index.html +++ b/dev/patterns/static_algorithms/index.html @@ -1,2 +1,2 @@ -Static Algorithms · LearnAPI.jl
+Static Algorithms · LearnAPI.jl diff --git a/dev/patterns/supervised_bayesian_algorithms/index.html b/dev/patterns/supervised_bayesian_algorithms/index.html index b53e8f6b..7c30c8d4 100644 --- a/dev/patterns/supervised_bayesian_algorithms/index.html +++ b/dev/patterns/supervised_bayesian_algorithms/index.html @@ -1,2 +1,2 @@ -Supervised Bayesian Models · LearnAPI.jl +Supervised Bayesian Models · LearnAPI.jl diff --git a/dev/patterns/supervised_bayesian_models/index.html b/dev/patterns/supervised_bayesian_models/index.html index 4e61bf7b..c614c99e 100644 --- a/dev/patterns/supervised_bayesian_models/index.html +++ b/dev/patterns/supervised_bayesian_models/index.html @@ -1,2 +1,2 @@ -Supervised Bayesian Algorithms · LearnAPI.jl +Supervised Bayesian Algorithms · LearnAPI.jl diff --git a/dev/patterns/survival_analysis/index.html b/dev/patterns/survival_analysis/index.html index c1c63cf6..aa09431b 100644 --- a/dev/patterns/survival_analysis/index.html +++ b/dev/patterns/survival_analysis/index.html @@ -1,2 +1,2 @@ -Survival Analysis · LearnAPI.jl +Survival Analysis · LearnAPI.jl diff --git a/dev/patterns/time_series_classification/index.html b/dev/patterns/time_series_classification/index.html index 002324c0..ba12f8ac 100644 --- a/dev/patterns/time_series_classification/index.html +++ b/dev/patterns/time_series_classification/index.html @@ -1,2 +1,2 @@ -Time Series Classification · LearnAPI.jl +Time Series Classification · LearnAPI.jl diff --git a/dev/patterns/time_series_forecasting/index.html b/dev/patterns/time_series_forecasting/index.html index 1df98bae..996daebf 100644 --- a/dev/patterns/time_series_forecasting/index.html +++ b/dev/patterns/time_series_forecasting/index.html @@ -1,2 +1,2 @@ -Time Series Forecasting · LearnAPI.jl +Time Series Forecasting · LearnAPI.jl diff --git a/dev/predict_transform/index.html b/dev/predict_transform/index.html index 7cfac2f3..75802ebe 100644 --- a/dev/predict_transform/index.html +++ b/dev/predict_transform/index.html @@ -20,19 +20,19 @@ ŷ = obspredict(model, LiteralTarget(), predictdata)

Implementation guide

The methods predict and transform are not directly overloaded. Implement obspredict and obstransform instead:

methodcompulsory?fallbackrequires
obspredictnononefit
obstransformnononefit
inverse_transformnononefit, obstransform

Predict or transform?

If the algorithm has a notion of target variable, then arrange for obspredict to output each supported kind of target proxy (LiteralTarget(), Distribution(), etc).

For output not associated with a target variable, implement obstransform instead, which does not dispatch on LearnAPI.KindOfProxy, but can be optionally paired with an implementation of inverse_transform for returning (approximate) right inverses to transform.

Reference

LearnAPI.predictFunction
predict(model, kind_of_proxy::LearnAPI.KindOfProxy, data...)
 predict(model, data...)

The first signature returns target or target proxy predictions for input features data, according to some model returned by fit or obsfit. Where supported, these are literally target predictions if kind_of_proxy = LiteralTarget(), and probability density/mass functions if kind_of_proxy = Distribution(). List all options with LearnAPI.kinds_of_proxy(algorithm), where algorithm = LearnAPI.algorithm(model).

The shortcut predict(model, data...) = predict(model, LiteralTarget(), data...) is also provided.

Arguments

  • model is anything returned by a call of the form fit(algorithm, ...), for some LearnAPI-complaint algorithm.

  • data: tuple of data objects with a common number of observations, for example, data = (X, y, w) where X is a table of features, y is a target vector with the same number of rows, and w a vector of per-observation weights.

Example

In the following, algorithm is some supervised learning algorithm with training features X, training target y, and test features Xnew:

model = fit(algorithm, X, y; verbosity=0)
 predict(model, LiteralTarget(), Xnew)

Note predict does not mutate any argument, except in the special case LearnAPI.predict_or_transform_mutates(algorithm) = true.

See also obspredict, fit, transform, inverse_transform.

Extended help

New implementations

LearnAPI.jl provides the following definition of predict which is never to be directly overloaded:

predict(model, kop::LearnAPI.KindOfProxy, data...) =
-    obspredict(model, kop, obs(predict, LearnAPI.algorithm(model), data...))

Rather, new algorithms overload obspredict.

source
LearnAPI.obspredictFunction
obspredict(model, kind_of_proxy::LearnAPI.KindOfProxy, obsdata)

Similar to predict but consumes algorithm-specific representations of input data, obsdata, as returned by obs(predict, algorithm, data...). Here data... is the form of data expected in the main predict method. Alternatively, such obsdata may be replaced by a resampled version, where resampling is performed using MLUtils.getobs (always supported).

For some algorithms and workflows, obspredict will have a performance benefit over predict. See more at obs.

Example

In the following, algorithm is some supervised learning algorithm with training features X, training target y, and test features Xnew:

model = fit(algorithm, X, y)
+    obspredict(model, kop, obs(predict, LearnAPI.algorithm(model), data...))

Rather, new algorithms overload obspredict.

source
LearnAPI.obspredictFunction
obspredict(model, kind_of_proxy::LearnAPI.KindOfProxy, obsdata)

Similar to predict but consumes algorithm-specific representations of input data, obsdata, as returned by obs(predict, algorithm, data...). Here data... is the form of data expected in the main predict method. Alternatively, such obsdata may be replaced by a resampled version, where resampling is performed using MLUtils.getobs (always supported).

For some algorithms and workflows, obspredict will have a performance benefit over predict. See more at obs.

Example

In the following, algorithm is some supervised learning algorithm with training features X, training target y, and test features Xnew:

model = fit(algorithm, X, y)
 obsdata = obs(predict, algorithm, Xnew)
 ŷ = obspredict(model, LiteralTarget(), obsdata)
-@assert ŷ == predict(model, LiteralTarget(), Xnew)

See also predict, fit, transform, inverse_transform, obs.

Extended help

New implementations

Implementation of obspredict is optional, but required to enable predict. The method must also handle obsdata in the case it is replaced by MLUtils.getobs(obsdata, I) for some collection I of indices. If obs is not overloaded, then obsdata = data, where data... is what the standard predict call expects, as in the call predict(model, kind_of_proxy, data...). Note data is always a tuple, even if predict has only one data argument. See more at obs.

If LearnAPI.predict_or_transform_mutates(algorithm) is overloaded to return true, then obspredict may mutate it's first argument, but not in a way that alters the result of a subsequent call to obspredict, obstransform or inverse_transform. This is necessary for some non-generalizing algorithms but is otherwise discouraged. See more at fit.

If overloaded, you must include both LearnAPI.obspredict and LearnAPI.predict in the list of methods returned by the LearnAPI.functions trait.

An implementation is provided for each kind of target proxy you wish to support. See the LearnAPI.jl documentation for options. Each supported kind_of_proxy instance should be listed in the return value of the LearnAPI.kinds_of_proxy(algorithm) trait.

If, additionally, minimize(model) is overloaded, then the following identity must hold:

obspredict(minimize(model), args...) = obspredict(model, args...)
source
LearnAPI.transformFunction
transform(model, data...)

Return a transformation of some data, using some model, as returned by fit.

Arguments

  • model is anything returned by a call of the form fit(algorithm, ...), for some LearnAPI-complaint algorithm.

  • data: tuple of data objects with a common number of observations, for example, data = (X, y, w) where X is a table of features, y is a target vector with the same number of rows, and w a vector of per-observation weights.

Example

Here X and Xnew are data of the same form:

# For an algorithm that generalizes to new data ("learns"):
+@assert ŷ == predict(model, LiteralTarget(), Xnew)

See also predict, fit, transform, inverse_transform, obs.

Extended help

New implementations

Implementation of obspredict is optional, but required to enable predict. The method must also handle obsdata in the case it is replaced by MLUtils.getobs(obsdata, I) for some collection I of indices. If obs is not overloaded, then obsdata = data, where data... is what the standard predict call expects, as in the call predict(model, kind_of_proxy, data...). Note data is always a tuple, even if predict has only one data argument. See more at obs.

If LearnAPI.predict_or_transform_mutates(algorithm) is overloaded to return true, then obspredict may mutate it's first argument, but not in a way that alters the result of a subsequent call to obspredict, obstransform or inverse_transform. This is necessary for some non-generalizing algorithms but is otherwise discouraged. See more at fit.

If overloaded, you must include both LearnAPI.obspredict and LearnAPI.predict in the list of methods returned by the LearnAPI.functions trait.

An implementation is provided for each kind of target proxy you wish to support. See the LearnAPI.jl documentation for options. Each supported kind_of_proxy instance should be listed in the return value of the LearnAPI.kinds_of_proxy(algorithm) trait.

If, additionally, minimize(model) is overloaded, then the following identity must hold:

obspredict(minimize(model), args...) = obspredict(model, args...)
source
LearnAPI.transformFunction
transform(model, data...)

Return a transformation of some data, using some model, as returned by fit.

Arguments

  • model is anything returned by a call of the form fit(algorithm, ...), for some LearnAPI-complaint algorithm.

  • data: tuple of data objects with a common number of observations, for example, data = (X, y, w) where X is a table of features, y is a target vector with the same number of rows, and w a vector of per-observation weights.

Example

Here X and Xnew are data of the same form:

# For an algorithm that generalizes to new data ("learns"):
 model = fit(algorithm, X; verbosity=0)
 transform(model, Xnew)
 
 # For a static (non-generalizing) transformer:
 model = fit(algorithm)
 transform(model, X)

Note transform does not mutate any argument, except in the special case LearnAPI.predict_or_transform_mutates(algorithm) = true.

See also obstransform, fit, predict, inverse_transform.

Extended help

New implementations

LearnAPI.jl provides the following definition of transform which is never to be directly overloaded:

transform(model, data...) =
-    obstransform(model, obs(predict, LearnAPI.algorithm(model), data...))

Rather, new algorithms overload obstransform.

source
LearnAPI.obstransformFunction
obstransform(model, kind_of_proxy::LearnAPI.KindOfProxy, obsdata)

Similar to transform but consumes algorithm-specific representations of input data, obsdata, as returned by obs(transform, algorithm, data...). Here data... is the form of data expected in the main transform method. Alternatively, such obsdata may be replaced by a resampled version, where resampling is performed using MLUtils.getobs (always supported).

For some algorithms and workflows, obstransform will have a performance benefit over transform. See more at obs.

Example

In the following, algorithm is some unsupervised learning algorithm with training features X, and test features Xnew:

model = fit(algorithm, X, y)
+    obstransform(model, obs(predict, LearnAPI.algorithm(model), data...))

Rather, new algorithms overload obstransform.

source
LearnAPI.obstransformFunction
obstransform(model, kind_of_proxy::LearnAPI.KindOfProxy, obsdata)

Similar to transform but consumes algorithm-specific representations of input data, obsdata, as returned by obs(transform, algorithm, data...). Here data... is the form of data expected in the main transform method. Alternatively, such obsdata may be replaced by a resampled version, where resampling is performed using MLUtils.getobs (always supported).

For some algorithms and workflows, obstransform will have a performance benefit over transform. See more at obs.

Example

In the following, algorithm is some unsupervised learning algorithm with training features X, and test features Xnew:

model = fit(algorithm, X, y)
 obsdata = obs(transform, algorithm, Xnew)
 W = obstransform(model, obsdata)
-@assert W == transform(model, Xnew)

See also transform, fit, predict, inverse_transform, obs.

Extended help

New implementations

Implementation of obstransform is optional, but required to enable transform. The method must also handle obsdata in the case it is replaced by MLUtils.getobs(obsdata, I) for some collection I of indices. If obs is not overloaded, then obsdata = data, where data... is what the standard transform call expects, as in the call transform(model, data...). Note data is always a tuple, even if transform has only one data argument. See more at obs.

If LearnAPI.predict_or_transform_mutates(algorithm) is overloaded to return true, then obstransform may mutate it's first argument, but not in a way that alters the result of a subsequent call to obspredict, obstransform or inverse_transform. This is necessary for some non-generalizing algorithms but is otherwise discouraged. See more at fit.

If overloaded, you must include both LearnAPI.obstransform and LearnAPI.transform in the list of methods returned by the LearnAPI.functions trait.

Each supported kind_of_proxy should be listed in the return value of the LearnAPI.kinds_of_proxy(algorithm) trait.

If, additionally, minimize(model) is overloaded, then the following identity must hold:

obstransform(minimize(model), args...) = obstransform(model, args...)
source
LearnAPI.inverse_transformFunction
inverse_transform(model, data)

Inverse transform data according to some model returned by fit. Here "inverse" is to be understood broadly, e.g, an approximate right inverse for transform.

Arguments

  • model: anything returned by a call of the form fit(algorithm, ...), for some LearnAPI-complaint algorithm.

  • data: something having the same form as the output of transform(model, inputs...)

Example

In the following, algorithm is some dimension-reducing algorithm that generalizes to new data (such as PCA); Xtrain is the training input and Xnew the input to be reduced:

model = fit(algorithm, Xtrain; verbosity=0)
+@assert W == transform(model, Xnew)

See also transform, fit, predict, inverse_transform, obs.

Extended help

New implementations

Implementation of obstransform is optional, but required to enable transform. The method must also handle obsdata in the case it is replaced by MLUtils.getobs(obsdata, I) for some collection I of indices. If obs is not overloaded, then obsdata = data, where data... is what the standard transform call expects, as in the call transform(model, data...). Note data is always a tuple, even if transform has only one data argument. See more at obs.

If LearnAPI.predict_or_transform_mutates(algorithm) is overloaded to return true, then obstransform may mutate it's first argument, but not in a way that alters the result of a subsequent call to obspredict, obstransform or inverse_transform. This is necessary for some non-generalizing algorithms but is otherwise discouraged. See more at fit.

If overloaded, you must include both LearnAPI.obstransform and LearnAPI.transform in the list of methods returned by the LearnAPI.functions trait.

Each supported kind_of_proxy should be listed in the return value of the LearnAPI.kinds_of_proxy(algorithm) trait.

If, additionally, minimize(model) is overloaded, then the following identity must hold:

obstransform(minimize(model), args...) = obstransform(model, args...)
source
LearnAPI.inverse_transformFunction
inverse_transform(model, data)

Inverse transform data according to some model returned by fit. Here "inverse" is to be understood broadly, e.g, an approximate right inverse for transform.

Arguments

  • model: anything returned by a call of the form fit(algorithm, ...), for some LearnAPI-complaint algorithm.

  • data: something having the same form as the output of transform(model, inputs...)

Example

In the following, algorithm is some dimension-reducing algorithm that generalizes to new data (such as PCA); Xtrain is the training input and Xnew the input to be reduced:

model = fit(algorithm, Xtrain; verbosity=0)
 W = transform(model, Xnew)       # reduced version of `Xnew`
-Ŵ = inverse_transform(model, W)  # embedding of `W` in original space

See also fit, transform, predict.

Extended help

New implementations

Implementation is optional. If implemented, you must include inverse_transform in the tuple returned by the LearnAPI.functions trait.

If, additionally, minimize(model) is overloaded, then the following identity must hold:

inverse_transform(minimize(model), args...) = inverse_transform(model, args...)
source
+Ŵ = inverse_transform(model, W) # embedding of `W` in original space

See also fit, transform, predict.

Extended help

New implementations

Implementation is optional. If implemented, you must include inverse_transform in the tuple returned by the LearnAPI.functions trait.

If, additionally, minimize(model) is overloaded, then the following identity must hold:

inverse_transform(minimize(model), args...) = inverse_transform(model, args...)
source diff --git a/dev/reference/index.html b/dev/reference/index.html index 95ef73e5..dd573ed1 100644 --- a/dev/reference/index.html +++ b/dev/reference/index.html @@ -5,4 +5,4 @@ l2_regularization::T end GradientRidgeRegressor(; learning_rate=0.01, epochs=10, l2_regularization=0.01) = - GradientRidgeRegressor(learning_rate, epochs, l2_regularization)

The same is not true if we make this a mutable struct. In that case we will need to appropriately overload Base.== for GradientRidgeRegressor.

Methods

Only these method names are exported: fit, obsfit, predict, obspredict, transform, obstransform, inverse_transform, minimize, and obs. All new implementations must implement obsfit, the accessor function LearnAPI.algorithm and the trait LearnAPI.functions.

  • fit/obsfit: for training algorithms that generalize to new data

  • predict/obspredict: for outputting targets or target proxies (such as probability density functions)

  • transform/obstransform: similar to predict, but for arbitrary kinds of output, and which can be paired with an inverse_transform method

  • inverse_transform: for inverting the output of transform ("inverting" broadly understood)

  • minimize: for stripping the model output by fit of inessential content, for purposes of serialization.

  • obs: a method for exposing to the user "optimized", algorithm-specific representations of data, which can be passed to obsfit, obspredict or obstransform, but which can also be efficiently resampled using the getobs/numobs interface provided by MLUtils.jl.

  • Accessor functions: include things like feature_importances and training_losses, for extracting, from training outcomes, information common to many algorithms.

  • Algorithm traits: special methods that promise specific algorithm behavior or for recording general information about the algorithm. The only universally compulsory trait is LearnAPI.functions(algorithm), which returns a list of the explicitly overloaded non-trait methods.


¹ We acknowledge users may not like this terminology, and may know "algorithm" by some other name, such as "strategy", "options", "hyperparameter set", "configuration", or "model". Consensus on this point is difficult; see, e.g., this Julia Discourse discussion.

+ GradientRidgeRegressor(learning_rate, epochs, l2_regularization)

The same is not true if we make this a mutable struct. In that case we will need to appropriately overload Base.== for GradientRidgeRegressor.

Methods

Only these method names are exported: fit, obsfit, predict, obspredict, transform, obstransform, inverse_transform, minimize, and obs. All new implementations must implement obsfit, the accessor function LearnAPI.algorithm and the trait LearnAPI.functions.

  • fit/obsfit: for training algorithms that generalize to new data

  • predict/obspredict: for outputting targets or target proxies (such as probability density functions)

  • transform/obstransform: similar to predict, but for arbitrary kinds of output, and which can be paired with an inverse_transform method

  • inverse_transform: for inverting the output of transform ("inverting" broadly understood)

  • minimize: for stripping the model output by fit of inessential content, for purposes of serialization.

  • obs: a method for exposing to the user "optimized", algorithm-specific representations of data, which can be passed to obsfit, obspredict or obstransform, but which can also be efficiently resampled using the getobs/numobs interface provided by MLUtils.jl.

  • Accessor functions: include things like feature_importances and training_losses, for extracting, from training outcomes, information common to many algorithms.

  • Algorithm traits: special methods that promise specific algorithm behavior or for recording general information about the algorithm. The only universally compulsory trait is LearnAPI.functions(algorithm), which returns a list of the explicitly overloaded non-trait methods.


¹ We acknowledge users may not like this terminology, and may know "algorithm" by some other name, such as "strategy", "options", "hyperparameter set", "configuration", or "model". Consensus on this point is difficult; see, e.g., this Julia Discourse discussion.

diff --git a/dev/search_index.js b/dev/search_index.js index f6e73ae1..66a21e7b 100644 --- a/dev/search_index.js +++ b/dev/search_index.js @@ -1,3 +1,3 @@ var documenterSearchIndex = {"docs": -[{"location":"patterns/regression/#Regression","page":"Regression","title":"Regression","text":"","category":"section"},{"location":"patterns/regression/","page":"Regression","title":"Regression","text":"See these examples from tests.","category":"page"},{"location":"patterns/missing_value_imputation/#Missing-Value-Imputation","page":"Missing Value Imputation","title":"Missing Value Imputation","text":"","category":"section"},{"location":"patterns/iterative_algorithms/#Iterative-Algorithms","page":"Iterative Algorithms","title":"Iterative Algorithms","text":"","category":"section"},{"location":"patterns/survival_analysis/#Survival-Analysis","page":"Survival Analysis","title":"Survival Analysis","text":"","category":"section"},{"location":"predict_transform/#operations","page":"predict, transform, and relatives","title":"predict, transform, and relatives","text":"","category":"section"},{"location":"predict_transform/","page":"predict, transform, and relatives","title":"predict, transform, and relatives","text":"Standard methods:","category":"page"},{"location":"predict_transform/","page":"predict, transform, and relatives","title":"predict, transform, and relatives","text":"predict(model, kind_of_proxy, data...) -> prediction\ntransform(model, data...) -> transformed_data\ninverse_transform(model, data...) -> inverted_data","category":"page"},{"location":"predict_transform/","page":"predict, transform, and relatives","title":"predict, transform, and relatives","text":"Methods consuming output, obsdata, of data-preprocessor obs:","category":"page"},{"location":"predict_transform/","page":"predict, transform, and relatives","title":"predict, transform, and relatives","text":"obspredict(model, kind_of_proxy, obsdata) -> prediction\nobstransform(model, obsdata) -> transformed_data","category":"page"},{"location":"predict_transform/#Typical-worklows","page":"predict, transform, and relatives","title":"Typical worklows","text":"","category":"section"},{"location":"predict_transform/","page":"predict, transform, and relatives","title":"predict, transform, and relatives","text":"# Train some supervised `algorithm`:\nmodel = fit(algorithm, X, y)\n\n# Predict probability distributions:\nŷ = predict(model, Distribution(), Xnew)\n\n# Generate point predictions:\nŷ = predict(model, LiteralTarget(), Xnew)","category":"page"},{"location":"predict_transform/","page":"predict, transform, and relatives","title":"predict, transform, and relatives","text":"# Training a dimension-reducing `algorithm`:\nmodel = fit(algorithm, X)\nXnew_reduced = transform(model, Xnew)\n\n# Apply an approximate right inverse:\ninverse_transform(model, Xnew_reduced)","category":"page"},{"location":"predict_transform/#An-advanced-workflow","page":"predict, transform, and relatives","title":"An advanced workflow","text":"","category":"section"},{"location":"predict_transform/","page":"predict, transform, and relatives","title":"predict, transform, and relatives","text":"fitdata = obs(fit, algorithm, X, y)\npredictdata = obs(predict, algorithm, Xnew)\nmodel = obsfit(algorithm, obsdata)\nŷ = obspredict(model, LiteralTarget(), predictdata)","category":"page"},{"location":"predict_transform/#Implementation-guide","page":"predict, transform, and relatives","title":"Implementation guide","text":"","category":"section"},{"location":"predict_transform/","page":"predict, transform, and relatives","title":"predict, transform, and relatives","text":"The methods predict and transform are not directly overloaded. Implement obspredict and obstransform instead:","category":"page"},{"location":"predict_transform/","page":"predict, transform, and relatives","title":"predict, transform, and relatives","text":"method compulsory? fallback requires\nobspredict no none fit\nobstransform no none fit\ninverse_transform no none fit, obstransform","category":"page"},{"location":"predict_transform/#Predict-or-transform?","page":"predict, transform, and relatives","title":"Predict or transform?","text":"","category":"section"},{"location":"predict_transform/","page":"predict, transform, and relatives","title":"predict, transform, and relatives","text":"If the algorithm has a notion of target variable, then arrange for obspredict to output each supported kind of target proxy (LiteralTarget(), Distribution(), etc).","category":"page"},{"location":"predict_transform/","page":"predict, transform, and relatives","title":"predict, transform, and relatives","text":"For output not associated with a target variable, implement obstransform instead, which does not dispatch on LearnAPI.KindOfProxy, but can be optionally paired with an implementation of inverse_transform for returning (approximate) right inverses to transform.","category":"page"},{"location":"predict_transform/#Reference","page":"predict, transform, and relatives","title":"Reference","text":"","category":"section"},{"location":"predict_transform/","page":"predict, transform, and relatives","title":"predict, transform, and relatives","text":"predict\nobspredict\ntransform\nobstransform\ninverse_transform","category":"page"},{"location":"predict_transform/#LearnAPI.predict","page":"predict, transform, and relatives","title":"LearnAPI.predict","text":"predict(model, kind_of_proxy::LearnAPI.KindOfProxy, data...)\npredict(model, data...)\n\nThe first signature returns target or target proxy predictions for input features data, according to some model returned by fit or obsfit. Where supported, these are literally target predictions if kind_of_proxy = LiteralTarget(), and probability density/mass functions if kind_of_proxy = Distribution(). List all options with LearnAPI.kinds_of_proxy(algorithm), where algorithm = LearnAPI.algorithm(model).\n\nThe shortcut predict(model, data...) = predict(model, LiteralTarget(), data...) is also provided.\n\nArguments\n\nmodel is anything returned by a call of the form fit(algorithm, ...), for some LearnAPI-complaint algorithm.\ndata: tuple of data objects with a common number of observations, for example, data = (X, y, w) where X is a table of features, y is a target vector with the same number of rows, and w a vector of per-observation weights.\n\nExample\n\nIn the following, algorithm is some supervised learning algorithm with training features X, training target y, and test features Xnew:\n\nmodel = fit(algorithm, X, y; verbosity=0)\npredict(model, LiteralTarget(), Xnew)\n\nNote predict does not mutate any argument, except in the special case LearnAPI.predict_or_transform_mutates(algorithm) = true.\n\nSee also obspredict, fit, transform, inverse_transform.\n\nExtended help\n\nNew implementations\n\nLearnAPI.jl provides the following definition of predict which is never to be directly overloaded:\n\npredict(model, kop::LearnAPI.KindOfProxy, data...) =\n obspredict(model, kop, obs(predict, LearnAPI.algorithm(model), data...))\n\nRather, new algorithms overload obspredict.\n\n\n\n\n\n","category":"function"},{"location":"predict_transform/#LearnAPI.obspredict","page":"predict, transform, and relatives","title":"LearnAPI.obspredict","text":"obspredict(model, kind_of_proxy::LearnAPI.KindOfProxy, obsdata)\n\nSimilar to predict but consumes algorithm-specific representations of input data, obsdata, as returned by obs(predict, algorithm, data...). Here data... is the form of data expected in the main predict method. Alternatively, such obsdata may be replaced by a resampled version, where resampling is performed using MLUtils.getobs (always supported).\n\nFor some algorithms and workflows, obspredict will have a performance benefit over predict. See more at obs.\n\nExample\n\nIn the following, algorithm is some supervised learning algorithm with training features X, training target y, and test features Xnew:\n\nmodel = fit(algorithm, X, y)\nobsdata = obs(predict, algorithm, Xnew)\nŷ = obspredict(model, LiteralTarget(), obsdata)\n@assert ŷ == predict(model, LiteralTarget(), Xnew)\n\nSee also predict, fit, transform, inverse_transform, obs.\n\nExtended help\n\nNew implementations\n\nImplementation of obspredict is optional, but required to enable predict. The method must also handle obsdata in the case it is replaced by MLUtils.getobs(obsdata, I) for some collection I of indices. If obs is not overloaded, then obsdata = data, where data... is what the standard predict call expects, as in the call predict(model, kind_of_proxy, data...). Note data is always a tuple, even if predict has only one data argument. See more at obs.\n\nIf LearnAPI.predict_or_transform_mutates(algorithm) is overloaded to return true, then obspredict may mutate it's first argument, but not in a way that alters the result of a subsequent call to obspredict, obstransform or inverse_transform. This is necessary for some non-generalizing algorithms but is otherwise discouraged. See more at fit.\n\nIf overloaded, you must include both LearnAPI.obspredict and LearnAPI.predict in the list of methods returned by the LearnAPI.functions trait.\n\nAn implementation is provided for each kind of target proxy you wish to support. See the LearnAPI.jl documentation for options. Each supported kind_of_proxy instance should be listed in the return value of the LearnAPI.kinds_of_proxy(algorithm) trait.\n\nIf, additionally, minimize(model) is overloaded, then the following identity must hold:\n\nobspredict(minimize(model), args...) = obspredict(model, args...)\n\n\n\n\n\n","category":"function"},{"location":"predict_transform/#LearnAPI.transform","page":"predict, transform, and relatives","title":"LearnAPI.transform","text":"transform(model, data...)\n\nReturn a transformation of some data, using some model, as returned by fit.\n\nArguments\n\nmodel is anything returned by a call of the form fit(algorithm, ...), for some LearnAPI-complaint algorithm.\ndata: tuple of data objects with a common number of observations, for example, data = (X, y, w) where X is a table of features, y is a target vector with the same number of rows, and w a vector of per-observation weights.\n\nExample\n\nHere X and Xnew are data of the same form:\n\n# For an algorithm that generalizes to new data (\"learns\"):\nmodel = fit(algorithm, X; verbosity=0)\ntransform(model, Xnew)\n\n# For a static (non-generalizing) transformer:\nmodel = fit(algorithm)\ntransform(model, X)\n\nNote transform does not mutate any argument, except in the special case LearnAPI.predict_or_transform_mutates(algorithm) = true.\n\nSee also obstransform, fit, predict, inverse_transform.\n\nExtended help\n\nNew implementations\n\nLearnAPI.jl provides the following definition of transform which is never to be directly overloaded:\n\ntransform(model, data...) =\n obstransform(model, obs(predict, LearnAPI.algorithm(model), data...))\n\nRather, new algorithms overload obstransform.\n\n\n\n\n\n","category":"function"},{"location":"predict_transform/#LearnAPI.obstransform","page":"predict, transform, and relatives","title":"LearnAPI.obstransform","text":"obstransform(model, kind_of_proxy::LearnAPI.KindOfProxy, obsdata)\n\nSimilar to transform but consumes algorithm-specific representations of input data, obsdata, as returned by obs(transform, algorithm, data...). Here data... is the form of data expected in the main transform method. Alternatively, such obsdata may be replaced by a resampled version, where resampling is performed using MLUtils.getobs (always supported).\n\nFor some algorithms and workflows, obstransform will have a performance benefit over transform. See more at obs.\n\nExample\n\nIn the following, algorithm is some unsupervised learning algorithm with training features X, and test features Xnew:\n\nmodel = fit(algorithm, X, y)\nobsdata = obs(transform, algorithm, Xnew)\nW = obstransform(model, obsdata)\n@assert W == transform(model, Xnew)\n\nSee also transform, fit, predict, inverse_transform, obs.\n\nExtended help\n\nNew implementations\n\nImplementation of obstransform is optional, but required to enable transform. The method must also handle obsdata in the case it is replaced by MLUtils.getobs(obsdata, I) for some collection I of indices. If obs is not overloaded, then obsdata = data, where data... is what the standard transform call expects, as in the call transform(model, data...). Note data is always a tuple, even if transform has only one data argument. See more at obs.\n\nIf LearnAPI.predict_or_transform_mutates(algorithm) is overloaded to return true, then obstransform may mutate it's first argument, but not in a way that alters the result of a subsequent call to obspredict, obstransform or inverse_transform. This is necessary for some non-generalizing algorithms but is otherwise discouraged. See more at fit.\n\nIf overloaded, you must include both LearnAPI.obstransform and LearnAPI.transform in the list of methods returned by the LearnAPI.functions trait.\n\nEach supported kind_of_proxy should be listed in the return value of the LearnAPI.kinds_of_proxy(algorithm) trait.\n\nIf, additionally, minimize(model) is overloaded, then the following identity must hold:\n\nobstransform(minimize(model), args...) = obstransform(model, args...)\n\n\n\n\n\n","category":"function"},{"location":"predict_transform/#LearnAPI.inverse_transform","page":"predict, transform, and relatives","title":"LearnAPI.inverse_transform","text":"inverse_transform(model, data)\n\nInverse transform data according to some model returned by fit. Here \"inverse\" is to be understood broadly, e.g, an approximate right inverse for transform.\n\nArguments\n\nmodel: anything returned by a call of the form fit(algorithm, ...), for some LearnAPI-complaint algorithm.\ndata: something having the same form as the output of transform(model, inputs...)\n\nExample\n\nIn the following, algorithm is some dimension-reducing algorithm that generalizes to new data (such as PCA); Xtrain is the training input and Xnew the input to be reduced:\n\nmodel = fit(algorithm, Xtrain; verbosity=0)\nW = transform(model, Xnew) # reduced version of `Xnew`\nŴ = inverse_transform(model, W) # embedding of `W` in original space\n\nSee also fit, transform, predict.\n\nExtended help\n\nNew implementations\n\nImplementation is optional. If implemented, you must include inverse_transform in the tuple returned by the LearnAPI.functions trait. \n\nIf, additionally, minimize(model) is overloaded, then the following identity must hold:\n\ninverse_transform(minimize(model), args...) = inverse_transform(model, args...)\n\n\n\n\n\n","category":"function"},{"location":"patterns/supervised_bayesian_algorithms/#Supervised-Bayesian-Models","page":"Supervised Bayesian Models","title":"Supervised Bayesian Models","text":"","category":"section"},{"location":"patterns/classification/#Classification","page":"Classification","title":"Classification","text":"","category":"section"},{"location":"common_implementation_patterns/#Common-Implementation-Patterns","page":"Common Implementation Patterns","title":"Common Implementation Patterns","text":"","category":"section"},{"location":"common_implementation_patterns/","page":"Common Implementation Patterns","title":"Common Implementation Patterns","text":"🚧","category":"page"},{"location":"common_implementation_patterns/","page":"Common Implementation Patterns","title":"Common Implementation Patterns","text":"warning: Warning\nUnder construction","category":"page"},{"location":"common_implementation_patterns/","page":"Common Implementation Patterns","title":"Common Implementation Patterns","text":"warning: Warning\nThis section is only an implementation guide. The definitive specification of the Learn API is given in Reference.","category":"page"},{"location":"common_implementation_patterns/","page":"Common Implementation Patterns","title":"Common Implementation Patterns","text":"This guide is intended to be consulted after reading Anatomy of an Implementation, which introduces the main interface objects and terminology.","category":"page"},{"location":"common_implementation_patterns/","page":"Common Implementation Patterns","title":"Common Implementation Patterns","text":"Although an implementation is defined purely by the methods and traits it implements, most implementations fall into one (or more) of the following informally understood patterns or \"tasks\":","category":"page"},{"location":"common_implementation_patterns/","page":"Common Implementation Patterns","title":"Common Implementation Patterns","text":"Classification: Supervised learners for categorical targets \nRegression: Supervised learners for continuous targets\nIterative Algorithms\nIncremental Algorithms\nStatic Algorithms: Algorithms that do not learn, in the sense they must be re-executed for each new data set (do not generalize), but which have hyperparameters and/or deliver ancillary information about the computation.\nDimension Reduction: Transformers that learn to reduce feature space dimension\nMissing Value Imputation: Transformers that replace missing values.\nClusterering: Algorithms that group data into clusters for classification and possibly dimension reduction. May be true learners (generalize to new data) or static.\nOutlier Detection: Supervised, unsupervised, or semi-supervised learners for anomaly detection.\nLearning a Probability Distribution: Algorithms that fit a distribution or distribution-like object to data\nTime Series Forecasting\nTime Series Classification\nSupervised Bayesian Algorithms\nSurvival Analysis","category":"page"},{"location":"traits/#traits","page":"Algorithm Traits","title":"Algorithm Traits","text":"","category":"section"},{"location":"traits/","page":"Algorithm Traits","title":"Algorithm Traits","text":"Traits generally promise specific algorithm behavior, such as: This algorithm supports per-observation weights, which must appear as the third argument of fit, or This algorithm's transform method predicts Real vectors. They also record more mundane information, such as a package license.","category":"page"},{"location":"traits/","page":"Algorithm Traits","title":"Algorithm Traits","text":"Algorithm traits are functions whose first (and usually only) argument is an algorithm.","category":"page"},{"location":"traits/#Special-two-argument-traits","page":"Algorithm Traits","title":"Special two-argument traits","text":"","category":"section"},{"location":"traits/","page":"Algorithm Traits","title":"Algorithm Traits","text":"The two-argument version of LearnAPI.predict_output_scitype and LearnAPI.predict_output_scitype are the only overloadable traits with more than one argument.","category":"page"},{"location":"traits/#trait_summary","page":"Algorithm Traits","title":"Trait summary","text":"","category":"section"},{"location":"traits/#traits_list","page":"Algorithm Traits","title":"Overloadable traits","text":"","category":"section"},{"location":"traits/","page":"Algorithm Traits","title":"Algorithm Traits","text":"In the examples column of the table below, Table, Continuous, Sampleable are names owned by the package ScientificTypesBase.jl.","category":"page"},{"location":"traits/","page":"Algorithm Traits","title":"Algorithm Traits","text":"trait return value fallback value example\nLearnAPI.functions(algorithm) functions you can apply to algorithm or associated model (traits excluded) () (LearnAPI.fit, LearnAPI.predict, LearnAPI.algorithm)\nLearnAPI.kinds_of_proxy(algorithm) instances kop of KindOfProxy for which an implementation of LearnAPI.predict(algorithm, kop, ...) is guaranteed. () (Distribution(), Interval())\nLearnAPI.position_of_target(algorithm) the positional index¹ of the target in data in fit(algorithm, data...) calls 0 2\nLearnAPI.position_of_weights(algorithm) the positional index¹ of per-observation weights in data in fit(algorithm, data...) 0 3\nLearnAPI.descriptors(algorithm) lists one or more suggestive algorithm descriptors from LearnAPI.descriptors() () (:regression, :probabilistic)\nLearnAPI.is_pure_julia(algorithm) true if implementation is 100% Julia code false true\nLearnAPI.pkg_name(algorithm) name of package providing core code (may be different from package providing LearnAPI.jl implementation) \"unknown\" \"DecisionTree\"\nLearnAPI.pkg_license(algorithm) name of license of package providing core code \"unknown\" \"MIT\"\nLearnAPI.doc_url(algorithm) url providing documentation of the core code \"unknown\" \"https://en.wikipedia.org/wiki/Decision_tree_learning\"\nLearnAPI.load_path(algorithm) a string indicating where the struct for typeof(algorithm) is defined, beginning with name of package providing implementation \"unknown\" FastTrees.LearnAPI.DecisionTreeClassifier\nLearnAPI.is_composite(algorithm) true if one or more properties (fields) of algorithm may be an algorithm false true\nLearnAPI.human_name(algorithm) human name for the algorithm; should be a noun type name with spaces \"elastic net regressor\"\nLearnAPI.iteration_parameter(algorithm) symbolic name of an iteration parameter nothing :epochs\nLearnAPI.fit_scitype(algorithm) upper bound on scitype(data) ensuring fit(algorithm, data...) works Union{} Tuple{Table(Continuous), AbstractVector{Continuous}}\nLearnAPI.fit_observation_scitype(algorithm) upper bound on scitype(observation) for observation in data ensuring fit(algorithm, data...) works Union{} Tuple{AbstractVector{Continuous}, Continuous}\nLearnAPI.fit_type(algorithm) upper bound on typeof(data) ensuring fit(algorithm, data...) works Union{} Tuple{AbstractMatrix{<:Real}, AbstractVector{<:Real}}\nLearnAPI.fit_observation_type(algorithm) upper bound on typeof(observation) for observation in data ensuring fit(algorithm, data...) works Union{} Tuple{AbstractVector{<:Real}, Real}\nLearnAPI.predict_input_scitype(algorithm) upper bound on scitype(data) ensuring predict(model, kop, data...) works Union{} Table(Continuous)\nLearnAPI.predict_input_observation_scitype(algorithm) upper bound on scitype(observation) for observation in data ensuring predict(model, kop, data...) works Union{} Vector{Continuous}\nLearnAPI.predict_input_type(algorithm) upper bound on typeof(data) ensuring predict(model, kop, data...) works Union{} AbstractMatrix{<:Real}\nLearnAPI.predict_input_observation_type(algorithm) upper bound on typeof(observation) for observation in data ensuring predict(model, kop, data...) works Union{} Vector{<:Real}\nLearnAPI.predict_output_scitype(algorithm, kind_of_proxy) upper bound on scitype(predict(model, ...)) Any AbstractVector{Continuous}\nLearnAPI.predict_output_type(algorithm, kind_of_proxy) upper bound on typeof(predict(model, ...)) Any AbstractVector{<:Real}\nLearnAPI.transform_input_scitype(algorithm) upper bound on scitype(data) ensuring transform(model, data...) works Union{} Table(Continuous)\nLearnAPI.transform_input_observation_scitype(algorithm) upper bound on scitype(observation) for observation in data ensuring transform(model, data...) works Union{} Vector{Continuous}\nLearnAPI.transform_input_type(algorithm) upper bound on typeof(data)ensuring transform(model, data...) works Union{} AbstractMatrix{<:Real}}\nLearnAPI.transform_input_observation_type(algorithm) upper bound on typeof(observation) for observation in data ensuring transform(model, data...) works Union{} Vector{Continuous}\nLearnAPI.transform_output_scitype(algorithm) upper bound on scitype(transform(model, ...)) Any Table(Continuous)\nLearnAPI.transform_output_type(algorithm) upper bound on typeof(transform(model, ...)) Any AbstractMatrix{<:Real}\nLearnAPI.predict_or_transform_mutates(algorithm) true if predict or transform mutates first argument false true","category":"page"},{"location":"traits/","page":"Algorithm Traits","title":"Algorithm Traits","text":"¹ If the value is 0, then the variable in boldface type is not supported and not expected to appear in data. If length(data) is less than the trait value, then data is understood to exclude the variable, but note that fit can have multiple signatures of varying lengths, as in fit(algorithm, X, y) and fit(algorithm, X, y, w). A non-zero value is a promise that fit includes a signature of sufficient length to include the variable.","category":"page"},{"location":"traits/#Derived-Traits","page":"Algorithm Traits","title":"Derived Traits","text":"","category":"section"},{"location":"traits/","page":"Algorithm Traits","title":"Algorithm Traits","text":"The following convenience methods are provided but not overloadable by new implementations.","category":"page"},{"location":"traits/","page":"Algorithm Traits","title":"Algorithm Traits","text":"trait return value example\nLearnAPI.name(algorithm) algorithm type name as string \"PCA\"\nLearnAPI.is_algorithm(algorithm) true if LearnAPI.functions(algorithm) is not empty true\nLearnAPI.predict_output_scitype(algorithm) dictionary of upper bounds on the scitype of predictions, keyed on subtypes of LearnAPI.KindOfProxy \nLearnAPI.predict_output_type(algorithm) dictionary of upper bounds on the type of predictions, keyed on subtypes of LearnAPI.KindOfProxy ","category":"page"},{"location":"traits/#Implementation-guide","page":"Algorithm Traits","title":"Implementation guide","text":"","category":"section"},{"location":"traits/","page":"Algorithm Traits","title":"Algorithm Traits","text":"A single-argument trait is declared following this pattern:","category":"page"},{"location":"traits/","page":"Algorithm Traits","title":"Algorithm Traits","text":"LearnAPI.is_pure_julia(algorithm::MyAlgorithmType) = true","category":"page"},{"location":"traits/","page":"Algorithm Traits","title":"Algorithm Traits","text":"A shorthand for single-argument traits is available:","category":"page"},{"location":"traits/","page":"Algorithm Traits","title":"Algorithm Traits","text":"@trait MyAlgorithmType is_pure_julia=true","category":"page"},{"location":"traits/","page":"Algorithm Traits","title":"Algorithm Traits","text":"Multiple traits can be declared like this:","category":"page"},{"location":"traits/","page":"Algorithm Traits","title":"Algorithm Traits","text":"@trait(\n MyAlgorithmType,\n is_pure_julia = true,\n pkg_name = \"MyPackage\",\n)","category":"page"},{"location":"traits/#The-global-trait-contracts","page":"Algorithm Traits","title":"The global trait contracts","text":"","category":"section"},{"location":"traits/","page":"Algorithm Traits","title":"Algorithm Traits","text":"To ensure that trait metadata can be stored in an external algorithm registry, LearnAPI.jl requires:","category":"page"},{"location":"traits/","page":"Algorithm Traits","title":"Algorithm Traits","text":"Finiteness: The value of a trait is the same for all algorithms with same underlying UnionAll type. That is, even if the type parameters are different, the trait should be the same. There is an exception if is_composite(algorithm) = true.\nSerializability: The value of any trait can be evaluated without installing any third party package; using LearnAPI should suffice.","category":"page"},{"location":"traits/","page":"Algorithm Traits","title":"Algorithm Traits","text":"Because of 1, combining a lot of functionality into one algorithm (e.g. the algorithm can perform both classification or regression) can mean traits are necessarily less informative (as in LearnAPI.predict_type(algorithm) = Any).","category":"page"},{"location":"traits/#Reference","page":"Algorithm Traits","title":"Reference","text":"","category":"section"},{"location":"traits/","page":"Algorithm Traits","title":"Algorithm Traits","text":"LearnAPI.functions\nLearnAPI.kinds_of_proxy\nLearnAPI.position_of_target\nLearnAPI.position_of_weights\nLearnAPI.descriptors\nLearnAPI.is_pure_julia\nLearnAPI.pkg_name\nLearnAPI.pkg_license\nLearnAPI.doc_url\nLearnAPI.load_path\nLearnAPI.is_composite\nLearnAPI.human_name\nLearnAPI.iteration_parameter\nLearnAPI.fit_scitype\nLearnAPI.fit_type\nLearnAPI.fit_observation_scitype\nLearnAPI.fit_observation_type\nLearnAPI.predict_input_scitype\nLearnAPI.predict_input_observation_scitype\nLearnAPI.predict_input_type\nLearnAPI.predict_input_observation_type\nLearnAPI.predict_output_scitype\nLearnAPI.predict_output_type\nLearnAPI.transform_input_scitype\nLearnAPI.transform_input_observation_scitype\nLearnAPI.transform_input_type\nLearnAPI.transform_input_observation_type\nLearnAPI.predict_or_transform_mutates\nLearnAPI.transform_output_scitype\nLearnAPI.transform_output_type","category":"page"},{"location":"traits/#LearnAPI.functions","page":"Algorithm Traits","title":"LearnAPI.functions","text":"LearnAPI.functions(algorithm)\n\nReturn a tuple of functions that can be sensibly applied to algorithm, or to objects having the same type as algorithm, or to associated models (objects returned by fit(algorithm, ...). Algorithm traits are excluded.\n\nIn addition to functions, the returned tuple may include expressions, like :(DecisionTree.print_tree), which reference functions not owned by LearnAPI.jl.\n\nThe understanding is that algorithm is a LearnAPI-compliant object whenever this is non-empty.\n\nExtended help\n\nNew implementations\n\nAll new implementations must overload this trait. Here's a checklist for elements in the return value:\n\nfunction needs explicit implementation? include in returned tuple?\nfit no yes\nobsfit yes yes\nminimize optional yes\npredict no if obspredict is implemented\nobspredict optional if implemented\ntransform no if obstransform is implemented\nobstransform optional if implemented\nobs optional yes\ninverse_transform optional if implemented\nLearnAPI.algorithm yes yes\n\nAlso include any implemented accessor functions. The LearnAPI.jl accessor functions are: LearnAPI.extras, LearnAPI.algorithm, LearnAPI.coefficients, LearnAPI.intercept, LearnAPI.tree, LearnAPI.trees, LearnAPI.feature_importances, LearnAPI.training_labels, LearnAPI.training_losses, LearnAPI.training_scores and LearnAPI.components.\n\n\n\n\n\n","category":"function"},{"location":"traits/#LearnAPI.kinds_of_proxy","page":"Algorithm Traits","title":"LearnAPI.kinds_of_proxy","text":"LearnAPI.kinds_of_proxy(algorithm)\n\nReturns an tuple of all instances, kind, for which for which predict(algorithm, kind, data...) has a guaranteed implementation. Each such kind subtypes LearnAPI.KindOfProxy. Examples are LiteralTarget() (for predicting actual target values) and Distributions() (for predicting probability mass/density functions).\n\nSee also LearnAPI.predict, LearnAPI.KindOfProxy.\n\nExtended help\n\nNew implementations\n\nImplementation is optional but recommended whenever predict is overloaded.\n\nElements of the returned tuple must be one of these: ConfidenceInterval, Continuous, Distribution, LabelAmbiguous, LabelAmbiguousDistribution, LabelAmbiguousSampleable, LiteralTarget, LogDistribution, LogProbability, OutlierScore, Parametric, ProbabilisticSet, Probability, Sampleable, Set, SurvivalDistribution, SurvivalFunction, IID, JointDistribution, JointLogDistribution and JointSampleable.\n\nSuppose, for example, we have the following implementation of a supervised learner returning only probabilistic predictions:\n\nLearnAPI.predict(algorithm::MyNewAlgorithmType, LearnAPI.Distribution(), Xnew) = ...\n\nThen we can declare\n\n@trait MyNewAlgorithmType kinds_of_proxy = (LearnaAPI.Distribution(),)\n\nFor more on target variables and target proxies, refer to the LearnAPI documentation.\n\n\n\n\n\n","category":"function"},{"location":"traits/#LearnAPI.position_of_target","page":"Algorithm Traits","title":"LearnAPI.position_of_target","text":"LearnAPI.position_of_target(algorithm)\n\nReturn the expected position of the target variable within data in calls of the form LearnAPI.fit(algorithm, verbosity, data...).\n\nIf this number is 0, then no target is expected. If this number exceeds length(data), then data is understood to exclude the target variable.\n\n\n\n\n\n","category":"function"},{"location":"traits/#LearnAPI.position_of_weights","page":"Algorithm Traits","title":"LearnAPI.position_of_weights","text":"LearnAPI.position_of_weights(algorithm)\n\nReturn the expected position of per-observation weights within data in calls of the form LearnAPI.fit(algorithm, data...).\n\nIf this number is 0, then no weights are expected. If this number exceeds length(data), then data is understood to exclude weights, which are assumed to be uniform.\n\n\n\n\n\n","category":"function"},{"location":"traits/#LearnAPI.descriptors","page":"Algorithm Traits","title":"LearnAPI.descriptors","text":"LearnAPI.descriptors(algorithm)\n\nLists one or more suggestive algorithm descriptors from this list: :regression, :classification, :clustering, :gradient_descent, :iterative_algorithms, :incremental_algorithms, :dimension_reduction, :encoders, :static_algorithms, :missing_value_imputation, :ensemble_algorithms, :wrappers, :time_series_forecasting, :time_series_classification, :survival_analysis, :distribution_fitters, :Bayesian_algorithms, :outlier_detection, :collaborative_filtering, :text_analysis, :audio_analysis, :natural_language_processing, :image_processing (do LearnAPI.descriptors() to reproduce).\n\nwarning: Warning\nThe value of this trait guarantees no particular behavior. The trait is intended for informal classification purposes only.\n\nNew implementations\n\nThis trait should return a tuple of symbols, as in (:classifier, :text_analysis).\n\n\n\n\n\n","category":"function"},{"location":"traits/#LearnAPI.is_pure_julia","page":"Algorithm Traits","title":"LearnAPI.is_pure_julia","text":"LearnAPI.is_pure_julia(algorithm)\n\nReturns true if training algorithm requires evaluation of pure Julia code only.\n\nNew implementations\n\nThe fallback is false.\n\n\n\n\n\n","category":"function"},{"location":"traits/#LearnAPI.pkg_name","page":"Algorithm Traits","title":"LearnAPI.pkg_name","text":"LearnAPI.pkg_name(algorithm)\n\nReturn the name of the package module which supplies the core training algorithm for algorithm. This is not necessarily the package providing the LearnAPI interface.\n\nReturns \"unknown\" if the algorithm implementation has failed to overload the trait. \n\nNew implementations\n\nMust return a string, as in \"DecisionTree\".\n\n\n\n\n\n","category":"function"},{"location":"traits/#LearnAPI.pkg_license","page":"Algorithm Traits","title":"LearnAPI.pkg_license","text":"LearnAPI.pkg_license(algorithm)\n\nReturn the name of the software license, such as \"MIT\", applying to the package where the core algorithm for algorithm is implemented.\n\n\n\n\n\n","category":"function"},{"location":"traits/#LearnAPI.doc_url","page":"Algorithm Traits","title":"LearnAPI.doc_url","text":"LearnAPI.doc_url(algorithm)\n\nReturn a url where the core algorithm for algorithm is documented.\n\nReturns \"unknown\" if the algorithm implementation has failed to overload the trait. \n\nNew implementations\n\nMust return a string, such as \"https://en.wikipedia.org/wiki/Decision_tree_learning\".\n\n\n\n\n\n","category":"function"},{"location":"traits/#LearnAPI.load_path","page":"Algorithm Traits","title":"LearnAPI.load_path","text":"LearnAPI.load_path(algorithm)\n\nReturn a string indicating where the struct for typeof(algorithm) can be found, beginning with the name of the package module defining it. For example, a return value of \"FastTrees.LearnAPI.DecisionTreeClassifier\" means the following julia code will return the algorithm type:\n\nimport FastTrees\nFastTrees.LearnAPI.DecisionTreeClassifier\n\nReturns \"unknown\" if the algorithm implementation has failed to overload the trait. \n\n\n\n\n\n","category":"function"},{"location":"traits/#LearnAPI.is_composite","page":"Algorithm Traits","title":"LearnAPI.is_composite","text":"LearnAPI.is_composite(algorithm)\n\nReturns true if one or more properties (fields) of algorithm may themselves be algorithms, and false otherwise.\n\nSee also [LearnAPI.components](@ref).\n\nNew implementations\n\nThis trait should be overloaded if one or more properties (fields) of algorithm may take algorithm values. Fallback return value is false. The keyword constructor for such an algorithm need not prescribe defaults for algorithm-valued properties. Implementation of the accessor function LearnAPI.components is recommended.\n\nThe value of the trait must depend only on the type of algorithm. \n\n\n\n\n\n","category":"function"},{"location":"traits/#LearnAPI.human_name","page":"Algorithm Traits","title":"LearnAPI.human_name","text":"LearnAPI.human_name(algorithm)\n\nA human-readable string representation of typeof(algorithm). Primarily intended for auto-generation of documentation.\n\nNew implementations\n\nOptional. A fallback takes the type name, inserts spaces and removes capitalization. For example, KNNRegressor becomes \"knn regressor\". Better would be to overload the trait to return \"K-nearest neighbors regressor\". Ideally, this is a \"concrete\" noun like \"ridge regressor\" rather than an \"abstract\" noun like \"ridge regression\".\n\n\n\n\n\n","category":"function"},{"location":"traits/#LearnAPI.iteration_parameter","page":"Algorithm Traits","title":"LearnAPI.iteration_parameter","text":"LearnAPI.iteration_parameter(algorithm)\n\nThe name of the iteration parameter of algorithm, or nothing if the algorithm is not iterative.\n\nNew implementations\n\nImplement if algorithm is iterative. Returns a symbol or nothing.\n\n\n\n\n\n","category":"function"},{"location":"traits/#LearnAPI.fit_scitype","page":"Algorithm Traits","title":"LearnAPI.fit_scitype","text":"LearnAPI.fit_scitype(algorithm)\n\nReturn an upper bound on the scitype of data guaranteed to work when calling fit(algorithm, data...).\n\nSpecifically, if the return value is S and ScientificTypes.scitype(data) <: S, then all the following calls are guaranteed to work:\n\nfit(algorithm, data...)\nobsdata = obs(fit, algorithm, data...)\nfit(algorithm, Obs(), obsdata)\n\nSee also LearnAPI.fit_type, LearnAPI.fit_observation_scitype, LearnAPI.fit_observation_type.\n\nNew implementations\n\nOptional. The fallback return value is Union{}. Ordinarily, at most one of the following should be overloaded for given algorithm LearnAPI.fit_scitype, LearnAPI.fit_type, LearnAPI.fit_observation_scitype, LearnAPI.fit_observation_type.\n\n\n\n\n\n","category":"function"},{"location":"traits/#LearnAPI.fit_type","page":"Algorithm Traits","title":"LearnAPI.fit_type","text":"LearnAPI.fit_type(algorithm)\n\nReturn an upper bound on the type of data guaranteed to work when calling fit(algorithm, data...).\n\nSpecifically, if the return value is T and typeof(data) <: T, then all the following calls are guaranteed to work:\n\nfit(algorithm, data...)\nobsdata = obs(fit, algorithm, data...)\nfit(algorithm, Obs(), obsdata)\n\nSee also LearnAPI.fit_scitype, LearnAPI.fit_observation_type. LearnAPI.fit_observation_scitype\n\nNew implementations\n\nOptional. The fallback return value is Union{}. Ordinarily, at most one of the following should be overloaded for given algorithm LearnAPI.fit_scitype, LearnAPI.fit_type, LearnAPI.fit_observation_scitype, LearnAPI.fit_observation_type.\n\n\n\n\n\n","category":"function"},{"location":"traits/#LearnAPI.fit_observation_scitype","page":"Algorithm Traits","title":"LearnAPI.fit_observation_scitype","text":"LearnAPI.fit_observation_scitype(algorithm)\n\nReturn an upper bound on the scitype of observations guaranteed to work when calling fit(algorithm, data...), independent of the type/scitype of the data container itself. Here \"observations\" is in the sense of MLUtils.jl. Assuming this trait has value different from Union{} the understanding is that data implements the MLUtils.jl getobs/numobs interface.\n\nSpecifically, denoting the type returned above by S, supposing S != Union{}, and that user supplies data satisfying\n\nScientificTypes.scitype(MLUtils.getobs(data, i)) <: S\n\nfor any valid index i, then all the following are guaranteed to work:\n\nfit(algorithm, data....)\nobsdata = obs(fit, algorithm, data...)\nfit(algorithm, Obs(), obsdata)\n\nSee also See also LearnAPI.fit_type, LearnAPI.fit_scitype, LearnAPI.fit_observation_type.\n\nNew implementations\n\nOptional. The fallback return value is Union{}. Ordinarily, at most one of the following should be overloaded for given algorithm LearnAPI.fit_scitype, LearnAPI.fit_type, LearnAPI.fit_observation_scitype, LearnAPI.fit_observation_type.\n\n\n\n\n\n","category":"function"},{"location":"traits/#LearnAPI.fit_observation_type","page":"Algorithm Traits","title":"LearnAPI.fit_observation_type","text":"LearnAPI.fit_observation_type(algorithm)\n\nReturn an upper bound on the type of observations guaranteed to work when calling fit(algorithm, data...), independent of the type/scitype of the data container itself. Here \"observations\" is in the sense of MLUtils.jl. Assuming this trait has value different from Union{} the understanding is that data implements the MLUtils.jl getobs/numobs interface.\n\nSpecifically, denoting the type returned above by T, supposing T != Union{}, and that user supplies data satisfying\n\ntypeof(MLUtils.getobs(data, i)) <: T\n\nfor any valid index i, then the following is guaranteed to work:\n\nfit(algorithm, data....)\nobsdata = obs(fit, algorithm, data...)\nfit(algorithm, Obs(), obsdata)\n\nSee also See also LearnAPI.fit_type, LearnAPI.fit_scitype, LearnAPI.fit_observation_scitype.\n\nNew implementations\n\nOptional. The fallback return value is Union{}. Ordinarily, at most one of the following should be overloaded for given algorithm LearnAPI.fit_scitype, LearnAPI.fit_type, LearnAPI.fit_observation_scitype, LearnAPI.fit_observation_type.\n\n\n\n\n\n","category":"function"},{"location":"traits/#LearnAPI.predict_input_scitype","page":"Algorithm Traits","title":"LearnAPI.predict_input_scitype","text":" LearnAPI.predict_input_scitype(algorithm)\n\nReturn an upper bound on the scitype of data guaranteed to work in the call predict(algorithm, kind_of_proxy, data...).\n\nSpecifically, if S is the value returned and ScientificTypes.scitype(data) <: S, then the following is guaranteed to work:\n\njulia predict(model, kind_of_proxy, data...) obsdata = obs(predict, algorithm, data...) predict(model, kind_of_proxy, Obs(), obsdata) whenever algorithm = LearnAPI.algorithm(model).\n\nSee also LearnAPI.predict_input_type.\n\nNew implementations\n\nImplementation is optional. The fallback return value is Union{}. Ordinarily, at most one of the following should be overloaded for given algorithm LearnAPI.predict_scitype, LearnAPI.predict_type, LearnAPI.predict_observation_scitype, LearnAPI.predict_observation_type.\n\n\n\n\n\n","category":"function"},{"location":"traits/#LearnAPI.predict_input_observation_scitype","page":"Algorithm Traits","title":"LearnAPI.predict_input_observation_scitype","text":"LearnAPI.predict_observation_scitype(algorithm)\n\nReturn an upper bound on the scitype of observations guaranteed to work when calling predict(model, kind_of_proxy, data...), independent of the type/scitype of the data container itself. Here \"observations\" is in the sense of MLUtils.jl. Assuming this trait has value different from Union{} the understanding is that data implements the MLUtils.jl getobs/numobs interface.\n\nSpecifically, denoting the type returned above by S, supposing S != Union{}, and that user supplies data satisfying\n\nScientificTypes.scitype(MLUtils.getobs(data, i)) <: S\n\nfor any valid index i, then all the following are guaranteed to work:\n\npredict(model, kind_of_proxy, data...)\nobsdata = obs(predict, algorithm, data...)\npredict(model, kind_of_proxy, Obs(), obsdata)\n\nwhenever algorithm = LearnAPI.algorithm(model).\n\nSee also See also LearnAPI.fit_type, LearnAPI.fit_scitype, LearnAPI.fit_observation_type.\n\nNew implementations\n\nOptional. The fallback return value is Union{}. Ordinarily, at most one of the following should be overloaded for given algorithm LearnAPI.predict_scitype, LearnAPI.predict_type, LearnAPI.predict_observation_scitype, LearnAPI.predict_observation_type.\n\n\n\n\n\n","category":"function"},{"location":"traits/#LearnAPI.predict_input_type","page":"Algorithm Traits","title":"LearnAPI.predict_input_type","text":"LearnAPI.predict_input_type(algorithm)\n\nReturn an upper bound on the type of data guaranteed to work in the call predict(algorithm, kind_of_proxy, data...).\n\nSpecifically, if T is the value returned and typeof(data) <: T, then the following is guaranteed to work:\n\npredict(model, kind_of_proxy, data...)\nobsdata = obs(predict, model, data...)\npredict(model, kind_of_proxy, Obs(), obsdata)\n\nSee also LearnAPI.predict_input_scitype.\n\nNew implementations\n\nImplementation is optional. The fallback return value is Union{}. Should not be overloaded if LearnAPI.predict_input_scitype is overloaded.\n\n\n\n\n\n","category":"function"},{"location":"traits/#LearnAPI.predict_input_observation_type","page":"Algorithm Traits","title":"LearnAPI.predict_input_observation_type","text":"LearnAPI.predict_observation_type(algorithm)\n\nReturn an upper bound on the type of observations guaranteed to work when calling predict(model, kind_of_proxy, data...), independent of the type/scitype of the data container itself. Here \"observations\" is in the sense of MLUtils.jl. Assuming this trait has value different from Union{} the understanding is that data implements the MLUtils.jl getobs/numobs interface.\n\nSpecifically, denoting the type returned above by T, supposing T != Union{}, and that user supplies data satisfying\n\ntypeof(MLUtils.getobs(data, i)) <: T\n\nfor any valid index i, then all the following are guaranteed to work:\n\npredict(model, kind_of_proxy, data...)\nobsdata = obs(predict, algorithm, data...)\npredict(model, kind_of_proxy, Obs(), obsdata)\n\nwhenever algorithm = LearnAPI.algorithm(model).\n\nSee also See also LearnAPI.fit_type, LearnAPI.fit_scitype, LearnAPI.fit_observation_type.\n\nNew implementations\n\nOptional. The fallback return value is Union{}. Ordinarily, at most one of the following should be overloaded for given algorithm LearnAPI.predict_scitype, LearnAPI.predict_type, LearnAPI.predict_observation_scitype, LearnAPI.predict_observation_type.\n\n\n\n\n\n","category":"function"},{"location":"traits/#LearnAPI.predict_output_scitype","page":"Algorithm Traits","title":"LearnAPI.predict_output_scitype","text":"LearnAPI.predict_output_scitype(algorithm, kind_of_proxy::KindOfProxy)\n\nReturn an upper bound for the scitypes of predictions of the specified form where supported, and otherwise return Any. For example, if\n\nŷ = LearnAPI.predict(model, LearnAPI.Distribution(), data...)\n\nsuccessfully returns (i.e., algorithm supports predictions of target probability distributions) then the following is guaranteed to hold:\n\nscitype(ŷ) <: LearnAPI.predict_output_scitype(algorithm, LearnAPI.Distribution())\n\nNote. This trait has a single-argument \"convenience\" version LearnAPI.predict_output_scitype(algorithm) derived from this one, which returns a dictionary keyed on target proxy types.\n\nSee also LearnAPI.KindOfProxy, LearnAPI.predict, LearnAPI.predict_input_scitype.\n\nNew implementations\n\nOverloading the trait is optional. Here's a sample implementation for a supervised regressor type MyRgs that only predicts actual values of the target:\n\n@trait MyRgs predict_output_scitype = AbstractVector{ScientificTypesBase.Continuous}\n\nThe fallback method returns Any.\n\n\n\n\n\nLearnAPI.predict_output_scitype(algorithm)\n\nReturn a dictionary of upper bounds on the scitype of predictions, keyed on concrete subtypes of LearnAPI.KindOfProxy. Each of these subtypes represents a different form of target prediction (LiteralTarget, Distribution, SurvivalFunction, etc) possibly supported by algorithm, but the existence of a key does not guarantee that form is supported.\n\nAs an example, if\n\nŷ = LearnAPI.predict(model, LearnAPI.Distribution(), data...)\n\nsuccessfully returns (i.e., algorithm supports predictions of target probability distributions) then the following is guaranteed to hold:\n\nscitype(ŷ) <: LearnAPI.predict_output_scitypes(algorithm)[LearnAPI.Distribution]\n\nSee also LearnAPI.KindOfProxy, LearnAPI.predict, LearnAPI.predict_input_scitype.\n\nNew implementations\n\nThis single argument trait should not be overloaded. Instead, overload LearnAPI.predict_output_scitype(algorithm, kindofproxy).\n\n\n\n\n\n","category":"function"},{"location":"traits/#LearnAPI.predict_output_type","page":"Algorithm Traits","title":"LearnAPI.predict_output_type","text":"LearnAPI.predict_output_type(algorithm, kind_of_proxy::KindOfProxy)\n\nReturn an upper bound for the types of predictions of the specified form where supported, and otherwise return Any. For example, if\n\nŷ = LearnAPI.predict(model, LearnAPI.Distribution(), data...)\n\nsuccessfully returns (i.e., algorithm supports predictions of target probability distributions) then the following is guaranteed to hold:\n\ntype(ŷ) <: LearnAPI.predict_output_type(algorithm, LearnAPI.Distribution())\n\nNote. This trait has a single-argument \"convenience\" version LearnAPI.predict_output_type(algorithm) derived from this one, which returns a dictionary keyed on target proxy types.\n\nSee also LearnAPI.KindOfProxy, LearnAPI.predict, LearnAPI.predict_input_type.\n\nNew implementations\n\nOverloading the trait is optional. Here's a sample implementation for a supervised regressor type MyRgs that only predicts actual values of the target:\n\n@trait MyRgs predict_output_type = AbstractVector{ScientificTypesBase.Continuous}\n\nThe fallback method returns Any.\n\n\n\n\n\nLearnAPI.predict_output_type(algorithm)\n\nReturn a dictionary of upper bounds on the type of predictions, keyed on concrete subtypes of LearnAPI.KindOfProxy. Each of these subtypes represents a different form of target prediction (LiteralTarget, Distribution, SurvivalFunction, etc) possibly supported by algorithm, but the existence of a key does not guarantee that form is supported.\n\nAs an example, if\n\nŷ = LearnAPI.predict(model, LearnAPI.Distribution(), data...)\n\nsuccessfully returns (i.e., algorithm supports predictions of target probability distributions) then the following is guaranteed to hold:\n\ntype(ŷ) <: LearnAPI.predict_output_types(algorithm)[LearnAPI.Distribution]\n\nSee also LearnAPI.KindOfProxy, LearnAPI.predict, LearnAPI.predict_input_type.\n\nNew implementations\n\nThis single argument trait should not be overloaded. Instead, overload LearnAPI.predict_output_type(algorithm, kindofproxy).\n\n\n\n\n\n","category":"function"},{"location":"traits/#LearnAPI.transform_input_scitype","page":"Algorithm Traits","title":"LearnAPI.transform_input_scitype","text":" LearnAPI.transform_input_scitype(algorithm)\n\nReturn an upper bound on the scitype of data guaranteed to work in the call transform(algorithm, data...).\n\nSpecifically, if S is the value returned and ScientificTypes.scitype(data) <: S, then the following is guaranteed to work:\n\njulia transform(model, data...) obsdata = obs(transform, algorithm, data...) transform(model, Obs(), obsdata) whenever algorithm = LearnAPI.algorithm(model).\n\nSee also LearnAPI.transform_input_type.\n\nNew implementations\n\nImplementation is optional. The fallback return value is Union{}. Ordinarily, at most one of the following should be overloaded for given algorithm LearnAPI.transform_scitype, LearnAPI.transform_type, LearnAPI.transform_observation_scitype, LearnAPI.transform_observation_type.\n\n\n\n\n\n","category":"function"},{"location":"traits/#LearnAPI.transform_input_observation_scitype","page":"Algorithm Traits","title":"LearnAPI.transform_input_observation_scitype","text":"LearnAPI.transform_observation_scitype(algorithm)\n\nReturn an upper bound on the scitype of observations guaranteed to work when calling transform(model, data...), independent of the type/scitype of the data container itself. Here \"observations\" is in the sense of MLUtils.jl. Assuming this trait has value different from Union{} the understanding is that data implements the MLUtils.jl getobs/numobs interface.\n\nSpecifically, denoting the type returned above by S, supposing S != Union{}, and that user supplies data satisfying\n\nScientificTypes.scitype(MLUtils.getobs(data, i)) <: S\n\nfor any valid index i, then all the following are guaranteed to work:\n\ntransform(model, data...)\nobsdata = obs(transform, algorithm, data...)\ntransform(model, Obs(), obsdata)\n\nwhenever algorithm = LearnAPI.algorithm(model).\n\nSee also See also LearnAPI.fit_type, LearnAPI.fit_scitype, LearnAPI.fit_observation_type.\n\nNew implementations\n\nOptional. The fallback return value is Union{}. Ordinarily, at most one of the following should be overloaded for given algorithm LearnAPI.transform_scitype, LearnAPI.transform_type, LearnAPI.transform_observation_scitype, LearnAPI.transform_observation_type.\n\n\n\n\n\n","category":"function"},{"location":"traits/#LearnAPI.transform_input_type","page":"Algorithm Traits","title":"LearnAPI.transform_input_type","text":"LearnAPI.transform_input_type(algorithm)\n\nReturn an upper bound on the type of data guaranteed to work in the call transform(algorithm, data...).\n\nSpecifically, if T is the value returned and typeof(data) <: T, then the following is guaranteed to work:\n\ntransform(model, data...)\nobsdata = obs(transform, model, data...)\ntransform(model, Obs(), obsdata)\n\nSee also LearnAPI.transform_input_scitype.\n\nNew implementations\n\nImplementation is optional. The fallback return value is Union{}. Should not be overloaded if LearnAPI.transform_input_scitype is overloaded.\n\n\n\n\n\n","category":"function"},{"location":"traits/#LearnAPI.transform_input_observation_type","page":"Algorithm Traits","title":"LearnAPI.transform_input_observation_type","text":"LearnAPI.transform_observation_type(algorithm)\n\nReturn an upper bound on the type of observations guaranteed to work when calling transform(model, data...), independent of the type/scitype of the data container itself. Here \"observations\" is in the sense of MLUtils.jl. Assuming this trait has value different from Union{} the understanding is that data implements the MLUtils.jl getobs/numobs interface.\n\nSpecifically, denoting the type returned above by T, supposing T != Union{}, and that user supplies data satisfying\n\ntypeof(MLUtils.getobs(data, i)) <: T\n\nfor any valid index i, then all the following are guaranteed to work:\n\ntransform(model, data...)\nobsdata = obs(transform, algorithm, data...)\ntransform(model, Obs(), obsdata)\n\nwhenever algorithm = LearnAPI.algorithm(model).\n\nSee also See also LearnAPI.fit_type, LearnAPI.fit_scitype, LearnAPI.fit_observation_type.\n\nNew implementations\n\nOptional. The fallback return value is Union{}. Ordinarily, at most one of the following should be overloaded for given algorithm LearnAPI.transform_scitype, LearnAPI.transform_type, LearnAPI.transform_observation_scitype, LearnAPI.transform_observation_type.\n\n\n\n\n\n","category":"function"},{"location":"traits/#LearnAPI.predict_or_transform_mutates","page":"Algorithm Traits","title":"LearnAPI.predict_or_transform_mutates","text":"LearnAPI.predict_or_transform_mutates(algorithm)\n\nReturns true if predict or transform possibly mutate their first argument, model, when LearnAPI.algorithm(model) == algorithm. If false, no arguments are ever mutated.\n\nNew implementations\n\nThis trait, falling back to false, may only be overloaded when fit has no data arguments (algorithm does not generalize to new data). See more at fit.\n\n\n\n\n\n","category":"function"},{"location":"traits/#LearnAPI.transform_output_scitype","page":"Algorithm Traits","title":"LearnAPI.transform_output_scitype","text":"LearnAPI.transform_output_scitype(algorithm)\n\nReturn an upper bound on the scitype of the output of the transform operation.\n\nSee also LearnAPI.transform_input_scitype.\n\nNew implementations\n\nImplementation is optional. The fallback return value is Any.\n\n\n\n\n\n","category":"function"},{"location":"traits/#LearnAPI.transform_output_type","page":"Algorithm Traits","title":"LearnAPI.transform_output_type","text":"LearnAPI.transform_output_type(algorithm)\n\nReturn an upper bound on the type of the output of the transform operation.\n\nNew implementations\n\nImplementation is optional. The fallback return value is Any.\n\n\n\n\n\n","category":"function"},{"location":"kinds_of_target_proxy/#proxy_types","page":"Kinds of Target Proxy","title":"Kinds of Target Proxy","text":"","category":"section"},{"location":"kinds_of_target_proxy/","page":"Kinds of Target Proxy","title":"Kinds of Target Proxy","text":"The available kinds of target proxy are classified by subtypes of LearnAPI.KindOfProxy. These types are intended for dispatch only and have no fields.","category":"page"},{"location":"kinds_of_target_proxy/","page":"Kinds of Target Proxy","title":"Kinds of Target Proxy","text":"LearnAPI.KindOfProxy","category":"page"},{"location":"kinds_of_target_proxy/#LearnAPI.KindOfProxy","page":"Kinds of Target Proxy","title":"LearnAPI.KindOfProxy","text":"LearnAPI.KindOfProxy\n\nAbstract type whose concrete subtypes T each represent a different kind of proxy for some target variable, associated with some algorithm. Instances T() are used to request the form of target predictions in predict calls.\n\nSee LearnAPI.jl documentation for an explanation of \"targets\" and \"target proxies\".\n\nFor example, Distribution is a concrete subtype of LearnAPI.KindOfProxy and a call like predict(model, Distribution(), Xnew) returns a data object whose observations are probability density/mass functions, assuming algorithm supports predictions of that form.\n\nRun LearnAPI.CONCRETE_TARGET_PROXY_TYPES to list all options. \n\n\n\n\n\n","category":"type"},{"location":"kinds_of_target_proxy/","page":"Kinds of Target Proxy","title":"Kinds of Target Proxy","text":"LearnAPI.IID","category":"page"},{"location":"kinds_of_target_proxy/#LearnAPI.IID","page":"Kinds of Target Proxy","title":"LearnAPI.IID","text":"LearnAPI.IID <: LearnAPI.KindOfProxy\n\nAbstract subtype of LearnAPI.KindOfProxy. If kind_of_proxy is an instance of LearnAPI.IID then, given data constisting of n observations, the following must hold:\n\nŷ = LearnAPI.predict(model, kind_of_proxy, data...) is data also consisting of n observations.\nThe jth observation of ŷ, for any j, depends only on the jth observation of the provided data (no correlation between observations).\n\nSee also LearnAPI.KindOfProxy.\n\n\n\n\n\n","category":"type"},{"location":"kinds_of_target_proxy/#Simple-target-proxies-(subtypes-of-LearnAPI.IID)","page":"Kinds of Target Proxy","title":"Simple target proxies (subtypes of LearnAPI.IID)","text":"","category":"section"},{"location":"kinds_of_target_proxy/","page":"Kinds of Target Proxy","title":"Kinds of Target Proxy","text":"type form of an observation\nLearnAPI.LiteralTarget same as target observations\nLearnAPI.Sampleable object that can be sampled to obtain object of the same form as target observation\nLearnAPI.Distribution explicit probability density/mass function whose sample space is all possible target observations\nLearnAPI.LogDistribution explicit log-probability density/mass function whose sample space is possible target observations\n† LearnAPI.Probability numerical probability or probability vector\n† LearnAPI.LogProbability log-probability or log-probability vector\n† LearnAPI.Parametric a list of parameters (e.g., mean and variance) describing some distribution\nLearnAPI.LabelAmbiguous collections of labels (in case of multi-class target) but without a known correspondence to the original target labels (and of possibly different number) as in, e.g., clustering\nLearnAPI.LabelAmbiguousSampleable sampleable version of LabelAmbiguous; see Sampleable above\nLearnAPI.LabelAmbiguousDistribution pdf/pmf version of LabelAmbiguous; see Distribution above\nLearnAPI.ConfidenceInterval confidence interval\nLearnAPI.Set finite but possibly varying number of target observations\nLearnAPI.ProbabilisticSet as for Set but labeled with probabilities (not necessarily summing to one)\nLearnAPI.SurvivalFunction survival function\nLearnAPI.SurvivalDistribution probability distribution for survival time\nLearnAPI.OutlierScore numerical score reflecting degree of outlierness (not necessarily normalized)\nLearnAPI.Continuous real-valued approximation/interpolation of a discrete-valued target, such as a count (e.g., number of phone calls)","category":"page"},{"location":"kinds_of_target_proxy/","page":"Kinds of Target Proxy","title":"Kinds of Target Proxy","text":"† Provided for completeness but discouraged to avoid ambiguities in representation.","category":"page"},{"location":"kinds_of_target_proxy/","page":"Kinds of Target Proxy","title":"Kinds of Target Proxy","text":"Table of concrete subtypes of LearnAPI.IID <: LearnAPI.KindOfProxy.","category":"page"},{"location":"kinds_of_target_proxy/#When-the-proxy-for-the-target-is-a-single-object","page":"Kinds of Target Proxy","title":"When the proxy for the target is a single object","text":"","category":"section"},{"location":"kinds_of_target_proxy/","page":"Kinds of Target Proxy","title":"Kinds of Target Proxy","text":"In the following table of subtypes T <: LearnAPI.KindOfProxy not falling under the IID umbrella, it is understood that predict(model, ::T, ...) is not divided into individual observations, but represents a single probability distribution for the sample space Y^n, where Y is the space the target variable takes its values, and n is the number of observations in data.","category":"page"},{"location":"kinds_of_target_proxy/","page":"Kinds of Target Proxy","title":"Kinds of Target Proxy","text":"type T form of output of predict(model, ::T, data...)\nLearnAPI.JointSampleable object that can be sampled to obtain a vector whose elements have the form of target observations; the vector length matches the number of observations in data.\nLearnAPI.JointDistribution explicit probability density/mass function whose sample space is vectors of target observations; the vector length matches the number of observations in data\nLearnAPI.JointLogDistribution explicit log-probability density/mass function whose sample space is vectors of target observations; the vector length matches the number of observations in data","category":"page"},{"location":"kinds_of_target_proxy/","page":"Kinds of Target Proxy","title":"Kinds of Target Proxy","text":"Table of LearnAPI.KindOfProxy subtypes not subtyping LearnAPI.IID","category":"page"},{"location":"patterns/supervised_bayesian_models/#Supervised-Bayesian-Algorithms","page":"Supervised Bayesian Algorithms","title":"Supervised Bayesian Algorithms","text":"","category":"section"},{"location":"testing_an_implementation/#Testing-an-Implementation","page":"Testing an Implementation","title":"Testing an Implementation","text":"","category":"section"},{"location":"testing_an_implementation/","page":"Testing an Implementation","title":"Testing an Implementation","text":"🚧","category":"page"},{"location":"testing_an_implementation/","page":"Testing an Implementation","title":"Testing an Implementation","text":"warning: Warning\nUnder construction","category":"page"},{"location":"patterns/time_series_classification/#Time-Series-Classification","page":"Time Series Classification","title":"Time Series Classification","text":"","category":"section"},{"location":"anatomy_of_an_implementation/#Anatomy-of-an-Implementation","page":"Anatomy of an Implementation","title":"Anatomy of an Implementation","text":"","category":"section"},{"location":"anatomy_of_an_implementation/","page":"Anatomy of an Implementation","title":"Anatomy of an Implementation","text":"This section explains a detailed implementation of the LearnAPI for naive ridge regression. Most readers will want to scan the demonstration of the implementation before studying the implementation itself.","category":"page"},{"location":"anatomy_of_an_implementation/#Defining-an-algorithm-type","page":"Anatomy of an Implementation","title":"Defining an algorithm type","text":"","category":"section"},{"location":"anatomy_of_an_implementation/","page":"Anatomy of an Implementation","title":"Anatomy of an Implementation","text":"The first line below imports the lightweight package LearnAPI.jl whose methods we will be extending. The second imports libraries needed for the core algorithm.","category":"page"},{"location":"anatomy_of_an_implementation/","page":"Anatomy of an Implementation","title":"Anatomy of an Implementation","text":"using LearnAPI\nusing LinearAlgebra, Tables\nnothing # hide","category":"page"},{"location":"anatomy_of_an_implementation/","page":"Anatomy of an Implementation","title":"Anatomy of an Implementation","text":"A struct stores the regularization hyperparameter lambda of our ridge regressor:","category":"page"},{"location":"anatomy_of_an_implementation/","page":"Anatomy of an Implementation","title":"Anatomy of an Implementation","text":"struct Ridge\n lambda::Float64\nend\nnothing # hide","category":"page"},{"location":"anatomy_of_an_implementation/","page":"Anatomy of an Implementation","title":"Anatomy of an Implementation","text":"Instances of Ridge are algorithms, in LearnAPI.jl parlance.","category":"page"},{"location":"anatomy_of_an_implementation/","page":"Anatomy of an Implementation","title":"Anatomy of an Implementation","text":"A keyword argument constructor provides defaults for all hyperparameters:","category":"page"},{"location":"anatomy_of_an_implementation/","page":"Anatomy of an Implementation","title":"Anatomy of an Implementation","text":"Ridge(; lambda=0.1) = Ridge(lambda)\nnothing # hide","category":"page"},{"location":"anatomy_of_an_implementation/#Implementing-fit","page":"Anatomy of an Implementation","title":"Implementing fit","text":"","category":"section"},{"location":"anatomy_of_an_implementation/","page":"Anatomy of an Implementation","title":"Anatomy of an Implementation","text":"A ridge regressor requires two types of data for training: input features X, which here we suppose are tabular, and a target y, which we suppose is a vector. Users will accordingly call fit like this:","category":"page"},{"location":"anatomy_of_an_implementation/","page":"Anatomy of an Implementation","title":"Anatomy of an Implementation","text":"algorithm = Ridge(lambda=0.05)\nfit(algorithm, X, y; verbosity=1)","category":"page"},{"location":"anatomy_of_an_implementation/","page":"Anatomy of an Implementation","title":"Anatomy of an Implementation","text":"However, a new implementation does not overload fit. Rather it implements","category":"page"},{"location":"anatomy_of_an_implementation/","page":"Anatomy of an Implementation","title":"Anatomy of an Implementation","text":"obsfit(algorithm::Ridge, obsdata; verbosity=1)","category":"page"},{"location":"anatomy_of_an_implementation/","page":"Anatomy of an Implementation","title":"Anatomy of an Implementation","text":"for each obsdata returned by a data-preprocessing call obs(fit, algorithm, X, y). You can read \"obs\" as \"observation-accessible\", for reasons explained shortly. The LearnAPI.jl definition","category":"page"},{"location":"anatomy_of_an_implementation/","page":"Anatomy of an Implementation","title":"Anatomy of an Implementation","text":"fit(algorithm, data...; verbosity=1) =\n obsfit(algorithm, obs(fit, algorithm, data...), verbosity)","category":"page"},{"location":"anatomy_of_an_implementation/","page":"Anatomy of an Implementation","title":"Anatomy of an Implementation","text":"then takes care of fit.","category":"page"},{"location":"anatomy_of_an_implementation/","page":"Anatomy of an Implementation","title":"Anatomy of an Implementation","text":"The obs and obsfit method are public, and the user can call them like this:","category":"page"},{"location":"anatomy_of_an_implementation/","page":"Anatomy of an Implementation","title":"Anatomy of an Implementation","text":"obsdata = obs(fit, algorithm, X, y)\nmodel = obsfit(algorithm, obsdata)","category":"page"},{"location":"anatomy_of_an_implementation/","page":"Anatomy of an Implementation","title":"Anatomy of an Implementation","text":"We begin by defining a struct¹ for the output of our data-preprocessing operation, obs, which will store y and the matrix representation of X, together with it's column names (needed for recording named coefficients for user inspection):","category":"page"},{"location":"anatomy_of_an_implementation/","page":"Anatomy of an Implementation","title":"Anatomy of an Implementation","text":"struct RidgeFitData{T}\n A::Matrix{T} # p x n\n names::Vector{Symbol}\n y::Vector{T}\nend\nnothing # hide","category":"page"},{"location":"anatomy_of_an_implementation/","page":"Anatomy of an Implementation","title":"Anatomy of an Implementation","text":"And we overload obs like this","category":"page"},{"location":"anatomy_of_an_implementation/","page":"Anatomy of an Implementation","title":"Anatomy of an Implementation","text":"function LearnAPI.obs(::typeof(fit), ::Ridge, X, y)\n table = Tables.columntable(X)\n names = Tables.columnnames(table) |> collect\n return RidgeFitData(Tables.matrix(table, transpose=true), names, y)\nend\nnothing # hide","category":"page"},{"location":"anatomy_of_an_implementation/","page":"Anatomy of an Implementation","title":"Anatomy of an Implementation","text":"so that obs(fit, Ridge(), X, y) returns a combined RidgeFitData object with everything the core algorithm will need.","category":"page"},{"location":"anatomy_of_an_implementation/","page":"Anatomy of an Implementation","title":"Anatomy of an Implementation","text":"Since obs is public, the user will have access to this object, but to make it useful to her (and to fulfill the obs contract) this object must implement the MLUtils.jl getobs/numobs interface, to enable observation-resampling (which will be efficient, because observations are now columns). It usually suffices to overload Base.getindex and Base.length (which are the getobs/numobs fallbacks) so we won't actually need to depend on MLUtils.jl:","category":"page"},{"location":"anatomy_of_an_implementation/","page":"Anatomy of an Implementation","title":"Anatomy of an Implementation","text":"Base.getindex(data::RidgeFitData, I) =\n RidgeFitData(data.A[:,I], data.names, y[I])\nBase.length(data::RidgeFitData, I) = length(data.y)\nnothing # hide","category":"page"},{"location":"anatomy_of_an_implementation/","page":"Anatomy of an Implementation","title":"Anatomy of an Implementation","text":"Next, we define a second struct for storing the outcomes of training, including named versions of the learned coefficients:","category":"page"},{"location":"anatomy_of_an_implementation/","page":"Anatomy of an Implementation","title":"Anatomy of an Implementation","text":"struct RidgeFitted{T,F}\n algorithm::Ridge\n coefficients::Vector{T}\n named_coefficients::F\nend\nnothing # hide","category":"page"},{"location":"anatomy_of_an_implementation/","page":"Anatomy of an Implementation","title":"Anatomy of an Implementation","text":"We include algorithm, which must be recoverable from the output of fit/obsfit (see Accessor functions below).","category":"page"},{"location":"anatomy_of_an_implementation/","page":"Anatomy of an Implementation","title":"Anatomy of an Implementation","text":"We are now ready to implement a suitable obsfit method to execute the core training:","category":"page"},{"location":"anatomy_of_an_implementation/","page":"Anatomy of an Implementation","title":"Anatomy of an Implementation","text":"function LearnAPI.obsfit(algorithm::Ridge, obsdata::RidgeFitData, verbosity)\n\n lambda = algorithm.lambda\n A = obsdata.A\n names = obsdata.names\n y = obsdata.y\n\n # apply core algorithm:\n coefficients = (A*A' + algorithm.lambda*I)\\(A*y) # 1 x p matrix\n\n # determine named coefficients:\n named_coefficients = [names[j] => coefficients[j] for j in eachindex(names)]\n\n # make some noise, if allowed:\n verbosity > 0 && @info \"Coefficients: $named_coefficients\"\n\n return RidgeFitted(algorithm, coefficients, named_coefficients)\n\nend\nnothing # hide","category":"page"},{"location":"anatomy_of_an_implementation/","page":"Anatomy of an Implementation","title":"Anatomy of an Implementation","text":"Users set verbosity=0 for warnings only, and verbosity=-1 for silence.","category":"page"},{"location":"anatomy_of_an_implementation/#Implementing-predict","page":"Anatomy of an Implementation","title":"Implementing predict","text":"","category":"section"},{"location":"anatomy_of_an_implementation/","page":"Anatomy of an Implementation","title":"Anatomy of an Implementation","text":"The primary predict call will look like this:","category":"page"},{"location":"anatomy_of_an_implementation/","page":"Anatomy of an Implementation","title":"Anatomy of an Implementation","text":"predict(model, LiteralTarget(), Xnew)","category":"page"},{"location":"anatomy_of_an_implementation/","page":"Anatomy of an Implementation","title":"Anatomy of an Implementation","text":"where Xnew is a table (of the same form as X above). The argument LiteralTarget() signals that we want literal predictions of the target variable, as opposed to a proxy for the target, such as probability density functions. LiteralTarget is an example of a LearnAPI.KindOfProxy type. Targets and target proxies are defined here.","category":"page"},{"location":"anatomy_of_an_implementation/","page":"Anatomy of an Implementation","title":"Anatomy of an Implementation","text":"Rather than overload the primary signature above, however, we overload for \"observation-accessible\" input, as we did for fit,","category":"page"},{"location":"anatomy_of_an_implementation/","page":"Anatomy of an Implementation","title":"Anatomy of an Implementation","text":"LearnAPI.obspredict(model::RidgeFitted, ::LiteralTarget, Anew::Matrix) =\n ((model.coefficients)'*Anew)'\nnothing # hide","category":"page"},{"location":"anatomy_of_an_implementation/","page":"Anatomy of an Implementation","title":"Anatomy of an Implementation","text":"and overload obs to make the table-to-matrix conversion:","category":"page"},{"location":"anatomy_of_an_implementation/","page":"Anatomy of an Implementation","title":"Anatomy of an Implementation","text":"LearnAPI.obs(::typeof(predict), ::Ridge, Xnew) = Tables.matrix(Xnew, transpose=true)","category":"page"},{"location":"anatomy_of_an_implementation/","page":"Anatomy of an Implementation","title":"Anatomy of an Implementation","text":"As matrices (with observations as columns) already implement the MLUtils.jl getobs/numobs interface, we already satisfy the obs contract, and there was no need to create a wrapper for obs output.","category":"page"},{"location":"anatomy_of_an_implementation/","page":"Anatomy of an Implementation","title":"Anatomy of an Implementation","text":"The primary predict method, handling tabular input, is provided by a LearnAPI.jl fallback similar to the fit fallback.","category":"page"},{"location":"anatomy_of_an_implementation/#Accessor-functions","page":"Anatomy of an Implementation","title":"Accessor functions","text":"","category":"section"},{"location":"anatomy_of_an_implementation/","page":"Anatomy of an Implementation","title":"Anatomy of an Implementation","text":"An accessor function has the output of fit (a \"model\") as it's sole argument. Every new implementation must implement the accessor function LearnAPI.algorithm for recovering an algorithm from a fitted object:","category":"page"},{"location":"anatomy_of_an_implementation/","page":"Anatomy of an Implementation","title":"Anatomy of an Implementation","text":"LearnAPI.algorithm(model::RidgeFitted) = model.algorithm","category":"page"},{"location":"anatomy_of_an_implementation/","page":"Anatomy of an Implementation","title":"Anatomy of an Implementation","text":"Other accessor functions extract learned parameters or some standard byproducts of training, such as feature importances or training losses.² Implementing the LearnAPI.coefficients accessor function is straightforward:","category":"page"},{"location":"anatomy_of_an_implementation/","page":"Anatomy of an Implementation","title":"Anatomy of an Implementation","text":"LearnAPI.coefficients(model::RidgeFitted) = model.named_coefficients\nnothing #hide","category":"page"},{"location":"anatomy_of_an_implementation/#Tearing-a-model-down-for-serialization","page":"Anatomy of an Implementation","title":"Tearing a model down for serialization","text":"","category":"section"},{"location":"anatomy_of_an_implementation/","page":"Anatomy of an Implementation","title":"Anatomy of an Implementation","text":"The minimize method falls back to the identity. Here, for the sake of illustration, we overload it to dump the named version of the coefficients:","category":"page"},{"location":"anatomy_of_an_implementation/","page":"Anatomy of an Implementation","title":"Anatomy of an Implementation","text":"LearnAPI.minimize(model::RidgeFitted) =\n RidgeFitted(model.algorithm, model.coefficients, nothing)","category":"page"},{"location":"anatomy_of_an_implementation/#Algorithm-traits","page":"Anatomy of an Implementation","title":"Algorithm traits","text":"","category":"section"},{"location":"anatomy_of_an_implementation/","page":"Anatomy of an Implementation","title":"Anatomy of an Implementation","text":"Algorithm traits record extra generic information about an algorithm, or make specific promises of behavior. They usually have an algorithm as the single argument.","category":"page"},{"location":"anatomy_of_an_implementation/","page":"Anatomy of an Implementation","title":"Anatomy of an Implementation","text":"In LearnAPI.jl predict always outputs a target or target proxy, where \"target\" is understood very broadly. We overload a trait to record the fact that the target variable explicitly appears in training (i.e, the algorithm is supervised) and where exactly it appears:","category":"page"},{"location":"anatomy_of_an_implementation/","page":"Anatomy of an Implementation","title":"Anatomy of an Implementation","text":"LearnAPI.position_of_target(::Ridge) = 2","category":"page"},{"location":"anatomy_of_an_implementation/","page":"Anatomy of an Implementation","title":"Anatomy of an Implementation","text":"Or, you can use the shorthand","category":"page"},{"location":"anatomy_of_an_implementation/","page":"Anatomy of an Implementation","title":"Anatomy of an Implementation","text":"@trait Ridge position_of_target = 2","category":"page"},{"location":"anatomy_of_an_implementation/","page":"Anatomy of an Implementation","title":"Anatomy of an Implementation","text":"The macro can also be used to specify multiple traits simultaneously:","category":"page"},{"location":"anatomy_of_an_implementation/","page":"Anatomy of an Implementation","title":"Anatomy of an Implementation","text":"@trait(\n Ridge,\n position_of_target = 2,\n kinds_of_proxy=(LiteralTarget(),),\n descriptors = (:regression,),\n functions = (\n fit,\n obsfit,\n minimize,\n predict,\n obspredict,\n obs,\n LearnAPI.algorithm,\n LearnAPI.coefficients,\n )\n)\nnothing # hide","category":"page"},{"location":"anatomy_of_an_implementation/","page":"Anatomy of an Implementation","title":"Anatomy of an Implementation","text":"Implementing the last trait, LearnAPI.functions, which must include all non-trait functions overloaded for Ridge, is compulsory. This is the only universally compulsory trait. It is worthwhile studying the list of all traits to see which might apply to a new implementation, to enable maximum buy into functionality provided by third party packages, and to assist third party algorithms that match machine learning algorithms to user defined tasks.","category":"page"},{"location":"anatomy_of_an_implementation/#workflow","page":"Anatomy of an Implementation","title":"Demonstration","text":"","category":"section"},{"location":"anatomy_of_an_implementation/","page":"Anatomy of an Implementation","title":"Anatomy of an Implementation","text":"We now illustrate how to interact directly with Ridge instances using the methods just implemented.","category":"page"},{"location":"anatomy_of_an_implementation/","page":"Anatomy of an Implementation","title":"Anatomy of an Implementation","text":"# synthesize some data:\nn = 10 # number of observations\ntrain = 1:6\ntest = 7:10\na, b, c = rand(n), rand(n), rand(n)\nX = (; a, b, c)\ny = 2a - b + 3c + 0.05*rand(n)\n\nalgorithm = Ridge(lambda=0.5)\nLearnAPI.functions(algorithm)","category":"page"},{"location":"anatomy_of_an_implementation/#Naive-user-workflow","page":"Anatomy of an Implementation","title":"Naive user workflow","text":"","category":"section"},{"location":"anatomy_of_an_implementation/","page":"Anatomy of an Implementation","title":"Anatomy of an Implementation","text":"Training and predicting with external resampling:","category":"page"},{"location":"anatomy_of_an_implementation/","page":"Anatomy of an Implementation","title":"Anatomy of an Implementation","text":"using Tables\nmodel = fit(algorithm, Tables.subset(X, train), y[train])\nŷ = predict(model, LiteralTarget(), Tables.subset(X, test))","category":"page"},{"location":"anatomy_of_an_implementation/#Advanced-workflow","page":"Anatomy of an Implementation","title":"Advanced workflow","text":"","category":"section"},{"location":"anatomy_of_an_implementation/","page":"Anatomy of an Implementation","title":"Anatomy of an Implementation","text":"We now train and predict using internal data representations, resampled using the generic MLUtils.jl interface.","category":"page"},{"location":"anatomy_of_an_implementation/","page":"Anatomy of an Implementation","title":"Anatomy of an Implementation","text":"import MLUtils\nfit_data = obs(fit, algorithm, X, y)\npredict_data = obs(predict, algorithm, X)\nmodel = obsfit(algorithm, MLUtils.getobs(fit_data, train))\nẑ = obspredict(model, LiteralTarget(), MLUtils.getobs(predict_data, test))\n@assert ẑ == ŷ\nnothing # hide","category":"page"},{"location":"anatomy_of_an_implementation/#Applying-an-accessor-function-and-serialization","page":"Anatomy of an Implementation","title":"Applying an accessor function and serialization","text":"","category":"section"},{"location":"anatomy_of_an_implementation/","page":"Anatomy of an Implementation","title":"Anatomy of an Implementation","text":"Extracting coefficients:","category":"page"},{"location":"anatomy_of_an_implementation/","page":"Anatomy of an Implementation","title":"Anatomy of an Implementation","text":"LearnAPI.coefficients(model)","category":"page"},{"location":"anatomy_of_an_implementation/","page":"Anatomy of an Implementation","title":"Anatomy of an Implementation","text":"Serialization/deserialization:","category":"page"},{"location":"anatomy_of_an_implementation/","page":"Anatomy of an Implementation","title":"Anatomy of an Implementation","text":"using Serialization\nsmall_model = minimize(model)\nserialize(\"my_ridge.jls\", small_model)\n\nrecovered_model = deserialize(\"my_ridge.jls\")\n@assert LearnAPI.algorithm(recovered_model) == algorithm\npredict(recovered_model, LiteralTarget(), X) == predict(model, LiteralTarget(), X)","category":"page"},{"location":"anatomy_of_an_implementation/","page":"Anatomy of an Implementation","title":"Anatomy of an Implementation","text":"","category":"page"},{"location":"anatomy_of_an_implementation/","page":"Anatomy of an Implementation","title":"Anatomy of an Implementation","text":"¹ The definition of this and other structs above is not an explicit requirement of LearnAPI.jl, whose constructs are purely functional. ","category":"page"},{"location":"anatomy_of_an_implementation/","page":"Anatomy of an Implementation","title":"Anatomy of an Implementation","text":"² An implementation can provide further accessor functions, if necessary, but like the native ones, they must be included in the LearnAPI.functions declaration.","category":"page"},{"location":"patterns/static_algorithms/#Static-Algorithms","page":"Static Algorithms","title":"Static Algorithms","text":"","category":"section"},{"location":"patterns/static_algorithms/","page":"Static Algorithms","title":"Static Algorithms","text":"See these examples from tests.","category":"page"},{"location":"patterns/clusterering/#Clusterering","page":"Clusterering","title":"Clusterering","text":"","category":"section"},{"location":"fit/#[fit](@ref-fit)","page":"fit","title":"fit","text":"","category":"section"},{"location":"fit/","page":"fit","title":"fit","text":"fit(algorithm, data...; verbosity=1) -> model\nfit(model, data...; verbosity=1) -> updated_model","category":"page"},{"location":"fit/#Typical-workflow","page":"fit","title":"Typical workflow","text":"","category":"section"},{"location":"fit/","page":"fit","title":"fit","text":"# Train some supervised `algorithm`:\nmodel = fit(algorithm, X, y)\n\n# Predict probability distributions:\nŷ = predict(model, Distribution(), Xnew)\n\n# Inspect some byproducts of training:\nLearnAPI.feature_importances(model)","category":"page"},{"location":"fit/#Implementation-guide","page":"fit","title":"Implementation guide","text":"","category":"section"},{"location":"fit/","page":"fit","title":"fit","text":"The fit method is not implemented directly. Instead, implement obsfit.","category":"page"},{"location":"fit/","page":"fit","title":"fit","text":"method fallback compulsory? requires\nobsfit(alg, ...) none yes obs in some cases\n ","category":"page"},{"location":"fit/#Reference","page":"fit","title":"Reference","text":"","category":"section"},{"location":"fit/","page":"fit","title":"fit","text":"LearnAPI.fit\nLearnAPI.obsfit","category":"page"},{"location":"fit/#LearnAPI.fit","page":"fit","title":"LearnAPI.fit","text":"LearnAPI.fit(algorithm, data...; verbosity=1)\n\nExecute the algorithm with configuration algorithm using the provided training data, returning an object, model, on which other methods, such as predict or transform, can be dispatched. LearnAPI.functions(algorithm) returns a list of methods that can be applied to either algorithm or model.\n\nArguments\n\nalgorithm: property-accessible object whose properties are the hyperparameters of some ML/statistical algorithm\ndata: tuple of data objects with a common number of observations, for example, data = (X, y, w) where X is a table of features, y is a target vector with the same number of rows, and w a vector of per-observation weights.\n\nverbosity=1: logging level; set to 0 for warnings only, and -1 for silent training\n\nSee also obsfit, predict, transform, inverse_transform, LearnAPI.functions, obs.\n\nExtended help\n\nNew implementations\n\nLearnAPI.jl provides the following definition of fit, which is never directly overloaded:\n\nfit(algorithm, data...; verbosity=1) =\n obsfit(algorithm, Obs(), obs(fit, algorithm, data...); verbosity)\n\nRather, new algorithms should overload obsfit. See also obs.\n\n\n\n\n\n","category":"function"},{"location":"fit/#LearnAPI.obsfit","page":"fit","title":"LearnAPI.obsfit","text":"obsfit(algorithm, obsdata; verbosity=1)\n\nA lower-level alternative to fit, this method consumes a pre-processed form of user data. Specifically, the following two code snippets are equivalent:\n\nmodel = fit(algorithm, data...)\n\nand\n\nobsdata = obs(fit, algorithm, data...)\nmodel = obsfit(algorithm, obsdata)\n\nHere obsdata is algorithm-specific, \"observation-accessible\" data, meaning it implements the MLUtils.jl getobs/numobs interface for observation resampling (even if data does not). Moreover, resampled versions of obsdata may be passed to obsfit in its place.\n\nThe use of obsfit may offer performance advantages. See more at obs.\n\nSee also fit, obs.\n\nExtended help\n\nNew implementations\n\nImplementation of the following method signature is compulsory for all new algorithms:\n\nLearnAPI.obsfit(algorithm, obsdata, verbosity)\n\nHere obsdata has the form explained above. If obs(fit, ...) is not being overloaded, then a fallback gives obsdata = data (always a tuple!). Note that verbosity is a positional argument, not a keyword argument in the overloaded signature.\n\nNew implementations must also implement LearnAPI.algorithm.\n\nIf overloaded, then the functions LearnAPI.obsfit and LearnAPI.fit must be included in the tuple returned by the LearnAPI.functions(algorithm) trait.\n\nNon-generalizing algorithms\n\nIf the algorithm does not generalize to new data (e.g, DBSCAN clustering) then data = () and obsfit carries out no computation, as this happen entirely in a transform and/or predict call. In such cases, obsfit(algorithm, ...) may return algorithm, but another possibility is allowed: To provide a mechanism for transform/predict to report byproducts of the computation (e.g., a list of boundary points in DBSCAN clustering) they are allowed to mutate the model object returned by obsfit, which is then arranged to be a mutable struct wrapping algorithm and fields to store the byproducts. In that case, LearnAPI.predict_or_transform_mutates(algorithm) must be overloaded to return true.\n\n\n\n\n\n","category":"function"},{"location":"reference/#reference","page":"Reference","title":"Reference","text":"","category":"section"},{"location":"reference/","page":"Reference","title":"Reference","text":"Here we give the definitive specification of the LearnAPI.jl interface. For informal guides see Anatomy of an Implementation and Common Implementation Patterns.","category":"page"},{"location":"reference/#scope","page":"Reference","title":"Important terms and concepts","text":"","category":"section"},{"location":"reference/","page":"Reference","title":"Reference","text":"The LearnAPI.jl specification is predicated on a few basic, informally defined notions:","category":"page"},{"location":"reference/#Data-and-observations","page":"Reference","title":"Data and observations","text":"","category":"section"},{"location":"reference/","page":"Reference","title":"Reference","text":"ML/statistical algorithms are typically applied in conjunction with resampling of observations, as in cross-validation. In this document data will always refer to objects encapsulating an ordered sequence of individual observations. If an algorithm is trained using multiple data objects, it is undertood that individual objects share the same number of observations, and that resampling of one component implies synchronized resampling of the others.","category":"page"},{"location":"reference/","page":"Reference","title":"Reference","text":"A DataFrame instance, from DataFrames.jl, is an example of data, the observations being the rows. LearnAPI.jl makes no assumptions about how observations can be accessed, except in the case of the output of obs, which must implement the MLUtils.jl getobs/numobs interface. For example, it is generally ambiguous whether the rows or columns of a matrix are considered observations, but if a matrix is returned by obs the observations must be the columns.","category":"page"},{"location":"reference/#hyperparameters","page":"Reference","title":"Hyperparameters","text":"","category":"section"},{"location":"reference/","page":"Reference","title":"Reference","text":"Besides the data it consumes, a machine learning algorithm's behavior is governed by a number of user-specified hyperparameters, such as the number of trees in a random forest. In LearnAPI.jl, one is allowed to have hyperparematers that are not data-generic. For example, a class weight dictionary will only make sense for a target taking values in the set of dictionary keys. ","category":"page"},{"location":"reference/#proxy","page":"Reference","title":"Targets and target proxies","text":"","category":"section"},{"location":"reference/#Context","page":"Reference","title":"Context","text":"","category":"section"},{"location":"reference/","page":"Reference","title":"Reference","text":"After training, a supervised classifier predicts labels on some input which are then compared with ground truth labels using some accuracy measure, to assesses the performance of the classifier. Alternatively, the classifier predicts class probabilities, which are instead paired with ground truth labels using a proper scoring rule, say. In outlier detection, \"outlier\"/\"inlier\" predictions, or probability-like scores, are similarly compared with ground truth labels. In clustering, integer labels assigned to observations by the clustering algorithm can can be paired with human labels using, say, the Rand index. In survival analysis, predicted survival functions or probability distributions are compared with censored ground truth survival times.","category":"page"},{"location":"reference/#Definitions","page":"Reference","title":"Definitions","text":"","category":"section"},{"location":"reference/","page":"Reference","title":"Reference","text":"More generally, whenever we have a variable (e.g., a class label) that can (in principle) can be paired with a predicted value, or some predicted \"proxy\" for that variable (such as a class probability), then we call the variable a target variable, and the predicted output a target proxy. In this definition, it is immaterial whether or not the target appears in training (is supervised) or whether or not the model generalizes to new observations (\"learns\").","category":"page"},{"location":"reference/","page":"Reference","title":"Reference","text":"LearnAPI.jl provides singleton target proxy types for prediction dispatch in LearnAPI.jl. These are also used to distinguish performance metrics provided by the package StatisticalMeasures.jl.","category":"page"},{"location":"reference/#algorithms","page":"Reference","title":"Algorithms","text":"","category":"section"},{"location":"reference/","page":"Reference","title":"Reference","text":"An object implementing the LearnAPI.jl interface is called an algorithm, although it is more accurately \"the configuration of some algorithm\".¹ It will have a type name reflecting the name of some ML/statistics algorithm (e.g., RandomForestRegressor) and it will encapsulate a particular set of user-specified hyperparameters.","category":"page"},{"location":"reference/","page":"Reference","title":"Reference","text":"Additionally, for alg::Alg to be a LearnAPI algorithm, we require:","category":"page"},{"location":"reference/","page":"Reference","title":"Reference","text":"Base.propertynames(alg) returns the hyperparameter names; values can be accessed using Base.getproperty\nIf alg is an algorithm, then so are all instances of the same type.\nIf _alg is another algorithm, then alg == _alg if and only if typeof(alg) == typeof(_alg) and corresponding properties are ==. This includes properties that are random number generators (which should be copied in training to avoid mutation).\nIf an algorithm has other algorithms as hyperparameters, then LearnAPI.is_composite(alg) must be true (fallback is false).\nA keyword constructor for Alg exists, providing default values for all non-algorithm hyperparameters.\nAt least one non-trait LearnAPI.jl function must be overloaded for instances of Alg, and accordingly LearnAPI.functions(algorithm) must be non-empty.","category":"page"},{"location":"reference/","page":"Reference","title":"Reference","text":"Any object alg for which LearnAPI.functions(alg) is non-empty is understood have a valid implementation of the LearnAPI.jl interface.","category":"page"},{"location":"reference/#Example","page":"Reference","title":"Example","text":"","category":"section"},{"location":"reference/","page":"Reference","title":"Reference","text":"Any instance of GradientRidgeRegressor defined below meets all but the last criterion above:","category":"page"},{"location":"reference/","page":"Reference","title":"Reference","text":"struct GradientRidgeRegressor{T<:Real}\n\tlearning_rate::T\n\tepochs::Int\n\tl2_regularization::T\nend\nGradientRidgeRegressor(; learning_rate=0.01, epochs=10, l2_regularization=0.01) =\n GradientRidgeRegressor(learning_rate, epochs, l2_regularization)","category":"page"},{"location":"reference/","page":"Reference","title":"Reference","text":"The same is not true if we make this a mutable struct. In that case we will need to appropriately overload Base.== for GradientRidgeRegressor.","category":"page"},{"location":"reference/#Methods","page":"Reference","title":"Methods","text":"","category":"section"},{"location":"reference/","page":"Reference","title":"Reference","text":"Only these method names are exported: fit, obsfit, predict, obspredict, transform, obstransform, inverse_transform, minimize, and obs. All new implementations must implement obsfit, the accessor function LearnAPI.algorithm and the trait LearnAPI.functions.","category":"page"},{"location":"reference/","page":"Reference","title":"Reference","text":"fit/obsfit: for training algorithms that generalize to new data\npredict/obspredict: for outputting targets or target proxies (such as probability density functions)\ntransform/obstransform: similar to predict, but for arbitrary kinds of output, and which can be paired with an inverse_transform method\ninverse_transform: for inverting the output of transform (\"inverting\" broadly understood)\nminimize: for stripping the model output by fit of inessential content, for purposes of serialization.\nobs: a method for exposing to the user \"optimized\", algorithm-specific representations of data, which can be passed to obsfit, obspredict or obstransform, but which can also be efficiently resampled using the getobs/numobs interface provided by MLUtils.jl.\nAccessor functions: include things like feature_importances and training_losses, for extracting, from training outcomes, information common to many algorithms. \nAlgorithm traits: special methods that promise specific algorithm behavior or for recording general information about the algorithm. The only universally compulsory trait is LearnAPI.functions(algorithm), which returns a list of the explicitly overloaded non-trait methods.","category":"page"},{"location":"reference/","page":"Reference","title":"Reference","text":"","category":"page"},{"location":"reference/","page":"Reference","title":"Reference","text":"¹ We acknowledge users may not like this terminology, and may know \"algorithm\" by some other name, such as \"strategy\", \"options\", \"hyperparameter set\", \"configuration\", or \"model\". Consensus on this point is difficult; see, e.g., this Julia Discourse discussion.","category":"page"},{"location":"accessor_functions/#accessor_functions","page":"Accessor Functions","title":"Accessor Functions","text":"","category":"section"},{"location":"accessor_functions/","page":"Accessor Functions","title":"Accessor Functions","text":"The sole argument of an accessor function is the output, model, of fit or obsfit.","category":"page"},{"location":"accessor_functions/","page":"Accessor Functions","title":"Accessor Functions","text":"LearnAPI.algorithm(model)\nLearnAPI.extras(model)\nLearnAPI.coefficients(model)\nLearnAPI.intercept(model)\nLearnAPI.tree(model)\nLearnAPI.trees(model)\nLearnAPI.feature_importances(model)\nLearnAPI.training_labels(model)\nLearnAPI.training_losses(model)\nLearnAPI.training_scores(model)\nLearnAPI.components(model)","category":"page"},{"location":"accessor_functions/#Implementation-guide","page":"Accessor Functions","title":"Implementation guide","text":"","category":"section"},{"location":"accessor_functions/","page":"Accessor Functions","title":"Accessor Functions","text":"All new implementations must implement LearnAPI.algorithm. All others are optional. All implemented accessor functions must be added to the list returned by LearnAPI.functions.","category":"page"},{"location":"accessor_functions/#Reference","page":"Accessor Functions","title":"Reference","text":"","category":"section"},{"location":"accessor_functions/","page":"Accessor Functions","title":"Accessor Functions","text":"LearnAPI.algorithm\nLearnAPI.extras\nLearnAPI.coefficients\nLearnAPI.intercept\nLearnAPI.tree\nLearnAPI.trees\nLearnAPI.feature_importances\nLearnAPI.training_losses\nLearnAPI.training_scores\nLearnAPI.training_labels\nLearnAPI.components","category":"page"},{"location":"accessor_functions/#LearnAPI.algorithm","page":"Accessor Functions","title":"LearnAPI.algorithm","text":"LearnAPI.algorithm(model)\nLearnAPI.algorithm(minimized_model)\n\nRecover the algorithm used to train model or the output of minimize(model).\n\nIn other words, if model = fit(algorithm, data...), for some algorithm and data, then\n\nLearnAPI.algorithm(model) == algorithm == LearnAPI.algorithm(minimize(model))\n\nis true.\n\nNew implementations\n\nImplementation is compulsory for new algorithm types. The behaviour described above is the only contract. If implemented, you must include algorithm in the tuple returned by the LearnAPI.functions trait. \n\n\n\n\n\n","category":"function"},{"location":"accessor_functions/#LearnAPI.extras","page":"Accessor Functions","title":"LearnAPI.extras","text":"LearnAPI.extras(model)\n\nReturn miscellaneous byproducts of an algorithm's computation, from the object model returned by a call of the form fit(algorithm, data).\n\nFor \"static\" algorithms (those without training data) it may be necessary to first call transform or predict on model.\n\nSee also fit.\n\nNew implementations\n\nImplementation is discouraged for byproducts already covered by other LearnAPI.jl accessor functions: LearnAPI.algorithm, LearnAPI.coefficients, LearnAPI.intercept, LearnAPI.tree, LearnAPI.trees, LearnAPI.feature_importances, LearnAPI.training_labels, LearnAPI.training_losses, LearnAPI.training_scores and LearnAPI.components.\n\nIf implemented, you must include training_labels in the tuple returned by the LearnAPI.functions trait. .\n\n\n\n\n\n","category":"function"},{"location":"accessor_functions/#LearnAPI.coefficients","page":"Accessor Functions","title":"LearnAPI.coefficients","text":"LearnAPI.coefficients(model)\n\nFor a linear model, return the learned coefficients. The value returned has the form of an abstract vector of feature_or_class::Symbol => coefficient::Real pairs (e.g [:gender => 0.23, :height => 0.7, :weight => 0.1]) or, in the case of multi-targets, feature::Symbol => coefficients::AbstractVector{<:Real} pairs.\n\nThe model reports coefficients if LearnAPI.coefficients in LearnAPI.functions(Learn.algorithm(model)).\n\nSee also LearnAPI.intercept.\n\nNew implementations\n\nImplementation is optional.\n\nIf implemented, you must include coefficients in the tuple returned by the LearnAPI.functions trait. .\n\n\n\n\n\n","category":"function"},{"location":"accessor_functions/#LearnAPI.intercept","page":"Accessor Functions","title":"LearnAPI.intercept","text":"LearnAPI.intercept(model)\n\nFor a linear model, return the learned intercept. The value returned is Real (single target) or an AbstractVector{<:Real} (multi-target).\n\nThe model reports intercept if LearnAPI.intercept in LearnAPI.functions(Learn.algorithm(model)).\n\nSee also LearnAPI.coefficients.\n\nNew implementations\n\nImplementation is optional.\n\nIf implemented, you must include intercept in the tuple returned by the LearnAPI.functions trait. .\n\n\n\n\n\n","category":"function"},{"location":"accessor_functions/#LearnAPI.tree","page":"Accessor Functions","title":"LearnAPI.tree","text":"LearnAPI.tree(model)\n\nReturn a user-friendly tree, in the form of a root object implementing the following interface defined in AbstractTrees.jl:\n\nsubtypes AbstractTrees.AbstractNode{T}\nimplements AbstractTrees.children()\nimplements AbstractTrees.printnode()\n\nSuch a tree can be visualized using the TreeRecipe.jl package, for example.\n\nSee also LearnAPI.trees.\n\nNew implementations\n\nImplementation is optional.\n\nIf implemented, you must include tree in the tuple returned by the LearnAPI.functions trait. .\n\n\n\n\n\n","category":"function"},{"location":"accessor_functions/#LearnAPI.trees","page":"Accessor Functions","title":"LearnAPI.trees","text":"LearnAPI.trees(model)\n\nFor some ensemble model, return a vector of trees. See LearnAPI.tree for the form of such trees.\n\nSee also LearnAPI.tree.\n\nNew implementations\n\nImplementation is optional.\n\nIf implemented, you must include trees in the tuple returned by the LearnAPI.functions trait. .\n\n\n\n\n\n","category":"function"},{"location":"accessor_functions/#LearnAPI.feature_importances","page":"Accessor Functions","title":"LearnAPI.feature_importances","text":"LearnAPI.feature_importances(model)\n\nReturn the algorithm-specific feature importances of a model output by fit(algorithm, ...) for some algorithm. The value returned has the form of an abstract vector of feature::Symbol => importance::Real pairs (e.g [:gender => 0.23, :height => 0.7, :weight => 0.1]).\n\nThe algorithm supports feature importances if LearnAPI.feature_importances in LearnAPI.functions(algorithm).\n\nIf an algorithm is sometimes unable to report feature importances then LearnAPI.feature_importances will return all importances as 0.0, as in [:gender => 0.0, :height => 0.0, :weight => 0.0].\n\nNew implementations\n\nImplementation is optional.\n\nIf implemented, you must include feature_importances in the tuple returned by the LearnAPI.functions trait. .\n\n\n\n\n\n","category":"function"},{"location":"accessor_functions/#LearnAPI.training_losses","page":"Accessor Functions","title":"LearnAPI.training_losses","text":"LearnAPI.training_losses(model)\n\nReturn the training losses obtained when running model = fit(algorithm, ...) for some algorithm.\n\nSee also fit.\n\nNew implementations\n\nImplement for iterative algorithms that compute and record training losses as part of training (e.g. neural networks).\n\nIf implemented, you must include training_losses in the tuple returned by the LearnAPI.functions trait. .\n\n\n\n\n\n","category":"function"},{"location":"accessor_functions/#LearnAPI.training_scores","page":"Accessor Functions","title":"LearnAPI.training_scores","text":"LearnAPI.training_scores(model)\n\nReturn the training scores obtained when running model = fit(algorithm, ...) for some algorithm.\n\nSee also fit.\n\nNew implementations\n\nImplement for algorithms, such as outlier detection algorithms, which associate a score with each observation during training, where these scores are of interest in later processes (e.g, in defining normalized scores for new data).\n\nIf implemented, you must include training_scores in the tuple returned by the LearnAPI.functions trait. .\n\n\n\n\n\n","category":"function"},{"location":"accessor_functions/#LearnAPI.training_labels","page":"Accessor Functions","title":"LearnAPI.training_labels","text":"LearnAPI.training_labels(model)\n\nReturn the training labels obtained when running model = fit(algorithm, ...) for some algorithm.\n\nSee also fit.\n\nNew implementations\n\nIf implemented, you must include training_labels in the tuple returned by the LearnAPI.functions trait. .\n\n\n\n\n\n","category":"function"},{"location":"accessor_functions/#LearnAPI.components","page":"Accessor Functions","title":"LearnAPI.components","text":"LearnAPI.components(model)\n\nFor a composite model, return the component models (fit outputs). These will be in the form of a vector of named pairs, property_name::Symbol => component_model. Here property_name is the name of some algorithm-valued property (hyper-parameter) of algorithm = LearnAPI.algorithm(model).\n\nA composite model is one for which the corresponding algorithm includes one or more algorithm-valued properties, and for which LearnAPI.is_composite(algorithm) is true.\n\nSee also is_composite.\n\nNew implementations\n\nImplementent if and only if model is a composite model. \n\nIf implemented, you must include components in the tuple returned by the LearnAPI.functions trait. .\n\n\n\n\n\n","category":"function"},{"location":"patterns/incremental_models/#Incremental-Algorithms","page":"Incremental Algorithms","title":"Incremental Algorithms","text":"","category":"section"},{"location":"patterns/learning_a_probability_distribution/#Learning-a-Probability-Distribution","page":"Learning a Probability Distribution","title":"Learning a Probability Distribution","text":"","category":"section"},{"location":"patterns/dimension_reduction/#Dimension-Reduction","page":"Dimension Reduction","title":"Dimension Reduction","text":"","category":"section"},{"location":"patterns/time_series_forecasting/#Time-Series-Forecasting","page":"Time Series Forecasting","title":"Time Series Forecasting","text":"","category":"section"},{"location":"minimize/#algorithm_minimize","page":"minimize","title":"minimize","text":"","category":"section"},{"location":"minimize/","page":"minimize","title":"minimize","text":"minimize(model) -> ","category":"page"},{"location":"minimize/#Typical-workflow","page":"minimize","title":"Typical workflow","text":"","category":"section"},{"location":"minimize/","page":"minimize","title":"minimize","text":"model = fit(algorithm, X, y)\nŷ = predict(model, LiteralTarget(), Xnew)\nLearnAPI.feature_importances(model)\n\nsmall_model = minimize(model)\nserialize(\"my_model.jls\", small_model)\n\nrecovered_model = deserialize(\"my_random_forest.jls\")\n@assert predict(recovered_model, LiteralTarget(), Xnew) == ŷ\n\n# throws MethodError:\nLearnAPI.feature_importances(recovered_model)","category":"page"},{"location":"minimize/#Implementation-guide","page":"minimize","title":"Implementation guide","text":"","category":"section"},{"location":"minimize/","page":"minimize","title":"minimize","text":"method compulsory? fallback requires\nminimize no identity fit","category":"page"},{"location":"minimize/#Reference","page":"minimize","title":"Reference","text":"","category":"section"},{"location":"minimize/","page":"minimize","title":"minimize","text":"minimize","category":"page"},{"location":"minimize/#LearnAPI.minimize","page":"minimize","title":"LearnAPI.minimize","text":"minimize(model; options...)\n\nReturn a version of model that will generally have a smaller memory allocation than model, suitable for serialization. Here model is any object returned by fit. Accessor functions that can be called on model may not work on minimize(model), but predict, transform and inverse_transform will work, if implemented for model. Check LearnAPI.functions(LearnAPI.algorithm(model)) to view see what the original model implements.\n\nSpecific algorithms may provide keyword options to control how much of the original functionality is preserved by minimize.\n\nExtended help\n\nNew implementations\n\nOverloading minimize for new algorithms is optional. The fallback is the identity. If overloaded, you must include minimize in the tuple returned by the LearnAPI.functions trait. \n\nNew implementations must enforce the following identities, whenever the right-hand side is defined:\n\npredict(minimize(model; options...), args...; kwargs...) ==\n predict(model, args...; kwargs...)\ntransform(minimize(model; options...), args...; kwargs...) ==\n transform(model, args...; kwargs...)\ninverse_transform(minimize(model; options), args...; kwargs...) ==\n inverse_transform(model, args...; kwargs...)\n\nAdditionally:\n\nminimize(minimize(model)) == minimize(model)\n\n\n\n\n\n","category":"function"},{"location":"obs/#data_interface","page":"obs","title":"obs","text":"","category":"section"},{"location":"obs/","page":"obs","title":"obs","text":"The MLUtils.jl package provides two methods getobs and numobs for resampling data divided into multiple observations, including arrays and tables. The data objects returned below are guaranteed to implement this interface and can be passed to the relevant method (obsfit, obspredict or obstransform) possibly after resampling using MLUtils.getobs. This may provide performance advantages over naive workflows.","category":"page"},{"location":"obs/","page":"obs","title":"obs","text":"obs(fit, algorithm, data...) -> \nobs(predict, algorithm, data...) -> \nobs(transform, algorithm, data...) -> ","category":"page"},{"location":"obs/#Typical-workflows","page":"obs","title":"Typical workflows","text":"","category":"section"},{"location":"obs/","page":"obs","title":"obs","text":"LearnAPI.jl makes no assumptions about the form of data X and y in a call like fit(algorithm, X, y). The particular algorithm is free to articulate it's own requirements. However, in this example, the definition","category":"page"},{"location":"obs/","page":"obs","title":"obs","text":"obsdata = obs(fit, algorithm, X, y)","category":"page"},{"location":"obs/","page":"obs","title":"obs","text":"combines X and y in a single object guaranteed to implement the MLUtils.jl getobs/numobs interface, which can be passed to obsfit instead of fit, as is, or after resampling using MLUtils.getobs:","category":"page"},{"location":"obs/","page":"obs","title":"obs","text":"# equivalent to `mode = fit(algorithm, X, y)`:\nmodel = obsfit(algorithm, obsdata)\n\n# with resampling:\nresampled_obsdata = MLUtils.getobs(obsdata, 1:100)\nmodel = obsfit(algorithm, resampled_obsdata)","category":"page"},{"location":"obs/","page":"obs","title":"obs","text":"In some implementations, the alternative pattern above can be used to avoid repeating unnecessary internal data preprocessing, or inefficient resampling. For example, here's how a user might call obs and MLUtils.getobs to perform efficient cross-validation:","category":"page"},{"location":"obs/","page":"obs","title":"obs","text":"using LearnAPI\nimport MLUtils\n\nX = \ny = \nalgorithm = \n\ntest_train_folds = map([1:10, 11:20, 21:30]) do test\n (test, setdiff(1:30, test))\nend \n\n# create fixed model-specific representations of the whole data set:\nfit_data = obs(fit, algorithm, X, y)\npredict_data = obs(predict, algorithm, predict, X)\n\nscores = map(train_test_folds) do (train_indices, test_indices)\n \n\t# train using model-specific representation of data:\n\ttrain_data = MLUtils.getobs(fit_data, train_indices)\n\tmodel = obsfit(algorithm, train_data)\n\t\n\t# predict on the fold complement:\n\ttest_data = MLUtils.getobs(predict_data, test_indices)\n\tŷ = obspredict(model, LiteralTarget(), test_data)\n\n return \n\t\nend ","category":"page"},{"location":"obs/","page":"obs","title":"obs","text":"Note here that the output of obspredict will match the representation of y , i.e., there is no concept of an algorithm-specific representation of outputs, only inputs.","category":"page"},{"location":"obs/#Implementation-guide","page":"obs","title":"Implementation guide","text":"","category":"section"},{"location":"obs/","page":"obs","title":"obs","text":"method compulsory? fallback\nobs depends slurps data argument\n ","category":"page"},{"location":"obs/","page":"obs","title":"obs","text":"If the data consumed by fit, predict or transform consists only of tables and arrays (with last dimension the observation dimension) then overloading obs is optional. However, if an implementation overloads obs to return a (thinly wrapped) representation of user data that is closer to what the core algorithm actually uses, and overloads MLUtils.getobs (or, more typically Base.getindex) to make resampling of that representation efficient, then those optimizations become available to the user, without the user concerning herself with the details of the representation.","category":"page"},{"location":"obs/","page":"obs","title":"obs","text":"A sample implementation is given in the obs document-string below.","category":"page"},{"location":"obs/#Reference","page":"obs","title":"Reference","text":"","category":"section"},{"location":"obs/","page":"obs","title":"obs","text":"obs","category":"page"},{"location":"obs/#LearnAPI.obs","page":"obs","title":"LearnAPI.obs","text":"obs(func, algorithm, data...)\n\nWhere func is fit, predict or transform, return a combined, algorithm-specific, representation of data..., which can be passed directly to obsfit, obspredict or obstransform, as shown in the example below.\n\nThe returned object implements the getobs/numobs observation-resampling interface provided by MLUtils.jl, even if data does not.\n\nCalling func on the returned object may be cheaper than calling func directly on data.... And resampling the returned object using MLUtils.getobs may be cheaper than directly resampling the components of data (an operation not provided by the LearnAPI.jl interface).\n\nExample\n\nUsual workflow, using data-specific resampling methods:\n\nX = \ny = \n\nXtrain = Tables.select(X, 1:100)\nytrain = y[1:100]\nmodel = fit(algorithm, Xtrain, ytrain)\nŷ = predict(model, LiteralTarget(), y[101:150])\n\nAlternative workflow using obs:\n\nimport MLUtils\n\nfitdata = obs(fit, algorithm, X, y)\npredictdata = obs(predict, algorithm, X)\n\nmodel = obsfit(algorithm, MLUtils.getobs(fitdata, 1:100))\nẑ = obspredict(model, LiteralTarget(), MLUtils.getobs(predictdata, 101:150))\n@assert ẑ == ŷ\n\nSee also obsfit, obspredict, obstransform.\n\nExtended help\n\nNew implementations\n\nIf the data to be consumed in standard user calls to fit, predict or transform consists only of tables and arrays (with last dimension the observation dimension) then overloading obs is optional, but the user will get no performance benefits by using it. The implementation of obs is optional under more general circumstances stated at the end.\n\nThe fallback for obs just slurps the provided data:\n\nobs(func, alg, data...) = data\n\nThe only contractual obligation of obs is to return an object implementing the getobs/numobs interface. Generally it suffices to overload Base.getindex and Base.length. However, note that implementations of obsfit, obspredict, and obstransform depend on the form of output of obs.\n\nIf overloaded, you must include obs in the tuple returned by the LearnAPI.functions trait. \n\nSample implementation\n\nSuppose that fit, for an algorithm of type Alg, is to have the primary signature\n\nfit(algorithm::Alg, X, y)\n\nwhere X is a table, y a vector. Internally, the algorithm is to call a lower level function\n\ntrain(A, names, y)\n\nwhere A = Tables.matrix(X)' and names are the column names of X. Then relevant parts of an implementation might look like this:\n\n# thin wrapper for algorithm-specific representation of data:\nstruct ObsData{T}\n A::Matrix{T}\n names::Vector{Symbol}\n y::Vector{T}\nend\n\n# (indirect) implementation of `getobs/numobs`:\nBase.getindex(data::ObsData, I) =\n ObsData(data.A[:,I], data.names, y[I])\nBase.length(data::ObsData, I) = length(data.y)\n\n# implementation of `obs`:\nfunction LearnAPI.obs(::typeof(fit), ::Alg, X, y)\n table = Tables.columntable(X)\n names = Tables.columnnames(table) |> collect\n return ObsData(Tables.matrix(table)', names, y)\nend\n\n# implementation of `obsfit`:\nfunction LearnAPI.obsfit(algorithm::Alg, data::ObsData; verbosity=1)\n coremodel = train(data.A, data.names, data.y)\n data.verbosity > 0 && @info \"Training using these features: names.\"\n \n return model\nend\n\nWhen is overloading obs optional?\n\nOverloading obs is optional, for a given typeof(algorithm) and typeof(fun), if the components of data in the standard call func(algorithm_or_model, data...) are already expected to separately implement the getobs/numbobs interface. This is true for arrays whose last dimension is the observation dimension, and for suitable tables.\n\n\n\n\n\n","category":"function"},{"location":"","page":"Home","title":"Home","text":"\n\nLearnAPI.jl\n
\n\nA base Julia interface for machine learning and statistics \n
\n
","category":"page"},{"location":"","page":"Home","title":"Home","text":"LearnAPI.jl is a lightweight, functional-style interface, providing a collection of methods, such as fit and predict, to be implemented by algorithms from machine learning and statistics. Through such implementations, these algorithms buy into functionality, such as hyperparameter optimization, as provided by ML/statistics toolboxes and other packages. LearnAPI.jl also provides a number of Julia traits for promising specific behavior.","category":"page"},{"location":"","page":"Home","title":"Home","text":"🚧","category":"page"},{"location":"","page":"Home","title":"Home","text":"warning: Warning\nThe API described here is under active development and not ready for adoption. Join an ongoing design discussion at this Julia Discourse thread.","category":"page"},{"location":"#Sample-workflow","page":"Home","title":"Sample workflow","text":"","category":"section"},{"location":"","page":"Home","title":"Home","text":"Suppose forest is some object encapsulating the hyperparameters of the random forest algorithm (the number of trees, etc.). Then, a LearnAPI.jl interface can be implemented, for objects with the type of forest, to enable the following basic workflow:","category":"page"},{"location":"","page":"Home","title":"Home","text":"X = \ny = \nXnew = \n\n# Train:\nmodel = fit(forest, X, y)\n\n# Predict probability distributions:\npredict(model, Distribution(), Xnew)\n\n# Generate point predictions:\nŷ = predict(model, LiteralTarget(), Xnew) # or `predict(model, Xnew)`\n\n# Apply an \"accessor function\" to inspect byproducts of training:\nLearnAPI.feature_importances(model)\n\n# Slim down and otherwise prepare model for serialization:\nsmall_model = minimize(model)\nserialize(\"my_random_forest.jls\", small_model)\n\n# Recover saved model and algorithm configuration:\nrecovered_model = deserialize(\"my_random_forest.jls\")\n@assert LearnAPI.algorithm(recovered_model) == forest\n@assert predict(recovered_model, LiteralTarget(), Xnew) == ŷ","category":"page"},{"location":"","page":"Home","title":"Home","text":"Distribution and LiteralTarget are singleton types owned by LearnAPI.jl. They allow dispatch based on the kind of target proxy, a key LearnAPI.jl concept. LearnAPI.jl places more emphasis on the notion of target variables and target proxies than on the usual supervised/unsupervised learning dichotomy. From this point of view, a supervised algorithm is simply one in which a target variable exists, and happens to appear as an input to training but not to prediction.","category":"page"},{"location":"","page":"Home","title":"Home","text":"In LearnAPI.jl, a method called obs gives users access to an \"internal\", algorithm-specific, representation of input data, which is always \"observation-accessible\", in the sense that it can be resampled using MLUtils.jl getobs/numobs interface. The implementation can arrange for this resampling to be efficient, and workflows based on obs can have performance benefits.","category":"page"},{"location":"#Learning-more","page":"Home","title":"Learning more","text":"","category":"section"},{"location":"","page":"Home","title":"Home","text":"Anatomy of an Implementation: informal introduction to the main actors in a new LearnAPI.jl implementation\nReference: official specification\nCommon Implementation Patterns: implementation suggestions for common, informally defined, algorithm types\nTesting an Implementation","category":"page"},{"location":"patterns/outlier_detection/#Outlier-Detection","page":"Outlier Detection","title":"Outlier Detection","text":"","category":"section"},{"location":"patterns/incremental_algorithms/#Incremental-Models","page":"Incremental Models","title":"Incremental Models","text":"","category":"section"}] +[{"location":"patterns/regression/#Regression","page":"Regression","title":"Regression","text":"","category":"section"},{"location":"patterns/regression/","page":"Regression","title":"Regression","text":"See these examples from tests.","category":"page"},{"location":"patterns/missing_value_imputation/#Missing-Value-Imputation","page":"Missing Value Imputation","title":"Missing Value Imputation","text":"","category":"section"},{"location":"patterns/iterative_algorithms/#Iterative-Algorithms","page":"Iterative Algorithms","title":"Iterative Algorithms","text":"","category":"section"},{"location":"patterns/survival_analysis/#Survival-Analysis","page":"Survival Analysis","title":"Survival Analysis","text":"","category":"section"},{"location":"predict_transform/#operations","page":"predict, transform, and relatives","title":"predict, transform, and relatives","text":"","category":"section"},{"location":"predict_transform/","page":"predict, transform, and relatives","title":"predict, transform, and relatives","text":"Standard methods:","category":"page"},{"location":"predict_transform/","page":"predict, transform, and relatives","title":"predict, transform, and relatives","text":"predict(model, kind_of_proxy, data...) -> prediction\ntransform(model, data...) -> transformed_data\ninverse_transform(model, data...) -> inverted_data","category":"page"},{"location":"predict_transform/","page":"predict, transform, and relatives","title":"predict, transform, and relatives","text":"Methods consuming output, obsdata, of data-preprocessor obs:","category":"page"},{"location":"predict_transform/","page":"predict, transform, and relatives","title":"predict, transform, and relatives","text":"obspredict(model, kind_of_proxy, obsdata) -> prediction\nobstransform(model, obsdata) -> transformed_data","category":"page"},{"location":"predict_transform/#Typical-worklows","page":"predict, transform, and relatives","title":"Typical worklows","text":"","category":"section"},{"location":"predict_transform/","page":"predict, transform, and relatives","title":"predict, transform, and relatives","text":"# Train some supervised `algorithm`:\nmodel = fit(algorithm, X, y)\n\n# Predict probability distributions:\nŷ = predict(model, Distribution(), Xnew)\n\n# Generate point predictions:\nŷ = predict(model, LiteralTarget(), Xnew)","category":"page"},{"location":"predict_transform/","page":"predict, transform, and relatives","title":"predict, transform, and relatives","text":"# Training a dimension-reducing `algorithm`:\nmodel = fit(algorithm, X)\nXnew_reduced = transform(model, Xnew)\n\n# Apply an approximate right inverse:\ninverse_transform(model, Xnew_reduced)","category":"page"},{"location":"predict_transform/#An-advanced-workflow","page":"predict, transform, and relatives","title":"An advanced workflow","text":"","category":"section"},{"location":"predict_transform/","page":"predict, transform, and relatives","title":"predict, transform, and relatives","text":"fitdata = obs(fit, algorithm, X, y)\npredictdata = obs(predict, algorithm, Xnew)\nmodel = obsfit(algorithm, obsdata)\nŷ = obspredict(model, LiteralTarget(), predictdata)","category":"page"},{"location":"predict_transform/#Implementation-guide","page":"predict, transform, and relatives","title":"Implementation guide","text":"","category":"section"},{"location":"predict_transform/","page":"predict, transform, and relatives","title":"predict, transform, and relatives","text":"The methods predict and transform are not directly overloaded. Implement obspredict and obstransform instead:","category":"page"},{"location":"predict_transform/","page":"predict, transform, and relatives","title":"predict, transform, and relatives","text":"method compulsory? fallback requires\nobspredict no none fit\nobstransform no none fit\ninverse_transform no none fit, obstransform","category":"page"},{"location":"predict_transform/#Predict-or-transform?","page":"predict, transform, and relatives","title":"Predict or transform?","text":"","category":"section"},{"location":"predict_transform/","page":"predict, transform, and relatives","title":"predict, transform, and relatives","text":"If the algorithm has a notion of target variable, then arrange for obspredict to output each supported kind of target proxy (LiteralTarget(), Distribution(), etc).","category":"page"},{"location":"predict_transform/","page":"predict, transform, and relatives","title":"predict, transform, and relatives","text":"For output not associated with a target variable, implement obstransform instead, which does not dispatch on LearnAPI.KindOfProxy, but can be optionally paired with an implementation of inverse_transform for returning (approximate) right inverses to transform.","category":"page"},{"location":"predict_transform/#Reference","page":"predict, transform, and relatives","title":"Reference","text":"","category":"section"},{"location":"predict_transform/","page":"predict, transform, and relatives","title":"predict, transform, and relatives","text":"predict\nobspredict\ntransform\nobstransform\ninverse_transform","category":"page"},{"location":"predict_transform/#LearnAPI.predict","page":"predict, transform, and relatives","title":"LearnAPI.predict","text":"predict(model, kind_of_proxy::LearnAPI.KindOfProxy, data...)\npredict(model, data...)\n\nThe first signature returns target or target proxy predictions for input features data, according to some model returned by fit or obsfit. Where supported, these are literally target predictions if kind_of_proxy = LiteralTarget(), and probability density/mass functions if kind_of_proxy = Distribution(). List all options with LearnAPI.kinds_of_proxy(algorithm), where algorithm = LearnAPI.algorithm(model).\n\nThe shortcut predict(model, data...) = predict(model, LiteralTarget(), data...) is also provided.\n\nArguments\n\nmodel is anything returned by a call of the form fit(algorithm, ...), for some LearnAPI-complaint algorithm.\ndata: tuple of data objects with a common number of observations, for example, data = (X, y, w) where X is a table of features, y is a target vector with the same number of rows, and w a vector of per-observation weights.\n\nExample\n\nIn the following, algorithm is some supervised learning algorithm with training features X, training target y, and test features Xnew:\n\nmodel = fit(algorithm, X, y; verbosity=0)\npredict(model, LiteralTarget(), Xnew)\n\nNote predict does not mutate any argument, except in the special case LearnAPI.predict_or_transform_mutates(algorithm) = true.\n\nSee also obspredict, fit, transform, inverse_transform.\n\nExtended help\n\nNew implementations\n\nLearnAPI.jl provides the following definition of predict which is never to be directly overloaded:\n\npredict(model, kop::LearnAPI.KindOfProxy, data...) =\n obspredict(model, kop, obs(predict, LearnAPI.algorithm(model), data...))\n\nRather, new algorithms overload obspredict.\n\n\n\n\n\n","category":"function"},{"location":"predict_transform/#LearnAPI.obspredict","page":"predict, transform, and relatives","title":"LearnAPI.obspredict","text":"obspredict(model, kind_of_proxy::LearnAPI.KindOfProxy, obsdata)\n\nSimilar to predict but consumes algorithm-specific representations of input data, obsdata, as returned by obs(predict, algorithm, data...). Here data... is the form of data expected in the main predict method. Alternatively, such obsdata may be replaced by a resampled version, where resampling is performed using MLUtils.getobs (always supported).\n\nFor some algorithms and workflows, obspredict will have a performance benefit over predict. See more at obs.\n\nExample\n\nIn the following, algorithm is some supervised learning algorithm with training features X, training target y, and test features Xnew:\n\nmodel = fit(algorithm, X, y)\nobsdata = obs(predict, algorithm, Xnew)\nŷ = obspredict(model, LiteralTarget(), obsdata)\n@assert ŷ == predict(model, LiteralTarget(), Xnew)\n\nSee also predict, fit, transform, inverse_transform, obs.\n\nExtended help\n\nNew implementations\n\nImplementation of obspredict is optional, but required to enable predict. The method must also handle obsdata in the case it is replaced by MLUtils.getobs(obsdata, I) for some collection I of indices. If obs is not overloaded, then obsdata = data, where data... is what the standard predict call expects, as in the call predict(model, kind_of_proxy, data...). Note data is always a tuple, even if predict has only one data argument. See more at obs.\n\nIf LearnAPI.predict_or_transform_mutates(algorithm) is overloaded to return true, then obspredict may mutate it's first argument, but not in a way that alters the result of a subsequent call to obspredict, obstransform or inverse_transform. This is necessary for some non-generalizing algorithms but is otherwise discouraged. See more at fit.\n\nIf overloaded, you must include both LearnAPI.obspredict and LearnAPI.predict in the list of methods returned by the LearnAPI.functions trait.\n\nAn implementation is provided for each kind of target proxy you wish to support. See the LearnAPI.jl documentation for options. Each supported kind_of_proxy instance should be listed in the return value of the LearnAPI.kinds_of_proxy(algorithm) trait.\n\nIf, additionally, minimize(model) is overloaded, then the following identity must hold:\n\nobspredict(minimize(model), args...) = obspredict(model, args...)\n\n\n\n\n\n","category":"function"},{"location":"predict_transform/#LearnAPI.transform","page":"predict, transform, and relatives","title":"LearnAPI.transform","text":"transform(model, data...)\n\nReturn a transformation of some data, using some model, as returned by fit.\n\nArguments\n\nmodel is anything returned by a call of the form fit(algorithm, ...), for some LearnAPI-complaint algorithm.\ndata: tuple of data objects with a common number of observations, for example, data = (X, y, w) where X is a table of features, y is a target vector with the same number of rows, and w a vector of per-observation weights.\n\nExample\n\nHere X and Xnew are data of the same form:\n\n# For an algorithm that generalizes to new data (\"learns\"):\nmodel = fit(algorithm, X; verbosity=0)\ntransform(model, Xnew)\n\n# For a static (non-generalizing) transformer:\nmodel = fit(algorithm)\ntransform(model, X)\n\nNote transform does not mutate any argument, except in the special case LearnAPI.predict_or_transform_mutates(algorithm) = true.\n\nSee also obstransform, fit, predict, inverse_transform.\n\nExtended help\n\nNew implementations\n\nLearnAPI.jl provides the following definition of transform which is never to be directly overloaded:\n\ntransform(model, data...) =\n obstransform(model, obs(predict, LearnAPI.algorithm(model), data...))\n\nRather, new algorithms overload obstransform.\n\n\n\n\n\n","category":"function"},{"location":"predict_transform/#LearnAPI.obstransform","page":"predict, transform, and relatives","title":"LearnAPI.obstransform","text":"obstransform(model, kind_of_proxy::LearnAPI.KindOfProxy, obsdata)\n\nSimilar to transform but consumes algorithm-specific representations of input data, obsdata, as returned by obs(transform, algorithm, data...). Here data... is the form of data expected in the main transform method. Alternatively, such obsdata may be replaced by a resampled version, where resampling is performed using MLUtils.getobs (always supported).\n\nFor some algorithms and workflows, obstransform will have a performance benefit over transform. See more at obs.\n\nExample\n\nIn the following, algorithm is some unsupervised learning algorithm with training features X, and test features Xnew:\n\nmodel = fit(algorithm, X, y)\nobsdata = obs(transform, algorithm, Xnew)\nW = obstransform(model, obsdata)\n@assert W == transform(model, Xnew)\n\nSee also transform, fit, predict, inverse_transform, obs.\n\nExtended help\n\nNew implementations\n\nImplementation of obstransform is optional, but required to enable transform. The method must also handle obsdata in the case it is replaced by MLUtils.getobs(obsdata, I) for some collection I of indices. If obs is not overloaded, then obsdata = data, where data... is what the standard transform call expects, as in the call transform(model, data...). Note data is always a tuple, even if transform has only one data argument. See more at obs.\n\nIf LearnAPI.predict_or_transform_mutates(algorithm) is overloaded to return true, then obstransform may mutate it's first argument, but not in a way that alters the result of a subsequent call to obspredict, obstransform or inverse_transform. This is necessary for some non-generalizing algorithms but is otherwise discouraged. See more at fit.\n\nIf overloaded, you must include both LearnAPI.obstransform and LearnAPI.transform in the list of methods returned by the LearnAPI.functions trait.\n\nEach supported kind_of_proxy should be listed in the return value of the LearnAPI.kinds_of_proxy(algorithm) trait.\n\nIf, additionally, minimize(model) is overloaded, then the following identity must hold:\n\nobstransform(minimize(model), args...) = obstransform(model, args...)\n\n\n\n\n\n","category":"function"},{"location":"predict_transform/#LearnAPI.inverse_transform","page":"predict, transform, and relatives","title":"LearnAPI.inverse_transform","text":"inverse_transform(model, data)\n\nInverse transform data according to some model returned by fit. Here \"inverse\" is to be understood broadly, e.g, an approximate right inverse for transform.\n\nArguments\n\nmodel: anything returned by a call of the form fit(algorithm, ...), for some LearnAPI-complaint algorithm.\ndata: something having the same form as the output of transform(model, inputs...)\n\nExample\n\nIn the following, algorithm is some dimension-reducing algorithm that generalizes to new data (such as PCA); Xtrain is the training input and Xnew the input to be reduced:\n\nmodel = fit(algorithm, Xtrain; verbosity=0)\nW = transform(model, Xnew) # reduced version of `Xnew`\nŴ = inverse_transform(model, W) # embedding of `W` in original space\n\nSee also fit, transform, predict.\n\nExtended help\n\nNew implementations\n\nImplementation is optional. If implemented, you must include inverse_transform in the tuple returned by the LearnAPI.functions trait. \n\nIf, additionally, minimize(model) is overloaded, then the following identity must hold:\n\ninverse_transform(minimize(model), args...) = inverse_transform(model, args...)\n\n\n\n\n\n","category":"function"},{"location":"patterns/supervised_bayesian_algorithms/#Supervised-Bayesian-Models","page":"Supervised Bayesian Models","title":"Supervised Bayesian Models","text":"","category":"section"},{"location":"patterns/classification/#Classification","page":"Classification","title":"Classification","text":"","category":"section"},{"location":"common_implementation_patterns/#Common-Implementation-Patterns","page":"Common Implementation Patterns","title":"Common Implementation Patterns","text":"","category":"section"},{"location":"common_implementation_patterns/","page":"Common Implementation Patterns","title":"Common Implementation Patterns","text":"🚧","category":"page"},{"location":"common_implementation_patterns/","page":"Common Implementation Patterns","title":"Common Implementation Patterns","text":"warning: Warning\nUnder construction","category":"page"},{"location":"common_implementation_patterns/","page":"Common Implementation Patterns","title":"Common Implementation Patterns","text":"warning: Warning\nThis section is only an implementation guide. The definitive specification of the Learn API is given in Reference.","category":"page"},{"location":"common_implementation_patterns/","page":"Common Implementation Patterns","title":"Common Implementation Patterns","text":"This guide is intended to be consulted after reading Anatomy of an Implementation, which introduces the main interface objects and terminology.","category":"page"},{"location":"common_implementation_patterns/","page":"Common Implementation Patterns","title":"Common Implementation Patterns","text":"Although an implementation is defined purely by the methods and traits it implements, most implementations fall into one (or more) of the following informally understood patterns or \"tasks\":","category":"page"},{"location":"common_implementation_patterns/","page":"Common Implementation Patterns","title":"Common Implementation Patterns","text":"Classification: Supervised learners for categorical targets \nRegression: Supervised learners for continuous targets\nIterative Algorithms\nIncremental Algorithms\nStatic Algorithms: Algorithms that do not learn, in the sense they must be re-executed for each new data set (do not generalize), but which have hyperparameters and/or deliver ancillary information about the computation.\nDimension Reduction: Transformers that learn to reduce feature space dimension\nMissing Value Imputation: Transformers that replace missing values.\nClusterering: Algorithms that group data into clusters for classification and possibly dimension reduction. May be true learners (generalize to new data) or static.\nOutlier Detection: Supervised, unsupervised, or semi-supervised learners for anomaly detection.\nLearning a Probability Distribution: Algorithms that fit a distribution or distribution-like object to data\nTime Series Forecasting\nTime Series Classification\nSupervised Bayesian Algorithms\nSurvival Analysis","category":"page"},{"location":"traits/#traits","page":"Algorithm Traits","title":"Algorithm Traits","text":"","category":"section"},{"location":"traits/","page":"Algorithm Traits","title":"Algorithm Traits","text":"Traits generally promise specific algorithm behavior, such as: This algorithm supports per-observation weights, which must appear as the third argument of fit, or This algorithm's transform method predicts Real vectors. They also record more mundane information, such as a package license.","category":"page"},{"location":"traits/","page":"Algorithm Traits","title":"Algorithm Traits","text":"Algorithm traits are functions whose first (and usually only) argument is an algorithm.","category":"page"},{"location":"traits/#Special-two-argument-traits","page":"Algorithm Traits","title":"Special two-argument traits","text":"","category":"section"},{"location":"traits/","page":"Algorithm Traits","title":"Algorithm Traits","text":"The two-argument version of LearnAPI.predict_output_scitype and LearnAPI.predict_output_scitype are the only overloadable traits with more than one argument.","category":"page"},{"location":"traits/#trait_summary","page":"Algorithm Traits","title":"Trait summary","text":"","category":"section"},{"location":"traits/#traits_list","page":"Algorithm Traits","title":"Overloadable traits","text":"","category":"section"},{"location":"traits/","page":"Algorithm Traits","title":"Algorithm Traits","text":"In the examples column of the table below, Table, Continuous, Sampleable are names owned by the package ScientificTypesBase.jl.","category":"page"},{"location":"traits/","page":"Algorithm Traits","title":"Algorithm Traits","text":"trait return value fallback value example\nLearnAPI.functions(algorithm) functions you can apply to algorithm or associated model (traits excluded) () (LearnAPI.fit, LearnAPI.predict, LearnAPI.algorithm)\nLearnAPI.kinds_of_proxy(algorithm) instances kop of KindOfProxy for which an implementation of LearnAPI.predict(algorithm, kop, ...) is guaranteed. () (Distribution(), Interval())\nLearnAPI.position_of_target(algorithm) the positional index¹ of the target in data in fit(algorithm, data...) calls 0 2\nLearnAPI.position_of_weights(algorithm) the positional index¹ of per-observation weights in data in fit(algorithm, data...) 0 3\nLearnAPI.descriptors(algorithm) lists one or more suggestive algorithm descriptors from LearnAPI.descriptors() () (:regression, :probabilistic)\nLearnAPI.is_pure_julia(algorithm) true if implementation is 100% Julia code false true\nLearnAPI.pkg_name(algorithm) name of package providing core code (may be different from package providing LearnAPI.jl implementation) \"unknown\" \"DecisionTree\"\nLearnAPI.pkg_license(algorithm) name of license of package providing core code \"unknown\" \"MIT\"\nLearnAPI.doc_url(algorithm) url providing documentation of the core code \"unknown\" \"https://en.wikipedia.org/wiki/Decision_tree_learning\"\nLearnAPI.load_path(algorithm) a string indicating where the struct for typeof(algorithm) is defined, beginning with name of package providing implementation \"unknown\" FastTrees.LearnAPI.DecisionTreeClassifier\nLearnAPI.is_composite(algorithm) true if one or more properties (fields) of algorithm may be an algorithm false true\nLearnAPI.human_name(algorithm) human name for the algorithm; should be a noun type name with spaces \"elastic net regressor\"\nLearnAPI.iteration_parameter(algorithm) symbolic name of an iteration parameter nothing :epochs\nLearnAPI.fit_scitype(algorithm) upper bound on scitype(data) ensuring fit(algorithm, data...) works Union{} Tuple{Table(Continuous), AbstractVector{Continuous}}\nLearnAPI.fit_observation_scitype(algorithm) upper bound on scitype(observation) for observation in data ensuring fit(algorithm, data...) works Union{} Tuple{AbstractVector{Continuous}, Continuous}\nLearnAPI.fit_type(algorithm) upper bound on typeof(data) ensuring fit(algorithm, data...) works Union{} Tuple{AbstractMatrix{<:Real}, AbstractVector{<:Real}}\nLearnAPI.fit_observation_type(algorithm) upper bound on typeof(observation) for observation in data ensuring fit(algorithm, data...) works Union{} Tuple{AbstractVector{<:Real}, Real}\nLearnAPI.predict_input_scitype(algorithm) upper bound on scitype(data) ensuring predict(model, kop, data...) works Union{} Table(Continuous)\nLearnAPI.predict_input_observation_scitype(algorithm) upper bound on scitype(observation) for observation in data ensuring predict(model, kop, data...) works Union{} Vector{Continuous}\nLearnAPI.predict_input_type(algorithm) upper bound on typeof(data) ensuring predict(model, kop, data...) works Union{} AbstractMatrix{<:Real}\nLearnAPI.predict_input_observation_type(algorithm) upper bound on typeof(observation) for observation in data ensuring predict(model, kop, data...) works Union{} Vector{<:Real}\nLearnAPI.predict_output_scitype(algorithm, kind_of_proxy) upper bound on scitype(predict(model, ...)) Any AbstractVector{Continuous}\nLearnAPI.predict_output_type(algorithm, kind_of_proxy) upper bound on typeof(predict(model, ...)) Any AbstractVector{<:Real}\nLearnAPI.transform_input_scitype(algorithm) upper bound on scitype(data) ensuring transform(model, data...) works Union{} Table(Continuous)\nLearnAPI.transform_input_observation_scitype(algorithm) upper bound on scitype(observation) for observation in data ensuring transform(model, data...) works Union{} Vector{Continuous}\nLearnAPI.transform_input_type(algorithm) upper bound on typeof(data)ensuring transform(model, data...) works Union{} AbstractMatrix{<:Real}}\nLearnAPI.transform_input_observation_type(algorithm) upper bound on typeof(observation) for observation in data ensuring transform(model, data...) works Union{} Vector{Continuous}\nLearnAPI.transform_output_scitype(algorithm) upper bound on scitype(transform(model, ...)) Any Table(Continuous)\nLearnAPI.transform_output_type(algorithm) upper bound on typeof(transform(model, ...)) Any AbstractMatrix{<:Real}\nLearnAPI.predict_or_transform_mutates(algorithm) true if predict or transform mutates first argument false true","category":"page"},{"location":"traits/","page":"Algorithm Traits","title":"Algorithm Traits","text":"¹ If the value is 0, then the variable in boldface type is not supported and not expected to appear in data. If length(data) is less than the trait value, then data is understood to exclude the variable, but note that fit can have multiple signatures of varying lengths, as in fit(algorithm, X, y) and fit(algorithm, X, y, w). A non-zero value is a promise that fit includes a signature of sufficient length to include the variable.","category":"page"},{"location":"traits/#Derived-Traits","page":"Algorithm Traits","title":"Derived Traits","text":"","category":"section"},{"location":"traits/","page":"Algorithm Traits","title":"Algorithm Traits","text":"The following convenience methods are provided but not overloadable by new implementations.","category":"page"},{"location":"traits/","page":"Algorithm Traits","title":"Algorithm Traits","text":"trait return value example\nLearnAPI.name(algorithm) algorithm type name as string \"PCA\"\nLearnAPI.is_algorithm(algorithm) true if LearnAPI.functions(algorithm) is not empty true\nLearnAPI.predict_output_scitype(algorithm) dictionary of upper bounds on the scitype of predictions, keyed on subtypes of LearnAPI.KindOfProxy \nLearnAPI.predict_output_type(algorithm) dictionary of upper bounds on the type of predictions, keyed on subtypes of LearnAPI.KindOfProxy ","category":"page"},{"location":"traits/#Implementation-guide","page":"Algorithm Traits","title":"Implementation guide","text":"","category":"section"},{"location":"traits/","page":"Algorithm Traits","title":"Algorithm Traits","text":"A single-argument trait is declared following this pattern:","category":"page"},{"location":"traits/","page":"Algorithm Traits","title":"Algorithm Traits","text":"LearnAPI.is_pure_julia(algorithm::MyAlgorithmType) = true","category":"page"},{"location":"traits/","page":"Algorithm Traits","title":"Algorithm Traits","text":"A shorthand for single-argument traits is available:","category":"page"},{"location":"traits/","page":"Algorithm Traits","title":"Algorithm Traits","text":"@trait MyAlgorithmType is_pure_julia=true","category":"page"},{"location":"traits/","page":"Algorithm Traits","title":"Algorithm Traits","text":"Multiple traits can be declared like this:","category":"page"},{"location":"traits/","page":"Algorithm Traits","title":"Algorithm Traits","text":"@trait(\n MyAlgorithmType,\n is_pure_julia = true,\n pkg_name = \"MyPackage\",\n)","category":"page"},{"location":"traits/#The-global-trait-contracts","page":"Algorithm Traits","title":"The global trait contracts","text":"","category":"section"},{"location":"traits/","page":"Algorithm Traits","title":"Algorithm Traits","text":"To ensure that trait metadata can be stored in an external algorithm registry, LearnAPI.jl requires:","category":"page"},{"location":"traits/","page":"Algorithm Traits","title":"Algorithm Traits","text":"Finiteness: The value of a trait is the same for all algorithms with same underlying UnionAll type. That is, even if the type parameters are different, the trait should be the same. There is an exception if is_composite(algorithm) = true.\nSerializability: The value of any trait can be evaluated without installing any third party package; using LearnAPI should suffice.","category":"page"},{"location":"traits/","page":"Algorithm Traits","title":"Algorithm Traits","text":"Because of 1, combining a lot of functionality into one algorithm (e.g. the algorithm can perform both classification or regression) can mean traits are necessarily less informative (as in LearnAPI.predict_type(algorithm) = Any).","category":"page"},{"location":"traits/#Reference","page":"Algorithm Traits","title":"Reference","text":"","category":"section"},{"location":"traits/","page":"Algorithm Traits","title":"Algorithm Traits","text":"LearnAPI.functions\nLearnAPI.kinds_of_proxy\nLearnAPI.position_of_target\nLearnAPI.position_of_weights\nLearnAPI.descriptors\nLearnAPI.is_pure_julia\nLearnAPI.pkg_name\nLearnAPI.pkg_license\nLearnAPI.doc_url\nLearnAPI.load_path\nLearnAPI.is_composite\nLearnAPI.human_name\nLearnAPI.iteration_parameter\nLearnAPI.fit_scitype\nLearnAPI.fit_type\nLearnAPI.fit_observation_scitype\nLearnAPI.fit_observation_type\nLearnAPI.predict_input_scitype\nLearnAPI.predict_input_observation_scitype\nLearnAPI.predict_input_type\nLearnAPI.predict_input_observation_type\nLearnAPI.predict_output_scitype\nLearnAPI.predict_output_type\nLearnAPI.transform_input_scitype\nLearnAPI.transform_input_observation_scitype\nLearnAPI.transform_input_type\nLearnAPI.transform_input_observation_type\nLearnAPI.predict_or_transform_mutates\nLearnAPI.transform_output_scitype\nLearnAPI.transform_output_type","category":"page"},{"location":"traits/#LearnAPI.functions","page":"Algorithm Traits","title":"LearnAPI.functions","text":"LearnAPI.functions(algorithm)\n\nReturn a tuple of functions that can be sensibly applied to algorithm, or to objects having the same type as algorithm, or to associated models (objects returned by fit(algorithm, ...). Algorithm traits are excluded.\n\nIn addition to functions, the returned tuple may include expressions, like :(DecisionTree.print_tree), which reference functions not owned by LearnAPI.jl.\n\nThe understanding is that algorithm is a LearnAPI-compliant object whenever this is non-empty.\n\nExtended help\n\nNew implementations\n\nAll new implementations must overload this trait. Here's a checklist for elements in the return value:\n\nfunction needs explicit implementation? include in returned tuple?\nfit no yes\nobsfit yes yes\nminimize optional yes\npredict no if obspredict is implemented\nobspredict optional if implemented\ntransform no if obstransform is implemented\nobstransform optional if implemented\nobs optional yes\ninverse_transform optional if implemented\nLearnAPI.algorithm yes yes\n\nAlso include any implemented accessor functions. The LearnAPI.jl accessor functions are: LearnAPI.extras, LearnAPI.algorithm, LearnAPI.coefficients, LearnAPI.intercept, LearnAPI.tree, LearnAPI.trees, LearnAPI.feature_importances, LearnAPI.training_labels, LearnAPI.training_losses, LearnAPI.training_scores and LearnAPI.components.\n\n\n\n\n\n","category":"function"},{"location":"traits/#LearnAPI.kinds_of_proxy","page":"Algorithm Traits","title":"LearnAPI.kinds_of_proxy","text":"LearnAPI.kinds_of_proxy(algorithm)\n\nReturns an tuple of all instances, kind, for which for which predict(algorithm, kind, data...) has a guaranteed implementation. Each such kind subtypes LearnAPI.KindOfProxy. Examples are LiteralTarget() (for predicting actual target values) and Distributions() (for predicting probability mass/density functions).\n\nSee also LearnAPI.predict, LearnAPI.KindOfProxy.\n\nExtended help\n\nNew implementations\n\nImplementation is optional but recommended whenever predict is overloaded.\n\nElements of the returned tuple must be one of these: ConfidenceInterval, Continuous, Distribution, LabelAmbiguous, LabelAmbiguousDistribution, LabelAmbiguousSampleable, LiteralTarget, LogDistribution, LogProbability, OutlierScore, Parametric, ProbabilisticSet, Probability, Sampleable, Set, SurvivalDistribution, SurvivalFunction, IID, JointDistribution, JointLogDistribution and JointSampleable.\n\nSuppose, for example, we have the following implementation of a supervised learner returning only probabilistic predictions:\n\nLearnAPI.predict(algorithm::MyNewAlgorithmType, LearnAPI.Distribution(), Xnew) = ...\n\nThen we can declare\n\n@trait MyNewAlgorithmType kinds_of_proxy = (LearnaAPI.Distribution(),)\n\nFor more on target variables and target proxies, refer to the LearnAPI documentation.\n\n\n\n\n\n","category":"function"},{"location":"traits/#LearnAPI.position_of_target","page":"Algorithm Traits","title":"LearnAPI.position_of_target","text":"LearnAPI.position_of_target(algorithm)\n\nReturn the expected position of the target variable within data in calls of the form LearnAPI.fit(algorithm, verbosity, data...).\n\nIf this number is 0, then no target is expected. If this number exceeds length(data), then data is understood to exclude the target variable.\n\n\n\n\n\n","category":"function"},{"location":"traits/#LearnAPI.position_of_weights","page":"Algorithm Traits","title":"LearnAPI.position_of_weights","text":"LearnAPI.position_of_weights(algorithm)\n\nReturn the expected position of per-observation weights within data in calls of the form LearnAPI.fit(algorithm, data...).\n\nIf this number is 0, then no weights are expected. If this number exceeds length(data), then data is understood to exclude weights, which are assumed to be uniform.\n\n\n\n\n\n","category":"function"},{"location":"traits/#LearnAPI.descriptors","page":"Algorithm Traits","title":"LearnAPI.descriptors","text":"LearnAPI.descriptors(algorithm)\n\nLists one or more suggestive algorithm descriptors from this list: :regression, :classification, :clustering, :gradient_descent, :iterative_algorithms, :incremental_algorithms, :dimension_reduction, :encoders, :static_algorithms, :missing_value_imputation, :ensemble_algorithms, :wrappers, :time_series_forecasting, :time_series_classification, :survival_analysis, :distribution_fitters, :Bayesian_algorithms, :outlier_detection, :collaborative_filtering, :text_analysis, :audio_analysis, :natural_language_processing, :image_processing (do LearnAPI.descriptors() to reproduce).\n\nwarning: Warning\nThe value of this trait guarantees no particular behavior. The trait is intended for informal classification purposes only.\n\nNew implementations\n\nThis trait should return a tuple of symbols, as in (:classifier, :text_analysis).\n\n\n\n\n\n","category":"function"},{"location":"traits/#LearnAPI.is_pure_julia","page":"Algorithm Traits","title":"LearnAPI.is_pure_julia","text":"LearnAPI.is_pure_julia(algorithm)\n\nReturns true if training algorithm requires evaluation of pure Julia code only.\n\nNew implementations\n\nThe fallback is false.\n\n\n\n\n\n","category":"function"},{"location":"traits/#LearnAPI.pkg_name","page":"Algorithm Traits","title":"LearnAPI.pkg_name","text":"LearnAPI.pkg_name(algorithm)\n\nReturn the name of the package module which supplies the core training algorithm for algorithm. This is not necessarily the package providing the LearnAPI interface.\n\nReturns \"unknown\" if the algorithm implementation has failed to overload the trait. \n\nNew implementations\n\nMust return a string, as in \"DecisionTree\".\n\n\n\n\n\n","category":"function"},{"location":"traits/#LearnAPI.pkg_license","page":"Algorithm Traits","title":"LearnAPI.pkg_license","text":"LearnAPI.pkg_license(algorithm)\n\nReturn the name of the software license, such as \"MIT\", applying to the package where the core algorithm for algorithm is implemented.\n\n\n\n\n\n","category":"function"},{"location":"traits/#LearnAPI.doc_url","page":"Algorithm Traits","title":"LearnAPI.doc_url","text":"LearnAPI.doc_url(algorithm)\n\nReturn a url where the core algorithm for algorithm is documented.\n\nReturns \"unknown\" if the algorithm implementation has failed to overload the trait. \n\nNew implementations\n\nMust return a string, such as \"https://en.wikipedia.org/wiki/Decision_tree_learning\".\n\n\n\n\n\n","category":"function"},{"location":"traits/#LearnAPI.load_path","page":"Algorithm Traits","title":"LearnAPI.load_path","text":"LearnAPI.load_path(algorithm)\n\nReturn a string indicating where the struct for typeof(algorithm) can be found, beginning with the name of the package module defining it. For example, a return value of \"FastTrees.LearnAPI.DecisionTreeClassifier\" means the following julia code will return the algorithm type:\n\nimport FastTrees\nFastTrees.LearnAPI.DecisionTreeClassifier\n\nReturns \"unknown\" if the algorithm implementation has failed to overload the trait. \n\n\n\n\n\n","category":"function"},{"location":"traits/#LearnAPI.is_composite","page":"Algorithm Traits","title":"LearnAPI.is_composite","text":"LearnAPI.is_composite(algorithm)\n\nReturns true if one or more properties (fields) of algorithm may themselves be algorithms, and false otherwise.\n\nSee also [LearnAPI.components](@ref).\n\nNew implementations\n\nThis trait should be overloaded if one or more properties (fields) of algorithm may take algorithm values. Fallback return value is false. The keyword constructor for such an algorithm need not prescribe defaults for algorithm-valued properties. Implementation of the accessor function LearnAPI.components is recommended.\n\nThe value of the trait must depend only on the type of algorithm. \n\n\n\n\n\n","category":"function"},{"location":"traits/#LearnAPI.human_name","page":"Algorithm Traits","title":"LearnAPI.human_name","text":"LearnAPI.human_name(algorithm)\n\nA human-readable string representation of typeof(algorithm). Primarily intended for auto-generation of documentation.\n\nNew implementations\n\nOptional. A fallback takes the type name, inserts spaces and removes capitalization. For example, KNNRegressor becomes \"knn regressor\". Better would be to overload the trait to return \"K-nearest neighbors regressor\". Ideally, this is a \"concrete\" noun like \"ridge regressor\" rather than an \"abstract\" noun like \"ridge regression\".\n\n\n\n\n\n","category":"function"},{"location":"traits/#LearnAPI.iteration_parameter","page":"Algorithm Traits","title":"LearnAPI.iteration_parameter","text":"LearnAPI.iteration_parameter(algorithm)\n\nThe name of the iteration parameter of algorithm, or nothing if the algorithm is not iterative.\n\nNew implementations\n\nImplement if algorithm is iterative. Returns a symbol or nothing.\n\n\n\n\n\n","category":"function"},{"location":"traits/#LearnAPI.fit_scitype","page":"Algorithm Traits","title":"LearnAPI.fit_scitype","text":"LearnAPI.fit_scitype(algorithm)\n\nReturn an upper bound on the scitype of data guaranteed to work when calling fit(algorithm, data...).\n\nSpecifically, if the return value is S and ScientificTypes.scitype(data) <: S, then all the following calls are guaranteed to work:\n\nfit(algorithm, data...)\nobsdata = obs(fit, algorithm, data...)\nfit(algorithm, Obs(), obsdata)\n\nSee also LearnAPI.fit_type, LearnAPI.fit_observation_scitype, LearnAPI.fit_observation_type.\n\nNew implementations\n\nOptional. The fallback return value is Union{}. Ordinarily, at most one of the following should be overloaded for given algorithm LearnAPI.fit_scitype, LearnAPI.fit_type, LearnAPI.fit_observation_scitype, LearnAPI.fit_observation_type.\n\n\n\n\n\n","category":"function"},{"location":"traits/#LearnAPI.fit_type","page":"Algorithm Traits","title":"LearnAPI.fit_type","text":"LearnAPI.fit_type(algorithm)\n\nReturn an upper bound on the type of data guaranteed to work when calling fit(algorithm, data...).\n\nSpecifically, if the return value is T and typeof(data) <: T, then all the following calls are guaranteed to work:\n\nfit(algorithm, data...)\nobsdata = obs(fit, algorithm, data...)\nfit(algorithm, Obs(), obsdata)\n\nSee also LearnAPI.fit_scitype, LearnAPI.fit_observation_type. LearnAPI.fit_observation_scitype\n\nNew implementations\n\nOptional. The fallback return value is Union{}. Ordinarily, at most one of the following should be overloaded for given algorithm LearnAPI.fit_scitype, LearnAPI.fit_type, LearnAPI.fit_observation_scitype, LearnAPI.fit_observation_type.\n\n\n\n\n\n","category":"function"},{"location":"traits/#LearnAPI.fit_observation_scitype","page":"Algorithm Traits","title":"LearnAPI.fit_observation_scitype","text":"LearnAPI.fit_observation_scitype(algorithm)\n\nReturn an upper bound on the scitype of observations guaranteed to work when calling fit(algorithm, data...), independent of the type/scitype of the data container itself. Here \"observations\" is in the sense of MLUtils.jl. Assuming this trait has value different from Union{} the understanding is that data implements the MLUtils.jl getobs/numobs interface.\n\nSpecifically, denoting the type returned above by S, supposing S != Union{}, and that user supplies data satisfying\n\nScientificTypes.scitype(MLUtils.getobs(data, i)) <: S\n\nfor any valid index i, then all the following are guaranteed to work:\n\nfit(algorithm, data....)\nobsdata = obs(fit, algorithm, data...)\nfit(algorithm, Obs(), obsdata)\n\nSee also See also LearnAPI.fit_type, LearnAPI.fit_scitype, LearnAPI.fit_observation_type.\n\nNew implementations\n\nOptional. The fallback return value is Union{}. Ordinarily, at most one of the following should be overloaded for given algorithm LearnAPI.fit_scitype, LearnAPI.fit_type, LearnAPI.fit_observation_scitype, LearnAPI.fit_observation_type.\n\n\n\n\n\n","category":"function"},{"location":"traits/#LearnAPI.fit_observation_type","page":"Algorithm Traits","title":"LearnAPI.fit_observation_type","text":"LearnAPI.fit_observation_type(algorithm)\n\nReturn an upper bound on the type of observations guaranteed to work when calling fit(algorithm, data...), independent of the type/scitype of the data container itself. Here \"observations\" is in the sense of MLUtils.jl. Assuming this trait has value different from Union{} the understanding is that data implements the MLUtils.jl getobs/numobs interface.\n\nSpecifically, denoting the type returned above by T, supposing T != Union{}, and that user supplies data satisfying\n\ntypeof(MLUtils.getobs(data, i)) <: T\n\nfor any valid index i, then the following is guaranteed to work:\n\nfit(algorithm, data....)\nobsdata = obs(fit, algorithm, data...)\nfit(algorithm, Obs(), obsdata)\n\nSee also See also LearnAPI.fit_type, LearnAPI.fit_scitype, LearnAPI.fit_observation_scitype.\n\nNew implementations\n\nOptional. The fallback return value is Union{}. Ordinarily, at most one of the following should be overloaded for given algorithm LearnAPI.fit_scitype, LearnAPI.fit_type, LearnAPI.fit_observation_scitype, LearnAPI.fit_observation_type.\n\n\n\n\n\n","category":"function"},{"location":"traits/#LearnAPI.predict_input_scitype","page":"Algorithm Traits","title":"LearnAPI.predict_input_scitype","text":" LearnAPI.predict_input_scitype(algorithm)\n\nReturn an upper bound on the scitype of data guaranteed to work in the call predict(algorithm, kind_of_proxy, data...).\n\nSpecifically, if S is the value returned and ScientificTypes.scitype(data) <: S, then the following is guaranteed to work:\n\njulia predict(model, kind_of_proxy, data...) obsdata = obs(predict, algorithm, data...) predict(model, kind_of_proxy, Obs(), obsdata) whenever algorithm = LearnAPI.algorithm(model).\n\nSee also LearnAPI.predict_input_type.\n\nNew implementations\n\nImplementation is optional. The fallback return value is Union{}. Ordinarily, at most one of the following should be overloaded for given algorithm LearnAPI.predict_scitype, LearnAPI.predict_type, LearnAPI.predict_observation_scitype, LearnAPI.predict_observation_type.\n\n\n\n\n\n","category":"function"},{"location":"traits/#LearnAPI.predict_input_observation_scitype","page":"Algorithm Traits","title":"LearnAPI.predict_input_observation_scitype","text":"LearnAPI.predict_observation_scitype(algorithm)\n\nReturn an upper bound on the scitype of observations guaranteed to work when calling predict(model, kind_of_proxy, data...), independent of the type/scitype of the data container itself. Here \"observations\" is in the sense of MLUtils.jl. Assuming this trait has value different from Union{} the understanding is that data implements the MLUtils.jl getobs/numobs interface.\n\nSpecifically, denoting the type returned above by S, supposing S != Union{}, and that user supplies data satisfying\n\nScientificTypes.scitype(MLUtils.getobs(data, i)) <: S\n\nfor any valid index i, then all the following are guaranteed to work:\n\npredict(model, kind_of_proxy, data...)\nobsdata = obs(predict, algorithm, data...)\npredict(model, kind_of_proxy, Obs(), obsdata)\n\nwhenever algorithm = LearnAPI.algorithm(model).\n\nSee also See also LearnAPI.fit_type, LearnAPI.fit_scitype, LearnAPI.fit_observation_type.\n\nNew implementations\n\nOptional. The fallback return value is Union{}. Ordinarily, at most one of the following should be overloaded for given algorithm LearnAPI.predict_scitype, LearnAPI.predict_type, LearnAPI.predict_observation_scitype, LearnAPI.predict_observation_type.\n\n\n\n\n\n","category":"function"},{"location":"traits/#LearnAPI.predict_input_type","page":"Algorithm Traits","title":"LearnAPI.predict_input_type","text":"LearnAPI.predict_input_type(algorithm)\n\nReturn an upper bound on the type of data guaranteed to work in the call predict(algorithm, kind_of_proxy, data...).\n\nSpecifically, if T is the value returned and typeof(data) <: T, then the following is guaranteed to work:\n\npredict(model, kind_of_proxy, data...)\nobsdata = obs(predict, model, data...)\npredict(model, kind_of_proxy, Obs(), obsdata)\n\nSee also LearnAPI.predict_input_scitype.\n\nNew implementations\n\nImplementation is optional. The fallback return value is Union{}. Should not be overloaded if LearnAPI.predict_input_scitype is overloaded.\n\n\n\n\n\n","category":"function"},{"location":"traits/#LearnAPI.predict_input_observation_type","page":"Algorithm Traits","title":"LearnAPI.predict_input_observation_type","text":"LearnAPI.predict_observation_type(algorithm)\n\nReturn an upper bound on the type of observations guaranteed to work when calling predict(model, kind_of_proxy, data...), independent of the type/scitype of the data container itself. Here \"observations\" is in the sense of MLUtils.jl. Assuming this trait has value different from Union{} the understanding is that data implements the MLUtils.jl getobs/numobs interface.\n\nSpecifically, denoting the type returned above by T, supposing T != Union{}, and that user supplies data satisfying\n\ntypeof(MLUtils.getobs(data, i)) <: T\n\nfor any valid index i, then all the following are guaranteed to work:\n\npredict(model, kind_of_proxy, data...)\nobsdata = obs(predict, algorithm, data...)\npredict(model, kind_of_proxy, Obs(), obsdata)\n\nwhenever algorithm = LearnAPI.algorithm(model).\n\nSee also See also LearnAPI.fit_type, LearnAPI.fit_scitype, LearnAPI.fit_observation_type.\n\nNew implementations\n\nOptional. The fallback return value is Union{}. Ordinarily, at most one of the following should be overloaded for given algorithm LearnAPI.predict_scitype, LearnAPI.predict_type, LearnAPI.predict_observation_scitype, LearnAPI.predict_observation_type.\n\n\n\n\n\n","category":"function"},{"location":"traits/#LearnAPI.predict_output_scitype","page":"Algorithm Traits","title":"LearnAPI.predict_output_scitype","text":"LearnAPI.predict_output_scitype(algorithm, kind_of_proxy::KindOfProxy)\n\nReturn an upper bound for the scitypes of predictions of the specified form where supported, and otherwise return Any. For example, if\n\nŷ = LearnAPI.predict(model, LearnAPI.Distribution(), data...)\n\nsuccessfully returns (i.e., algorithm supports predictions of target probability distributions) then the following is guaranteed to hold:\n\nscitype(ŷ) <: LearnAPI.predict_output_scitype(algorithm, LearnAPI.Distribution())\n\nNote. This trait has a single-argument \"convenience\" version LearnAPI.predict_output_scitype(algorithm) derived from this one, which returns a dictionary keyed on target proxy types.\n\nSee also LearnAPI.KindOfProxy, LearnAPI.predict, LearnAPI.predict_input_scitype.\n\nNew implementations\n\nOverloading the trait is optional. Here's a sample implementation for a supervised regressor type MyRgs that only predicts actual values of the target:\n\n@trait MyRgs predict_output_scitype = AbstractVector{ScientificTypesBase.Continuous}\n\nThe fallback method returns Any.\n\n\n\n\n\nLearnAPI.predict_output_scitype(algorithm)\n\nReturn a dictionary of upper bounds on the scitype of predictions, keyed on concrete subtypes of LearnAPI.KindOfProxy. Each of these subtypes represents a different form of target prediction (LiteralTarget, Distribution, SurvivalFunction, etc) possibly supported by algorithm, but the existence of a key does not guarantee that form is supported.\n\nAs an example, if\n\nŷ = LearnAPI.predict(model, LearnAPI.Distribution(), data...)\n\nsuccessfully returns (i.e., algorithm supports predictions of target probability distributions) then the following is guaranteed to hold:\n\nscitype(ŷ) <: LearnAPI.predict_output_scitypes(algorithm)[LearnAPI.Distribution]\n\nSee also LearnAPI.KindOfProxy, LearnAPI.predict, LearnAPI.predict_input_scitype.\n\nNew implementations\n\nThis single argument trait should not be overloaded. Instead, overload LearnAPI.predict_output_scitype(algorithm, kindofproxy).\n\n\n\n\n\n","category":"function"},{"location":"traits/#LearnAPI.predict_output_type","page":"Algorithm Traits","title":"LearnAPI.predict_output_type","text":"LearnAPI.predict_output_type(algorithm, kind_of_proxy::KindOfProxy)\n\nReturn an upper bound for the types of predictions of the specified form where supported, and otherwise return Any. For example, if\n\nŷ = LearnAPI.predict(model, LearnAPI.Distribution(), data...)\n\nsuccessfully returns (i.e., algorithm supports predictions of target probability distributions) then the following is guaranteed to hold:\n\ntype(ŷ) <: LearnAPI.predict_output_type(algorithm, LearnAPI.Distribution())\n\nNote. This trait has a single-argument \"convenience\" version LearnAPI.predict_output_type(algorithm) derived from this one, which returns a dictionary keyed on target proxy types.\n\nSee also LearnAPI.KindOfProxy, LearnAPI.predict, LearnAPI.predict_input_type.\n\nNew implementations\n\nOverloading the trait is optional. Here's a sample implementation for a supervised regressor type MyRgs that only predicts actual values of the target:\n\n@trait MyRgs predict_output_type = AbstractVector{ScientificTypesBase.Continuous}\n\nThe fallback method returns Any.\n\n\n\n\n\nLearnAPI.predict_output_type(algorithm)\n\nReturn a dictionary of upper bounds on the type of predictions, keyed on concrete subtypes of LearnAPI.KindOfProxy. Each of these subtypes represents a different form of target prediction (LiteralTarget, Distribution, SurvivalFunction, etc) possibly supported by algorithm, but the existence of a key does not guarantee that form is supported.\n\nAs an example, if\n\nŷ = LearnAPI.predict(model, LearnAPI.Distribution(), data...)\n\nsuccessfully returns (i.e., algorithm supports predictions of target probability distributions) then the following is guaranteed to hold:\n\ntype(ŷ) <: LearnAPI.predict_output_types(algorithm)[LearnAPI.Distribution]\n\nSee also LearnAPI.KindOfProxy, LearnAPI.predict, LearnAPI.predict_input_type.\n\nNew implementations\n\nThis single argument trait should not be overloaded. Instead, overload LearnAPI.predict_output_type(algorithm, kindofproxy).\n\n\n\n\n\n","category":"function"},{"location":"traits/#LearnAPI.transform_input_scitype","page":"Algorithm Traits","title":"LearnAPI.transform_input_scitype","text":" LearnAPI.transform_input_scitype(algorithm)\n\nReturn an upper bound on the scitype of data guaranteed to work in the call transform(algorithm, data...).\n\nSpecifically, if S is the value returned and ScientificTypes.scitype(data) <: S, then the following is guaranteed to work:\n\njulia transform(model, data...) obsdata = obs(transform, algorithm, data...) transform(model, Obs(), obsdata) whenever algorithm = LearnAPI.algorithm(model).\n\nSee also LearnAPI.transform_input_type.\n\nNew implementations\n\nImplementation is optional. The fallback return value is Union{}. Ordinarily, at most one of the following should be overloaded for given algorithm LearnAPI.transform_scitype, LearnAPI.transform_type, LearnAPI.transform_observation_scitype, LearnAPI.transform_observation_type.\n\n\n\n\n\n","category":"function"},{"location":"traits/#LearnAPI.transform_input_observation_scitype","page":"Algorithm Traits","title":"LearnAPI.transform_input_observation_scitype","text":"LearnAPI.transform_observation_scitype(algorithm)\n\nReturn an upper bound on the scitype of observations guaranteed to work when calling transform(model, data...), independent of the type/scitype of the data container itself. Here \"observations\" is in the sense of MLUtils.jl. Assuming this trait has value different from Union{} the understanding is that data implements the MLUtils.jl getobs/numobs interface.\n\nSpecifically, denoting the type returned above by S, supposing S != Union{}, and that user supplies data satisfying\n\nScientificTypes.scitype(MLUtils.getobs(data, i)) <: S\n\nfor any valid index i, then all the following are guaranteed to work:\n\ntransform(model, data...)\nobsdata = obs(transform, algorithm, data...)\ntransform(model, Obs(), obsdata)\n\nwhenever algorithm = LearnAPI.algorithm(model).\n\nSee also See also LearnAPI.fit_type, LearnAPI.fit_scitype, LearnAPI.fit_observation_type.\n\nNew implementations\n\nOptional. The fallback return value is Union{}. Ordinarily, at most one of the following should be overloaded for given algorithm LearnAPI.transform_scitype, LearnAPI.transform_type, LearnAPI.transform_observation_scitype, LearnAPI.transform_observation_type.\n\n\n\n\n\n","category":"function"},{"location":"traits/#LearnAPI.transform_input_type","page":"Algorithm Traits","title":"LearnAPI.transform_input_type","text":"LearnAPI.transform_input_type(algorithm)\n\nReturn an upper bound on the type of data guaranteed to work in the call transform(algorithm, data...).\n\nSpecifically, if T is the value returned and typeof(data) <: T, then the following is guaranteed to work:\n\ntransform(model, data...)\nobsdata = obs(transform, model, data...)\ntransform(model, Obs(), obsdata)\n\nSee also LearnAPI.transform_input_scitype.\n\nNew implementations\n\nImplementation is optional. The fallback return value is Union{}. Should not be overloaded if LearnAPI.transform_input_scitype is overloaded.\n\n\n\n\n\n","category":"function"},{"location":"traits/#LearnAPI.transform_input_observation_type","page":"Algorithm Traits","title":"LearnAPI.transform_input_observation_type","text":"LearnAPI.transform_observation_type(algorithm)\n\nReturn an upper bound on the type of observations guaranteed to work when calling transform(model, data...), independent of the type/scitype of the data container itself. Here \"observations\" is in the sense of MLUtils.jl. Assuming this trait has value different from Union{} the understanding is that data implements the MLUtils.jl getobs/numobs interface.\n\nSpecifically, denoting the type returned above by T, supposing T != Union{}, and that user supplies data satisfying\n\ntypeof(MLUtils.getobs(data, i)) <: T\n\nfor any valid index i, then all the following are guaranteed to work:\n\ntransform(model, data...)\nobsdata = obs(transform, algorithm, data...)\ntransform(model, Obs(), obsdata)\n\nwhenever algorithm = LearnAPI.algorithm(model).\n\nSee also See also LearnAPI.fit_type, LearnAPI.fit_scitype, LearnAPI.fit_observation_type.\n\nNew implementations\n\nOptional. The fallback return value is Union{}. Ordinarily, at most one of the following should be overloaded for given algorithm LearnAPI.transform_scitype, LearnAPI.transform_type, LearnAPI.transform_observation_scitype, LearnAPI.transform_observation_type.\n\n\n\n\n\n","category":"function"},{"location":"traits/#LearnAPI.predict_or_transform_mutates","page":"Algorithm Traits","title":"LearnAPI.predict_or_transform_mutates","text":"LearnAPI.predict_or_transform_mutates(algorithm)\n\nReturns true if predict or transform possibly mutate their first argument, model, when LearnAPI.algorithm(model) == algorithm. If false, no arguments are ever mutated.\n\nNew implementations\n\nThis trait, falling back to false, may only be overloaded when fit has no data arguments (algorithm does not generalize to new data). See more at fit.\n\n\n\n\n\n","category":"function"},{"location":"traits/#LearnAPI.transform_output_scitype","page":"Algorithm Traits","title":"LearnAPI.transform_output_scitype","text":"LearnAPI.transform_output_scitype(algorithm)\n\nReturn an upper bound on the scitype of the output of the transform operation.\n\nSee also LearnAPI.transform_input_scitype.\n\nNew implementations\n\nImplementation is optional. The fallback return value is Any.\n\n\n\n\n\n","category":"function"},{"location":"traits/#LearnAPI.transform_output_type","page":"Algorithm Traits","title":"LearnAPI.transform_output_type","text":"LearnAPI.transform_output_type(algorithm)\n\nReturn an upper bound on the type of the output of the transform operation.\n\nNew implementations\n\nImplementation is optional. The fallback return value is Any.\n\n\n\n\n\n","category":"function"},{"location":"kinds_of_target_proxy/#proxy_types","page":"Kinds of Target Proxy","title":"Kinds of Target Proxy","text":"","category":"section"},{"location":"kinds_of_target_proxy/","page":"Kinds of Target Proxy","title":"Kinds of Target Proxy","text":"The available kinds of target proxy are classified by subtypes of LearnAPI.KindOfProxy. These types are intended for dispatch only and have no fields.","category":"page"},{"location":"kinds_of_target_proxy/","page":"Kinds of Target Proxy","title":"Kinds of Target Proxy","text":"LearnAPI.KindOfProxy","category":"page"},{"location":"kinds_of_target_proxy/#LearnAPI.KindOfProxy","page":"Kinds of Target Proxy","title":"LearnAPI.KindOfProxy","text":"LearnAPI.KindOfProxy\n\nAbstract type whose concrete subtypes T each represent a different kind of proxy for some target variable, associated with some algorithm. Instances T() are used to request the form of target predictions in predict calls.\n\nSee LearnAPI.jl documentation for an explanation of \"targets\" and \"target proxies\".\n\nFor example, Distribution is a concrete subtype of LearnAPI.KindOfProxy and a call like predict(model, Distribution(), Xnew) returns a data object whose observations are probability density/mass functions, assuming algorithm supports predictions of that form.\n\nRun LearnAPI.CONCRETE_TARGET_PROXY_TYPES to list all options. \n\n\n\n\n\n","category":"type"},{"location":"kinds_of_target_proxy/","page":"Kinds of Target Proxy","title":"Kinds of Target Proxy","text":"LearnAPI.IID","category":"page"},{"location":"kinds_of_target_proxy/#LearnAPI.IID","page":"Kinds of Target Proxy","title":"LearnAPI.IID","text":"LearnAPI.IID <: LearnAPI.KindOfProxy\n\nAbstract subtype of LearnAPI.KindOfProxy. If kind_of_proxy is an instance of LearnAPI.IID then, given data constisting of n observations, the following must hold:\n\nŷ = LearnAPI.predict(model, kind_of_proxy, data...) is data also consisting of n observations.\nThe jth observation of ŷ, for any j, depends only on the jth observation of the provided data (no correlation between observations).\n\nSee also LearnAPI.KindOfProxy.\n\n\n\n\n\n","category":"type"},{"location":"kinds_of_target_proxy/#Simple-target-proxies-(subtypes-of-LearnAPI.IID)","page":"Kinds of Target Proxy","title":"Simple target proxies (subtypes of LearnAPI.IID)","text":"","category":"section"},{"location":"kinds_of_target_proxy/","page":"Kinds of Target Proxy","title":"Kinds of Target Proxy","text":"type form of an observation\nLearnAPI.LiteralTarget same as target observations\nLearnAPI.Sampleable object that can be sampled to obtain object of the same form as target observation\nLearnAPI.Distribution explicit probability density/mass function whose sample space is all possible target observations\nLearnAPI.LogDistribution explicit log-probability density/mass function whose sample space is possible target observations\n† LearnAPI.Probability numerical probability or probability vector\n† LearnAPI.LogProbability log-probability or log-probability vector\n† LearnAPI.Parametric a list of parameters (e.g., mean and variance) describing some distribution\nLearnAPI.LabelAmbiguous collections of labels (in case of multi-class target) but without a known correspondence to the original target labels (and of possibly different number) as in, e.g., clustering\nLearnAPI.LabelAmbiguousSampleable sampleable version of LabelAmbiguous; see Sampleable above\nLearnAPI.LabelAmbiguousDistribution pdf/pmf version of LabelAmbiguous; see Distribution above\nLearnAPI.ConfidenceInterval confidence interval\nLearnAPI.Set finite but possibly varying number of target observations\nLearnAPI.ProbabilisticSet as for Set but labeled with probabilities (not necessarily summing to one)\nLearnAPI.SurvivalFunction survival function\nLearnAPI.SurvivalDistribution probability distribution for survival time\nLearnAPI.OutlierScore numerical score reflecting degree of outlierness (not necessarily normalized)\nLearnAPI.Continuous real-valued approximation/interpolation of a discrete-valued target, such as a count (e.g., number of phone calls)","category":"page"},{"location":"kinds_of_target_proxy/","page":"Kinds of Target Proxy","title":"Kinds of Target Proxy","text":"† Provided for completeness but discouraged to avoid ambiguities in representation.","category":"page"},{"location":"kinds_of_target_proxy/","page":"Kinds of Target Proxy","title":"Kinds of Target Proxy","text":"Table of concrete subtypes of LearnAPI.IID <: LearnAPI.KindOfProxy.","category":"page"},{"location":"kinds_of_target_proxy/#When-the-proxy-for-the-target-is-a-single-object","page":"Kinds of Target Proxy","title":"When the proxy for the target is a single object","text":"","category":"section"},{"location":"kinds_of_target_proxy/","page":"Kinds of Target Proxy","title":"Kinds of Target Proxy","text":"In the following table of subtypes T <: LearnAPI.KindOfProxy not falling under the IID umbrella, it is understood that predict(model, ::T, ...) is not divided into individual observations, but represents a single probability distribution for the sample space Y^n, where Y is the space the target variable takes its values, and n is the number of observations in data.","category":"page"},{"location":"kinds_of_target_proxy/","page":"Kinds of Target Proxy","title":"Kinds of Target Proxy","text":"type T form of output of predict(model, ::T, data...)\nLearnAPI.JointSampleable object that can be sampled to obtain a vector whose elements have the form of target observations; the vector length matches the number of observations in data.\nLearnAPI.JointDistribution explicit probability density/mass function whose sample space is vectors of target observations; the vector length matches the number of observations in data\nLearnAPI.JointLogDistribution explicit log-probability density/mass function whose sample space is vectors of target observations; the vector length matches the number of observations in data","category":"page"},{"location":"kinds_of_target_proxy/","page":"Kinds of Target Proxy","title":"Kinds of Target Proxy","text":"Table of LearnAPI.KindOfProxy subtypes not subtyping LearnAPI.IID","category":"page"},{"location":"patterns/supervised_bayesian_models/#Supervised-Bayesian-Algorithms","page":"Supervised Bayesian Algorithms","title":"Supervised Bayesian Algorithms","text":"","category":"section"},{"location":"testing_an_implementation/#Testing-an-Implementation","page":"Testing an Implementation","title":"Testing an Implementation","text":"","category":"section"},{"location":"testing_an_implementation/","page":"Testing an Implementation","title":"Testing an Implementation","text":"🚧","category":"page"},{"location":"testing_an_implementation/","page":"Testing an Implementation","title":"Testing an Implementation","text":"warning: Warning\nUnder construction","category":"page"},{"location":"patterns/time_series_classification/#Time-Series-Classification","page":"Time Series Classification","title":"Time Series Classification","text":"","category":"section"},{"location":"anatomy_of_an_implementation/#Anatomy-of-an-Implementation","page":"Anatomy of an Implementation","title":"Anatomy of an Implementation","text":"","category":"section"},{"location":"anatomy_of_an_implementation/","page":"Anatomy of an Implementation","title":"Anatomy of an Implementation","text":"This section explains a detailed implementation of the LearnAPI for naive ridge regression. Most readers will want to scan the demonstration of the implementation before studying the implementation itself.","category":"page"},{"location":"anatomy_of_an_implementation/#Defining-an-algorithm-type","page":"Anatomy of an Implementation","title":"Defining an algorithm type","text":"","category":"section"},{"location":"anatomy_of_an_implementation/","page":"Anatomy of an Implementation","title":"Anatomy of an Implementation","text":"The first line below imports the lightweight package LearnAPI.jl whose methods we will be extending. The second imports libraries needed for the core algorithm.","category":"page"},{"location":"anatomy_of_an_implementation/","page":"Anatomy of an Implementation","title":"Anatomy of an Implementation","text":"using LearnAPI\nusing LinearAlgebra, Tables\nnothing # hide","category":"page"},{"location":"anatomy_of_an_implementation/","page":"Anatomy of an Implementation","title":"Anatomy of an Implementation","text":"A struct stores the regularization hyperparameter lambda of our ridge regressor:","category":"page"},{"location":"anatomy_of_an_implementation/","page":"Anatomy of an Implementation","title":"Anatomy of an Implementation","text":"struct Ridge\n lambda::Float64\nend\nnothing # hide","category":"page"},{"location":"anatomy_of_an_implementation/","page":"Anatomy of an Implementation","title":"Anatomy of an Implementation","text":"Instances of Ridge are algorithms, in LearnAPI.jl parlance.","category":"page"},{"location":"anatomy_of_an_implementation/","page":"Anatomy of an Implementation","title":"Anatomy of an Implementation","text":"A keyword argument constructor provides defaults for all hyperparameters:","category":"page"},{"location":"anatomy_of_an_implementation/","page":"Anatomy of an Implementation","title":"Anatomy of an Implementation","text":"Ridge(; lambda=0.1) = Ridge(lambda)\nnothing # hide","category":"page"},{"location":"anatomy_of_an_implementation/#Implementing-fit","page":"Anatomy of an Implementation","title":"Implementing fit","text":"","category":"section"},{"location":"anatomy_of_an_implementation/","page":"Anatomy of an Implementation","title":"Anatomy of an Implementation","text":"A ridge regressor requires two types of data for training: input features X, which here we suppose are tabular, and a target y, which we suppose is a vector. Users will accordingly call fit like this:","category":"page"},{"location":"anatomy_of_an_implementation/","page":"Anatomy of an Implementation","title":"Anatomy of an Implementation","text":"algorithm = Ridge(lambda=0.05)\nfit(algorithm, X, y; verbosity=1)","category":"page"},{"location":"anatomy_of_an_implementation/","page":"Anatomy of an Implementation","title":"Anatomy of an Implementation","text":"However, a new implementation does not overload fit. Rather it implements","category":"page"},{"location":"anatomy_of_an_implementation/","page":"Anatomy of an Implementation","title":"Anatomy of an Implementation","text":"obsfit(algorithm::Ridge, obsdata; verbosity=1)","category":"page"},{"location":"anatomy_of_an_implementation/","page":"Anatomy of an Implementation","title":"Anatomy of an Implementation","text":"for each obsdata returned by a data-preprocessing call obs(fit, algorithm, X, y). You can read \"obs\" as \"observation-accessible\", for reasons explained shortly. The LearnAPI.jl definition","category":"page"},{"location":"anatomy_of_an_implementation/","page":"Anatomy of an Implementation","title":"Anatomy of an Implementation","text":"fit(algorithm, data...; verbosity=1) =\n obsfit(algorithm, obs(fit, algorithm, data...), verbosity)","category":"page"},{"location":"anatomy_of_an_implementation/","page":"Anatomy of an Implementation","title":"Anatomy of an Implementation","text":"then takes care of fit.","category":"page"},{"location":"anatomy_of_an_implementation/","page":"Anatomy of an Implementation","title":"Anatomy of an Implementation","text":"The obs and obsfit method are public, and the user can call them like this:","category":"page"},{"location":"anatomy_of_an_implementation/","page":"Anatomy of an Implementation","title":"Anatomy of an Implementation","text":"obsdata = obs(fit, algorithm, X, y)\nmodel = obsfit(algorithm, obsdata)","category":"page"},{"location":"anatomy_of_an_implementation/","page":"Anatomy of an Implementation","title":"Anatomy of an Implementation","text":"We begin by defining a struct¹ for the output of our data-preprocessing operation, obs, which will store y and the matrix representation of X, together with it's column names (needed for recording named coefficients for user inspection):","category":"page"},{"location":"anatomy_of_an_implementation/","page":"Anatomy of an Implementation","title":"Anatomy of an Implementation","text":"struct RidgeFitData{T}\n A::Matrix{T} # p x n\n names::Vector{Symbol}\n y::Vector{T}\nend\nnothing # hide","category":"page"},{"location":"anatomy_of_an_implementation/","page":"Anatomy of an Implementation","title":"Anatomy of an Implementation","text":"And we overload obs like this","category":"page"},{"location":"anatomy_of_an_implementation/","page":"Anatomy of an Implementation","title":"Anatomy of an Implementation","text":"function LearnAPI.obs(::typeof(fit), ::Ridge, X, y)\n table = Tables.columntable(X)\n names = Tables.columnnames(table) |> collect\n return RidgeFitData(Tables.matrix(table, transpose=true), names, y)\nend\nnothing # hide","category":"page"},{"location":"anatomy_of_an_implementation/","page":"Anatomy of an Implementation","title":"Anatomy of an Implementation","text":"so that obs(fit, Ridge(), X, y) returns a combined RidgeFitData object with everything the core algorithm will need.","category":"page"},{"location":"anatomy_of_an_implementation/","page":"Anatomy of an Implementation","title":"Anatomy of an Implementation","text":"Since obs is public, the user will have access to this object, but to make it useful to her (and to fulfill the obs contract) this object must implement the MLUtils.jl getobs/numobs interface, to enable observation-resampling (which will be efficient, because observations are now columns). It usually suffices to overload Base.getindex and Base.length (which are the getobs/numobs fallbacks) so we won't actually need to depend on MLUtils.jl:","category":"page"},{"location":"anatomy_of_an_implementation/","page":"Anatomy of an Implementation","title":"Anatomy of an Implementation","text":"Base.getindex(data::RidgeFitData, I) =\n RidgeFitData(data.A[:,I], data.names, y[I])\nBase.length(data::RidgeFitData, I) = length(data.y)\nnothing # hide","category":"page"},{"location":"anatomy_of_an_implementation/","page":"Anatomy of an Implementation","title":"Anatomy of an Implementation","text":"Next, we define a second struct for storing the outcomes of training, including named versions of the learned coefficients:","category":"page"},{"location":"anatomy_of_an_implementation/","page":"Anatomy of an Implementation","title":"Anatomy of an Implementation","text":"struct RidgeFitted{T,F}\n algorithm::Ridge\n coefficients::Vector{T}\n named_coefficients::F\nend\nnothing # hide","category":"page"},{"location":"anatomy_of_an_implementation/","page":"Anatomy of an Implementation","title":"Anatomy of an Implementation","text":"We include algorithm, which must be recoverable from the output of fit/obsfit (see Accessor functions below).","category":"page"},{"location":"anatomy_of_an_implementation/","page":"Anatomy of an Implementation","title":"Anatomy of an Implementation","text":"We are now ready to implement a suitable obsfit method to execute the core training:","category":"page"},{"location":"anatomy_of_an_implementation/","page":"Anatomy of an Implementation","title":"Anatomy of an Implementation","text":"function LearnAPI.obsfit(algorithm::Ridge, obsdata::RidgeFitData, verbosity)\n\n lambda = algorithm.lambda\n A = obsdata.A\n names = obsdata.names\n y = obsdata.y\n\n # apply core algorithm:\n coefficients = (A*A' + algorithm.lambda*I)\\(A*y) # 1 x p matrix\n\n # determine named coefficients:\n named_coefficients = [names[j] => coefficients[j] for j in eachindex(names)]\n\n # make some noise, if allowed:\n verbosity > 0 && @info \"Coefficients: $named_coefficients\"\n\n return RidgeFitted(algorithm, coefficients, named_coefficients)\n\nend\nnothing # hide","category":"page"},{"location":"anatomy_of_an_implementation/","page":"Anatomy of an Implementation","title":"Anatomy of an Implementation","text":"Users set verbosity=0 for warnings only, and verbosity=-1 for silence.","category":"page"},{"location":"anatomy_of_an_implementation/#Implementing-predict","page":"Anatomy of an Implementation","title":"Implementing predict","text":"","category":"section"},{"location":"anatomy_of_an_implementation/","page":"Anatomy of an Implementation","title":"Anatomy of an Implementation","text":"The primary predict call will look like this:","category":"page"},{"location":"anatomy_of_an_implementation/","page":"Anatomy of an Implementation","title":"Anatomy of an Implementation","text":"predict(model, LiteralTarget(), Xnew)","category":"page"},{"location":"anatomy_of_an_implementation/","page":"Anatomy of an Implementation","title":"Anatomy of an Implementation","text":"where Xnew is a table (of the same form as X above). The argument LiteralTarget() signals that we want literal predictions of the target variable, as opposed to a proxy for the target, such as probability density functions. LiteralTarget is an example of a LearnAPI.KindOfProxy type. Targets and target proxies are defined here.","category":"page"},{"location":"anatomy_of_an_implementation/","page":"Anatomy of an Implementation","title":"Anatomy of an Implementation","text":"Rather than overload the primary signature above, however, we overload for \"observation-accessible\" input, as we did for fit,","category":"page"},{"location":"anatomy_of_an_implementation/","page":"Anatomy of an Implementation","title":"Anatomy of an Implementation","text":"LearnAPI.obspredict(model::RidgeFitted, ::LiteralTarget, Anew::Matrix) =\n ((model.coefficients)'*Anew)'\nnothing # hide","category":"page"},{"location":"anatomy_of_an_implementation/","page":"Anatomy of an Implementation","title":"Anatomy of an Implementation","text":"and overload obs to make the table-to-matrix conversion:","category":"page"},{"location":"anatomy_of_an_implementation/","page":"Anatomy of an Implementation","title":"Anatomy of an Implementation","text":"LearnAPI.obs(::typeof(predict), ::Ridge, Xnew) = Tables.matrix(Xnew, transpose=true)","category":"page"},{"location":"anatomy_of_an_implementation/","page":"Anatomy of an Implementation","title":"Anatomy of an Implementation","text":"As matrices (with observations as columns) already implement the MLUtils.jl getobs/numobs interface, we already satisfy the obs contract, and there was no need to create a wrapper for obs output.","category":"page"},{"location":"anatomy_of_an_implementation/","page":"Anatomy of an Implementation","title":"Anatomy of an Implementation","text":"The primary predict method, handling tabular input, is provided by a LearnAPI.jl fallback similar to the fit fallback.","category":"page"},{"location":"anatomy_of_an_implementation/#Accessor-functions","page":"Anatomy of an Implementation","title":"Accessor functions","text":"","category":"section"},{"location":"anatomy_of_an_implementation/","page":"Anatomy of an Implementation","title":"Anatomy of an Implementation","text":"An accessor function has the output of fit (a \"model\") as it's sole argument. Every new implementation must implement the accessor function LearnAPI.algorithm for recovering an algorithm from a fitted object:","category":"page"},{"location":"anatomy_of_an_implementation/","page":"Anatomy of an Implementation","title":"Anatomy of an Implementation","text":"LearnAPI.algorithm(model::RidgeFitted) = model.algorithm","category":"page"},{"location":"anatomy_of_an_implementation/","page":"Anatomy of an Implementation","title":"Anatomy of an Implementation","text":"Other accessor functions extract learned parameters or some standard byproducts of training, such as feature importances or training losses.² Implementing the LearnAPI.coefficients accessor function is straightforward:","category":"page"},{"location":"anatomy_of_an_implementation/","page":"Anatomy of an Implementation","title":"Anatomy of an Implementation","text":"LearnAPI.coefficients(model::RidgeFitted) = model.named_coefficients\nnothing #hide","category":"page"},{"location":"anatomy_of_an_implementation/#Tearing-a-model-down-for-serialization","page":"Anatomy of an Implementation","title":"Tearing a model down for serialization","text":"","category":"section"},{"location":"anatomy_of_an_implementation/","page":"Anatomy of an Implementation","title":"Anatomy of an Implementation","text":"The minimize method falls back to the identity. Here, for the sake of illustration, we overload it to dump the named version of the coefficients:","category":"page"},{"location":"anatomy_of_an_implementation/","page":"Anatomy of an Implementation","title":"Anatomy of an Implementation","text":"LearnAPI.minimize(model::RidgeFitted) =\n RidgeFitted(model.algorithm, model.coefficients, nothing)","category":"page"},{"location":"anatomy_of_an_implementation/#Algorithm-traits","page":"Anatomy of an Implementation","title":"Algorithm traits","text":"","category":"section"},{"location":"anatomy_of_an_implementation/","page":"Anatomy of an Implementation","title":"Anatomy of an Implementation","text":"Algorithm traits record extra generic information about an algorithm, or make specific promises of behavior. They usually have an algorithm as the single argument.","category":"page"},{"location":"anatomy_of_an_implementation/","page":"Anatomy of an Implementation","title":"Anatomy of an Implementation","text":"In LearnAPI.jl predict always outputs a target or target proxy, where \"target\" is understood very broadly. We overload a trait to record the fact that the target variable explicitly appears in training (i.e, the algorithm is supervised) and where exactly it appears:","category":"page"},{"location":"anatomy_of_an_implementation/","page":"Anatomy of an Implementation","title":"Anatomy of an Implementation","text":"LearnAPI.position_of_target(::Ridge) = 2","category":"page"},{"location":"anatomy_of_an_implementation/","page":"Anatomy of an Implementation","title":"Anatomy of an Implementation","text":"Or, you can use the shorthand","category":"page"},{"location":"anatomy_of_an_implementation/","page":"Anatomy of an Implementation","title":"Anatomy of an Implementation","text":"@trait Ridge position_of_target = 2","category":"page"},{"location":"anatomy_of_an_implementation/","page":"Anatomy of an Implementation","title":"Anatomy of an Implementation","text":"The macro can also be used to specify multiple traits simultaneously:","category":"page"},{"location":"anatomy_of_an_implementation/","page":"Anatomy of an Implementation","title":"Anatomy of an Implementation","text":"@trait(\n Ridge,\n position_of_target = 2,\n kinds_of_proxy=(LiteralTarget(),),\n descriptors = (:regression,),\n functions = (\n fit,\n obsfit,\n minimize,\n predict,\n obspredict,\n obs,\n LearnAPI.algorithm,\n LearnAPI.coefficients,\n )\n)\nnothing # hide","category":"page"},{"location":"anatomy_of_an_implementation/","page":"Anatomy of an Implementation","title":"Anatomy of an Implementation","text":"Implementing the last trait, LearnAPI.functions, which must include all non-trait functions overloaded for Ridge, is compulsory. This is the only universally compulsory trait. It is worthwhile studying the list of all traits to see which might apply to a new implementation, to enable maximum buy into functionality provided by third party packages, and to assist third party algorithms that match machine learning algorithms to user defined tasks.","category":"page"},{"location":"anatomy_of_an_implementation/#workflow","page":"Anatomy of an Implementation","title":"Demonstration","text":"","category":"section"},{"location":"anatomy_of_an_implementation/","page":"Anatomy of an Implementation","title":"Anatomy of an Implementation","text":"We now illustrate how to interact directly with Ridge instances using the methods just implemented.","category":"page"},{"location":"anatomy_of_an_implementation/","page":"Anatomy of an Implementation","title":"Anatomy of an Implementation","text":"# synthesize some data:\nn = 10 # number of observations\ntrain = 1:6\ntest = 7:10\na, b, c = rand(n), rand(n), rand(n)\nX = (; a, b, c)\ny = 2a - b + 3c + 0.05*rand(n)\n\nalgorithm = Ridge(lambda=0.5)\nLearnAPI.functions(algorithm)","category":"page"},{"location":"anatomy_of_an_implementation/#Naive-user-workflow","page":"Anatomy of an Implementation","title":"Naive user workflow","text":"","category":"section"},{"location":"anatomy_of_an_implementation/","page":"Anatomy of an Implementation","title":"Anatomy of an Implementation","text":"Training and predicting with external resampling:","category":"page"},{"location":"anatomy_of_an_implementation/","page":"Anatomy of an Implementation","title":"Anatomy of an Implementation","text":"using Tables\nmodel = fit(algorithm, Tables.subset(X, train), y[train])\nŷ = predict(model, LiteralTarget(), Tables.subset(X, test))","category":"page"},{"location":"anatomy_of_an_implementation/#Advanced-workflow","page":"Anatomy of an Implementation","title":"Advanced workflow","text":"","category":"section"},{"location":"anatomy_of_an_implementation/","page":"Anatomy of an Implementation","title":"Anatomy of an Implementation","text":"We now train and predict using internal data representations, resampled using the generic MLUtils.jl interface.","category":"page"},{"location":"anatomy_of_an_implementation/","page":"Anatomy of an Implementation","title":"Anatomy of an Implementation","text":"import MLUtils\nfit_data = obs(fit, algorithm, X, y)\npredict_data = obs(predict, algorithm, X)\nmodel = obsfit(algorithm, MLUtils.getobs(fit_data, train))\nẑ = obspredict(model, LiteralTarget(), MLUtils.getobs(predict_data, test))\n@assert ẑ == ŷ\nnothing # hide","category":"page"},{"location":"anatomy_of_an_implementation/#Applying-an-accessor-function-and-serialization","page":"Anatomy of an Implementation","title":"Applying an accessor function and serialization","text":"","category":"section"},{"location":"anatomy_of_an_implementation/","page":"Anatomy of an Implementation","title":"Anatomy of an Implementation","text":"Extracting coefficients:","category":"page"},{"location":"anatomy_of_an_implementation/","page":"Anatomy of an Implementation","title":"Anatomy of an Implementation","text":"LearnAPI.coefficients(model)","category":"page"},{"location":"anatomy_of_an_implementation/","page":"Anatomy of an Implementation","title":"Anatomy of an Implementation","text":"Serialization/deserialization:","category":"page"},{"location":"anatomy_of_an_implementation/","page":"Anatomy of an Implementation","title":"Anatomy of an Implementation","text":"using Serialization\nsmall_model = minimize(model)\nserialize(\"my_ridge.jls\", small_model)\n\nrecovered_model = deserialize(\"my_ridge.jls\")\n@assert LearnAPI.algorithm(recovered_model) == algorithm\npredict(recovered_model, LiteralTarget(), X) == predict(model, LiteralTarget(), X)","category":"page"},{"location":"anatomy_of_an_implementation/","page":"Anatomy of an Implementation","title":"Anatomy of an Implementation","text":"","category":"page"},{"location":"anatomy_of_an_implementation/","page":"Anatomy of an Implementation","title":"Anatomy of an Implementation","text":"¹ The definition of this and other structs above is not an explicit requirement of LearnAPI.jl, whose constructs are purely functional. ","category":"page"},{"location":"anatomy_of_an_implementation/","page":"Anatomy of an Implementation","title":"Anatomy of an Implementation","text":"² An implementation can provide further accessor functions, if necessary, but like the native ones, they must be included in the LearnAPI.functions declaration.","category":"page"},{"location":"patterns/static_algorithms/#Static-Algorithms","page":"Static Algorithms","title":"Static Algorithms","text":"","category":"section"},{"location":"patterns/static_algorithms/","page":"Static Algorithms","title":"Static Algorithms","text":"See these examples from tests.","category":"page"},{"location":"patterns/clusterering/#Clusterering","page":"Clusterering","title":"Clusterering","text":"","category":"section"},{"location":"fit/#[fit](@ref-fit)","page":"fit","title":"fit","text":"","category":"section"},{"location":"fit/","page":"fit","title":"fit","text":"fit(algorithm, data...; verbosity=1) -> model\nfit(model, data...; verbosity=1) -> updated_model","category":"page"},{"location":"fit/#Typical-workflow","page":"fit","title":"Typical workflow","text":"","category":"section"},{"location":"fit/","page":"fit","title":"fit","text":"# Train some supervised `algorithm`:\nmodel = fit(algorithm, X, y)\n\n# Predict probability distributions:\nŷ = predict(model, Distribution(), Xnew)\n\n# Inspect some byproducts of training:\nLearnAPI.feature_importances(model)","category":"page"},{"location":"fit/#Implementation-guide","page":"fit","title":"Implementation guide","text":"","category":"section"},{"location":"fit/","page":"fit","title":"fit","text":"The fit method is not implemented directly. Instead, implement obsfit.","category":"page"},{"location":"fit/","page":"fit","title":"fit","text":"method fallback compulsory? requires\nobsfit(alg, ...) none yes obs in some cases\n ","category":"page"},{"location":"fit/#Reference","page":"fit","title":"Reference","text":"","category":"section"},{"location":"fit/","page":"fit","title":"fit","text":"LearnAPI.fit\nLearnAPI.obsfit","category":"page"},{"location":"fit/#LearnAPI.fit","page":"fit","title":"LearnAPI.fit","text":"LearnAPI.fit(algorithm, data...; verbosity=1)\n\nExecute the algorithm with configuration algorithm using the provided training data, returning an object, model, on which other methods, such as predict or transform, can be dispatched. LearnAPI.functions(algorithm) returns a list of methods that can be applied to either algorithm or model.\n\nArguments\n\nalgorithm: property-accessible object whose properties are the hyperparameters of some ML/statistical algorithm\ndata: tuple of data objects with a common number of observations, for example, data = (X, y, w) where X is a table of features, y is a target vector with the same number of rows, and w a vector of per-observation weights.\n\nverbosity=1: logging level; set to 0 for warnings only, and -1 for silent training\n\nSee also obsfit, predict, transform, inverse_transform, LearnAPI.functions, obs.\n\nExtended help\n\nNew implementations\n\nLearnAPI.jl provides the following definition of fit, which is never directly overloaded:\n\nfit(algorithm, data...; verbosity=1) =\n obsfit(algorithm, Obs(), obs(fit, algorithm, data...); verbosity)\n\nRather, new algorithms should overload obsfit. See also obs.\n\n\n\n\n\n","category":"function"},{"location":"fit/#LearnAPI.obsfit","page":"fit","title":"LearnAPI.obsfit","text":"obsfit(algorithm, obsdata; verbosity=1)\n\nA lower-level alternative to fit, this method consumes a pre-processed form of user data. Specifically, the following two code snippets are equivalent:\n\nmodel = fit(algorithm, data...)\n\nand\n\nobsdata = obs(fit, algorithm, data...)\nmodel = obsfit(algorithm, obsdata)\n\nHere obsdata is algorithm-specific, \"observation-accessible\" data, meaning it implements the MLUtils.jl getobs/numobs interface for observation resampling (even if data does not). Moreover, resampled versions of obsdata may be passed to obsfit in its place.\n\nThe use of obsfit may offer performance advantages. See more at obs.\n\nSee also fit, obs.\n\nExtended help\n\nNew implementations\n\nImplementation of the following method signature is compulsory for all new algorithms:\n\nLearnAPI.obsfit(algorithm, obsdata, verbosity)\n\nHere obsdata has the form explained above. If obs(fit, ...) is not being overloaded, then a fallback gives obsdata = data (always a tuple!). Note that verbosity is a positional argument, not a keyword argument in the overloaded signature.\n\nNew implementations must also implement LearnAPI.algorithm.\n\nIf overloaded, then the functions LearnAPI.obsfit and LearnAPI.fit must be included in the tuple returned by the LearnAPI.functions(algorithm) trait.\n\nNon-generalizing algorithms\n\nIf the algorithm does not generalize to new data (e.g, DBSCAN clustering) then data = () and obsfit carries out no computation, as this happen entirely in a transform and/or predict call. In such cases, obsfit(algorithm, ...) may return algorithm, but another possibility is allowed: To provide a mechanism for transform/predict to report byproducts of the computation (e.g., a list of boundary points in DBSCAN clustering) they are allowed to mutate the model object returned by obsfit, which is then arranged to be a mutable struct wrapping algorithm and fields to store the byproducts. In that case, LearnAPI.predict_or_transform_mutates(algorithm) must be overloaded to return true.\n\n\n\n\n\n","category":"function"},{"location":"reference/#reference","page":"Reference","title":"Reference","text":"","category":"section"},{"location":"reference/","page":"Reference","title":"Reference","text":"Here we give the definitive specification of the LearnAPI.jl interface. For informal guides see Anatomy of an Implementation and Common Implementation Patterns.","category":"page"},{"location":"reference/#scope","page":"Reference","title":"Important terms and concepts","text":"","category":"section"},{"location":"reference/","page":"Reference","title":"Reference","text":"The LearnAPI.jl specification is predicated on a few basic, informally defined notions:","category":"page"},{"location":"reference/#Data-and-observations","page":"Reference","title":"Data and observations","text":"","category":"section"},{"location":"reference/","page":"Reference","title":"Reference","text":"ML/statistical algorithms are typically applied in conjunction with resampling of observations, as in cross-validation. In this document data will always refer to objects encapsulating an ordered sequence of individual observations. If an algorithm is trained using multiple data objects, it is undertood that individual objects share the same number of observations, and that resampling of one component implies synchronized resampling of the others.","category":"page"},{"location":"reference/","page":"Reference","title":"Reference","text":"A DataFrame instance, from DataFrames.jl, is an example of data, the observations being the rows. LearnAPI.jl makes no assumptions about how observations can be accessed, except in the case of the output of obs, which must implement the MLUtils.jl getobs/numobs interface. For example, it is generally ambiguous whether the rows or columns of a matrix are considered observations, but if a matrix is returned by obs the observations must be the columns.","category":"page"},{"location":"reference/#hyperparameters","page":"Reference","title":"Hyperparameters","text":"","category":"section"},{"location":"reference/","page":"Reference","title":"Reference","text":"Besides the data it consumes, a machine learning algorithm's behavior is governed by a number of user-specified hyperparameters, such as the number of trees in a random forest. In LearnAPI.jl, one is allowed to have hyperparematers that are not data-generic. For example, a class weight dictionary will only make sense for a target taking values in the set of dictionary keys. ","category":"page"},{"location":"reference/#proxy","page":"Reference","title":"Targets and target proxies","text":"","category":"section"},{"location":"reference/#Context","page":"Reference","title":"Context","text":"","category":"section"},{"location":"reference/","page":"Reference","title":"Reference","text":"After training, a supervised classifier predicts labels on some input which are then compared with ground truth labels using some accuracy measure, to assesses the performance of the classifier. Alternatively, the classifier predicts class probabilities, which are instead paired with ground truth labels using a proper scoring rule, say. In outlier detection, \"outlier\"/\"inlier\" predictions, or probability-like scores, are similarly compared with ground truth labels. In clustering, integer labels assigned to observations by the clustering algorithm can can be paired with human labels using, say, the Rand index. In survival analysis, predicted survival functions or probability distributions are compared with censored ground truth survival times.","category":"page"},{"location":"reference/#Definitions","page":"Reference","title":"Definitions","text":"","category":"section"},{"location":"reference/","page":"Reference","title":"Reference","text":"More generally, whenever we have a variable (e.g., a class label) that can (in principle) can be paired with a predicted value, or some predicted \"proxy\" for that variable (such as a class probability), then we call the variable a target variable, and the predicted output a target proxy. In this definition, it is immaterial whether or not the target appears in training (is supervised) or whether or not the model generalizes to new observations (\"learns\").","category":"page"},{"location":"reference/","page":"Reference","title":"Reference","text":"LearnAPI.jl provides singleton target proxy types for prediction dispatch in LearnAPI.jl. These are also used to distinguish performance metrics provided by the package StatisticalMeasures.jl.","category":"page"},{"location":"reference/#algorithms","page":"Reference","title":"Algorithms","text":"","category":"section"},{"location":"reference/","page":"Reference","title":"Reference","text":"An object implementing the LearnAPI.jl interface is called an algorithm, although it is more accurately \"the configuration of some algorithm\".¹ It will have a type name reflecting the name of some ML/statistics algorithm (e.g., RandomForestRegressor) and it will encapsulate a particular set of user-specified hyperparameters.","category":"page"},{"location":"reference/","page":"Reference","title":"Reference","text":"Additionally, for alg::Alg to be a LearnAPI algorithm, we require:","category":"page"},{"location":"reference/","page":"Reference","title":"Reference","text":"Base.propertynames(alg) returns the hyperparameter names; values can be accessed using Base.getproperty\nIf alg is an algorithm, then so are all instances of the same type.\nIf _alg is another algorithm, then alg == _alg if and only if typeof(alg) == typeof(_alg) and corresponding properties are ==. This includes properties that are random number generators (which should be copied in training to avoid mutation).\nIf an algorithm has other algorithms as hyperparameters, then LearnAPI.is_composite(alg) must be true (fallback is false).\nA keyword constructor for Alg exists, providing default values for all non-algorithm hyperparameters.\nAt least one non-trait LearnAPI.jl function must be overloaded for instances of Alg, and accordingly LearnAPI.functions(algorithm) must be non-empty.","category":"page"},{"location":"reference/","page":"Reference","title":"Reference","text":"Any object alg for which LearnAPI.functions(alg) is non-empty is understood have a valid implementation of the LearnAPI.jl interface.","category":"page"},{"location":"reference/#Example","page":"Reference","title":"Example","text":"","category":"section"},{"location":"reference/","page":"Reference","title":"Reference","text":"Any instance of GradientRidgeRegressor defined below meets all but the last criterion above:","category":"page"},{"location":"reference/","page":"Reference","title":"Reference","text":"struct GradientRidgeRegressor{T<:Real}\n\tlearning_rate::T\n\tepochs::Int\n\tl2_regularization::T\nend\nGradientRidgeRegressor(; learning_rate=0.01, epochs=10, l2_regularization=0.01) =\n GradientRidgeRegressor(learning_rate, epochs, l2_regularization)","category":"page"},{"location":"reference/","page":"Reference","title":"Reference","text":"The same is not true if we make this a mutable struct. In that case we will need to appropriately overload Base.== for GradientRidgeRegressor.","category":"page"},{"location":"reference/#Methods","page":"Reference","title":"Methods","text":"","category":"section"},{"location":"reference/","page":"Reference","title":"Reference","text":"Only these method names are exported: fit, obsfit, predict, obspredict, transform, obstransform, inverse_transform, minimize, and obs. All new implementations must implement obsfit, the accessor function LearnAPI.algorithm and the trait LearnAPI.functions.","category":"page"},{"location":"reference/","page":"Reference","title":"Reference","text":"fit/obsfit: for training algorithms that generalize to new data\npredict/obspredict: for outputting targets or target proxies (such as probability density functions)\ntransform/obstransform: similar to predict, but for arbitrary kinds of output, and which can be paired with an inverse_transform method\ninverse_transform: for inverting the output of transform (\"inverting\" broadly understood)\nminimize: for stripping the model output by fit of inessential content, for purposes of serialization.\nobs: a method for exposing to the user \"optimized\", algorithm-specific representations of data, which can be passed to obsfit, obspredict or obstransform, but which can also be efficiently resampled using the getobs/numobs interface provided by MLUtils.jl.\nAccessor functions: include things like feature_importances and training_losses, for extracting, from training outcomes, information common to many algorithms. \nAlgorithm traits: special methods that promise specific algorithm behavior or for recording general information about the algorithm. The only universally compulsory trait is LearnAPI.functions(algorithm), which returns a list of the explicitly overloaded non-trait methods.","category":"page"},{"location":"reference/","page":"Reference","title":"Reference","text":"","category":"page"},{"location":"reference/","page":"Reference","title":"Reference","text":"¹ We acknowledge users may not like this terminology, and may know \"algorithm\" by some other name, such as \"strategy\", \"options\", \"hyperparameter set\", \"configuration\", or \"model\". Consensus on this point is difficult; see, e.g., this Julia Discourse discussion.","category":"page"},{"location":"accessor_functions/#accessor_functions","page":"Accessor Functions","title":"Accessor Functions","text":"","category":"section"},{"location":"accessor_functions/","page":"Accessor Functions","title":"Accessor Functions","text":"The sole argument of an accessor function is the output, model, of fit or obsfit.","category":"page"},{"location":"accessor_functions/","page":"Accessor Functions","title":"Accessor Functions","text":"LearnAPI.algorithm(model)\nLearnAPI.extras(model)\nLearnAPI.coefficients(model)\nLearnAPI.intercept(model)\nLearnAPI.tree(model)\nLearnAPI.trees(model)\nLearnAPI.feature_importances(model)\nLearnAPI.training_labels(model)\nLearnAPI.training_losses(model)\nLearnAPI.training_scores(model)\nLearnAPI.components(model)","category":"page"},{"location":"accessor_functions/#Implementation-guide","page":"Accessor Functions","title":"Implementation guide","text":"","category":"section"},{"location":"accessor_functions/","page":"Accessor Functions","title":"Accessor Functions","text":"All new implementations must implement LearnAPI.algorithm. While, all others are optional, any implemented accessor functions must be added to the list returned by LearnAPI.functions.","category":"page"},{"location":"accessor_functions/#Reference","page":"Accessor Functions","title":"Reference","text":"","category":"section"},{"location":"accessor_functions/","page":"Accessor Functions","title":"Accessor Functions","text":"LearnAPI.algorithm\nLearnAPI.extras\nLearnAPI.coefficients\nLearnAPI.intercept\nLearnAPI.tree\nLearnAPI.trees\nLearnAPI.feature_importances\nLearnAPI.training_losses\nLearnAPI.training_scores\nLearnAPI.training_labels\nLearnAPI.components","category":"page"},{"location":"accessor_functions/#LearnAPI.algorithm","page":"Accessor Functions","title":"LearnAPI.algorithm","text":"LearnAPI.algorithm(model)\nLearnAPI.algorithm(minimized_model)\n\nRecover the algorithm used to train model or the output of minimize(model).\n\nIn other words, if model = fit(algorithm, data...), for some algorithm and data, then\n\nLearnAPI.algorithm(model) == algorithm == LearnAPI.algorithm(minimize(model))\n\nis true.\n\nNew implementations\n\nImplementation is compulsory for new algorithm types. The behaviour described above is the only contract. If implemented, you must include algorithm in the tuple returned by the LearnAPI.functions trait. \n\n\n\n\n\n","category":"function"},{"location":"accessor_functions/#LearnAPI.extras","page":"Accessor Functions","title":"LearnAPI.extras","text":"LearnAPI.extras(model)\n\nReturn miscellaneous byproducts of an algorithm's computation, from the object model returned by a call of the form fit(algorithm, data).\n\nFor \"static\" algorithms (those without training data) it may be necessary to first call transform or predict on model.\n\nSee also fit.\n\nNew implementations\n\nImplementation is discouraged for byproducts already covered by other LearnAPI.jl accessor functions: LearnAPI.algorithm, LearnAPI.coefficients, LearnAPI.intercept, LearnAPI.tree, LearnAPI.trees, LearnAPI.feature_importances, LearnAPI.training_labels, LearnAPI.training_losses, LearnAPI.training_scores and LearnAPI.components.\n\nIf implemented, you must include training_labels in the tuple returned by the LearnAPI.functions trait. .\n\n\n\n\n\n","category":"function"},{"location":"accessor_functions/#LearnAPI.coefficients","page":"Accessor Functions","title":"LearnAPI.coefficients","text":"LearnAPI.coefficients(model)\n\nFor a linear model, return the learned coefficients. The value returned has the form of an abstract vector of feature_or_class::Symbol => coefficient::Real pairs (e.g [:gender => 0.23, :height => 0.7, :weight => 0.1]) or, in the case of multi-targets, feature::Symbol => coefficients::AbstractVector{<:Real} pairs.\n\nThe model reports coefficients if LearnAPI.coefficients in LearnAPI.functions(Learn.algorithm(model)).\n\nSee also LearnAPI.intercept.\n\nNew implementations\n\nImplementation is optional.\n\nIf implemented, you must include coefficients in the tuple returned by the LearnAPI.functions trait. .\n\n\n\n\n\n","category":"function"},{"location":"accessor_functions/#LearnAPI.intercept","page":"Accessor Functions","title":"LearnAPI.intercept","text":"LearnAPI.intercept(model)\n\nFor a linear model, return the learned intercept. The value returned is Real (single target) or an AbstractVector{<:Real} (multi-target).\n\nThe model reports intercept if LearnAPI.intercept in LearnAPI.functions(Learn.algorithm(model)).\n\nSee also LearnAPI.coefficients.\n\nNew implementations\n\nImplementation is optional.\n\nIf implemented, you must include intercept in the tuple returned by the LearnAPI.functions trait. .\n\n\n\n\n\n","category":"function"},{"location":"accessor_functions/#LearnAPI.tree","page":"Accessor Functions","title":"LearnAPI.tree","text":"LearnAPI.tree(model)\n\nReturn a user-friendly tree, in the form of a root object implementing the following interface defined in AbstractTrees.jl:\n\nsubtypes AbstractTrees.AbstractNode{T}\nimplements AbstractTrees.children()\nimplements AbstractTrees.printnode()\n\nSuch a tree can be visualized using the TreeRecipe.jl package, for example.\n\nSee also LearnAPI.trees.\n\nNew implementations\n\nImplementation is optional.\n\nIf implemented, you must include tree in the tuple returned by the LearnAPI.functions trait. .\n\n\n\n\n\n","category":"function"},{"location":"accessor_functions/#LearnAPI.trees","page":"Accessor Functions","title":"LearnAPI.trees","text":"LearnAPI.trees(model)\n\nFor some ensemble model, return a vector of trees. See LearnAPI.tree for the form of such trees.\n\nSee also LearnAPI.tree.\n\nNew implementations\n\nImplementation is optional.\n\nIf implemented, you must include trees in the tuple returned by the LearnAPI.functions trait. .\n\n\n\n\n\n","category":"function"},{"location":"accessor_functions/#LearnAPI.feature_importances","page":"Accessor Functions","title":"LearnAPI.feature_importances","text":"LearnAPI.feature_importances(model)\n\nReturn the algorithm-specific feature importances of a model output by fit(algorithm, ...) for some algorithm. The value returned has the form of an abstract vector of feature::Symbol => importance::Real pairs (e.g [:gender => 0.23, :height => 0.7, :weight => 0.1]).\n\nThe algorithm supports feature importances if LearnAPI.feature_importances in LearnAPI.functions(algorithm).\n\nIf an algorithm is sometimes unable to report feature importances then LearnAPI.feature_importances will return all importances as 0.0, as in [:gender => 0.0, :height => 0.0, :weight => 0.0].\n\nNew implementations\n\nImplementation is optional.\n\nIf implemented, you must include feature_importances in the tuple returned by the LearnAPI.functions trait. .\n\n\n\n\n\n","category":"function"},{"location":"accessor_functions/#LearnAPI.training_losses","page":"Accessor Functions","title":"LearnAPI.training_losses","text":"LearnAPI.training_losses(model)\n\nReturn the training losses obtained when running model = fit(algorithm, ...) for some algorithm.\n\nSee also fit.\n\nNew implementations\n\nImplement for iterative algorithms that compute and record training losses as part of training (e.g. neural networks).\n\nIf implemented, you must include training_losses in the tuple returned by the LearnAPI.functions trait. .\n\n\n\n\n\n","category":"function"},{"location":"accessor_functions/#LearnAPI.training_scores","page":"Accessor Functions","title":"LearnAPI.training_scores","text":"LearnAPI.training_scores(model)\n\nReturn the training scores obtained when running model = fit(algorithm, ...) for some algorithm.\n\nSee also fit.\n\nNew implementations\n\nImplement for algorithms, such as outlier detection algorithms, which associate a score with each observation during training, where these scores are of interest in later processes (e.g, in defining normalized scores for new data).\n\nIf implemented, you must include training_scores in the tuple returned by the LearnAPI.functions trait. .\n\n\n\n\n\n","category":"function"},{"location":"accessor_functions/#LearnAPI.training_labels","page":"Accessor Functions","title":"LearnAPI.training_labels","text":"LearnAPI.training_labels(model)\n\nReturn the training labels obtained when running model = fit(algorithm, ...) for some algorithm.\n\nSee also fit.\n\nNew implementations\n\nIf implemented, you must include training_labels in the tuple returned by the LearnAPI.functions trait. .\n\n\n\n\n\n","category":"function"},{"location":"accessor_functions/#LearnAPI.components","page":"Accessor Functions","title":"LearnAPI.components","text":"LearnAPI.components(model)\n\nFor a composite model, return the component models (fit outputs). These will be in the form of a vector of named pairs, property_name::Symbol => component_model. Here property_name is the name of some algorithm-valued property (hyper-parameter) of algorithm = LearnAPI.algorithm(model).\n\nA composite model is one for which the corresponding algorithm includes one or more algorithm-valued properties, and for which LearnAPI.is_composite(algorithm) is true.\n\nSee also is_composite.\n\nNew implementations\n\nImplementent if and only if model is a composite model. \n\nIf implemented, you must include components in the tuple returned by the LearnAPI.functions trait. .\n\n\n\n\n\n","category":"function"},{"location":"patterns/incremental_models/#Incremental-Algorithms","page":"Incremental Algorithms","title":"Incremental Algorithms","text":"","category":"section"},{"location":"patterns/learning_a_probability_distribution/#Learning-a-Probability-Distribution","page":"Learning a Probability Distribution","title":"Learning a Probability Distribution","text":"","category":"section"},{"location":"patterns/dimension_reduction/#Dimension-Reduction","page":"Dimension Reduction","title":"Dimension Reduction","text":"","category":"section"},{"location":"patterns/time_series_forecasting/#Time-Series-Forecasting","page":"Time Series Forecasting","title":"Time Series Forecasting","text":"","category":"section"},{"location":"minimize/#algorithm_minimize","page":"minimize","title":"minimize","text":"","category":"section"},{"location":"minimize/","page":"minimize","title":"minimize","text":"minimize(model) -> ","category":"page"},{"location":"minimize/#Typical-workflow","page":"minimize","title":"Typical workflow","text":"","category":"section"},{"location":"minimize/","page":"minimize","title":"minimize","text":"model = fit(algorithm, X, y)\nŷ = predict(model, LiteralTarget(), Xnew)\nLearnAPI.feature_importances(model)\n\nsmall_model = minimize(model)\nserialize(\"my_model.jls\", small_model)\n\nrecovered_model = deserialize(\"my_random_forest.jls\")\n@assert predict(recovered_model, LiteralTarget(), Xnew) == ŷ\n\n# throws MethodError:\nLearnAPI.feature_importances(recovered_model)","category":"page"},{"location":"minimize/#Implementation-guide","page":"minimize","title":"Implementation guide","text":"","category":"section"},{"location":"minimize/","page":"minimize","title":"minimize","text":"method compulsory? fallback requires\nminimize no identity fit","category":"page"},{"location":"minimize/#Reference","page":"minimize","title":"Reference","text":"","category":"section"},{"location":"minimize/","page":"minimize","title":"minimize","text":"minimize","category":"page"},{"location":"minimize/#LearnAPI.minimize","page":"minimize","title":"LearnAPI.minimize","text":"minimize(model; options...)\n\nReturn a version of model that will generally have a smaller memory allocation than model, suitable for serialization. Here model is any object returned by fit. Accessor functions that can be called on model may not work on minimize(model), but predict, transform and inverse_transform will work, if implemented for model. Check LearnAPI.functions(LearnAPI.algorithm(model)) to view see what the original model implements.\n\nSpecific algorithms may provide keyword options to control how much of the original functionality is preserved by minimize.\n\nExtended help\n\nNew implementations\n\nOverloading minimize for new algorithms is optional. The fallback is the identity. If overloaded, you must include minimize in the tuple returned by the LearnAPI.functions trait. \n\nNew implementations must enforce the following identities, whenever the right-hand side is defined:\n\npredict(minimize(model; options...), args...; kwargs...) ==\n predict(model, args...; kwargs...)\ntransform(minimize(model; options...), args...; kwargs...) ==\n transform(model, args...; kwargs...)\ninverse_transform(minimize(model; options), args...; kwargs...) ==\n inverse_transform(model, args...; kwargs...)\n\nAdditionally:\n\nminimize(minimize(model)) == minimize(model)\n\n\n\n\n\n","category":"function"},{"location":"obs/#data_interface","page":"obs","title":"obs","text":"","category":"section"},{"location":"obs/","page":"obs","title":"obs","text":"The MLUtils.jl package provides two methods getobs and numobs for resampling data divided into multiple observations, including arrays and tables. The data objects returned below are guaranteed to implement this interface and can be passed to the relevant method (obsfit, obspredict or obstransform) possibly after resampling using MLUtils.getobs. This may provide performance advantages over naive workflows.","category":"page"},{"location":"obs/","page":"obs","title":"obs","text":"obs(fit, algorithm, data...) -> \nobs(predict, algorithm, data...) -> \nobs(transform, algorithm, data...) -> ","category":"page"},{"location":"obs/#Typical-workflows","page":"obs","title":"Typical workflows","text":"","category":"section"},{"location":"obs/","page":"obs","title":"obs","text":"LearnAPI.jl makes no assumptions about the form of data X and y in a call like fit(algorithm, X, y). The particular algorithm is free to articulate it's own requirements. However, in this example, the definition","category":"page"},{"location":"obs/","page":"obs","title":"obs","text":"obsdata = obs(fit, algorithm, X, y)","category":"page"},{"location":"obs/","page":"obs","title":"obs","text":"combines X and y in a single object guaranteed to implement the MLUtils.jl getobs/numobs interface, which can be passed to obsfit instead of fit, as is, or after resampling using MLUtils.getobs:","category":"page"},{"location":"obs/","page":"obs","title":"obs","text":"# equivalent to `mode = fit(algorithm, X, y)`:\nmodel = obsfit(algorithm, obsdata)\n\n# with resampling:\nresampled_obsdata = MLUtils.getobs(obsdata, 1:100)\nmodel = obsfit(algorithm, resampled_obsdata)","category":"page"},{"location":"obs/","page":"obs","title":"obs","text":"In some implementations, the alternative pattern above can be used to avoid repeating unnecessary internal data preprocessing, or inefficient resampling. For example, here's how a user might call obs and MLUtils.getobs to perform efficient cross-validation:","category":"page"},{"location":"obs/","page":"obs","title":"obs","text":"using LearnAPI\nimport MLUtils\n\nX = \ny = \nalgorithm = \n\ntest_train_folds = map([1:10, 11:20, 21:30]) do test\n (test, setdiff(1:30, test))\nend \n\n# create fixed model-specific representations of the whole data set:\nfit_data = obs(fit, algorithm, X, y)\npredict_data = obs(predict, algorithm, predict, X)\n\nscores = map(train_test_folds) do (train_indices, test_indices)\n \n\t# train using model-specific representation of data:\n\ttrain_data = MLUtils.getobs(fit_data, train_indices)\n\tmodel = obsfit(algorithm, train_data)\n\t\n\t# predict on the fold complement:\n\ttest_data = MLUtils.getobs(predict_data, test_indices)\n\tŷ = obspredict(model, LiteralTarget(), test_data)\n\n return \n\t\nend ","category":"page"},{"location":"obs/","page":"obs","title":"obs","text":"Note here that the output of obspredict will match the representation of y , i.e., there is no concept of an algorithm-specific representation of outputs, only inputs.","category":"page"},{"location":"obs/#Implementation-guide","page":"obs","title":"Implementation guide","text":"","category":"section"},{"location":"obs/","page":"obs","title":"obs","text":"method compulsory? fallback\nobs depends slurps data argument\n ","category":"page"},{"location":"obs/","page":"obs","title":"obs","text":"If the data consumed by fit, predict or transform consists only of tables and arrays (with last dimension the observation dimension) then overloading obs is optional. However, if an implementation overloads obs to return a (thinly wrapped) representation of user data that is closer to what the core algorithm actually uses, and overloads MLUtils.getobs (or, more typically Base.getindex) to make resampling of that representation efficient, then those optimizations become available to the user, without the user concerning herself with the details of the representation.","category":"page"},{"location":"obs/","page":"obs","title":"obs","text":"A sample implementation is given in the obs document-string below.","category":"page"},{"location":"obs/#Reference","page":"obs","title":"Reference","text":"","category":"section"},{"location":"obs/","page":"obs","title":"obs","text":"obs","category":"page"},{"location":"obs/#LearnAPI.obs","page":"obs","title":"LearnAPI.obs","text":"obs(func, algorithm, data...)\n\nWhere func is fit, predict or transform, return a combined, algorithm-specific, representation of data..., which can be passed directly to obsfit, obspredict or obstransform, as shown in the example below.\n\nThe returned object implements the getobs/numobs observation-resampling interface provided by MLUtils.jl, even if data does not.\n\nCalling func on the returned object may be cheaper than calling func directly on data.... And resampling the returned object using MLUtils.getobs may be cheaper than directly resampling the components of data (an operation not provided by the LearnAPI.jl interface).\n\nExample\n\nUsual workflow, using data-specific resampling methods:\n\nX = \ny = \n\nXtrain = Tables.select(X, 1:100)\nytrain = y[1:100]\nmodel = fit(algorithm, Xtrain, ytrain)\nŷ = predict(model, LiteralTarget(), y[101:150])\n\nAlternative workflow using obs:\n\nimport MLUtils\n\nfitdata = obs(fit, algorithm, X, y)\npredictdata = obs(predict, algorithm, X)\n\nmodel = obsfit(algorithm, MLUtils.getobs(fitdata, 1:100))\nẑ = obspredict(model, LiteralTarget(), MLUtils.getobs(predictdata, 101:150))\n@assert ẑ == ŷ\n\nSee also obsfit, obspredict, obstransform.\n\nExtended help\n\nNew implementations\n\nIf the data to be consumed in standard user calls to fit, predict or transform consists only of tables and arrays (with last dimension the observation dimension) then overloading obs is optional, but the user will get no performance benefits by using it. The implementation of obs is optional under more general circumstances stated at the end.\n\nThe fallback for obs just slurps the provided data:\n\nobs(func, alg, data...) = data\n\nThe only contractual obligation of obs is to return an object implementing the getobs/numobs interface. Generally it suffices to overload Base.getindex and Base.length. However, note that implementations of obsfit, obspredict, and obstransform depend on the form of output of obs.\n\nIf overloaded, you must include obs in the tuple returned by the LearnAPI.functions trait. \n\nSample implementation\n\nSuppose that fit, for an algorithm of type Alg, is to have the primary signature\n\nfit(algorithm::Alg, X, y)\n\nwhere X is a table, y a vector. Internally, the algorithm is to call a lower level function\n\ntrain(A, names, y)\n\nwhere A = Tables.matrix(X)' and names are the column names of X. Then relevant parts of an implementation might look like this:\n\n# thin wrapper for algorithm-specific representation of data:\nstruct ObsData{T}\n A::Matrix{T}\n names::Vector{Symbol}\n y::Vector{T}\nend\n\n# (indirect) implementation of `getobs/numobs`:\nBase.getindex(data::ObsData, I) =\n ObsData(data.A[:,I], data.names, y[I])\nBase.length(data::ObsData, I) = length(data.y)\n\n# implementation of `obs`:\nfunction LearnAPI.obs(::typeof(fit), ::Alg, X, y)\n table = Tables.columntable(X)\n names = Tables.columnnames(table) |> collect\n return ObsData(Tables.matrix(table)', names, y)\nend\n\n# implementation of `obsfit`:\nfunction LearnAPI.obsfit(algorithm::Alg, data::ObsData; verbosity=1)\n coremodel = train(data.A, data.names, data.y)\n data.verbosity > 0 && @info \"Training using these features: names.\"\n \n return model\nend\n\nWhen is overloading obs optional?\n\nOverloading obs is optional, for a given typeof(algorithm) and typeof(fun), if the components of data in the standard call func(algorithm_or_model, data...) are already expected to separately implement the getobs/numbobs interface. This is true for arrays whose last dimension is the observation dimension, and for suitable tables.\n\n\n\n\n\n","category":"function"},{"location":"","page":"Home","title":"Home","text":"\n\nLearnAPI.jl\n
\n\nA base Julia interface for machine learning and statistics \n
\n
","category":"page"},{"location":"","page":"Home","title":"Home","text":"LearnAPI.jl is a lightweight, functional-style interface, providing a collection of methods, such as fit and predict, to be implemented by algorithms from machine learning and statistics. Through such implementations, these algorithms buy into functionality, such as hyperparameter optimization, as provided by ML/statistics toolboxes and other packages. LearnAPI.jl also provides a number of Julia traits for promising specific behavior.","category":"page"},{"location":"","page":"Home","title":"Home","text":"🚧","category":"page"},{"location":"","page":"Home","title":"Home","text":"warning: Warning\nThe API described here is under active development and not ready for adoption. Join an ongoing design discussion at this Julia Discourse thread.","category":"page"},{"location":"#Sample-workflow","page":"Home","title":"Sample workflow","text":"","category":"section"},{"location":"","page":"Home","title":"Home","text":"Suppose forest is some object encapsulating the hyperparameters of the random forest algorithm (the number of trees, etc.). Then, a LearnAPI.jl interface can be implemented, for objects with the type of forest, to enable the following basic workflow:","category":"page"},{"location":"","page":"Home","title":"Home","text":"X = \ny = \nXnew = \n\n# Train:\nmodel = fit(forest, X, y)\n\n# Predict probability distributions:\npredict(model, Distribution(), Xnew)\n\n# Generate point predictions:\nŷ = predict(model, LiteralTarget(), Xnew) # or `predict(model, Xnew)`\n\n# Apply an \"accessor function\" to inspect byproducts of training:\nLearnAPI.feature_importances(model)\n\n# Slim down and otherwise prepare model for serialization:\nsmall_model = minimize(model)\nserialize(\"my_random_forest.jls\", small_model)\n\n# Recover saved model and algorithm configuration:\nrecovered_model = deserialize(\"my_random_forest.jls\")\n@assert LearnAPI.algorithm(recovered_model) == forest\n@assert predict(recovered_model, LiteralTarget(), Xnew) == ŷ","category":"page"},{"location":"","page":"Home","title":"Home","text":"Distribution and LiteralTarget are singleton types owned by LearnAPI.jl. They allow dispatch based on the kind of target proxy, a key LearnAPI.jl concept. LearnAPI.jl places more emphasis on the notion of target variables and target proxies than on the usual supervised/unsupervised learning dichotomy. From this point of view, a supervised algorithm is simply one in which a target variable exists, and happens to appear as an input to training but not to prediction.","category":"page"},{"location":"","page":"Home","title":"Home","text":"In LearnAPI.jl, a method called obs gives users access to an \"internal\", algorithm-specific, representation of input data, which is always \"observation-accessible\", in the sense that it can be resampled using MLUtils.jl getobs/numobs interface. The implementation can arrange for this resampling to be efficient, and workflows based on obs can have performance benefits.","category":"page"},{"location":"#Learning-more","page":"Home","title":"Learning more","text":"","category":"section"},{"location":"","page":"Home","title":"Home","text":"Anatomy of an Implementation: informal introduction to the main actors in a new LearnAPI.jl implementation\nReference: official specification\nCommon Implementation Patterns: implementation suggestions for common, informally defined, algorithm types\nTesting an Implementation","category":"page"},{"location":"patterns/outlier_detection/#Outlier-Detection","page":"Outlier Detection","title":"Outlier Detection","text":"","category":"section"},{"location":"patterns/incremental_algorithms/#Incremental-Models","page":"Incremental Models","title":"Incremental Models","text":"","category":"section"}] } diff --git a/dev/testing_an_implementation/index.html b/dev/testing_an_implementation/index.html index f54f47a8..0722c34c 100644 --- a/dev/testing_an_implementation/index.html +++ b/dev/testing_an_implementation/index.html @@ -1,2 +1,2 @@ -Testing an Implementation · LearnAPI.jl +Testing an Implementation · LearnAPI.jl diff --git a/dev/traits/index.html b/dev/traits/index.html index 955f6182..f1a8c5a7 100644 --- a/dev/traits/index.html +++ b/dev/traits/index.html @@ -3,25 +3,25 @@ MyAlgorithmType, is_pure_julia = true, pkg_name = "MyPackage", -)

The global trait contracts

To ensure that trait metadata can be stored in an external algorithm registry, LearnAPI.jl requires:

  1. Finiteness: The value of a trait is the same for all algorithms with same underlying UnionAll type. That is, even if the type parameters are different, the trait should be the same. There is an exception if is_composite(algorithm) = true.

  2. Serializability: The value of any trait can be evaluated without installing any third party package; using LearnAPI should suffice.

Because of 1, combining a lot of functionality into one algorithm (e.g. the algorithm can perform both classification or regression) can mean traits are necessarily less informative (as in LearnAPI.predict_type(algorithm) = Any).

Reference

LearnAPI.functionsFunction
LearnAPI.functions(algorithm)

Return a tuple of functions that can be sensibly applied to algorithm, or to objects having the same type as algorithm, or to associated models (objects returned by fit(algorithm, ...). Algorithm traits are excluded.

In addition to functions, the returned tuple may include expressions, like :(DecisionTree.print_tree), which reference functions not owned by LearnAPI.jl.

The understanding is that algorithm is a LearnAPI-compliant object whenever this is non-empty.

Extended help

New implementations

All new implementations must overload this trait. Here's a checklist for elements in the return value:

functionneeds explicit implementation?include in returned tuple?
fitnoyes
obsfityesyes
minimizeoptionalyes
predictnoif obspredict is implemented
obspredictoptionalif implemented
transformnoif obstransform is implemented
obstransformoptionalif implemented
obsoptionalyes
inverse_transformoptionalif implemented
LearnAPI.algorithmyesyes

Also include any implemented accessor functions. The LearnAPI.jl accessor functions are: LearnAPI.extras, LearnAPI.algorithm, LearnAPI.coefficients, LearnAPI.intercept, LearnAPI.tree, LearnAPI.trees, LearnAPI.feature_importances, LearnAPI.training_labels, LearnAPI.training_losses, LearnAPI.training_scores and LearnAPI.components.

source
LearnAPI.kinds_of_proxyFunction
LearnAPI.kinds_of_proxy(algorithm)

Returns an tuple of all instances, kind, for which for which predict(algorithm, kind, data...) has a guaranteed implementation. Each such kind subtypes LearnAPI.KindOfProxy. Examples are LiteralTarget() (for predicting actual target values) and Distributions() (for predicting probability mass/density functions).

See also LearnAPI.predict, LearnAPI.KindOfProxy.

Extended help

New implementations

Implementation is optional but recommended whenever predict is overloaded.

Elements of the returned tuple must be one of these: ConfidenceInterval, Continuous, Distribution, LabelAmbiguous, LabelAmbiguousDistribution, LabelAmbiguousSampleable, LiteralTarget, LogDistribution, LogProbability, OutlierScore, Parametric, ProbabilisticSet, Probability, Sampleable, Set, SurvivalDistribution, SurvivalFunction, IID, JointDistribution, JointLogDistribution and JointSampleable.

Suppose, for example, we have the following implementation of a supervised learner returning only probabilistic predictions:

LearnAPI.predict(algorithm::MyNewAlgorithmType, LearnAPI.Distribution(), Xnew) = ...

Then we can declare

@trait MyNewAlgorithmType kinds_of_proxy = (LearnaAPI.Distribution(),)

For more on target variables and target proxies, refer to the LearnAPI documentation.

source
LearnAPI.position_of_targetFunction
LearnAPI.position_of_target(algorithm)

Return the expected position of the target variable within data in calls of the form LearnAPI.fit(algorithm, verbosity, data...).

If this number is 0, then no target is expected. If this number exceeds length(data), then data is understood to exclude the target variable.

source
LearnAPI.position_of_weightsFunction
LearnAPI.position_of_weights(algorithm)

Return the expected position of per-observation weights within data in calls of the form LearnAPI.fit(algorithm, data...).

If this number is 0, then no weights are expected. If this number exceeds length(data), then data is understood to exclude weights, which are assumed to be uniform.

source
LearnAPI.descriptorsFunction
LearnAPI.descriptors(algorithm)

Lists one or more suggestive algorithm descriptors from this list: :regression, :classification, :clustering, :gradient_descent, :iterative_algorithms, :incremental_algorithms, :dimension_reduction, :encoders, :static_algorithms, :missing_value_imputation, :ensemble_algorithms, :wrappers, :time_series_forecasting, :time_series_classification, :survival_analysis, :distribution_fitters, :Bayesian_algorithms, :outlier_detection, :collaborative_filtering, :text_analysis, :audio_analysis, :natural_language_processing, :image_processing (do LearnAPI.descriptors() to reproduce).

Warning

The value of this trait guarantees no particular behavior. The trait is intended for informal classification purposes only.

New implementations

This trait should return a tuple of symbols, as in (:classifier, :text_analysis).

source
LearnAPI.is_pure_juliaFunction
LearnAPI.is_pure_julia(algorithm)

Returns true if training algorithm requires evaluation of pure Julia code only.

New implementations

The fallback is false.

source
LearnAPI.pkg_nameFunction
LearnAPI.pkg_name(algorithm)

Return the name of the package module which supplies the core training algorithm for algorithm. This is not necessarily the package providing the LearnAPI interface.

Returns "unknown" if the algorithm implementation has failed to overload the trait.

New implementations

Must return a string, as in "DecisionTree".

source
LearnAPI.pkg_licenseFunction
LearnAPI.pkg_license(algorithm)

Return the name of the software license, such as "MIT", applying to the package where the core algorithm for algorithm is implemented.

source
LearnAPI.doc_urlFunction
LearnAPI.doc_url(algorithm)

Return a url where the core algorithm for algorithm is documented.

Returns "unknown" if the algorithm implementation has failed to overload the trait.

New implementations

Must return a string, such as "https://en.wikipedia.org/wiki/Decision_tree_learning".

source
LearnAPI.load_pathFunction
LearnAPI.load_path(algorithm)

Return a string indicating where the struct for typeof(algorithm) can be found, beginning with the name of the package module defining it. For example, a return value of "FastTrees.LearnAPI.DecisionTreeClassifier" means the following julia code will return the algorithm type:

import FastTrees
-FastTrees.LearnAPI.DecisionTreeClassifier

Returns "unknown" if the algorithm implementation has failed to overload the trait.

source
LearnAPI.is_compositeFunction
LearnAPI.is_composite(algorithm)

Returns true if one or more properties (fields) of algorithm may themselves be algorithms, and false otherwise.

See also [LearnAPI.components](@ref).

New implementations

This trait should be overloaded if one or more properties (fields) of algorithm may take algorithm values. Fallback return value is false. The keyword constructor for such an algorithm need not prescribe defaults for algorithm-valued properties. Implementation of the accessor function LearnAPI.components is recommended.

The value of the trait must depend only on the type of algorithm.

source
LearnAPI.human_nameFunction
LearnAPI.human_name(algorithm)

A human-readable string representation of typeof(algorithm). Primarily intended for auto-generation of documentation.

New implementations

Optional. A fallback takes the type name, inserts spaces and removes capitalization. For example, KNNRegressor becomes "knn regressor". Better would be to overload the trait to return "K-nearest neighbors regressor". Ideally, this is a "concrete" noun like "ridge regressor" rather than an "abstract" noun like "ridge regression".

source
LearnAPI.iteration_parameterFunction
LearnAPI.iteration_parameter(algorithm)

The name of the iteration parameter of algorithm, or nothing if the algorithm is not iterative.

New implementations

Implement if algorithm is iterative. Returns a symbol or nothing.

source
LearnAPI.fit_scitypeFunction
LearnAPI.fit_scitype(algorithm)

Return an upper bound on the scitype of data guaranteed to work when calling fit(algorithm, data...).

Specifically, if the return value is S and ScientificTypes.scitype(data) <: S, then all the following calls are guaranteed to work:

fit(algorithm, data...)
+)

The global trait contracts

To ensure that trait metadata can be stored in an external algorithm registry, LearnAPI.jl requires:

  1. Finiteness: The value of a trait is the same for all algorithms with same underlying UnionAll type. That is, even if the type parameters are different, the trait should be the same. There is an exception if is_composite(algorithm) = true.

  2. Serializability: The value of any trait can be evaluated without installing any third party package; using LearnAPI should suffice.

Because of 1, combining a lot of functionality into one algorithm (e.g. the algorithm can perform both classification or regression) can mean traits are necessarily less informative (as in LearnAPI.predict_type(algorithm) = Any).

Reference

LearnAPI.functionsFunction
LearnAPI.functions(algorithm)

Return a tuple of functions that can be sensibly applied to algorithm, or to objects having the same type as algorithm, or to associated models (objects returned by fit(algorithm, ...). Algorithm traits are excluded.

In addition to functions, the returned tuple may include expressions, like :(DecisionTree.print_tree), which reference functions not owned by LearnAPI.jl.

The understanding is that algorithm is a LearnAPI-compliant object whenever this is non-empty.

Extended help

New implementations

All new implementations must overload this trait. Here's a checklist for elements in the return value:

functionneeds explicit implementation?include in returned tuple?
fitnoyes
obsfityesyes
minimizeoptionalyes
predictnoif obspredict is implemented
obspredictoptionalif implemented
transformnoif obstransform is implemented
obstransformoptionalif implemented
obsoptionalyes
inverse_transformoptionalif implemented
LearnAPI.algorithmyesyes

Also include any implemented accessor functions. The LearnAPI.jl accessor functions are: LearnAPI.extras, LearnAPI.algorithm, LearnAPI.coefficients, LearnAPI.intercept, LearnAPI.tree, LearnAPI.trees, LearnAPI.feature_importances, LearnAPI.training_labels, LearnAPI.training_losses, LearnAPI.training_scores and LearnAPI.components.

source
LearnAPI.kinds_of_proxyFunction
LearnAPI.kinds_of_proxy(algorithm)

Returns an tuple of all instances, kind, for which for which predict(algorithm, kind, data...) has a guaranteed implementation. Each such kind subtypes LearnAPI.KindOfProxy. Examples are LiteralTarget() (for predicting actual target values) and Distributions() (for predicting probability mass/density functions).

See also LearnAPI.predict, LearnAPI.KindOfProxy.

Extended help

New implementations

Implementation is optional but recommended whenever predict is overloaded.

Elements of the returned tuple must be one of these: ConfidenceInterval, Continuous, Distribution, LabelAmbiguous, LabelAmbiguousDistribution, LabelAmbiguousSampleable, LiteralTarget, LogDistribution, LogProbability, OutlierScore, Parametric, ProbabilisticSet, Probability, Sampleable, Set, SurvivalDistribution, SurvivalFunction, IID, JointDistribution, JointLogDistribution and JointSampleable.

Suppose, for example, we have the following implementation of a supervised learner returning only probabilistic predictions:

LearnAPI.predict(algorithm::MyNewAlgorithmType, LearnAPI.Distribution(), Xnew) = ...

Then we can declare

@trait MyNewAlgorithmType kinds_of_proxy = (LearnaAPI.Distribution(),)

For more on target variables and target proxies, refer to the LearnAPI documentation.

source
LearnAPI.position_of_targetFunction
LearnAPI.position_of_target(algorithm)

Return the expected position of the target variable within data in calls of the form LearnAPI.fit(algorithm, verbosity, data...).

If this number is 0, then no target is expected. If this number exceeds length(data), then data is understood to exclude the target variable.

source
LearnAPI.position_of_weightsFunction
LearnAPI.position_of_weights(algorithm)

Return the expected position of per-observation weights within data in calls of the form LearnAPI.fit(algorithm, data...).

If this number is 0, then no weights are expected. If this number exceeds length(data), then data is understood to exclude weights, which are assumed to be uniform.

source
LearnAPI.descriptorsFunction
LearnAPI.descriptors(algorithm)

Lists one or more suggestive algorithm descriptors from this list: :regression, :classification, :clustering, :gradient_descent, :iterative_algorithms, :incremental_algorithms, :dimension_reduction, :encoders, :static_algorithms, :missing_value_imputation, :ensemble_algorithms, :wrappers, :time_series_forecasting, :time_series_classification, :survival_analysis, :distribution_fitters, :Bayesian_algorithms, :outlier_detection, :collaborative_filtering, :text_analysis, :audio_analysis, :natural_language_processing, :image_processing (do LearnAPI.descriptors() to reproduce).

Warning

The value of this trait guarantees no particular behavior. The trait is intended for informal classification purposes only.

New implementations

This trait should return a tuple of symbols, as in (:classifier, :text_analysis).

source
LearnAPI.is_pure_juliaFunction
LearnAPI.is_pure_julia(algorithm)

Returns true if training algorithm requires evaluation of pure Julia code only.

New implementations

The fallback is false.

source
LearnAPI.pkg_nameFunction
LearnAPI.pkg_name(algorithm)

Return the name of the package module which supplies the core training algorithm for algorithm. This is not necessarily the package providing the LearnAPI interface.

Returns "unknown" if the algorithm implementation has failed to overload the trait.

New implementations

Must return a string, as in "DecisionTree".

source
LearnAPI.pkg_licenseFunction
LearnAPI.pkg_license(algorithm)

Return the name of the software license, such as "MIT", applying to the package where the core algorithm for algorithm is implemented.

source
LearnAPI.doc_urlFunction
LearnAPI.doc_url(algorithm)

Return a url where the core algorithm for algorithm is documented.

Returns "unknown" if the algorithm implementation has failed to overload the trait.

New implementations

Must return a string, such as "https://en.wikipedia.org/wiki/Decision_tree_learning".

source
LearnAPI.load_pathFunction
LearnAPI.load_path(algorithm)

Return a string indicating where the struct for typeof(algorithm) can be found, beginning with the name of the package module defining it. For example, a return value of "FastTrees.LearnAPI.DecisionTreeClassifier" means the following julia code will return the algorithm type:

import FastTrees
+FastTrees.LearnAPI.DecisionTreeClassifier

Returns "unknown" if the algorithm implementation has failed to overload the trait.

source
LearnAPI.is_compositeFunction
LearnAPI.is_composite(algorithm)

Returns true if one or more properties (fields) of algorithm may themselves be algorithms, and false otherwise.

See also [LearnAPI.components](@ref).

New implementations

This trait should be overloaded if one or more properties (fields) of algorithm may take algorithm values. Fallback return value is false. The keyword constructor for such an algorithm need not prescribe defaults for algorithm-valued properties. Implementation of the accessor function LearnAPI.components is recommended.

The value of the trait must depend only on the type of algorithm.

source
LearnAPI.human_nameFunction
LearnAPI.human_name(algorithm)

A human-readable string representation of typeof(algorithm). Primarily intended for auto-generation of documentation.

New implementations

Optional. A fallback takes the type name, inserts spaces and removes capitalization. For example, KNNRegressor becomes "knn regressor". Better would be to overload the trait to return "K-nearest neighbors regressor". Ideally, this is a "concrete" noun like "ridge regressor" rather than an "abstract" noun like "ridge regression".

source
LearnAPI.iteration_parameterFunction
LearnAPI.iteration_parameter(algorithm)

The name of the iteration parameter of algorithm, or nothing if the algorithm is not iterative.

New implementations

Implement if algorithm is iterative. Returns a symbol or nothing.

source
LearnAPI.fit_scitypeFunction
LearnAPI.fit_scitype(algorithm)

Return an upper bound on the scitype of data guaranteed to work when calling fit(algorithm, data...).

Specifically, if the return value is S and ScientificTypes.scitype(data) <: S, then all the following calls are guaranteed to work:

fit(algorithm, data...)
 obsdata = obs(fit, algorithm, data...)
-fit(algorithm, Obs(), obsdata)

See also LearnAPI.fit_type, LearnAPI.fit_observation_scitype, LearnAPI.fit_observation_type.

New implementations

Optional. The fallback return value is Union{}. Ordinarily, at most one of the following should be overloaded for given algorithm LearnAPI.fit_scitype, LearnAPI.fit_type, LearnAPI.fit_observation_scitype, LearnAPI.fit_observation_type.

source
LearnAPI.fit_typeFunction
LearnAPI.fit_type(algorithm)

Return an upper bound on the type of data guaranteed to work when calling fit(algorithm, data...).

Specifically, if the return value is T and typeof(data) <: T, then all the following calls are guaranteed to work:

fit(algorithm, data...)
+fit(algorithm, Obs(), obsdata)

See also LearnAPI.fit_type, LearnAPI.fit_observation_scitype, LearnAPI.fit_observation_type.

New implementations

Optional. The fallback return value is Union{}. Ordinarily, at most one of the following should be overloaded for given algorithm LearnAPI.fit_scitype, LearnAPI.fit_type, LearnAPI.fit_observation_scitype, LearnAPI.fit_observation_type.

source
LearnAPI.fit_typeFunction
LearnAPI.fit_type(algorithm)

Return an upper bound on the type of data guaranteed to work when calling fit(algorithm, data...).

Specifically, if the return value is T and typeof(data) <: T, then all the following calls are guaranteed to work:

fit(algorithm, data...)
 obsdata = obs(fit, algorithm, data...)
-fit(algorithm, Obs(), obsdata)

See also LearnAPI.fit_scitype, LearnAPI.fit_observation_type. LearnAPI.fit_observation_scitype

New implementations

Optional. The fallback return value is Union{}. Ordinarily, at most one of the following should be overloaded for given algorithm LearnAPI.fit_scitype, LearnAPI.fit_type, LearnAPI.fit_observation_scitype, LearnAPI.fit_observation_type.

source
LearnAPI.fit_observation_scitypeFunction
LearnAPI.fit_observation_scitype(algorithm)

Return an upper bound on the scitype of observations guaranteed to work when calling fit(algorithm, data...), independent of the type/scitype of the data container itself. Here "observations" is in the sense of MLUtils.jl. Assuming this trait has value different from Union{} the understanding is that data implements the MLUtils.jl getobs/numobs interface.

Specifically, denoting the type returned above by S, supposing S != Union{}, and that user supplies data satisfying

ScientificTypes.scitype(MLUtils.getobs(data, i)) <: S

for any valid index i, then all the following are guaranteed to work:

fit(algorithm, data....)
+fit(algorithm, Obs(), obsdata)

See also LearnAPI.fit_scitype, LearnAPI.fit_observation_type. LearnAPI.fit_observation_scitype

New implementations

Optional. The fallback return value is Union{}. Ordinarily, at most one of the following should be overloaded for given algorithm LearnAPI.fit_scitype, LearnAPI.fit_type, LearnAPI.fit_observation_scitype, LearnAPI.fit_observation_type.

source
LearnAPI.fit_observation_scitypeFunction
LearnAPI.fit_observation_scitype(algorithm)

Return an upper bound on the scitype of observations guaranteed to work when calling fit(algorithm, data...), independent of the type/scitype of the data container itself. Here "observations" is in the sense of MLUtils.jl. Assuming this trait has value different from Union{} the understanding is that data implements the MLUtils.jl getobs/numobs interface.

Specifically, denoting the type returned above by S, supposing S != Union{}, and that user supplies data satisfying

ScientificTypes.scitype(MLUtils.getobs(data, i)) <: S

for any valid index i, then all the following are guaranteed to work:

fit(algorithm, data....)
 obsdata = obs(fit, algorithm, data...)
-fit(algorithm, Obs(), obsdata)

See also See also LearnAPI.fit_type, LearnAPI.fit_scitype, LearnAPI.fit_observation_type.

New implementations

Optional. The fallback return value is Union{}. Ordinarily, at most one of the following should be overloaded for given algorithm LearnAPI.fit_scitype, LearnAPI.fit_type, LearnAPI.fit_observation_scitype, LearnAPI.fit_observation_type.

source
LearnAPI.fit_observation_typeFunction
LearnAPI.fit_observation_type(algorithm)

Return an upper bound on the type of observations guaranteed to work when calling fit(algorithm, data...), independent of the type/scitype of the data container itself. Here "observations" is in the sense of MLUtils.jl. Assuming this trait has value different from Union{} the understanding is that data implements the MLUtils.jl getobs/numobs interface.

Specifically, denoting the type returned above by T, supposing T != Union{}, and that user supplies data satisfying

typeof(MLUtils.getobs(data, i)) <: T

for any valid index i, then the following is guaranteed to work:

fit(algorithm, data....)
+fit(algorithm, Obs(), obsdata)

See also See also LearnAPI.fit_type, LearnAPI.fit_scitype, LearnAPI.fit_observation_type.

New implementations

Optional. The fallback return value is Union{}. Ordinarily, at most one of the following should be overloaded for given algorithm LearnAPI.fit_scitype, LearnAPI.fit_type, LearnAPI.fit_observation_scitype, LearnAPI.fit_observation_type.

source
LearnAPI.fit_observation_typeFunction
LearnAPI.fit_observation_type(algorithm)

Return an upper bound on the type of observations guaranteed to work when calling fit(algorithm, data...), independent of the type/scitype of the data container itself. Here "observations" is in the sense of MLUtils.jl. Assuming this trait has value different from Union{} the understanding is that data implements the MLUtils.jl getobs/numobs interface.

Specifically, denoting the type returned above by T, supposing T != Union{}, and that user supplies data satisfying

typeof(MLUtils.getobs(data, i)) <: T

for any valid index i, then the following is guaranteed to work:

fit(algorithm, data....)
 obsdata = obs(fit, algorithm, data...)
-fit(algorithm, Obs(), obsdata)

See also See also LearnAPI.fit_type, LearnAPI.fit_scitype, LearnAPI.fit_observation_scitype.

New implementations

Optional. The fallback return value is Union{}. Ordinarily, at most one of the following should be overloaded for given algorithm LearnAPI.fit_scitype, LearnAPI.fit_type, LearnAPI.fit_observation_scitype, LearnAPI.fit_observation_type.

source
LearnAPI.predict_input_scitypeFunction
 LearnAPI.predict_input_scitype(algorithm)

Return an upper bound on the scitype of data guaranteed to work in the call predict(algorithm, kind_of_proxy, data...).

Specifically, if S is the value returned and ScientificTypes.scitype(data) <: S, then the following is guaranteed to work:

julia predict(model, kind_of_proxy, data...) obsdata = obs(predict, algorithm, data...) predict(model, kind_of_proxy, Obs(), obsdata) whenever algorithm = LearnAPI.algorithm(model).

See also LearnAPI.predict_input_type.

New implementations

Implementation is optional. The fallback return value is Union{}. Ordinarily, at most one of the following should be overloaded for given algorithm LearnAPI.predict_scitype, LearnAPI.predict_type, LearnAPI.predict_observation_scitype, LearnAPI.predict_observation_type.

source
LearnAPI.predict_input_observation_scitypeFunction
LearnAPI.predict_observation_scitype(algorithm)

Return an upper bound on the scitype of observations guaranteed to work when calling predict(model, kind_of_proxy, data...), independent of the type/scitype of the data container itself. Here "observations" is in the sense of MLUtils.jl. Assuming this trait has value different from Union{} the understanding is that data implements the MLUtils.jl getobs/numobs interface.

Specifically, denoting the type returned above by S, supposing S != Union{}, and that user supplies data satisfying

ScientificTypes.scitype(MLUtils.getobs(data, i)) <: S

for any valid index i, then all the following are guaranteed to work:

predict(model, kind_of_proxy, data...)
+fit(algorithm, Obs(), obsdata)

See also See also LearnAPI.fit_type, LearnAPI.fit_scitype, LearnAPI.fit_observation_scitype.

New implementations

Optional. The fallback return value is Union{}. Ordinarily, at most one of the following should be overloaded for given algorithm LearnAPI.fit_scitype, LearnAPI.fit_type, LearnAPI.fit_observation_scitype, LearnAPI.fit_observation_type.

source
LearnAPI.predict_input_scitypeFunction
 LearnAPI.predict_input_scitype(algorithm)

Return an upper bound on the scitype of data guaranteed to work in the call predict(algorithm, kind_of_proxy, data...).

Specifically, if S is the value returned and ScientificTypes.scitype(data) <: S, then the following is guaranteed to work:

julia predict(model, kind_of_proxy, data...) obsdata = obs(predict, algorithm, data...) predict(model, kind_of_proxy, Obs(), obsdata) whenever algorithm = LearnAPI.algorithm(model).

See also LearnAPI.predict_input_type.

New implementations

Implementation is optional. The fallback return value is Union{}. Ordinarily, at most one of the following should be overloaded for given algorithm LearnAPI.predict_scitype, LearnAPI.predict_type, LearnAPI.predict_observation_scitype, LearnAPI.predict_observation_type.

source
LearnAPI.predict_input_observation_scitypeFunction
LearnAPI.predict_observation_scitype(algorithm)

Return an upper bound on the scitype of observations guaranteed to work when calling predict(model, kind_of_proxy, data...), independent of the type/scitype of the data container itself. Here "observations" is in the sense of MLUtils.jl. Assuming this trait has value different from Union{} the understanding is that data implements the MLUtils.jl getobs/numobs interface.

Specifically, denoting the type returned above by S, supposing S != Union{}, and that user supplies data satisfying

ScientificTypes.scitype(MLUtils.getobs(data, i)) <: S

for any valid index i, then all the following are guaranteed to work:

predict(model, kind_of_proxy, data...)
 obsdata = obs(predict, algorithm, data...)
-predict(model, kind_of_proxy, Obs(), obsdata)

whenever algorithm = LearnAPI.algorithm(model).

See also See also LearnAPI.fit_type, LearnAPI.fit_scitype, LearnAPI.fit_observation_type.

New implementations

Optional. The fallback return value is Union{}. Ordinarily, at most one of the following should be overloaded for given algorithm LearnAPI.predict_scitype, LearnAPI.predict_type, LearnAPI.predict_observation_scitype, LearnAPI.predict_observation_type.

source
LearnAPI.predict_input_typeFunction
LearnAPI.predict_input_type(algorithm)

Return an upper bound on the type of data guaranteed to work in the call predict(algorithm, kind_of_proxy, data...).

Specifically, if T is the value returned and typeof(data) <: T, then the following is guaranteed to work:

predict(model, kind_of_proxy, data...)
+predict(model, kind_of_proxy, Obs(), obsdata)

whenever algorithm = LearnAPI.algorithm(model).

See also See also LearnAPI.fit_type, LearnAPI.fit_scitype, LearnAPI.fit_observation_type.

New implementations

Optional. The fallback return value is Union{}. Ordinarily, at most one of the following should be overloaded for given algorithm LearnAPI.predict_scitype, LearnAPI.predict_type, LearnAPI.predict_observation_scitype, LearnAPI.predict_observation_type.

source
LearnAPI.predict_input_typeFunction
LearnAPI.predict_input_type(algorithm)

Return an upper bound on the type of data guaranteed to work in the call predict(algorithm, kind_of_proxy, data...).

Specifically, if T is the value returned and typeof(data) <: T, then the following is guaranteed to work:

predict(model, kind_of_proxy, data...)
 obsdata = obs(predict, model, data...)
-predict(model, kind_of_proxy, Obs(), obsdata)

See also LearnAPI.predict_input_scitype.

New implementations

Implementation is optional. The fallback return value is Union{}. Should not be overloaded if LearnAPI.predict_input_scitype is overloaded.

source
LearnAPI.predict_input_observation_typeFunction
LearnAPI.predict_observation_type(algorithm)

Return an upper bound on the type of observations guaranteed to work when calling predict(model, kind_of_proxy, data...), independent of the type/scitype of the data container itself. Here "observations" is in the sense of MLUtils.jl. Assuming this trait has value different from Union{} the understanding is that data implements the MLUtils.jl getobs/numobs interface.

Specifically, denoting the type returned above by T, supposing T != Union{}, and that user supplies data satisfying

typeof(MLUtils.getobs(data, i)) <: T

for any valid index i, then all the following are guaranteed to work:

predict(model, kind_of_proxy, data...)
+predict(model, kind_of_proxy, Obs(), obsdata)

See also LearnAPI.predict_input_scitype.

New implementations

Implementation is optional. The fallback return value is Union{}. Should not be overloaded if LearnAPI.predict_input_scitype is overloaded.

source
LearnAPI.predict_input_observation_typeFunction
LearnAPI.predict_observation_type(algorithm)

Return an upper bound on the type of observations guaranteed to work when calling predict(model, kind_of_proxy, data...), independent of the type/scitype of the data container itself. Here "observations" is in the sense of MLUtils.jl. Assuming this trait has value different from Union{} the understanding is that data implements the MLUtils.jl getobs/numobs interface.

Specifically, denoting the type returned above by T, supposing T != Union{}, and that user supplies data satisfying

typeof(MLUtils.getobs(data, i)) <: T

for any valid index i, then all the following are guaranteed to work:

predict(model, kind_of_proxy, data...)
 obsdata = obs(predict, algorithm, data...)
-predict(model, kind_of_proxy, Obs(), obsdata)

whenever algorithm = LearnAPI.algorithm(model).

See also See also LearnAPI.fit_type, LearnAPI.fit_scitype, LearnAPI.fit_observation_type.

New implementations

Optional. The fallback return value is Union{}. Ordinarily, at most one of the following should be overloaded for given algorithm LearnAPI.predict_scitype, LearnAPI.predict_type, LearnAPI.predict_observation_scitype, LearnAPI.predict_observation_type.

source
LearnAPI.predict_output_scitypeFunction
LearnAPI.predict_output_scitype(algorithm, kind_of_proxy::KindOfProxy)

Return an upper bound for the scitypes of predictions of the specified form where supported, and otherwise return Any. For example, if

ŷ = LearnAPI.predict(model, LearnAPI.Distribution(), data...)

successfully returns (i.e., algorithm supports predictions of target probability distributions) then the following is guaranteed to hold:

scitype(ŷ) <: LearnAPI.predict_output_scitype(algorithm, LearnAPI.Distribution())

Note. This trait has a single-argument "convenience" version LearnAPI.predict_output_scitype(algorithm) derived from this one, which returns a dictionary keyed on target proxy types.

See also LearnAPI.KindOfProxy, LearnAPI.predict, LearnAPI.predict_input_scitype.

New implementations

Overloading the trait is optional. Here's a sample implementation for a supervised regressor type MyRgs that only predicts actual values of the target:

@trait MyRgs predict_output_scitype = AbstractVector{ScientificTypesBase.Continuous}

The fallback method returns Any.

source
LearnAPI.predict_output_scitype(algorithm)

Return a dictionary of upper bounds on the scitype of predictions, keyed on concrete subtypes of LearnAPI.KindOfProxy. Each of these subtypes represents a different form of target prediction (LiteralTarget, Distribution, SurvivalFunction, etc) possibly supported by algorithm, but the existence of a key does not guarantee that form is supported.

As an example, if

ŷ = LearnAPI.predict(model, LearnAPI.Distribution(), data...)

successfully returns (i.e., algorithm supports predictions of target probability distributions) then the following is guaranteed to hold:

scitype(ŷ) <: LearnAPI.predict_output_scitypes(algorithm)[LearnAPI.Distribution]

See also LearnAPI.KindOfProxy, LearnAPI.predict, LearnAPI.predict_input_scitype.

New implementations

This single argument trait should not be overloaded. Instead, overload LearnAPI.predict_output_scitype(algorithm, kindofproxy).

source
LearnAPI.predict_output_typeFunction
LearnAPI.predict_output_type(algorithm, kind_of_proxy::KindOfProxy)

Return an upper bound for the types of predictions of the specified form where supported, and otherwise return Any. For example, if

ŷ = LearnAPI.predict(model, LearnAPI.Distribution(), data...)

successfully returns (i.e., algorithm supports predictions of target probability distributions) then the following is guaranteed to hold:

type(ŷ) <: LearnAPI.predict_output_type(algorithm, LearnAPI.Distribution())

Note. This trait has a single-argument "convenience" version LearnAPI.predict_output_type(algorithm) derived from this one, which returns a dictionary keyed on target proxy types.

See also LearnAPI.KindOfProxy, LearnAPI.predict, LearnAPI.predict_input_type.

New implementations

Overloading the trait is optional. Here's a sample implementation for a supervised regressor type MyRgs that only predicts actual values of the target:

@trait MyRgs predict_output_type = AbstractVector{ScientificTypesBase.Continuous}

The fallback method returns Any.

source
LearnAPI.predict_output_type(algorithm)

Return a dictionary of upper bounds on the type of predictions, keyed on concrete subtypes of LearnAPI.KindOfProxy. Each of these subtypes represents a different form of target prediction (LiteralTarget, Distribution, SurvivalFunction, etc) possibly supported by algorithm, but the existence of a key does not guarantee that form is supported.

As an example, if

ŷ = LearnAPI.predict(model, LearnAPI.Distribution(), data...)

successfully returns (i.e., algorithm supports predictions of target probability distributions) then the following is guaranteed to hold:

type(ŷ) <: LearnAPI.predict_output_types(algorithm)[LearnAPI.Distribution]

See also LearnAPI.KindOfProxy, LearnAPI.predict, LearnAPI.predict_input_type.

New implementations

This single argument trait should not be overloaded. Instead, overload LearnAPI.predict_output_type(algorithm, kindofproxy).

source
LearnAPI.transform_input_scitypeFunction
 LearnAPI.transform_input_scitype(algorithm)

Return an upper bound on the scitype of data guaranteed to work in the call transform(algorithm, data...).

Specifically, if S is the value returned and ScientificTypes.scitype(data) <: S, then the following is guaranteed to work:

julia transform(model, data...) obsdata = obs(transform, algorithm, data...) transform(model, Obs(), obsdata) whenever algorithm = LearnAPI.algorithm(model).

See also LearnAPI.transform_input_type.

New implementations

Implementation is optional. The fallback return value is Union{}. Ordinarily, at most one of the following should be overloaded for given algorithm LearnAPI.transform_scitype, LearnAPI.transform_type, LearnAPI.transform_observation_scitype, LearnAPI.transform_observation_type.

source
LearnAPI.transform_input_observation_scitypeFunction
LearnAPI.transform_observation_scitype(algorithm)

Return an upper bound on the scitype of observations guaranteed to work when calling transform(model, data...), independent of the type/scitype of the data container itself. Here "observations" is in the sense of MLUtils.jl. Assuming this trait has value different from Union{} the understanding is that data implements the MLUtils.jl getobs/numobs interface.

Specifically, denoting the type returned above by S, supposing S != Union{}, and that user supplies data satisfying

ScientificTypes.scitype(MLUtils.getobs(data, i)) <: S

for any valid index i, then all the following are guaranteed to work:

transform(model, data...)
+predict(model, kind_of_proxy, Obs(), obsdata)

whenever algorithm = LearnAPI.algorithm(model).

See also See also LearnAPI.fit_type, LearnAPI.fit_scitype, LearnAPI.fit_observation_type.

New implementations

Optional. The fallback return value is Union{}. Ordinarily, at most one of the following should be overloaded for given algorithm LearnAPI.predict_scitype, LearnAPI.predict_type, LearnAPI.predict_observation_scitype, LearnAPI.predict_observation_type.

source
LearnAPI.predict_output_scitypeFunction
LearnAPI.predict_output_scitype(algorithm, kind_of_proxy::KindOfProxy)

Return an upper bound for the scitypes of predictions of the specified form where supported, and otherwise return Any. For example, if

ŷ = LearnAPI.predict(model, LearnAPI.Distribution(), data...)

successfully returns (i.e., algorithm supports predictions of target probability distributions) then the following is guaranteed to hold:

scitype(ŷ) <: LearnAPI.predict_output_scitype(algorithm, LearnAPI.Distribution())

Note. This trait has a single-argument "convenience" version LearnAPI.predict_output_scitype(algorithm) derived from this one, which returns a dictionary keyed on target proxy types.

See also LearnAPI.KindOfProxy, LearnAPI.predict, LearnAPI.predict_input_scitype.

New implementations

Overloading the trait is optional. Here's a sample implementation for a supervised regressor type MyRgs that only predicts actual values of the target:

@trait MyRgs predict_output_scitype = AbstractVector{ScientificTypesBase.Continuous}

The fallback method returns Any.

source
LearnAPI.predict_output_scitype(algorithm)

Return a dictionary of upper bounds on the scitype of predictions, keyed on concrete subtypes of LearnAPI.KindOfProxy. Each of these subtypes represents a different form of target prediction (LiteralTarget, Distribution, SurvivalFunction, etc) possibly supported by algorithm, but the existence of a key does not guarantee that form is supported.

As an example, if

ŷ = LearnAPI.predict(model, LearnAPI.Distribution(), data...)

successfully returns (i.e., algorithm supports predictions of target probability distributions) then the following is guaranteed to hold:

scitype(ŷ) <: LearnAPI.predict_output_scitypes(algorithm)[LearnAPI.Distribution]

See also LearnAPI.KindOfProxy, LearnAPI.predict, LearnAPI.predict_input_scitype.

New implementations

This single argument trait should not be overloaded. Instead, overload LearnAPI.predict_output_scitype(algorithm, kindofproxy).

source
LearnAPI.predict_output_typeFunction
LearnAPI.predict_output_type(algorithm, kind_of_proxy::KindOfProxy)

Return an upper bound for the types of predictions of the specified form where supported, and otherwise return Any. For example, if

ŷ = LearnAPI.predict(model, LearnAPI.Distribution(), data...)

successfully returns (i.e., algorithm supports predictions of target probability distributions) then the following is guaranteed to hold:

type(ŷ) <: LearnAPI.predict_output_type(algorithm, LearnAPI.Distribution())

Note. This trait has a single-argument "convenience" version LearnAPI.predict_output_type(algorithm) derived from this one, which returns a dictionary keyed on target proxy types.

See also LearnAPI.KindOfProxy, LearnAPI.predict, LearnAPI.predict_input_type.

New implementations

Overloading the trait is optional. Here's a sample implementation for a supervised regressor type MyRgs that only predicts actual values of the target:

@trait MyRgs predict_output_type = AbstractVector{ScientificTypesBase.Continuous}

The fallback method returns Any.

source
LearnAPI.predict_output_type(algorithm)

Return a dictionary of upper bounds on the type of predictions, keyed on concrete subtypes of LearnAPI.KindOfProxy. Each of these subtypes represents a different form of target prediction (LiteralTarget, Distribution, SurvivalFunction, etc) possibly supported by algorithm, but the existence of a key does not guarantee that form is supported.

As an example, if

ŷ = LearnAPI.predict(model, LearnAPI.Distribution(), data...)

successfully returns (i.e., algorithm supports predictions of target probability distributions) then the following is guaranteed to hold:

type(ŷ) <: LearnAPI.predict_output_types(algorithm)[LearnAPI.Distribution]

See also LearnAPI.KindOfProxy, LearnAPI.predict, LearnAPI.predict_input_type.

New implementations

This single argument trait should not be overloaded. Instead, overload LearnAPI.predict_output_type(algorithm, kindofproxy).

source
LearnAPI.transform_input_scitypeFunction
 LearnAPI.transform_input_scitype(algorithm)

Return an upper bound on the scitype of data guaranteed to work in the call transform(algorithm, data...).

Specifically, if S is the value returned and ScientificTypes.scitype(data) <: S, then the following is guaranteed to work:

julia transform(model, data...) obsdata = obs(transform, algorithm, data...) transform(model, Obs(), obsdata) whenever algorithm = LearnAPI.algorithm(model).

See also LearnAPI.transform_input_type.

New implementations

Implementation is optional. The fallback return value is Union{}. Ordinarily, at most one of the following should be overloaded for given algorithm LearnAPI.transform_scitype, LearnAPI.transform_type, LearnAPI.transform_observation_scitype, LearnAPI.transform_observation_type.

source
LearnAPI.transform_input_observation_scitypeFunction
LearnAPI.transform_observation_scitype(algorithm)

Return an upper bound on the scitype of observations guaranteed to work when calling transform(model, data...), independent of the type/scitype of the data container itself. Here "observations" is in the sense of MLUtils.jl. Assuming this trait has value different from Union{} the understanding is that data implements the MLUtils.jl getobs/numobs interface.

Specifically, denoting the type returned above by S, supposing S != Union{}, and that user supplies data satisfying

ScientificTypes.scitype(MLUtils.getobs(data, i)) <: S

for any valid index i, then all the following are guaranteed to work:

transform(model, data...)
 obsdata = obs(transform, algorithm, data...)
-transform(model, Obs(), obsdata)

whenever algorithm = LearnAPI.algorithm(model).

See also See also LearnAPI.fit_type, LearnAPI.fit_scitype, LearnAPI.fit_observation_type.

New implementations

Optional. The fallback return value is Union{}. Ordinarily, at most one of the following should be overloaded for given algorithm LearnAPI.transform_scitype, LearnAPI.transform_type, LearnAPI.transform_observation_scitype, LearnAPI.transform_observation_type.

source
LearnAPI.transform_input_typeFunction
LearnAPI.transform_input_type(algorithm)

Return an upper bound on the type of data guaranteed to work in the call transform(algorithm, data...).

Specifically, if T is the value returned and typeof(data) <: T, then the following is guaranteed to work:

transform(model, data...)
+transform(model, Obs(), obsdata)

whenever algorithm = LearnAPI.algorithm(model).

See also See also LearnAPI.fit_type, LearnAPI.fit_scitype, LearnAPI.fit_observation_type.

New implementations

Optional. The fallback return value is Union{}. Ordinarily, at most one of the following should be overloaded for given algorithm LearnAPI.transform_scitype, LearnAPI.transform_type, LearnAPI.transform_observation_scitype, LearnAPI.transform_observation_type.

source
LearnAPI.transform_input_typeFunction
LearnAPI.transform_input_type(algorithm)

Return an upper bound on the type of data guaranteed to work in the call transform(algorithm, data...).

Specifically, if T is the value returned and typeof(data) <: T, then the following is guaranteed to work:

transform(model, data...)
 obsdata = obs(transform, model, data...)
-transform(model, Obs(), obsdata)

See also LearnAPI.transform_input_scitype.

New implementations

Implementation is optional. The fallback return value is Union{}. Should not be overloaded if LearnAPI.transform_input_scitype is overloaded.

source
LearnAPI.transform_input_observation_typeFunction
LearnAPI.transform_observation_type(algorithm)

Return an upper bound on the type of observations guaranteed to work when calling transform(model, data...), independent of the type/scitype of the data container itself. Here "observations" is in the sense of MLUtils.jl. Assuming this trait has value different from Union{} the understanding is that data implements the MLUtils.jl getobs/numobs interface.

Specifically, denoting the type returned above by T, supposing T != Union{}, and that user supplies data satisfying

typeof(MLUtils.getobs(data, i)) <: T

for any valid index i, then all the following are guaranteed to work:

transform(model, data...)
+transform(model, Obs(), obsdata)

See also LearnAPI.transform_input_scitype.

New implementations

Implementation is optional. The fallback return value is Union{}. Should not be overloaded if LearnAPI.transform_input_scitype is overloaded.

source
LearnAPI.transform_input_observation_typeFunction
LearnAPI.transform_observation_type(algorithm)

Return an upper bound on the type of observations guaranteed to work when calling transform(model, data...), independent of the type/scitype of the data container itself. Here "observations" is in the sense of MLUtils.jl. Assuming this trait has value different from Union{} the understanding is that data implements the MLUtils.jl getobs/numobs interface.

Specifically, denoting the type returned above by T, supposing T != Union{}, and that user supplies data satisfying

typeof(MLUtils.getobs(data, i)) <: T

for any valid index i, then all the following are guaranteed to work:

transform(model, data...)
 obsdata = obs(transform, algorithm, data...)
-transform(model, Obs(), obsdata)

whenever algorithm = LearnAPI.algorithm(model).

See also See also LearnAPI.fit_type, LearnAPI.fit_scitype, LearnAPI.fit_observation_type.

New implementations

Optional. The fallback return value is Union{}. Ordinarily, at most one of the following should be overloaded for given algorithm LearnAPI.transform_scitype, LearnAPI.transform_type, LearnAPI.transform_observation_scitype, LearnAPI.transform_observation_type.

source
LearnAPI.predict_or_transform_mutatesFunction
LearnAPI.predict_or_transform_mutates(algorithm)

Returns true if predict or transform possibly mutate their first argument, model, when LearnAPI.algorithm(model) == algorithm. If false, no arguments are ever mutated.

New implementations

This trait, falling back to false, may only be overloaded when fit has no data arguments (algorithm does not generalize to new data). See more at fit.

source
LearnAPI.transform_output_typeFunction
LearnAPI.transform_output_type(algorithm)

Return an upper bound on the type of the output of the transform operation.

New implementations

Implementation is optional. The fallback return value is Any.

source
+transform(model, Obs(), obsdata)

whenever algorithm = LearnAPI.algorithm(model).

See also See also LearnAPI.fit_type, LearnAPI.fit_scitype, LearnAPI.fit_observation_type.

New implementations

Optional. The fallback return value is Union{}. Ordinarily, at most one of the following should be overloaded for given algorithm LearnAPI.transform_scitype, LearnAPI.transform_type, LearnAPI.transform_observation_scitype, LearnAPI.transform_observation_type.

source
LearnAPI.predict_or_transform_mutatesFunction
LearnAPI.predict_or_transform_mutates(algorithm)

Returns true if predict or transform possibly mutate their first argument, model, when LearnAPI.algorithm(model) == algorithm. If false, no arguments are ever mutated.

New implementations

This trait, falling back to false, may only be overloaded when fit has no data arguments (algorithm does not generalize to new data). See more at fit.

source
LearnAPI.transform_output_typeFunction
LearnAPI.transform_output_type(algorithm)

Return an upper bound on the type of the output of the transform operation.

New implementations

Implementation is optional. The fallback return value is Any.

source