diff --git a/dev/.documenter-siteinfo.json b/dev/.documenter-siteinfo.json
index fe514dc3..0a09d3ca 100644
--- a/dev/.documenter-siteinfo.json
+++ b/dev/.documenter-siteinfo.json
@@ -1 +1 @@
-{"documenter":{"julia_version":"1.11.0","generation_timestamp":"2024-10-11T09:01:46","documenter_version":"1.7.0"}}
\ No newline at end of file
+{"documenter":{"julia_version":"1.11.1","generation_timestamp":"2024-11-30T09:18:47","documenter_version":"1.8.0"}}
\ No newline at end of file
diff --git a/dev/api/index.html b/dev/api/index.html
index 46d43c70..6bcd8579 100644
--- a/dev/api/index.html
+++ b/dev/api/index.html
@@ -1,462 +1,2 @@
-
struct to hold information regarding user-specified custom priors.
Usage
The CustomPrior struct has 3 fields:
predictors: the β coefficients.
intercept: the α intercept.
auxiliary: an auxiliary parameter.
In robust models, e.g. Linear Regression with Student-t likelihood or Count Regression with Negative Binomial likelihood, often there is an extra auxiliary parameter that is needed to parametrize to model to overcome under- or over-dispersion. If you are specifying a custom prior for one of these type of models, then you should also specify a prior for the auxiliary parameter.
Non-robust models do not need an auxiliary parameter and you can pass nothing as the auxiliary argument.
Converts a vector v to a vector of indices, i.e. a vector where all the entries are integers. Returns a tuple with the first element as the converted vector and the second element a Dict specifying which string is which integer.
This function is especially useful for random-effects varying-intercept hierarchical models. Normally v would be a vector of group membership with values such as "group_1", "group_2" etc. For random-effect models with varying-intercepts, Turing needs the group membership values to be passed as Ints.
Constructs the vector(s)/matrix(ces) Z(s) of random-effects (a.k.a. group-level) slope predictors.
Returns a Dict{String, AbstractArray} of Vector/Matrix as values of the random-effects predictors slope variables (keys) in the formula and present inside data.
Returns a tuple with the first element as the ID vector of Ints that represent group membership for a specific random-effect intercept group t of observations present in data. The second element of the tuple is a Dict specifying which string is which integer in the ID vector.
Create a Turing model using formula syntax and a data source.
formula
formula is the the same friendly interface to specify used to specify statistical models by brms, rstarnarm, bambi, StatsModels.jl and MixedModels.jl. The syntax is done by using the @formula macro and then specifying the dependent variable followed by a tilde ~ then the independent variables separated by a plus sign +.
Example: @formula(y ~ x1 + x2 + x3).
Moderations/interactions can be specified with the asterisk sign *, e.g. x1 * x2. This will be expanded to x1 + x2 + x1:x2, which, following the principle of hierarchy, the main effects must also be added along with the interaction effects. Here x1:x2 means that the values of x1 will be multiplied (interacted) with the values of x2.
Random-effects (a.k.a. group-level effects) can be specified with the (term | group) inside the @formula, where term is the independent variable and group is the categorical representation (i.e., either a column of Strings or a CategoricalArray in data). You can specify a random-intercept with (1 | group).
Example: @formula(y ~ (1 | group) + x1).
Notice: random-effects are currently only implemented for a single group-level intercept. Future versions of TuringGLM.jl will support slope random-effects and multiple group-level effets.
data
data can be any Tables.jl-compatible data interface. The most popular ones are DataFrames and NamedTuples.
model
model represents the likelihood function which you want to condition your data on. It has to be a subtype of Distributions.UnivariateDistribution. Currently, TuringGLM.jl supports:
Normal (the default if not specified): linear regression
TDist: robust linear regression
Bernoulli: logistic regression
Poisson: Poisson count data regression
NegativeBinomial: negative binomial robust count data regression
priors
TuringGLM.jl comes with state-of-the-art default priors, based on the literature and the Stan community. By default, turing_model will use DefaultPrior. But you can specify your own with priors=CustomPrior(predictors, intercept, auxiliary). All models take a predictors and intercept priors.
In robust models, e.g. Linear Regression with Student-t likelihood or Count Regression with Negative Binomial likelihood, often there is an extra auxiliary parameter that is needed to parametrize to model to overcome under- or over-dispersion. If you are specifying a custom prior for one of these type of models, then you should also specify a prior for the auxiliary parameter.
Non-robust models do not need an auxiliary parameter and you can pass nothing as the auxiliary argument.
Example for a non-robust model: @formula(y, ...), data; priors=CustomPrior(Normal(0, 2.5), Normal(10, 5), nothing)
Example for a robust model: @formula(y, ...), data; priors=CustomPrior(Normal(0, 2.5), Normal(10, 5), Exponential(1))
standardize
Whether true or false to standardize your data to mean 0 and standard deviation 1 before inference. Some science fields prefer to analyze and report effects in terms of standard devations. Also, whenever measurement scales differs, it is often suggested to standardize the effects for better comparison. By default, turing_model sets standardize=false.
struct to hold information regarding user-specified custom priors.
Usage
The CustomPrior struct has 3 fields:
predictors: the β coefficients.
intercept: the α intercept.
auxiliary: an auxiliary parameter.
In robust models, e.g. Linear Regression with Student-t likelihood or Count Regression with Negative Binomial likelihood, often there is an extra auxiliary parameter that is needed to parametrize to model to overcome under- or over-dispersion. If you are specifying a custom prior for one of these type of models, then you should also specify a prior for the auxiliary parameter.
Non-robust models do not need an auxiliary parameter and you can pass nothing as the auxiliary argument.
Converts a vector v to a vector of indices, i.e. a vector where all the entries are integers. Returns a tuple with the first element as the converted vector and the second element a Dict specifying which string is which integer.
This function is especially useful for random-effects varying-intercept hierarchical models. Normally v would be a vector of group membership with values such as "group_1", "group_2" etc. For random-effect models with varying-intercepts, Turing needs the group membership values to be passed as Ints.
Constructs the vector(s)/matrix(ces) Z(s) of random-effects (a.k.a. group-level) slope predictors.
Returns a Dict{String, AbstractArray} of Vector/Matrix as values of the random-effects predictors slope variables (keys) in the formula and present inside data.
Returns a tuple with the first element as the ID vector of Ints that represent group membership for a specific random-effect intercept group t of observations present in data. The second element of the tuple is a Dict specifying which string is which integer in the ID vector.
Create a Turing model using formula syntax and a data source.
formula
formula is the the same friendly interface to specify used to specify statistical models by brms, rstarnarm, bambi, StatsModels.jl and MixedModels.jl. The syntax is done by using the @formula macro and then specifying the dependent variable followed by a tilde ~ then the independent variables separated by a plus sign +.
Example: @formula(y ~ x1 + x2 + x3).
Moderations/interactions can be specified with the asterisk sign *, e.g. x1 * x2. This will be expanded to x1 + x2 + x1:x2, which, following the principle of hierarchy, the main effects must also be added along with the interaction effects. Here x1:x2 means that the values of x1 will be multiplied (interacted) with the values of x2.
Random-effects (a.k.a. group-level effects) can be specified with the (term | group) inside the @formula, where term is the independent variable and group is the categorical representation (i.e., either a column of Strings or a CategoricalArray in data). You can specify a random-intercept with (1 | group).
Example: @formula(y ~ (1 | group) + x1).
Notice: random-effects are currently only implemented for a single group-level intercept. Future versions of TuringGLM.jl will support slope random-effects and multiple group-level effets.
data
data can be any Tables.jl-compatible data interface. The most popular ones are DataFrames and NamedTuples.
model
model represents the likelihood function which you want to condition your data on. It has to be a subtype of Distributions.UnivariateDistribution. Currently, TuringGLM.jl supports:
Normal (the default if not specified): linear regression
TDist: robust linear regression
Bernoulli: logistic regression
Poisson: Poisson count data regression
NegativeBinomial: negative binomial robust count data regression
priors
TuringGLM.jl comes with state-of-the-art default priors, based on the literature and the Stan community. By default, turing_model will use DefaultPrior. But you can specify your own with priors=CustomPrior(predictors, intercept, auxiliary). All models take a predictors and intercept priors.
In robust models, e.g. Linear Regression with Student-t likelihood or Count Regression with Negative Binomial likelihood, often there is an extra auxiliary parameter that is needed to parametrize to model to overcome under- or over-dispersion. If you are specifying a custom prior for one of these type of models, then you should also specify a prior for the auxiliary parameter.
Non-robust models do not need an auxiliary parameter and you can pass nothing as the auxiliary argument.
Example for a non-robust model: @formula(y, ...), data; priors=CustomPrior(Normal(0, 2.5), Normal(10, 5), nothing)
Example for a robust model: @formula(y, ...), data; priors=CustomPrior(Normal(0, 2.5), Normal(10, 5), Exponential(1))
standardize
Whether true or false to standardize your data to mean 0 and standard deviation 1 before inference. Some science fields prefer to analyze and report effects in terms of standard devations. Also, whenever measurement scales differs, it is often suggested to standardize the effects for better comparison. By default, turing_model sets standardize=false.
This document was generated with Documenter.jl version 1.8.0 on Saturday 30 November 2024. Using Julia version 1.11.1.
diff --git a/dev/assets/documenter.js b/dev/assets/documenter.js
index 235cb2e5..c802defb 100644
--- a/dev/assets/documenter.js
+++ b/dev/assets/documenter.js
@@ -613,176 +613,194 @@ function worker_function(documenterSearchIndex, documenterBaseURL, filters) {
};
}
-// `worker = Threads.@spawn worker_function(documenterSearchIndex)`, but in JavaScript!
-const filters = [
- ...new Set(documenterSearchIndex["docs"].map((x) => x.category)),
-];
-const worker_str =
- "(" +
- worker_function.toString() +
- ")(" +
- JSON.stringify(documenterSearchIndex["docs"]) +
- "," +
- JSON.stringify(documenterBaseURL) +
- "," +
- JSON.stringify(filters) +
- ")";
-const worker_blob = new Blob([worker_str], { type: "text/javascript" });
-const worker = new Worker(URL.createObjectURL(worker_blob));
-
/////// SEARCH MAIN ///////
-// Whether the worker is currently handling a search. This is a boolean
-// as the worker only ever handles 1 or 0 searches at a time.
-var worker_is_running = false;
-
-// The last search text that was sent to the worker. This is used to determine
-// if the worker should be launched again when it reports back results.
-var last_search_text = "";
-
-// The results of the last search. This, in combination with the state of the filters
-// in the DOM, is used compute the results to display on calls to update_search.
-var unfiltered_results = [];
-
-// Which filter is currently selected
-var selected_filter = "";
-
-$(document).on("input", ".documenter-search-input", function (event) {
- if (!worker_is_running) {
- launch_search();
- }
-});
-
-function launch_search() {
- worker_is_running = true;
- last_search_text = $(".documenter-search-input").val();
- worker.postMessage(last_search_text);
-}
-
-worker.onmessage = function (e) {
- if (last_search_text !== $(".documenter-search-input").val()) {
- launch_search();
- } else {
- worker_is_running = false;
- }
-
- unfiltered_results = e.data;
- update_search();
-};
+function runSearchMainCode() {
+ // `worker = Threads.@spawn worker_function(documenterSearchIndex)`, but in JavaScript!
+ const filters = [
+ ...new Set(documenterSearchIndex["docs"].map((x) => x.category)),
+ ];
+ const worker_str =
+ "(" +
+ worker_function.toString() +
+ ")(" +
+ JSON.stringify(documenterSearchIndex["docs"]) +
+ "," +
+ JSON.stringify(documenterBaseURL) +
+ "," +
+ JSON.stringify(filters) +
+ ")";
+ const worker_blob = new Blob([worker_str], { type: "text/javascript" });
+ const worker = new Worker(URL.createObjectURL(worker_blob));
+
+ // Whether the worker is currently handling a search. This is a boolean
+ // as the worker only ever handles 1 or 0 searches at a time.
+ var worker_is_running = false;
+
+ // The last search text that was sent to the worker. This is used to determine
+ // if the worker should be launched again when it reports back results.
+ var last_search_text = "";
+
+ // The results of the last search. This, in combination with the state of the filters
+ // in the DOM, is used compute the results to display on calls to update_search.
+ var unfiltered_results = [];
+
+ // Which filter is currently selected
+ var selected_filter = "";
+
+ $(document).on("input", ".documenter-search-input", function (event) {
+ if (!worker_is_running) {
+ launch_search();
+ }
+ });
-$(document).on("click", ".search-filter", function () {
- if ($(this).hasClass("search-filter-selected")) {
- selected_filter = "";
- } else {
- selected_filter = $(this).text().toLowerCase();
+ function launch_search() {
+ worker_is_running = true;
+ last_search_text = $(".documenter-search-input").val();
+ worker.postMessage(last_search_text);
}
- // This updates search results and toggles classes for UI:
- update_search();
-});
+ worker.onmessage = function (e) {
+ if (last_search_text !== $(".documenter-search-input").val()) {
+ launch_search();
+ } else {
+ worker_is_running = false;
+ }
-/**
- * Make/Update the search component
- */
-function update_search() {
- let querystring = $(".documenter-search-input").val();
+ unfiltered_results = e.data;
+ update_search();
+ };
- if (querystring.trim()) {
- if (selected_filter == "") {
- results = unfiltered_results;
+ $(document).on("click", ".search-filter", function () {
+ if ($(this).hasClass("search-filter-selected")) {
+ selected_filter = "";
} else {
- results = unfiltered_results.filter((result) => {
- return selected_filter == result.category.toLowerCase();
- });
+ selected_filter = $(this).text().toLowerCase();
}
- let search_result_container = ``;
- let modal_filters = make_modal_body_filters();
- let search_divider = ``;
+ // This updates search results and toggles classes for UI:
+ update_search();
+ });
- if (results.length) {
- let links = [];
- let count = 0;
- let search_results = "";
-
- for (var i = 0, n = results.length; i < n && count < 200; ++i) {
- let result = results[i];
- if (result.location && !links.includes(result.location)) {
- search_results += result.div;
- count++;
- links.push(result.location);
- }
- }
+ /**
+ * Make/Update the search component
+ */
+ function update_search() {
+ let querystring = $(".documenter-search-input").val();
- if (count == 1) {
- count_str = "1 result";
- } else if (count == 200) {
- count_str = "200+ results";
+ if (querystring.trim()) {
+ if (selected_filter == "") {
+ results = unfiltered_results;
} else {
- count_str = count + " results";
+ results = unfiltered_results.filter((result) => {
+ return selected_filter == result.category.toLowerCase();
+ });
}
- let result_count = `
${count_str}
`;
- search_result_container = `
+ let search_result_container = ``;
+ let modal_filters = make_modal_body_filters();
+ let search_divider = ``;
+
+ if (results.length) {
+ let links = [];
+ let count = 0;
+ let search_results = "";
+
+ for (var i = 0, n = results.length; i < n && count < 200; ++i) {
+ let result = results[i];
+ if (result.location && !links.includes(result.location)) {
+ search_results += result.div;
+ count++;
+ links.push(result.location);
+ }
+ }
+
+ if (count == 1) {
+ count_str = "1 result";
+ } else if (count == 200) {
+ count_str = "200+ results";
+ } else {
+ count_str = count + " results";
+ }
+ let result_count = `
`;
+function waitUntilSearchIndexAvailable() {
+ // It is possible that the documenter.js script runs before the page
+ // has finished loading and documenterSearchIndex gets defined.
+ // So we need to wait until the search index actually loads before setting
+ // up all the search-related stuff.
+ if (typeof documenterSearchIndex !== "undefined") {
+ runSearchMainCode();
+ } else {
+ console.warn("Search Index not available, waiting");
+ setTimeout(waitUntilSearchIndexAvailable, 1000);
+ }
}
+// The actual entry point to the search code
+waitUntilSearchIndexAvailable();
+
})
////////////////////////////////////////////////////////////////////////////////
require(['jquery'], function($) {
diff --git a/dev/index.html b/dev/index.html
index 370e991f..e500c4e8 100644
--- a/dev/index.html
+++ b/dev/index.html
@@ -1,462 +1,2 @@
-Home · TuringGLM.jl
-
-
-
-
-
-
-
The @formula macro is extended from StatsModels.jl along with MixedModels.jl for the random-effects (a.k.a. group-level predictors).
The syntax is done by using the @formula macro and then specifying the dependent variable followed by a tilde ~ then the independent variables separated by a plus sign +.
Example:
@formula(y ~ x1 + x2 + x3)
Moderations/interactions can be specified with the asterisk sign *, e.g. x1 * x2. This will be expanded to x1 + x2 + x1:x2, which, following the principle of hierarchy, the main effects must also be added along with the interaction effects. Here x1:x2 means that the values of x1 will be multiplied (interacted) with the values of x2.
Random-effects (a.k.a. group-level effects) can be specified with the (term | group) inside the @formula, where term is the independent variable and group is the categorical representation (i.e., either a column of Strings or a CategoricalArray in data). You can specify a random-intercept with (1 | group).
The @formula macro is extended from StatsModels.jl along with MixedModels.jl for the random-effects (a.k.a. group-level predictors).
The syntax is done by using the @formula macro and then specifying the dependent variable followed by a tilde ~ then the independent variables separated by a plus sign +.
Example:
@formula(y ~ x1 + x2 + x3)
Moderations/interactions can be specified with the asterisk sign *, e.g. x1 * x2. This will be expanded to x1 + x2 + x1:x2, which, following the principle of hierarchy, the main effects must also be added along with the interaction effects. Here x1:x2 means that the values of x1 will be multiplied (interacted) with the values of x2.
Random-effects (a.k.a. group-level effects) can be specified with the (term | group) inside the @formula, where term is the independent variable and group is the categorical representation (i.e., either a column of Strings or a CategoricalArray in data). You can specify a random-intercept with (1 | group).
Take a look at the tutorials for all supported likelihood and models.
Settings
This document was generated with Documenter.jl version 1.8.0 on Saturday 30 November 2024. Using Julia version 1.11.1.
diff --git a/dev/objects.inv b/dev/objects.inv
index 085d7af0..4ba9e287 100644
Binary files a/dev/objects.inv and b/dev/objects.inv differ
diff --git a/dev/search_index.js b/dev/search_index.js
index 0a1919b1..90059c13 100644
--- a/dev/search_index.js
+++ b/dev/search_index.js
@@ -1,3 +1,3 @@
var documenterSearchIndex = {"docs":
-[{"location":"api/","page":"API reference","title":"API reference","text":"CurrentModule = TuringGLM","category":"page"},{"location":"api/#TuringGLM","page":"API reference","title":"TuringGLM","text":"","category":"section"},{"location":"api/","page":"API reference","title":"API reference","text":"Documentation for TuringGLM.","category":"page"},{"location":"api/","page":"API reference","title":"API reference","text":"","category":"page"},{"location":"api/","page":"API reference","title":"API reference","text":"Modules = [TuringGLM]","category":"page"},{"location":"api/#TuringGLM.CustomPrior","page":"API reference","title":"TuringGLM.CustomPrior","text":"CustomPrior(predictors, intercept, auxiliary)\n\nstruct to hold information regarding user-specified custom priors.\n\nUsage\n\nThe CustomPrior struct has 3 fields:\n\npredictors: the β coefficients.\nintercept: the α intercept.\nauxiliary: an auxiliary parameter.\n\nIn robust models, e.g. Linear Regression with Student-t likelihood or Count Regression with Negative Binomial likelihood, often there is an extra auxiliary parameter that is needed to parametrize to model to overcome under- or over-dispersion. If you are specifying a custom prior for one of these type of models, then you should also specify a prior for the auxiliary parameter.\n\nNon-robust models do not need an auxiliary parameter and you can pass nothing as the auxiliary argument.\n\n\n\n\n\n","category":"type"},{"location":"api/#TuringGLM.NegativeBinomial2-Union{Tuple{T}, Tuple{T, T}} where T<:Real","page":"API reference","title":"TuringGLM.NegativeBinomial2","text":"NegativeBinomial2(μ, ϕ)\n\nAn alternative parameterization of the Negative Binomial distribution:\n\ntextNegative-Binomial(n mid mu phi) sim binomn + phi - 1n left( fracmumu + phi right)^n left( fracphimu + phi right)^phi\n\nwhere the expectation is μ and variance is (μ + μ²/ϕ).\n\nThe alternative parameterization is inspired by the Stan's neg_binomial_2 function.\n\n\n\n\n\n","category":"method"},{"location":"api/#TuringGLM.center_predictors-Tuple{AbstractMatrix}","page":"API reference","title":"TuringGLM.center_predictors","text":"center_predictors(X::AbstractMatrix)\n\nCenters the columns of a matrix X of predictors to mean 0.\n\nReturns a tuple with:\n\nμ_X: 1xK Matrix of Float64s of the means of the K columns in the original X\n\nmatrix.\n\nX_centered: A Matrix of Float64s with the same dimensions as the original matrix\n\nX with the columns centered on mean μ=0.\n\nArguments\n\nX::AbstractMatrix: a matrix of predictors where rows are observations and columns are\n\nvariables.\n\n\n\n\n\n","category":"method"},{"location":"api/#TuringGLM.convert_str_to_indices-Tuple{AbstractVector}","page":"API reference","title":"TuringGLM.convert_str_to_indices","text":"convert_str_to_indices(v::AbstractVector)\n\nConverts a vector v to a vector of indices, i.e. a vector where all the entries are integers. Returns a tuple with the first element as the converted vector and the second element a Dict specifying which string is which integer.\n\nThis function is especially useful for random-effects varying-intercept hierarchical models. Normally v would be a vector of group membership with values such as \"group_1\", \"group_2\" etc. For random-effect models with varying-intercepts, Turing needs the group membership values to be passed as Ints.\n\n\n\n\n\n","category":"method"},{"location":"api/#TuringGLM.data_fixed_effects-Union{Tuple{D}, Tuple{StatsModels.FormulaTerm, D}} where D","page":"API reference","title":"TuringGLM.data_fixed_effects","text":"data_fixed_effects(formula::FormulaTerm, data)\n\nConstructs the matrix X of fixed-effects (a.k.a. population-level) predictors.\n\nReturns a Matrix of the fixed-effects predictors variables in the formula and present inside data.\n\nArguments\n\nformula: a FormulaTerm created by @formula macro.\ndata: a data object that satisfies the\n\nTables.jl interface such as a DataFrame.\n\n\n\n\n\n","category":"method"},{"location":"api/#TuringGLM.data_random_effects-Union{Tuple{D}, Tuple{StatsModels.FormulaTerm, D}} where D","page":"API reference","title":"TuringGLM.data_random_effects","text":"data_random_effects(formula::FormulaTerm, data)\n\nConstructs the vector(s)/matrix(ces) Z(s) of random-effects (a.k.a. group-level) slope predictors.\n\nReturns a Dict{String, AbstractArray} of Vector/Matrix as values of the random-effects predictors slope variables (keys) in the formula and present inside data.\n\nArguments\n\nformula: a FormulaTerm created by @formula macro.\ndata: a data object that satisfies the\n\nTables.jl interface such as a DataFrame.\n\n\n\n\n\n","category":"method"},{"location":"api/#TuringGLM.data_response-Union{Tuple{D}, Tuple{StatsModels.FormulaTerm, D}} where D","page":"API reference","title":"TuringGLM.data_response","text":"data_response(formula::FormulaTerm, data)\n\nConstructs the response y vector.\n\nReturns a Vector of the response variable in the formula and present inside data.\n\nArguments\n\nformula: a FormulaTerm created by @formula macro.\ndata: a data object that satisfies the\n\nTables.jl interface such as a DataFrame.\n\n\n\n\n\n","category":"method"},{"location":"api/#TuringGLM.get_idx-Union{Tuple{D}, Tuple{StatsModels.Term, D}} where D","page":"API reference","title":"TuringGLM.get_idx","text":"get_idx(term::Term, data)\n\nReturns a tuple with the first element as the ID vector of Ints that represent group membership for a specific random-effect intercept group t of observations present in data. The second element of the tuple is a Dict specifying which string is which integer in the ID vector.\n\n\n\n\n\n","category":"method"},{"location":"api/#TuringGLM.get_var-Union{Tuple{D}, Tuple{StatsModels.Term, D}} where D","page":"API reference","title":"TuringGLM.get_var","text":"get_var(term::Term, data)\n\nReturns the corresponding vector of column in data for the a specific random-effect slope term of observations present in data.\n\n\n\n\n\n","category":"method"},{"location":"api/#TuringGLM.has_ranef-Tuple{StatsModels.FormulaTerm}","page":"API reference","title":"TuringGLM.has_ranef","text":"has_ranef(formula::FormulaTerm)\n\nReturns true if any of the terms in formula is a FunctionTerm or false otherwise.\n\n\n\n\n\n","category":"method"},{"location":"api/#TuringGLM.intercept_per_ranef-Tuple{Tuple}","page":"API reference","title":"TuringGLM.intercept_per_ranef","text":"intercept_per_ranef(terms::Tuple{RandomEffectsTerm})\n\nReturns a vector of Strings where the entries are the grouping variables that have a group-level intercept.\n\n\n\n\n\n","category":"method"},{"location":"api/#TuringGLM.n_ranef-Tuple{StatsModels.FormulaTerm}","page":"API reference","title":"TuringGLM.n_ranef","text":"n_ranef(formula::FormulaTerm)\n\nReturns the number of RandomEffectsTerms in formula.\n\n\n\n\n\n","category":"method"},{"location":"api/#TuringGLM.ranef-Tuple{StatsModels.FormulaTerm}","page":"API reference","title":"TuringGLM.ranef","text":"ranef(formula::FormulaTerm)\n\nReturns a tuple of the FunctionTerms parsed as RandomEffectsTerms in formula. If there are no FunctionTerms in formula returns nothing.\n\n\n\n\n\n","category":"method"},{"location":"api/#TuringGLM.slope_per_ranef-Tuple{Tuple}","page":"API reference","title":"TuringGLM.slope_per_ranef","text":"slope_per_ranef(terms::Tuple{RandomEffectsTerm})\n\nReturns a SlopePerRanEf object where the entries are the grouping variables that have a group-level slope.\n\n\n\n\n\n","category":"method"},{"location":"api/#TuringGLM.standardize_predictors-Tuple{AbstractMatrix}","page":"API reference","title":"TuringGLM.standardize_predictors","text":"standardize_predictors(X::AbstractMatrix)\n\nStandardizes the columns of a matrix X of predictors to mean 0 and standard deviation 1.\n\nReturns a tuple with:\n\nμ_X: 1xK Matrix of Float64s of the means of the K columns in the original X\n\nmatrix.\n\nσ_X: 1xK Matrix of Float64s of the standard deviations of the K columns in the\n\noriginal X matrix.\n\nX_std: A Matrix of Float64s with the same dimensions as the original matrix\n\nX with the columns centered on mean μ=0 and standard deviation σ=1.\n\nArguments\n\nX::AbstractMatrix: a matrix of predictors where rows are observations and columns are\n\nvariables.\n\n\n\n\n\n","category":"method"},{"location":"api/#TuringGLM.standardize_predictors-Tuple{AbstractVector}","page":"API reference","title":"TuringGLM.standardize_predictors","text":"standardize_predictors(x::AbstractVector)\n\nStandardizes the vector x to mean 0 and standard deviation 1.\n\nReturns a tuple with:\n\nμ_X: Float64s of the mean of the original vector x.\nσ_X: Float64s of the standard deviations of the original vector x.\nx_std: A Vector of Float64s with the same length as the original vector\n\nx with the values centered on mean μ=0 and standard deviation σ=1.\n\nArguments\n\nx::AbstractVector: a vector.\n\n\n\n\n\n","category":"method"},{"location":"api/#TuringGLM.tuple_length-Union{Tuple{NTuple{N, Any}}, Tuple{N}} where N","page":"API reference","title":"TuringGLM.tuple_length","text":"tuple_length(::NTuple{N, Any}) where {N} = Int(N)\n\nThis is a hack to get the length of any tuple.\n\n\n\n\n\n","category":"method"},{"location":"api/#TuringGLM.turing_model-Union{Tuple{T}, Tuple{StatsModels.FormulaTerm, Any}} where T<:(UnivariateDistribution)","page":"API reference","title":"TuringGLM.turing_model","text":"turing_model(formula, data; model=Normal, priors=DefaultPrior(), standardize=false)\n\nCreate a Turing model using formula syntax and a data source.\n\nformula\n\nformula is the the same friendly interface to specify used to specify statistical models by brms, rstarnarm, bambi, StatsModels.jl and MixedModels.jl. The syntax is done by using the @formula macro and then specifying the dependent variable followed by a tilde ~ then the independent variables separated by a plus sign +.\n\nExample: @formula(y ~ x1 + x2 + x3).\n\nModerations/interactions can be specified with the asterisk sign *, e.g. x1 * x2. This will be expanded to x1 + x2 + x1:x2, which, following the principle of hierarchy, the main effects must also be added along with the interaction effects. Here x1:x2 means that the values of x1 will be multiplied (interacted) with the values of x2.\n\nRandom-effects (a.k.a. group-level effects) can be specified with the (term | group) inside the @formula, where term is the independent variable and group is the categorical representation (i.e., either a column of Strings or a CategoricalArray in data). You can specify a random-intercept with (1 | group).\n\nExample: @formula(y ~ (1 | group) + x1).\n\nNotice: random-effects are currently only implemented for a single group-level intercept. Future versions of TuringGLM.jl will support slope random-effects and multiple group-level effets.\n\ndata\n\ndata can be any Tables.jl-compatible data interface. The most popular ones are DataFrames and NamedTuples.\n\nmodel\n\nmodel represents the likelihood function which you want to condition your data on. It has to be a subtype of Distributions.UnivariateDistribution. Currently, TuringGLM.jl supports:\n\nNormal (the default if not specified): linear regression\nTDist: robust linear regression\nBernoulli: logistic regression\nPoisson: Poisson count data regression\nNegativeBinomial: negative binomial robust count data regression\n\npriors\n\nTuringGLM.jl comes with state-of-the-art default priors, based on the literature and the Stan community. By default, turing_model will use DefaultPrior. But you can specify your own with priors=CustomPrior(predictors, intercept, auxiliary). All models take a predictors and intercept priors.\n\nIn robust models, e.g. Linear Regression with Student-t likelihood or Count Regression with Negative Binomial likelihood, often there is an extra auxiliary parameter that is needed to parametrize to model to overcome under- or over-dispersion. If you are specifying a custom prior for one of these type of models, then you should also specify a prior for the auxiliary parameter.\n\nNon-robust models do not need an auxiliary parameter and you can pass nothing as the auxiliary argument.\n\nExample for a non-robust model: @formula(y, ...), data; priors=CustomPrior(Normal(0, 2.5), Normal(10, 5), nothing)\n\nExample for a robust model: @formula(y, ...), data; priors=CustomPrior(Normal(0, 2.5), Normal(10, 5), Exponential(1))\n\nstandardize\n\nWhether true or false to standardize your data to mean 0 and standard deviation 1 before inference. Some science fields prefer to analyze and report effects in terms of standard devations. Also, whenever measurement scales differs, it is often suggested to standardize the effects for better comparison. By default, turing_model sets standardize=false.\n\n\n\n\n\n","category":"method"},{"location":"tutorials/robust_regression/","page":"Robust Regression","title":"Robust Regression","text":"\n\n\n\n\n\n\n\n
For the Robust Regression with Student-\\(t\\) distribution as the likelihood, we'll use a famous dataset called kidiq (Gelman & Hill, 2007), which is data from a survey of adult American women and their respective children. Dated from 2007, it has 434 observations and 4 variables:
kid_score: child's IQ
mom_hs: binary/dummy (0 or 1) if the child's mother has a high school diploma
We instantiate our model with turing_model passing a keyword argument model=TDist to indicate that the model is a robust regression with the Student's t-distribution:
For our example on Negative Binomial Regression, let's use a famous dataset called roaches (Gelman & Hill, 2007), which is data on the efficacy of a pest management system at reducing the number of roaches in urban apartments. It has 262 observations and the following variables:
y – number of roaches caught.
roach1 – pretreatment number of roaches.
treatment – binary/dummy (0 or 1) for treatment indicator.
senior – binary/dummy (0 or 1) for only elderly residents in building.
exposure2 – number of days for which the roach traps were used
We instantiate our model with turing_model passing a keyword argument model=NegativeBinomial to indicate that the model is a negative binomial regression:
\n\n
model = turing_model(fm, roaches; model=NegativeBinomial);
Let's cover the Linear Regression example with the kidiq dataset (Gelman & Hill, 2007), which is data from a survey of adult American women and their respective children. Dated from 2007, it has 434 observations and 4 variables:
kid_score: child's IQ
mom_hs: binary/dummy (0 or 1) if the child's mother has a high school diploma
We instantiate our model with turing_model without specifying any model, thus the default model will be used (model=Normal). Notice that we are specifying the priors keyword argument:
Currently, TuringGLM only supports hierarchical models with a single random-intercept. This is done by using the (1 | group) inside the @formula macro.
For our Hierarchical Model example, let's use a famous dataset called cheese (Boatwright, McCulloch & Rossi, 1999), which is data from cheese ratings. A group of 10 rural and 10 urban raters rated 4 types of different cheeses (A, B, C and D) in two samples. So we have \\(4 \\cdot 20 \\cdot 2 = 160\\) observations and 4 variables:
Boatwright, P., McCulloch, R., & Rossi, P. (1999). Account-level modeling for trade promotion: An application of a constrained parameter hierarchical model. Journal of the American Statistical Association, 94(448), 1063–1073.
Let's cover Linear Regression with a famous dataset called kidiq (Gelman & Hill, 2007), which is data from a survey of adult American women and their respective children. Dated from 2007, it has 434 observations and 4 variables:
kid_score: child's IQ
mom_hs: binary/dummy (0 or 1) if the child's mother has a high school diploma
mom_iq: mother's IQ
mom_age: mother's age
For the purposes of this tutorial, we download the dataset from the TuringGLM repository:
Next, we instantiate our model with turing_model without specifying any model, thus the default model will be used (model=Normal):
\n\n
model = turing_model(fm, kidiq);
\n\n\n
n_samples = 2_000;
\n\n\n\n
This model is a valid Turing model, which we can pass to the default sample function from Turing to get our parameter estimates. We use the NUTS sampler with 2000 samples.
Gelman, A., & Hill, J. (2007). Data analysis using regression and multilevel/hierarchical models. Cambridge university press.
\n\n","category":"page"},{"location":"tutorials/linear_regression/","page":"Linear Regression","title":"Linear Regression","text":"EditURL = \"https://github.com/TuringLang/TuringGLM.jl/blob/main/docs/src/tutorials/linear_regression.jl\"","category":"page"},{"location":"#TuringGLM","page":"Home","title":"TuringGLM","text":"","category":"section"},{"location":"","page":"Home","title":"Home","text":"Documentation for TuringGLM. Please file an issue if you run into any problems.","category":"page"},{"location":"","page":"Home","title":"Home","text":"TuringGLM supports Julia version 1.7+. We recommend always using it with the latest stable Julia release.","category":"page"},{"location":"#Getting-Started","page":"Home","title":"Getting Started","text":"","category":"section"},{"location":"","page":"Home","title":"Home","text":"TuringGLM makes easy to specify Bayesian Generalized Linear Models using the formula syntax and returns an instantiated Turing model.","category":"page"},{"location":"","page":"Home","title":"Home","text":"Heavily inspired by brms (uses RStan or CmdStanR) and bambi (uses PyMC3).","category":"page"},{"location":"#@formula","page":"Home","title":"@formula","text":"","category":"section"},{"location":"","page":"Home","title":"Home","text":"The @formula macro is extended from StatsModels.jl along with MixedModels.jl for the random-effects (a.k.a. group-level predictors).","category":"page"},{"location":"","page":"Home","title":"Home","text":"The syntax is done by using the @formula macro and then specifying the dependent variable followed by a tilde ~ then the independent variables separated by a plus sign +.","category":"page"},{"location":"","page":"Home","title":"Home","text":"Example:","category":"page"},{"location":"","page":"Home","title":"Home","text":"@formula(y ~ x1 + x2 + x3)","category":"page"},{"location":"","page":"Home","title":"Home","text":"Moderations/interactions can be specified with the asterisk sign *, e.g. x1 * x2. This will be expanded to x1 + x2 + x1:x2, which, following the principle of hierarchy, the main effects must also be added along with the interaction effects. Here x1:x2 means that the values of x1 will be multiplied (interacted) with the values of x2.","category":"page"},{"location":"","page":"Home","title":"Home","text":"Random-effects (a.k.a. group-level effects) can be specified with the (term | group) inside the @formula, where term is the independent variable and group is the categorical representation (i.e., either a column of Strings or a CategoricalArray in data). You can specify a random-intercept with (1 | group).","category":"page"},{"location":"","page":"Home","title":"Home","text":"Example:","category":"page"},{"location":"","page":"Home","title":"Home","text":"@formula(y ~ (1 | group) + x1)","category":"page"},{"location":"#Data","page":"Home","title":"Data","text":"","category":"section"},{"location":"","page":"Home","title":"Home","text":"TuringGLM supports any Tables.jl-compatible data interface. The most popular ones are DataFrames and NamedTuples.","category":"page"},{"location":"#Supported-Models","page":"Home","title":"Supported Models","text":"","category":"section"},{"location":"","page":"Home","title":"Home","text":"TuringGLM supports non-hierarchical and hierarchical models. For hierarchical models, only single random-intercept hierarchical models are supported.","category":"page"},{"location":"","page":"Home","title":"Home","text":"Currently, for likelihoods TuringGLM.jl supports:","category":"page"},{"location":"","page":"Home","title":"Home","text":"Normal (the default if not specified): linear regression\nTDist: robust linear regression\nBernoulli: logistic regression\nPoisson: Poisson count data regression\nNegativeBinomial: negative binomial robust count data regression","category":"page"},{"location":"#Tutorials","page":"Home","title":"Tutorials","text":"","category":"section"},{"location":"","page":"Home","title":"Home","text":"Take a look at the tutorials for all supported likelihood and models.","category":"page"},{"location":"","page":"Home","title":"Home","text":"Pages = [\n \"tutorials/linear_regression.md\",\n \"tutorials/logistic_regression.md\",\n \"tutorials/poisson_regression.md\",\n \"tutorials/negativebinomial_regression.md\",\n \"tutorials/robust_regression.md\",\n \"tutorials/hierarchical_models.md\",\n \"tutorials/custom_priors.md\"\n]\nDepth = 1","category":"page"},{"location":"tutorials/logistic_regression/","page":"Logistic Regression","title":"Logistic Regression","text":"\n\n\n\n\n\n\n\n
For our tutorial on Logistic Regression, let's use a famous dataset called wells (Gelman & Hill, 2007), which is data from a survey of 3,200 residents in a small area of Bangladesh suffering from arsenic contamination of groundwater. Respondents with elevated arsenic levels in their wells had been encouraged to switch their water source to a safe public or private well in the nearby area and the survey was conducted several years later to learn which of the affected residents had switched wells. It has 3,200 observations and the following variables:
switch – binary/dummy (0 or 1) for well-switching.
arsenic – arsenic level in respondent's well.
dist – distance (meters) from the respondent's house to the nearest well with safe drinking water.
association – binary/dummy (0 or 1) if member(s) of household participate in community organizations.
For our example on Poisson Regression, let's use a famous dataset called roaches (Gelman & Hill, 2007), which is data on the efficacy of a pest management system at reducing the number of roaches in urban apartments. It has 262 observations and the following variables:
y – number of roaches caught.
roach1 – pretreatment number of roaches.
treatment – binary/dummy (0 or 1) for treatment indicator.
senior – binary/dummy (0 or 1) for only elderly residents in building.
exposure2 – number of days for which the roach traps were used
Gelman, A., & Hill, J. (2007). Data analysis using regression and multilevel/hierarchical models. Cambridge university press.
\n\n","category":"page"},{"location":"tutorials/poisson_regression/","page":"Poisson Regression","title":"Poisson Regression","text":"EditURL = \"https://github.com/TuringLang/TuringGLM.jl/blob/main/docs/src/tutorials/poisson_regression.jl\"","category":"page"}]
+[{"location":"api/","page":"API reference","title":"API reference","text":"CurrentModule = TuringGLM","category":"page"},{"location":"api/#TuringGLM","page":"API reference","title":"TuringGLM","text":"","category":"section"},{"location":"api/","page":"API reference","title":"API reference","text":"Documentation for TuringGLM.","category":"page"},{"location":"api/","page":"API reference","title":"API reference","text":"","category":"page"},{"location":"api/","page":"API reference","title":"API reference","text":"Modules = [TuringGLM]","category":"page"},{"location":"api/#TuringGLM.CustomPrior","page":"API reference","title":"TuringGLM.CustomPrior","text":"CustomPrior(predictors, intercept, auxiliary)\n\nstruct to hold information regarding user-specified custom priors.\n\nUsage\n\nThe CustomPrior struct has 3 fields:\n\npredictors: the β coefficients.\nintercept: the α intercept.\nauxiliary: an auxiliary parameter.\n\nIn robust models, e.g. Linear Regression with Student-t likelihood or Count Regression with Negative Binomial likelihood, often there is an extra auxiliary parameter that is needed to parametrize to model to overcome under- or over-dispersion. If you are specifying a custom prior for one of these type of models, then you should also specify a prior for the auxiliary parameter.\n\nNon-robust models do not need an auxiliary parameter and you can pass nothing as the auxiliary argument.\n\n\n\n\n\n","category":"type"},{"location":"api/#TuringGLM.NegativeBinomial2-Union{Tuple{T}, Tuple{T, T}} where T<:Real","page":"API reference","title":"TuringGLM.NegativeBinomial2","text":"NegativeBinomial2(μ, ϕ)\n\nAn alternative parameterization of the Negative Binomial distribution:\n\ntextNegative-Binomial(n mid mu phi) sim binomn + phi - 1n left( fracmumu + phi right)^n left( fracphimu + phi right)^phi\n\nwhere the expectation is μ and variance is (μ + μ²/ϕ).\n\nThe alternative parameterization is inspired by the Stan's neg_binomial_2 function.\n\n\n\n\n\n","category":"method"},{"location":"api/#TuringGLM.center_predictors-Tuple{AbstractMatrix}","page":"API reference","title":"TuringGLM.center_predictors","text":"center_predictors(X::AbstractMatrix)\n\nCenters the columns of a matrix X of predictors to mean 0.\n\nReturns a tuple with:\n\nμ_X: 1xK Matrix of Float64s of the means of the K columns in the original X\n\nmatrix.\n\nX_centered: A Matrix of Float64s with the same dimensions as the original matrix\n\nX with the columns centered on mean μ=0.\n\nArguments\n\nX::AbstractMatrix: a matrix of predictors where rows are observations and columns are\n\nvariables.\n\n\n\n\n\n","category":"method"},{"location":"api/#TuringGLM.convert_str_to_indices-Tuple{AbstractVector}","page":"API reference","title":"TuringGLM.convert_str_to_indices","text":"convert_str_to_indices(v::AbstractVector)\n\nConverts a vector v to a vector of indices, i.e. a vector where all the entries are integers. Returns a tuple with the first element as the converted vector and the second element a Dict specifying which string is which integer.\n\nThis function is especially useful for random-effects varying-intercept hierarchical models. Normally v would be a vector of group membership with values such as \"group_1\", \"group_2\" etc. For random-effect models with varying-intercepts, Turing needs the group membership values to be passed as Ints.\n\n\n\n\n\n","category":"method"},{"location":"api/#TuringGLM.data_fixed_effects-Union{Tuple{D}, Tuple{StatsModels.FormulaTerm, D}} where D","page":"API reference","title":"TuringGLM.data_fixed_effects","text":"data_fixed_effects(formula::FormulaTerm, data)\n\nConstructs the matrix X of fixed-effects (a.k.a. population-level) predictors.\n\nReturns a Matrix of the fixed-effects predictors variables in the formula and present inside data.\n\nArguments\n\nformula: a FormulaTerm created by @formula macro.\ndata: a data object that satisfies the\n\nTables.jl interface such as a DataFrame.\n\n\n\n\n\n","category":"method"},{"location":"api/#TuringGLM.data_random_effects-Union{Tuple{D}, Tuple{StatsModels.FormulaTerm, D}} where D","page":"API reference","title":"TuringGLM.data_random_effects","text":"data_random_effects(formula::FormulaTerm, data)\n\nConstructs the vector(s)/matrix(ces) Z(s) of random-effects (a.k.a. group-level) slope predictors.\n\nReturns a Dict{String, AbstractArray} of Vector/Matrix as values of the random-effects predictors slope variables (keys) in the formula and present inside data.\n\nArguments\n\nformula: a FormulaTerm created by @formula macro.\ndata: a data object that satisfies the\n\nTables.jl interface such as a DataFrame.\n\n\n\n\n\n","category":"method"},{"location":"api/#TuringGLM.data_response-Union{Tuple{D}, Tuple{StatsModels.FormulaTerm, D}} where D","page":"API reference","title":"TuringGLM.data_response","text":"data_response(formula::FormulaTerm, data)\n\nConstructs the response y vector.\n\nReturns a Vector of the response variable in the formula and present inside data.\n\nArguments\n\nformula: a FormulaTerm created by @formula macro.\ndata: a data object that satisfies the\n\nTables.jl interface such as a DataFrame.\n\n\n\n\n\n","category":"method"},{"location":"api/#TuringGLM.get_idx-Union{Tuple{D}, Tuple{StatsModels.Term, D}} where D","page":"API reference","title":"TuringGLM.get_idx","text":"get_idx(term::Term, data)\n\nReturns a tuple with the first element as the ID vector of Ints that represent group membership for a specific random-effect intercept group t of observations present in data. The second element of the tuple is a Dict specifying which string is which integer in the ID vector.\n\n\n\n\n\n","category":"method"},{"location":"api/#TuringGLM.get_var-Union{Tuple{D}, Tuple{StatsModels.Term, D}} where D","page":"API reference","title":"TuringGLM.get_var","text":"get_var(term::Term, data)\n\nReturns the corresponding vector of column in data for the a specific random-effect slope term of observations present in data.\n\n\n\n\n\n","category":"method"},{"location":"api/#TuringGLM.has_ranef-Tuple{StatsModels.FormulaTerm}","page":"API reference","title":"TuringGLM.has_ranef","text":"has_ranef(formula::FormulaTerm)\n\nReturns true if any of the terms in formula is a FunctionTerm or false otherwise.\n\n\n\n\n\n","category":"method"},{"location":"api/#TuringGLM.intercept_per_ranef-Tuple{Tuple}","page":"API reference","title":"TuringGLM.intercept_per_ranef","text":"intercept_per_ranef(terms::Tuple{RandomEffectsTerm})\n\nReturns a vector of Strings where the entries are the grouping variables that have a group-level intercept.\n\n\n\n\n\n","category":"method"},{"location":"api/#TuringGLM.n_ranef-Tuple{StatsModels.FormulaTerm}","page":"API reference","title":"TuringGLM.n_ranef","text":"n_ranef(formula::FormulaTerm)\n\nReturns the number of RandomEffectsTerms in formula.\n\n\n\n\n\n","category":"method"},{"location":"api/#TuringGLM.ranef-Tuple{StatsModels.FormulaTerm}","page":"API reference","title":"TuringGLM.ranef","text":"ranef(formula::FormulaTerm)\n\nReturns a tuple of the FunctionTerms parsed as RandomEffectsTerms in formula. If there are no FunctionTerms in formula returns nothing.\n\n\n\n\n\n","category":"method"},{"location":"api/#TuringGLM.slope_per_ranef-Tuple{Tuple}","page":"API reference","title":"TuringGLM.slope_per_ranef","text":"slope_per_ranef(terms::Tuple{RandomEffectsTerm})\n\nReturns a SlopePerRanEf object where the entries are the grouping variables that have a group-level slope.\n\n\n\n\n\n","category":"method"},{"location":"api/#TuringGLM.standardize_predictors-Tuple{AbstractMatrix}","page":"API reference","title":"TuringGLM.standardize_predictors","text":"standardize_predictors(X::AbstractMatrix)\n\nStandardizes the columns of a matrix X of predictors to mean 0 and standard deviation 1.\n\nReturns a tuple with:\n\nμ_X: 1xK Matrix of Float64s of the means of the K columns in the original X\n\nmatrix.\n\nσ_X: 1xK Matrix of Float64s of the standard deviations of the K columns in the\n\noriginal X matrix.\n\nX_std: A Matrix of Float64s with the same dimensions as the original matrix\n\nX with the columns centered on mean μ=0 and standard deviation σ=1.\n\nArguments\n\nX::AbstractMatrix: a matrix of predictors where rows are observations and columns are\n\nvariables.\n\n\n\n\n\n","category":"method"},{"location":"api/#TuringGLM.standardize_predictors-Tuple{AbstractVector}","page":"API reference","title":"TuringGLM.standardize_predictors","text":"standardize_predictors(x::AbstractVector)\n\nStandardizes the vector x to mean 0 and standard deviation 1.\n\nReturns a tuple with:\n\nμ_X: Float64s of the mean of the original vector x.\nσ_X: Float64s of the standard deviations of the original vector x.\nx_std: A Vector of Float64s with the same length as the original vector\n\nx with the values centered on mean μ=0 and standard deviation σ=1.\n\nArguments\n\nx::AbstractVector: a vector.\n\n\n\n\n\n","category":"method"},{"location":"api/#TuringGLM.tuple_length-Union{Tuple{NTuple{N, Any}}, Tuple{N}} where N","page":"API reference","title":"TuringGLM.tuple_length","text":"tuple_length(::NTuple{N, Any}) where {N} = Int(N)\n\nThis is a hack to get the length of any tuple.\n\n\n\n\n\n","category":"method"},{"location":"api/#TuringGLM.turing_model-Union{Tuple{T}, Tuple{StatsModels.FormulaTerm, Any}} where T<:(UnivariateDistribution)","page":"API reference","title":"TuringGLM.turing_model","text":"turing_model(formula, data; model=Normal, priors=DefaultPrior(), standardize=false)\n\nCreate a Turing model using formula syntax and a data source.\n\nformula\n\nformula is the the same friendly interface to specify used to specify statistical models by brms, rstarnarm, bambi, StatsModels.jl and MixedModels.jl. The syntax is done by using the @formula macro and then specifying the dependent variable followed by a tilde ~ then the independent variables separated by a plus sign +.\n\nExample: @formula(y ~ x1 + x2 + x3).\n\nModerations/interactions can be specified with the asterisk sign *, e.g. x1 * x2. This will be expanded to x1 + x2 + x1:x2, which, following the principle of hierarchy, the main effects must also be added along with the interaction effects. Here x1:x2 means that the values of x1 will be multiplied (interacted) with the values of x2.\n\nRandom-effects (a.k.a. group-level effects) can be specified with the (term | group) inside the @formula, where term is the independent variable and group is the categorical representation (i.e., either a column of Strings or a CategoricalArray in data). You can specify a random-intercept with (1 | group).\n\nExample: @formula(y ~ (1 | group) + x1).\n\nNotice: random-effects are currently only implemented for a single group-level intercept. Future versions of TuringGLM.jl will support slope random-effects and multiple group-level effets.\n\ndata\n\ndata can be any Tables.jl-compatible data interface. The most popular ones are DataFrames and NamedTuples.\n\nmodel\n\nmodel represents the likelihood function which you want to condition your data on. It has to be a subtype of Distributions.UnivariateDistribution. Currently, TuringGLM.jl supports:\n\nNormal (the default if not specified): linear regression\nTDist: robust linear regression\nBernoulli: logistic regression\nPoisson: Poisson count data regression\nNegativeBinomial: negative binomial robust count data regression\n\npriors\n\nTuringGLM.jl comes with state-of-the-art default priors, based on the literature and the Stan community. By default, turing_model will use DefaultPrior. But you can specify your own with priors=CustomPrior(predictors, intercept, auxiliary). All models take a predictors and intercept priors.\n\nIn robust models, e.g. Linear Regression with Student-t likelihood or Count Regression with Negative Binomial likelihood, often there is an extra auxiliary parameter that is needed to parametrize to model to overcome under- or over-dispersion. If you are specifying a custom prior for one of these type of models, then you should also specify a prior for the auxiliary parameter.\n\nNon-robust models do not need an auxiliary parameter and you can pass nothing as the auxiliary argument.\n\nExample for a non-robust model: @formula(y, ...), data; priors=CustomPrior(Normal(0, 2.5), Normal(10, 5), nothing)\n\nExample for a robust model: @formula(y, ...), data; priors=CustomPrior(Normal(0, 2.5), Normal(10, 5), Exponential(1))\n\nstandardize\n\nWhether true or false to standardize your data to mean 0 and standard deviation 1 before inference. Some science fields prefer to analyze and report effects in terms of standard devations. Also, whenever measurement scales differs, it is often suggested to standardize the effects for better comparison. By default, turing_model sets standardize=false.\n\n\n\n\n\n","category":"method"},{"location":"tutorials/robust_regression/","page":"Robust Regression","title":"Robust Regression","text":"\n\n\n\n\n\n\n\n
For the Robust Regression with Student-\\(t\\) distribution as the likelihood, we'll use a famous dataset called kidiq (Gelman & Hill, 2007), which is data from a survey of adult American women and their respective children. Dated from 2007, it has 434 observations and 4 variables:
kid_score: child's IQ
mom_hs: binary/dummy (0 or 1) if the child's mother has a high school diploma
We instantiate our model with turing_model passing a keyword argument model=TDist to indicate that the model is a robust regression with the Student's t-distribution:
For our example on Negative Binomial Regression, let's use a famous dataset called roaches (Gelman & Hill, 2007), which is data on the efficacy of a pest management system at reducing the number of roaches in urban apartments. It has 262 observations and the following variables:
y – number of roaches caught.
roach1 – pretreatment number of roaches.
treatment – binary/dummy (0 or 1) for treatment indicator.
senior – binary/dummy (0 or 1) for only elderly residents in building.
exposure2 – number of days for which the roach traps were used
We instantiate our model with turing_model passing a keyword argument model=NegativeBinomial to indicate that the model is a negative binomial regression:
\n\n
model = turing_model(fm, roaches; model=NegativeBinomial);
Let's cover the Linear Regression example with the kidiq dataset (Gelman & Hill, 2007), which is data from a survey of adult American women and their respective children. Dated from 2007, it has 434 observations and 4 variables:
kid_score: child's IQ
mom_hs: binary/dummy (0 or 1) if the child's mother has a high school diploma
We instantiate our model with turing_model without specifying any model, thus the default model will be used (model=Normal). Notice that we are specifying the priors keyword argument:
Currently, TuringGLM only supports hierarchical models with a single random-intercept. This is done by using the (1 | group) inside the @formula macro.
For our Hierarchical Model example, let's use a famous dataset called cheese (Boatwright, McCulloch & Rossi, 1999), which is data from cheese ratings. A group of 10 rural and 10 urban raters rated 4 types of different cheeses (A, B, C and D) in two samples. So we have \\(4 \\cdot 20 \\cdot 2 = 160\\) observations and 4 variables:
Boatwright, P., McCulloch, R., & Rossi, P. (1999). Account-level modeling for trade promotion: An application of a constrained parameter hierarchical model. Journal of the American Statistical Association, 94(448), 1063–1073.
Let's cover Linear Regression with a famous dataset called kidiq (Gelman & Hill, 2007), which is data from a survey of adult American women and their respective children. Dated from 2007, it has 434 observations and 4 variables:
kid_score: child's IQ
mom_hs: binary/dummy (0 or 1) if the child's mother has a high school diploma
mom_iq: mother's IQ
mom_age: mother's age
For the purposes of this tutorial, we download the dataset from the TuringGLM repository:
Next, we instantiate our model with turing_model without specifying any model, thus the default model will be used (model=Normal):
\n\n
model = turing_model(fm, kidiq);
\n\n\n
n_samples = 2_000;
\n\n\n\n
This model is a valid Turing model, which we can pass to the default sample function from Turing to get our parameter estimates. We use the NUTS sampler with 2000 samples.
Gelman, A., & Hill, J. (2007). Data analysis using regression and multilevel/hierarchical models. Cambridge university press.
\n\n","category":"page"},{"location":"tutorials/linear_regression/","page":"Linear Regression","title":"Linear Regression","text":"EditURL = \"https://github.com/TuringLang/TuringGLM.jl/blob/main/docs/src/tutorials/linear_regression.jl\"","category":"page"},{"location":"#TuringGLM","page":"Home","title":"TuringGLM","text":"","category":"section"},{"location":"","page":"Home","title":"Home","text":"Documentation for TuringGLM. Please file an issue if you run into any problems.","category":"page"},{"location":"","page":"Home","title":"Home","text":"TuringGLM supports Julia version 1.7+. We recommend always using it with the latest stable Julia release.","category":"page"},{"location":"#Getting-Started","page":"Home","title":"Getting Started","text":"","category":"section"},{"location":"","page":"Home","title":"Home","text":"TuringGLM makes easy to specify Bayesian Generalized Linear Models using the formula syntax and returns an instantiated Turing model.","category":"page"},{"location":"","page":"Home","title":"Home","text":"Heavily inspired by brms (uses RStan or CmdStanR) and bambi (uses PyMC3).","category":"page"},{"location":"#@formula","page":"Home","title":"@formula","text":"","category":"section"},{"location":"","page":"Home","title":"Home","text":"The @formula macro is extended from StatsModels.jl along with MixedModels.jl for the random-effects (a.k.a. group-level predictors).","category":"page"},{"location":"","page":"Home","title":"Home","text":"The syntax is done by using the @formula macro and then specifying the dependent variable followed by a tilde ~ then the independent variables separated by a plus sign +.","category":"page"},{"location":"","page":"Home","title":"Home","text":"Example:","category":"page"},{"location":"","page":"Home","title":"Home","text":"@formula(y ~ x1 + x2 + x3)","category":"page"},{"location":"","page":"Home","title":"Home","text":"Moderations/interactions can be specified with the asterisk sign *, e.g. x1 * x2. This will be expanded to x1 + x2 + x1:x2, which, following the principle of hierarchy, the main effects must also be added along with the interaction effects. Here x1:x2 means that the values of x1 will be multiplied (interacted) with the values of x2.","category":"page"},{"location":"","page":"Home","title":"Home","text":"Random-effects (a.k.a. group-level effects) can be specified with the (term | group) inside the @formula, where term is the independent variable and group is the categorical representation (i.e., either a column of Strings or a CategoricalArray in data). You can specify a random-intercept with (1 | group).","category":"page"},{"location":"","page":"Home","title":"Home","text":"Example:","category":"page"},{"location":"","page":"Home","title":"Home","text":"@formula(y ~ (1 | group) + x1)","category":"page"},{"location":"#Data","page":"Home","title":"Data","text":"","category":"section"},{"location":"","page":"Home","title":"Home","text":"TuringGLM supports any Tables.jl-compatible data interface. The most popular ones are DataFrames and NamedTuples.","category":"page"},{"location":"#Supported-Models","page":"Home","title":"Supported Models","text":"","category":"section"},{"location":"","page":"Home","title":"Home","text":"TuringGLM supports non-hierarchical and hierarchical models. For hierarchical models, only single random-intercept hierarchical models are supported.","category":"page"},{"location":"","page":"Home","title":"Home","text":"Currently, for likelihoods TuringGLM.jl supports:","category":"page"},{"location":"","page":"Home","title":"Home","text":"Normal (the default if not specified): linear regression\nTDist: robust linear regression\nBernoulli: logistic regression\nPoisson: Poisson count data regression\nNegativeBinomial: negative binomial robust count data regression","category":"page"},{"location":"#Tutorials","page":"Home","title":"Tutorials","text":"","category":"section"},{"location":"","page":"Home","title":"Home","text":"Take a look at the tutorials for all supported likelihood and models.","category":"page"},{"location":"","page":"Home","title":"Home","text":"Pages = [\n \"tutorials/linear_regression.md\",\n \"tutorials/logistic_regression.md\",\n \"tutorials/poisson_regression.md\",\n \"tutorials/negativebinomial_regression.md\",\n \"tutorials/robust_regression.md\",\n \"tutorials/hierarchical_models.md\",\n \"tutorials/custom_priors.md\"\n]\nDepth = 1","category":"page"},{"location":"tutorials/logistic_regression/","page":"Logistic Regression","title":"Logistic Regression","text":"\n\n\n\n\n\n\n\n
For our tutorial on Logistic Regression, let's use a famous dataset called wells (Gelman & Hill, 2007), which is data from a survey of 3,200 residents in a small area of Bangladesh suffering from arsenic contamination of groundwater. Respondents with elevated arsenic levels in their wells had been encouraged to switch their water source to a safe public or private well in the nearby area and the survey was conducted several years later to learn which of the affected residents had switched wells. It has 3,200 observations and the following variables:
switch – binary/dummy (0 or 1) for well-switching.
arsenic – arsenic level in respondent's well.
dist – distance (meters) from the respondent's house to the nearest well with safe drinking water.
association – binary/dummy (0 or 1) if member(s) of household participate in community organizations.
For our example on Poisson Regression, let's use a famous dataset called roaches (Gelman & Hill, 2007), which is data on the efficacy of a pest management system at reducing the number of roaches in urban apartments. It has 262 observations and the following variables:
y – number of roaches caught.
roach1 – pretreatment number of roaches.
treatment – binary/dummy (0 or 1) for treatment indicator.
senior – binary/dummy (0 or 1) for only elderly residents in building.
exposure2 – number of days for which the roach traps were used