Skip to content

Commit aa1fcf6

Browse files
Maximilian-Stefan-ErnstalystAlexey Stukalovaaronpeikert
authored
Release/v0.3.0 (#211)
* CommutationMatrix type replace comm_matrix helper functions with a CommutationMatrix and overloaded linalg ops * simplify elimination_matrix() * simplify duplication_matrix() * add tests for commutation/dublication/elimination matrices * small unit test fixes * commutation_matrix * vec method * more comm_matrix tests * SemSpecification base type * SemSpecification: use in methods * rename identifier -> param * identifier() -> param_indices() (Dict{Symbol, Int}) * get_identifier_indices() -> param_to_indices() (Vector{Int}) * parameters -> params (Vector{Symbol}) * ParTable: columns[:identifier] => columns[:param] * getindex(EnsParTable, i) instead of get_group() * replace no-op ctors with convert(T, obj) convert() is a proper method to call to avoid unnecessary construction, ctor semantics requires that a new object is constructed * ParamTable: convert vars from Dict to fields make the type immutable * ParamTable: update StenGraph-based ctor * use graph as a main parameter * simplify rows processing * don't reallocate table.columns Co-authored-by: Maximilian-Stefan-Ernst <[email protected]> * rename Base.sort() to sort_vars() because the ParTable contains rows and columns, it is not clear, what sort() actually sorts. Co-authored-by: Maximilian-Stefan-Ernst <[email protected]> * don't import == * don't import push!() * don't import DataFrame * remove no-op push!() * ParTable ctor: simplify rows code * use named tuples * reduce code duplication * use colnames vector instead of position_names Dict * ParTable: full support for Iterator iface * RAMConstant: simplify * declare RAMConstant field types * refactor constants collection to avoid code duplication * RAMMatrices: optimize F_indices init * RAMMatrices: declare types for all fields * RAMMatrices: option to keep zero constants * nonunique() helper function * add check_vars() and check_params() * RAMMatrices ctor: dims and vars checks * RAMMatrices: cleanup params index * simplify parameters() function to return just a vector of params * RAMMatrices ctor: use check_params() * include RAMMatrices before EnsParTable * fix EnsParTable to Dict{RAMMatrices} convert * this method is not RAMMatrices ctor, it is Dict{K, RAMMatrices} convert * use comprehension to construct dict * DataFrame(EnsParTable) * params() API method * remove n_par.jl * remove identifier.jl * EnsParTable ctor: enforce same params in tables * fix EnsParTable container to Dict{Symbol, ParTable} * don't use keywords for main params as it complicates dispatch Co-authored-by: Maximilian-Stefan-Ernst <[email protected]> * formatting fixes * ParTable ctor: allow providing columns data * update_partable!() cleanup + docstring * update_partable!(): SemFit methods use basic one * ParTable: add explicit params field * n_par() -> nparams() for clarity and aligning to Julia naming conventions * param_values(ParTable) Co-authored-by: Maximilian-Stefan-Ernst <[email protected]> * lavaan_param_values(lav_fit, partable) * compare_estimates() -> test_estimates() * do tests inside * use param_values()/lavaan_param_values() * update_partable!(): dict-based generic version Co-authored-by: Maximilian-Stefan-Ernst <[email protected]> * ParTable: getindex() returns NamedTuple so the downstream code doesn't rely on the order of tuple elements * ParTable: graph-based ctor supports params= kw * rename parameter_type to relation for clarity * sem_summary(): cleanup filters * fix sem_summary method for partable * show(ParTable): suppress NaNs * sort_vars!(ParTable): cleanup * Project.toml: disable SymbolicUtils 1.6 causes problems with sparsehessian(). It is a temporary fix until the compatibility issues are resolved in Symbolics.jl * Project.toml: support StenoGraphs 0.3 * RAM ctor: better error for missing meanstruct * add function param_indices * start fixing docs * fix regularization docs * introduce formatting error * update_start(): fix docstring typo Co-authored-by: Maximilian-Stefan-Ernst <[email protected]> * push!(::ParTable, Tuple): check keys compat Co-authored-by: Maximilian-Stefan-Ernst <[email protected]> * SemObsCov ctor: restrict n_obs to integer don't allow missing n_obs * fixup param_indices() * common.jl: common vars API methods * SemSpecification: vars API * RAMMatrices: vars API * ParamTable: vars API * SemImply: vars and params API * RAM imply: use vars API * RAMSymbolic: use vars API * start_simple(): use vars API * starts_fabin3: use vars API * remove get_colnames() replaced by observed_vars() * remove get_n_nodes() replaced by nvars() * get_data() -> samples() and add default implementation samples(::SemObserved) * SemObsData: remove rowwise * it is unused * if ever rowwise access would be required, it could be done with eachrow(data) without allocation * AbstractSemSingle: vars API * rename n_obs() -> nsamples() * rename n_man() -> nobserved_vars() for missing data pattern: nobserved_vars() -> nmeasured_vars(), obs_cov/obs_mean -> measured_cov/measured_mean * move Sem methods out of types.jl * rows(::SemObservedMissing) -> pattern_rows() * fix formatting * samples(SemObsCov) throws an exception * SemObserved tests: refactor and add var API tests * ParTable(graph): group is only valid for ensemble * ParTable(graph): fix NaN modif detection * export vars, params and observed APIs * refactor SemSpec tests * add Sem unit tests * dont allow fixed and labeled parameters * add test for labeled and fixed parameters * remove get_observed() does not seem to be used anywhere; also the method signature does not match Julia conventions * fix ridge eval * MeanStructure, HessianEvaluation traits * replace has_meanstrcture and approximate_hessian fields with trait-like typeparams * remove methods for has_meanstructure-based dispatch * obj/grad/hess: refactor evaluation API the intent of this commit is to refactor the API for objective, gradient and hessian evaluation, such that the evaluation code does not have to be duplicates across functions that calculate different combinations of those functions * introduce EvaluationTargets class that handles selection of what to evaluate * add evaluate!(EvalTargets, ...) methods for loss and imply objs that evaluate only what is required * objective!(), obj_grad!() etc calls are just a wrapper of evaluate!() with proper targets * se_hessian(): rename hessian -> method for clarity * se_hessian!(): optimize calc * explicitly use Cholesky factorization * H_scaling(): cleanup remove unnecesary arguments * SemOptOptim: remove redundant sem_fit() by dispatching over optimizer * SemOptNLopt: remove redundant sem_fit() by dispatching over optimizer * SemOptOptim: use evaluate!() directly no wrapper required * SemOptNLopt: use evaluate!() directly * SemWLS: dim checks * fixup formatting * WLS: use 5-arg mul!() to reduce allocations * ML: use 5-arg mul!() to reduce allocations * FIML: use 5-arg mul! to avoid extra allocation * fix the error message Co-authored-by: Maximilian Ernst <[email protected]> * HessianEvaluation -> HessianEval * MeanStructure -> MeanStruct * SemImply: replace common type params with fields * close #216 * close #205 * update EnsembleParameterTable docs and add methods for par table equality * close #213 * close #157 * add method for * format * increase test sample size * Project.toml: update Symbolics deps * tests/examples: import -> using no declarations, so import is not required * add ParamsArray replaces RAMMatrices indices and constants vectors with dedicated class that incapsulate this logic, resulting in overall cleaner interface A_ind, S_ind, M_ind become ParamsArray F_ind becomes SparseMatrixCSC parameters.jl is not longer required and is removed * materialize!(Symm/LowTri/UpTri) * ParamsArray: faster sparse materialize! * ParamsArray: use Iterators.flatten() (faster) * Base.hash(::ParamsArray) * colnames -> vars * update_partable!(): better params unique check * start_fabin3: check obs_mean data & meanstructure * params/vars API tweaks and tests * generic imply: keep F sparse * tests helper: is_extended_tests() to consolidate ENV variable check * Optim sem_fit(): use provided optimizer * prepare_start_params(): arg-dependent dispatch * convert to argument type-dependent dispatch * replace start_val() function with prepare_start_params() * refactor start_parameter_table() into prepare_start_params(start_val::ParameterTable, ...) and use the SEM model param indices * unify processing of starting values by all optimizers * support dictionaries of values * prepare_param_bounds() API for optim * u/l_bounds support for Optim.jl * SemOptimizer(engine = ...) ctor * SEMNLOptExt for NLopt * NLopt: sem_fit(): use provided optimizer * SEMProximalOptExt for Proximal opt * merge diff/*.jl optimizer code into optimizer/*.jl * Optim: document u/l bounds * remove unused options field from Proximal optimizer * decouple optimizer from Sem model Co-authored-by: Maximilian Ernst <[email protected]> * fix inequality constraints test NLopt minimum was 18.11, below what the test expected * add ProximalSEM tests * optim/documentation.jl: rename to abstract.jl * ext: change folder layout * Project.toml: fix ProximalOperators ID * docs: fix nsamples, nobserved_vars * cleanup data columns reordering define a single source_to_dest_perm() function * SemObservedCov: def as an alias of SemObservedData reduces code duplication; also annotate types of ctor args now samples(SemObsCov) returns nothing * SemObserved: store observed_vars add observed_vars(data::SemObserved) * nsamples(observed::SemObserved): unify * FIML: simplify index generation * SemObservedMissing: refactor * use SemObsMissingPattern struct to simplify code * replace O(Nvars^2) common pattern detection with Dict{} * don't store row-wise, store sub-matrices of non-missing data instead * use StatsBase.mean_and_cov() * remove cov_and_mean(): not used anymore StatsBase.mean_and_cov() is used instead * SemObserved: unify data preparation - SemObservedData: parameterize by cov/mean eltype instead of the whole container types Co-authored-by: Maximilian Ernst <[email protected]> * tests: update SemObserved tests to match the update data preparation behaviour * prep_data: warn if obs_vars order don't match spec * SemObsData: observed_var_prefix kwarg to specify the prefix of the generated observed_vars if none provided could be inferred, defaults to :obs * ParTable: add graph-based kw-only constructor * Project.toml: fix ProximalAlgorithms to 0.5 v0.7 changed the diff interface (v0.6 was skipped) * switch to ProximalAlgorithms.jl v0.7 also drop ProximalOperators and ProximalCore weak deps * move params() to common.jl it is available for many SEM types, not just SemSpec * RAM ctor: use random parameters instead of NaNs to initialize RAM matrices simplify check_acyclic() * move check_acyclic() to abstract.jl add verbose parameter * AbstractSem: improve imply/observed API redirect * imply -> implied, SemImply -> SemImplied * imply -> implied: file renames * close #158 * close #232 * Update ext/SEMProximalOptExt/ProximalAlgorithms.jl * suppress uninformative warnings during package testing * turn simplification of symbolic terms by default off * new version of StenoGraph results in fewer deprication notices * fix exporting structs from package extensions * fix NLopt extension * fix Proximal extension * fix printing * fix regularization docs * start reworking docs * finish rewriting docs * rm ProximalSEM from docs deps * fix docs * fix docs * try to fix svgs for docs * try to fix svgs for docs * update README * bump version * give macos some slack and format --------- Co-authored-by: Alexey Stukalov <[email protected]> Co-authored-by: Alexey Stukalov <[email protected]> Co-authored-by: Alexey Stukalov <[email protected]> Co-authored-by: Alexey Stukalov <[email protected]> Co-authored-by: Aaron Peikert <[email protected]>
1 parent 131523a commit aa1fcf6

File tree

122 files changed

+5618
-5556
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

122 files changed

+5618
-5556
lines changed

Project.toml

Lines changed: 15 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
name = "StructuralEquationModels"
22
uuid = "383ca8c5-e4ff-4104-b0a9-f7b279deed53"
33
authors = ["Maximilian Ernst", "Aaron Peikert"]
4-
version = "0.2.4"
4+
version = "0.3.0"
55

66
[deps]
77
DataFrames = "a93c6f00-e57d-5684-b7b6-d8193f3e46c0"
@@ -12,7 +12,6 @@ LazyArtifacts = "4af54fe1-eca0-43a8-85a7-787d91b784e3"
1212
LineSearches = "d3d80556-e9d4-5f37-9878-2ab0fcc64255"
1313
LinearAlgebra = "37e2e46d-f89d-539d-b4ee-838fcccc9c8e"
1414
NLSolversBase = "d41bc354-129a-5804-8e4c-c37616107c6c"
15-
NLopt = "76087f3c-5699-56af-9a33-bf431cd00edd"
1615
Optim = "429524aa-4258-5aef-a3af-852621145aeb"
1716
Pkg = "44cfe95a-1eb2-52ea-b672-e2afdf69b78f"
1817
PrettyTables = "08abe8d2-0d0c-5749-adfa-8a2ac140af0d"
@@ -22,10 +21,11 @@ Statistics = "10745b16-79ce-11e8-11f9-7d13ad32a3b2"
2221
StatsBase = "2913bbd2-ae8a-5f71-8c99-4fb6c76f3a91"
2322
StenoGraphs = "78862bba-adae-4a83-bb4d-33c106177f81"
2423
Symbolics = "0c5d862f-8b57-4792-8d23-62f2024744c7"
24+
SymbolicUtils = "d1185830-fcd6-423d-90d6-eec64667417b"
2525

2626
[compat]
27-
julia = "1.9, 1.10"
28-
StenoGraphs = "0.2"
27+
julia = "1.9, 1.10, 1.11"
28+
StenoGraphs = "0.2 - 0.3, 0.4.1 - 0.5"
2929
DataFrames = "1"
3030
Distributions = "0.25"
3131
FiniteDiff = "2"
@@ -34,11 +34,21 @@ NLSolversBase = "7"
3434
NLopt = "0.6, 1"
3535
Optim = "1"
3636
PrettyTables = "2"
37+
ProximalAlgorithms = "0.7"
3738
StatsBase = "0.33, 0.34"
38-
Symbolics = "4, 5"
39+
Symbolics = "4, 5, 6"
40+
SymbolicUtils = "1.4 - 1.5, 1.7, 2, 3"
3941

4042
[extras]
4143
Test = "8dfed614-e22c-5e08-85e1-65c5234f0b40"
4244

4345
[targets]
4446
test = ["Test"]
47+
48+
[weakdeps]
49+
NLopt = "76087f3c-5699-56af-9a33-bf431cd00edd"
50+
ProximalAlgorithms = "140ffc9f-1907-541a-a177-7475e0a401e9"
51+
52+
[extensions]
53+
SEMNLOptExt = "NLopt"
54+
SEMProximalOptExt = "ProximalAlgorithms"

README.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@ It is still *in development*.
1111
Models you can fit include
1212
- Linear SEM that can be specified in RAM (or LISREL) notation
1313
- ML, GLS and FIML estimation
14-
- Regularization
14+
- Regularized SEM (Ridge, Lasso, L0, ...)
1515
- Multigroup SEM
1616
- Sums of arbitrary loss functions (everything the optimizer can handle).
1717

@@ -35,6 +35,7 @@ The package makes use of
3535
- Symbolics.jl for symbolically precomputing parts of the objective and gradients to generate fast, specialized functions.
3636
- SparseArrays.jl to speed up symbolic computations.
3737
- Optim.jl and NLopt.jl to provide a range of different Optimizers/Linesearches.
38+
- ProximalAlgorithms.jl for regularization.
3839
- FiniteDiff.jl and ForwardDiff.jl to provide gradients for user-defined loss functions.
3940

4041
# At the moment, we are still working on:

docs/Project.toml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,6 @@
11
[deps]
22
DataFrames = "a93c6f00-e57d-5684-b7b6-d8193f3e46c0"
33
Documenter = "e30172f5-a6a5-5a46-863b-614d45cd2de4"
4+
NLopt = "76087f3c-5699-56af-9a33-bf431cd00edd"
5+
ProximalAlgorithms = "140ffc9f-1907-541a-a177-7475e0a401e9"
46
ProximalOperators = "a725b495-10eb-56fe-b38b-717eba820537"

docs/make.jl

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -32,7 +32,7 @@ makedocs(
3232
"Developer documentation" => [
3333
"Extending the package" => "developer/extending.md",
3434
"Custom loss functions" => "developer/loss.md",
35-
"Custom imply types" => "developer/imply.md",
35+
"Custom implied types" => "developer/implied.md",
3636
"Custom optimizer types" => "developer/optimizer.md",
3737
"Custom observed types" => "developer/observed.md",
3838
"Custom model types" => "developer/sem.md",

docs/src/assets/concept.svg

Lines changed: 0 additions & 26 deletions
Loading

docs/src/assets/concept_typed.svg

Lines changed: 0 additions & 26 deletions
Loading

docs/src/developer/extending.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
# Extending the package
22

3-
As discussed in the section on [Model Construction](@ref), every Structural Equation Model (`Sem`) consists of four parts:
3+
As discussed in the section on [Model Construction](@ref), every Structural Equation Model (`Sem`) consists of three (four with the optimizer) parts:
44

55
![SEM concept typed](../assets/concept_typed.svg)
66

docs/src/developer/implied.md

Lines changed: 98 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,98 @@
1+
# Custom implied types
2+
3+
We recommend to first read the part [Custom loss functions](@ref), as the overall implementation is the same and we will describe it here more briefly.
4+
5+
Implied types are of subtype `SemImplied`. To implement your own implied type, you should define a struct
6+
7+
```julia
8+
struct MyImplied <: SemImplied
9+
...
10+
end
11+
```
12+
13+
and a method to update!:
14+
15+
```julia
16+
import StructuralEquationModels: objective!
17+
18+
function update!(targets::EvaluationTargets, implied::MyImplied, model::AbstractSemSingle, params)
19+
20+
if is_objective_required(targets)
21+
...
22+
end
23+
24+
if is_gradient_required(targets)
25+
...
26+
end
27+
if is_hessian_required(targets)
28+
...
29+
end
30+
31+
end
32+
```
33+
34+
As you can see, `update` gets passed as a first argument `targets`, which is telling us whether the objective value, gradient, and/or hessian are needed.
35+
We can then use the functions `is_..._required` and conditional on what the optimizer needs, we can compute and store things we want to make available to the loss functions. For example, as we have seen in [Second example - maximum likelihood](@ref), the `RAM` implied type computes the model-implied covariance matrix and makes it available via `implied.Σ`.
36+
37+
38+
39+
Just as described in [Custom loss functions](@ref), you may define a constructor. Typically, this will depend on the `specification = ...` argument that can be a `ParameterTable` or a `RAMMatrices` object.
40+
41+
We implement an `ImpliedEmpty` type in our package that does nothing but serving as an `implied` field in case you are using a loss function that does not need any implied type at all. You may use it as a template for defining your own implied type, as it also shows how to handle the specification objects:
42+
43+
```julia
44+
############################################################################################
45+
### Types
46+
############################################################################################
47+
"""
48+
Empty placeholder for models that don't need an implied part.
49+
(For example, models that only regularize parameters.)
50+
51+
# Constructor
52+
53+
ImpliedEmpty(;specification, kwargs...)
54+
55+
# Arguments
56+
- `specification`: either a `RAMMatrices` or `ParameterTable` object
57+
58+
# Examples
59+
A multigroup model with ridge regularization could be specified as a `SemEnsemble` with one
60+
model per group and an additional model with `ImpliedEmpty` and `SemRidge` for the regularization part.
61+
62+
# Extended help
63+
64+
## Interfaces
65+
- `params(::RAMSymbolic) `-> Vector of parameter labels
66+
- `nparams(::RAMSymbolic)` -> Number of parameters
67+
68+
## Implementation
69+
Subtype of `SemImplied`.
70+
"""
71+
struct ImpliedEmpty{A, B, C} <: SemImplied
72+
hessianeval::A
73+
meanstruct::B
74+
ram_matrices::C
75+
end
76+
77+
############################################################################################
78+
### Constructors
79+
############################################################################################
80+
81+
function ImpliedEmpty(;specification, meanstruct = NoMeanStruct(), hessianeval = ExactHessian(), kwargs...)
82+
return ImpliedEmpty(hessianeval, meanstruct, convert(RAMMatrices, specification))
83+
end
84+
85+
############################################################################################
86+
### methods
87+
############################################################################################
88+
89+
update!(targets::EvaluationTargets, implied::ImpliedEmpty, par, model) = nothing
90+
91+
############################################################################################
92+
### Recommended methods
93+
############################################################################################
94+
95+
update_observed(implied::ImpliedEmpty, observed::SemObserved; kwargs...) = implied
96+
```
97+
98+
As you see, similar to [Custom loss functions](@ref) we implement a method for `update_observed`.

docs/src/developer/imply.md

Lines changed: 0 additions & 87 deletions
This file was deleted.

0 commit comments

Comments
 (0)