Skip to content

Commit a8ae75c

Browse files
Merge pull request #248 from StructuralEquationModels/documentation/0.3.0
Documentation/0.3.0
2 parents bcfae6a + a4b7fe6 commit a8ae75c

File tree

35 files changed

+292
-326
lines changed

35 files changed

+292
-326
lines changed

docs/Project.toml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,6 @@
11
[deps]
22
DataFrames = "a93c6f00-e57d-5684-b7b6-d8193f3e46c0"
33
Documenter = "e30172f5-a6a5-5a46-863b-614d45cd2de4"
4+
NLopt = "76087f3c-5699-56af-9a33-bf431cd00edd"
5+
ProximalAlgorithms = "140ffc9f-1907-541a-a177-7475e0a401e9"
46
ProximalOperators = "a725b495-10eb-56fe-b38b-717eba820537"

docs/src/assets/concept.svg

Lines changed: 0 additions & 26 deletions
Loading

docs/src/assets/concept_typed.svg

Lines changed: 0 additions & 26 deletions
Loading

docs/src/developer/extending.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
# Extending the package
22

3-
As discussed in the section on [Model Construction](@ref), every Structural Equation Model (`Sem`) consists of four parts:
3+
As discussed in the section on [Model Construction](@ref), every Structural Equation Model (`Sem`) consists of three (four with the optimizer) parts:
44

55
![SEM concept typed](../assets/concept_typed.svg)
66

docs/src/developer/implied.md

Lines changed: 53 additions & 42 deletions
Original file line numberDiff line numberDiff line change
@@ -10,78 +10,89 @@ struct MyImplied <: SemImplied
1010
end
1111
```
1212

13-
and at least a method to compute the objective
13+
and a method to update!:
1414

1515
```julia
1616
import StructuralEquationModels: objective!
1717

18-
function objective!(implied::MyImplied, par, model::AbstractSemSingle)
19-
...
20-
return nothing
21-
end
22-
```
18+
function update!(targets::EvaluationTargets, implied::MyImplied, model::AbstractSemSingle, params)
2319

24-
This method should compute and store things you want to make available to the loss functions, and returns `nothing`. For example, as we have seen in [Second example - maximum likelihood](@ref), the `RAM` implied type computes the model-implied covariance matrix and makes it available via `Σ(implied)`.
25-
To make stored computations available to loss functions, simply write a function - for example, for the `RAM` implied type we defined
20+
if is_objective_required(targets)
21+
...
22+
end
2623

27-
```julia
28-
Σ(implied::RAM) = implied.Σ
24+
if is_gradient_required(targets)
25+
...
26+
end
27+
if is_hessian_required(targets)
28+
...
29+
end
30+
31+
end
2932
```
3033

31-
Additionally, you can specify methods for `gradient` and `hessian` as well as the combinations described in [Custom loss functions](@ref).
34+
As you can see, `update` gets passed as a first argument `targets`, which is telling us whether the objective value, gradient, and/or hessian are needed.
35+
We can then use the functions `is_..._required` and conditional on what the optimizer needs, we can compute and store things we want to make available to the loss functions. For example, as we have seen in [Second example - maximum likelihood](@ref), the `RAM` implied type computes the model-implied covariance matrix and makes it available via `implied.Σ`.
3236

33-
The last thing nedded to make it work is a method for `nparams` that takes your implied type and returns the number of parameters of the model:
3437

35-
```julia
36-
nparams(implied::MyImplied) = ...
37-
```
3838

3939
Just as described in [Custom loss functions](@ref), you may define a constructor. Typically, this will depend on the `specification = ...` argument that can be a `ParameterTable` or a `RAMMatrices` object.
4040

4141
We implement an `ImpliedEmpty` type in our package that does nothing but serving as an `implied` field in case you are using a loss function that does not need any implied type at all. You may use it as a template for defining your own implied type, as it also shows how to handle the specification objects:
4242

4343
```julia
44-
############################################################################
44+
############################################################################################
4545
### Types
46-
############################################################################
46+
############################################################################################
47+
"""
48+
Empty placeholder for models that don't need an implied part.
49+
(For example, models that only regularize parameters.)
4750
48-
struct ImpliedEmpty{V, V2} <: SemImplied
49-
identifier::V2
50-
n_par::V
51-
end
51+
# Constructor
5252
53-
############################################################################
54-
### Constructors
55-
############################################################################
53+
ImpliedEmpty(;specification, kwargs...)
54+
55+
# Arguments
56+
- `specification`: either a `RAMMatrices` or `ParameterTable` object
57+
58+
# Examples
59+
A multigroup model with ridge regularization could be specified as a `SemEnsemble` with one
60+
model per group and an additional model with `ImpliedEmpty` and `SemRidge` for the regularization part.
5661
57-
function ImpliedEmpty(;
58-
specification,
59-
kwargs...)
62+
# Extended help
6063
61-
ram_matrices = RAMMatrices(specification)
62-
identifier = StructuralEquationModels.identifier(ram_matrices)
64+
## Interfaces
65+
- `params(::RAMSymbolic) `-> Vector of parameter labels
66+
- `nparams(::RAMSymbolic)` -> Number of parameters
6367
64-
n_par = length(ram_matrices.parameters)
68+
## Implementation
69+
Subtype of `SemImplied`.
70+
"""
71+
struct ImpliedEmpty{A, B, C} <: SemImplied
72+
hessianeval::A
73+
meanstruct::B
74+
ram_matrices::C
75+
end
76+
77+
############################################################################################
78+
### Constructors
79+
############################################################################################
6580

66-
return ImpliedEmpty(identifier, n_par)
81+
function ImpliedEmpty(;specification, meanstruct = NoMeanStruct(), hessianeval = ExactHessian(), kwargs...)
82+
return ImpliedEmpty(hessianeval, meanstruct, convert(RAMMatrices, specification))
6783
end
6884

69-
############################################################################
85+
############################################################################################
7086
### methods
71-
############################################################################
87+
############################################################################################
7288

73-
objective!(implied::ImpliedEmpty, par, model) = nothing
74-
gradient!(implied::ImpliedEmpty, par, model) = nothing
75-
hessian!(implied::ImpliedEmpty, par, model) = nothing
89+
update!(targets::EvaluationTargets, implied::ImpliedEmpty, par, model) = nothing
7690

77-
############################################################################
91+
############################################################################################
7892
### Recommended methods
79-
############################################################################
80-
81-
identifier(implied::ImpliedEmpty) = implied.identifier
82-
n_par(implied::ImpliedEmpty) = implied.n_par
93+
############################################################################################
8394

8495
update_observed(implied::ImpliedEmpty, observed::SemObserved; kwargs...) = implied
8596
```
8697

87-
As you see, similar to [Custom loss functions](@ref) we implement a method for `update_observed`. Additionally, you should store the `identifier` from the specification object and write a method for `identifier`, as this will make it possible to access parameter indices by label.
98+
As you see, similar to [Custom loss functions](@ref) we implement a method for `update_observed`.

docs/src/developer/loss.md

Lines changed: 27 additions & 53 deletions
Original file line numberDiff line numberDiff line change
@@ -20,17 +20,22 @@ end
2020
```
2121
We store the hyperparameter α and the indices I of the parameters we want to regularize.
2222

23-
Additionaly, we need to define a *method* to compute the objective:
23+
Additionaly, we need to define a *method* of the function `evaluate!` to compute the objective:
2424

2525
```@example loss
26-
import StructuralEquationModels: objective!
26+
import StructuralEquationModels: evaluate!
2727
28-
objective!(ridge::Ridge, par, model::AbstractSemSingle) = ridge.α*sum(par[ridge.I].^2)
28+
evaluate!(objective::Number, gradient::Nothing, hessian::Nothing, ridge::Ridge, model::AbstractSem, par) =
29+
ridge.α * sum(i -> par[i]^2, ridge.I)
2930
```
3031

32+
The function `evaluate!` recognizes by the types of the arguments `objective`, `gradient` and `hessian` whether it should compute the objective value, gradient or hessian of the model w.r.t. the parameters.
33+
In this case, `gradient` and `hessian` are of type `Nothing`, signifying that they should not be computed, but only the objective value.
34+
3135
That's all we need to make it work! For example, we can now fit [A first model](@ref) with ridge regularization:
3236

3337
We first give some parameters labels to be able to identify them as targets for the regularization:
38+
3439
```@example loss
3540
observed_vars = [:x1, :x2, :x3, :y1, :y2, :y3, :y4, :y5, :y6, :y7, :y8]
3641
latent_vars = [:ind60, :dem60, :dem65]
@@ -65,7 +70,7 @@ partable = ParameterTable(
6570
observed_vars = observed_vars
6671
)
6772
68-
parameter_indices = param_indices([:a, :b, :c], partable)
73+
parameter_indices = getindex.([param_indices(partable)], [:a, :b, :c])
6974
myridge = Ridge(0.01, parameter_indices)
7075
7176
model = SemFiniteDiff(
@@ -86,15 +91,23 @@ Note that the last argument to the `objective!` method is the whole model. There
8691
By far the biggest improvements in performance will result from specifying analytical gradients. We can do this for our example:
8792

8893
```@example loss
89-
import StructuralEquationModels: gradient!
90-
91-
function gradient!(ridge::Ridge, par, model::AbstractSemSingle)
92-
gradient = zero(par)
93-
gradient[ridge.I] .= 2*ridge.α*par[ridge.I]
94-
return gradient
94+
function evaluate!(objective, gradient, hessian::Nothing, ridge::Ridge, model::AbstractSem, par)
95+
# compute gradient
96+
if !isnothing(gradient)
97+
fill!(gradient, 0)
98+
gradient[ridge.I] .= 2 * ridge.α * par[ridge.I]
99+
end
100+
# compute objective
101+
if !isnothing(objective)
102+
return ridge.α * sum(i -> par[i]^2, ridge.I)
103+
end
95104
end
96105
```
97106

107+
As you can see, in this method definition, both `objective` and `gradient` can be different from `nothing`.
108+
We then check whether to compute the objective value and/or the gradient with `isnothing(objective)`/`isnothing(gradient)`.
109+
This syntax makes it possible to compute objective value and gradient at the same time, which is beneficial when the the objective and gradient share common computations.
110+
98111
Now, instead of specifying a `SemFiniteDiff`, we can use the normal `Sem` constructor:
99112

100113
```@example loss
@@ -119,46 +132,7 @@ using BenchmarkTools
119132

120133
The exact results of those benchmarks are of course highly depended an your system (processor, RAM, etc.), but you should see that the median computation time with analytical gradients drops to about 5% of the computation without analytical gradients.
121134

122-
Additionally, you may provide analytic hessians by writing a method of the form
123-
124-
```julia
125-
function hessian!(ridge::Ridge, par, model::AbstractSemSingle)
126-
...
127-
return hessian
128-
end
129-
```
130-
131-
however, this will only matter if you use an optimization algorithm that makes use of the hessians. Our default algorithmn `LBFGS` from the package `Optim.jl` does not use hessians (for example, the `Newton` algorithmn from the same package does).
132-
133-
To improve performance even more, you can write a method of the form
134-
135-
```julia
136-
function objective_gradient!(ridge::Ridge, par, model::AbstractSemSingle)
137-
...
138-
return objective, gradient
139-
end
140-
```
141-
142-
This is beneficial when the computation of the objective and gradient share common computations. For example, in maximum likelihood estimation, the model implied covariance matrix has to be inverted to both compute the objective and gradient. Whenever the optimization algorithmn asks for the objective value and gradient at the same point, we call `objective_gradient!` and only have to do the shared computations - in this case the matrix inversion - once.
143-
144-
If you want to do hessian-based optimization, there are also the following methods:
145-
146-
```julia
147-
function objective_hessian!(ridge::Ridge, par, model::AbstractSemSingle)
148-
...
149-
return objective, hessian
150-
end
151-
152-
function gradient_hessian!(ridge::Ridge, par, model::AbstractSemSingle)
153-
...
154-
return gradient, hessian
155-
end
156-
157-
function objective_gradient_hessian!(ridge::Ridge, par, model::AbstractSemSingle)
158-
...
159-
return objective, gradient, hessian
160-
end
161-
```
135+
Additionally, you may provide analytic hessians by writing a respective method for `evaluate!`. However, this will only matter if you use an optimization algorithm that makes use of the hessians. Our default algorithmn `LBFGS` from the package `Optim.jl` does not use hessians (for example, the `Newton` algorithmn from the same package does).
162136

163137
## Convenient
164138

@@ -241,11 +215,11 @@ With this information, we write can implement maximum likelihood optimization as
241215
struct MaximumLikelihood <: SemLossFunction end
242216
243217
using LinearAlgebra
244-
import StructuralEquationModels: Σ, obs_cov, objective!
218+
import StructuralEquationModels: obs_cov, evaluate!
245219
246-
function objective!(semml::MaximumLikelihood, parameters, model::AbstractSem)
220+
function evaluate!(objective::Number, gradient::Nothing, hessian::Nothing, semml::MaximumLikelihood, model::AbstractSem, par)
247221
# access the model implied and observed covariance matrices
248-
Σᵢ = Σ(implied(model))
222+
Σᵢ = implied(model)
249223
Σₒ = obs_cov(observed(model))
250224
# compute the objective
251225
if isposdef(Symmetric(Σᵢ)) # is the model implied covariance matrix positive definite?

0 commit comments

Comments
 (0)