# Calibrating Models to Data

Calibrating models to data, or finding parameters which make a model a sufficiently close fit to data, is part of the core functionality of JuliaSimModelOptimizer. The `calibrate`

function is designed as an automated mechanism for performing such model calibration in an accurate and efficient manner. It uses alternative methods for building the cost functions, also called transcription methods, such as multiple shooting and collocation to improve the stability of the training process, along with mixing in techniques for sampling initial conditions, to ensure that an accurate fit can be generate with little to no effort.

`calibrate`

is designed for finding a single best fitting parameter set for a model. For a set of potential fits to characterize fitting uncertainty, see the documentation on the `parametric_uq`

function for parametric uncertainty quantification

## Objective function definition

`JuliaSimModelOptimizer.objective`

— Function`objective(prob, alg)`

When considering the `InverseProblem`

in the context of calibration, we can define an objective function that associates a number to a vector of possible search space values. This number is a measure of how close is the given vector to a solution of the inverse problem. The objective for an inverse problem is defined as a sum of the individual contributions from each experiment.

\[\sum_{i=1}^{N} loss(i).\]

In the above `loss(i)`

is the individual contribution of the `i`

th experiment and this can be computed via `cost_contribution`

and is determined by the `loss_func`

passed in the corresponding experiment constructor.

Calling `objective(prob, alg)`

will return a function of the search space values, so in order to use it, it is best to store it in a variable as follows:

`cost = objective(prob, alg);`

In order to obtain a particular value that corresponds to some search space values, one has to input a `NamedTuple`

or a vector of pairs to the above mentioned function.

As an example, one would call

`cost([a => 1.0, b => 3.0])`

where the search space in the inverse problem was defined as containing the `a`

and `b`

parameters, i.e. `[a => (0.0, 10.0), b => (-5.0, 5.0)]`

.

The above function also accepts calibration results as input, so if we have a calibration result obtained from something like

`r = calibrate(prob, alg)`

then we can compute the cost value corresponding to the calibration result using `cost(r)`

, where `cost`

is the above mentioned variable.

The `cost`

function returned by `objective`

can also be called with 0 arguments, i.e. `cost()`

, in which case the cost will be computed using the initial state of the optimization problem, which uses the default values present in the model for all parameters and initial conditions.

**Positional arguments**

`prob`

: the`InverseProblem`

defining the objective`alg`

: the calibration algorithm that formulates the objective function

`JuliaSimModelOptimizer.cost_contribution`

— Function`cost_contribution(alg, experiment::AbstractExperiment, prob, x)`

Compute the contribution of the given `experiment`

to the total cost associated with the inverse problem `prob`

using the given calibration algorithm (`alg`

).

**Positional arguments**

`alg`

: the calibration algorithm that formulates the objective function.`experiment`

: the experiment for which we compute the contribution.`prob`

: the`InverseProblem`

defining the objective.`x`

: the values of the tuned parameters or initial conditions.

## The `calibrate`

Function

`JuliaSimModelOptimizer.calibrate`

— Function`calibrate(prob, alg; adtype = Optimization.AutoForwardDiff(), progress = true)`

Find the best parameters to solve the inverse problem `prob`

using the calibration algorithm given by `alg`

.

**Arguments**

`prob`

: the`InverseProblem`

to solve.`alg`

: the calibration algorithm used for building the loss function. This can be`JuliaHubJob`

for launching batch jobs in JuliaHub. In this case, the actual`alg`

is wrapped inside`JuliaHubJob`

object.

**Keyword Arguments**

`adtype`

: Automatic differentiation choice, see the Optimization.jl docs for details. Defaults to`AutoForwardDiff()`

.`progress`

: Show the progress of the optimization (current loss value) in a progress bar. Defaults to`true`

.

`calibrate(res, alg; adtype = Optimization.AutoForwardDiff(), progress = true)`

Continue calibration from `res`

which is obtained from a previous calibration using the algorithm given by `alg`

.

**Arguments**

`res`

: object of type`CalibrationResult`

which is obtained from a previous calibration of an`InverseProblem`

.`alg`

: the calibration algorithm used for building the loss function. This can be`JuliaHubJob`

for launching batch jobs in JuliaHub. In this case, the actual`alg`

is wrapped inside`JuliaHubJob`

object.

**Keyword Arguments**

`adtype`

: Automatic differentiation choice, see the Optimization.jl docs for details. Defaults to`AutoForwardDiff()`

.`progress`

: Show the progress of the optimization (current loss value) in a progress bar. Defaults to`true`

.

## Ipopt Optimizer

`JuliaSimModelOptimizer.IpoptOptimizer`

— Function```
IpoptOptimizer(;
verbose = false,
tol = 1e-4,
max_iter = 1000,
acceptable_iter = 250,
acceptable_tol = 1e-3,
hessian_approximation = "limited-memory", kwargs...)
```

A convenience constructor to create an `optimizer = Ipopt.Optimizer()`

and set options. Other options can be passed as kwargs. See https://coin-or.github.io/Ipopt/OPTIONS.html for information about each option. The defaults provided here are more relaxed than Ipopt defaults.

## Calibration Algorithms

### Single Shooting

`JuliaSimModelOptimizer.SingleShooting`

— Type`SingleShooting(;maxiters = nothing, maxtime = nothing, optimizer = IpoptOptimizer(; max_iter = maxiters), kwargs...)`

Single shooting is the simplest transcription method for building the loss function. In this case we solve all the model equation and then we compare the solutions against the data with the user given error functions.

The optimization method will get the cost function as the sum of the individual previously mentioned costs. To change the optimization method, you can use the `optimizer`

keyword argument. The default optimization method is `Ipopt.Optimizer`

which can be used from `OptimizationMOI`

. We also have a convenience constructor `IpoptOptimizer`

for defining it. The maximum number of iterations for the optimizer is given by `maxiters`

, which is a required keyword argument. Atleast one of `maxiters`

or `maxtime`

is required. Optimization stops when either one of the condition is met. Atleast one of `maxiters`

or `maxtime`

is required. If both of them are given, optimization stops after either of condition is first met.

**Keyword Arguments**

`maxiters`

: Maximum numbers of iterations when using the method with a time stepping optimizer.`maxtime`

: Maximum time for the optimization process.`optimizer`

: An Optimization.jl algorithm for the global optimization. Defaults to`Ipopt.Optimizer`

constructed using`IpoptOptimizer`

.`kwargs`

: Keyword arguments passed onto`Optimization.solve`

.

### Multiple Shooting

`JuliaSimModelOptimizer.MultipleShooting`

— Type`MultipleShooting(;maxiters = nothing, maxtime = nothing, optimizer = IpoptOptimizer(; max_iter = maxiters), trajectories, ensemblealg = EnsembleThreads(), continuitytype = ModelStatesPenalty, continuitylossfun = squaredl2loss, continuitylossweight = 100, initialization = DefaultSimulationInitialization(), kwargs...)`

Multiple shooting is a calibration algorithm in which we split the solves of the model equations in several parts, as it can be easier to avoid local minima if we fit the individual segments and them match them up. The number of groups can be controlled via the `trajectories`

keyword argument. The initial conditions between the segments are added to the optimization as extra hidden parameters that have to be fit. The continuity of the resulting solution ensemble is enforced using a continuity penalty between segments.

The continuity loss function can be changed via `continuitylossfun`

, defaulting to `squaredl2loss`

and the penalty used for the continuity is weighted via a the `continuitylossweight`

, which defaults to 100. All segments can be initialized using `initialization`

keyword argument. It defaults to `DefaultSimulationInitialization()`

, i.e. the values are obtained by simulating the model passed in the `Experiment`

with default parameters. For information on other initializations, refer Initialization Methods section in the manual.

The optimization method will get the cost function as the sum of the individual previously mentioned costs. To change the optimization method, you can use the `optimizer`

keyword argument. The default optimization method is `Ipopt.Optimizer`

which can be used from `OptimizationMOI`

. We also have a convenience constructor `IpoptOptimizer`

for defining it. The maximum number of iterations for the optimizer is given by `maxiters`

, which is a required keyword argument. Atleast one of `maxiters`

or `maxtime`

is required. If both of them are given, optimization stops after either of condition is first met.

**Keyword Arguments**

`maxiters`

: Maximum numbers of iterations when using the method with a time stepping optimizer.`maxtime`

: Maximum time for the optimization process.`optimizer`

: An Optimization.jl algorithm for the global optimization. Defaults to`Ipopt.Optimizer`

which can be used from`OptimizationMOI`

.`trajectories`

: Required. This is number of segments that the whole time span is split into.`ensemblealg`

: Parallelization method for the ensemble problem. Defaults to`EnsembleThreads()`

.`continuitytype`

: Type of continuity enforced. This can be`ModelStatesPenalty`

which adds a term in the loss which corresponds to continuity loss given by`continuitylossfun`

or it can be`ConstraintBased`

where the continuity equations are added as constraints for the optimizer.`continuitylossfun`

: Loss function to compute continuity penalty. This is only applicable if`continuitytype`

is`ModelStatesPenalty`

.`continuitylossweight`

: Weight multiplied with the continuity loss term in the total loss. This is only applicable if`continuitytype`

is`ModelStatesPenalty`

.`initialization`

: Initialization method of the segments. This is only applicable if`continuitytype`

is`ModelStatesPenalty`

.`kwargs`

: Keyword arguments passed onto`Optimization.solve`

.

#### Initialization Methods

`JuliaSimModelOptimizer.DefaultSimulationInitialization`

— Type`DefaultSimulationInitialization()`

This is used for initializing segments for doing calibration using `MultipleShooting`

. The values for all the segments are computed by simulating the model passed in the `Experiment`

with default parameters and using its `ODESolution`

.

`JuliaSimModelOptimizer.DataInitialization`

— Type`DataInitialization(; interpolation = CubicSpline)`

This is used for initializing segments for doing calibration using `MultipleShooting`

. It is used for those states where data is present and the values are obtained by fitting using `interpolation`

. For other states, initial condition for that state is used for all the segments.

**Keyword Arguments**

- interpolation: Interpolation method from DataInterpolations which can be passed to fit data for obtaining initial values for the segments.

`JuliaSimModelOptimizer.RandomInitialization`

— Type`RandomInitialization()`

This is used for initializing segments for doing calibration using `MultipleShooting`

. The values for all the segments are computed as a random value from 0 to 1 multiplied by the initial condition for that state.

### Data Shooting

`JuliaSimModelOptimizer.DataShooting`

— Type`DataShooting(;maxiters = nothing, maxtime = nothing, optimizer = IpoptOptimizer(; max_iter = maxiters), groupsize, ensemblealg = EnsembleThreads(), continuitylossfun = squaredl2loss, continuitylossweight = 100, kwargs...)`

`DataShooting`

is a variant of the multiple shooting method (see `MultipleShooting`

for more details), where we use the data for the internal initial conditions between segments instead of having the optimizer find them. If the data is considered noisy, it might be better to let the optimizer find the initial conditions, as we would inevitably introduce errors if we use `DataShooting`

. To specify the group size, which is the number of time points in each trajectory where data points on both ends are taken as initial conditions, we can use `groupsize`

keyword argument. The continuity of the resulting solution ensemble is enforced using a continuity penalty between segments. The continuity losss function can be changed via `continuitylossfun`

, defaulting to `squaredl2loss`

and the penalty used for the continuity is weighted via a the `continuitylossweight`

, which defaults to 100.

The optimization method will get the cost function as the sum of the individual previously mentioned costs. To change the optimization method, you can use the `optimizer`

keyword argument. The default optimization method is `Ipopt.Optimizer`

which can be used from `OptimizationMOI`

. We also have a convenience constructor `IpoptOptimizer`

for defining it. The maximum number of iterations for the optimizer is given by `maxiters`

, which is a required keyword argument. Atleast one of `maxiters`

or `maxtime`

is required. If both of them are given, optimization stops after either of condition is first met.

**Keyword Arguments**

`maxiters`

: Maximum numbers of iterations when using the method with a time stepping optimizer.`maxtime`

: Maximum time for the optimization process.`optimizer`

: An Optimization.jl algorithm for the global optimization. Defaults to`Ipopt.Optimizer`

constructed using`IpoptOptimizer`

.`groupsize`

: Required. Number of time points in each trajectory.`ensemblealg`

: Parallelization method for the ensemble problem. Defaults to`EnsembleThreads()`

.`continuitytype`

: Type of continuity enforced. This can be`ModelStatesPenalty`

which adds a term in the loss which corresponds to continuity loss given by`continuitylossfun`

or it can be`ConstraintBased`

where the continuity equations are added as constraints for the optimizer.`continuitylossfun`

: Loss function to compute continuity penalty. This is only applicable if`continuitytype`

is`ModelStatesPenalty`

.`continuitylossweight`

: Weight multiplied with the continuity loss term in the total loss. This is only applicable if`continuitytype`

is`ModelStatesPenalty`

.`initialization`

: Initialization method of the segments. This is only applicable if`continuitytype`

is`ModelStatesPenalty`

.`kwargs`

: Keyword arguments passed onto`Optimization.solve`

.

### Collocation methods

Collocation methods work by computing the derivatives of the data instead of integrating the equations. For more details on how they work, take a look at the collocation manual page.

Collocation methods work with data by design, so they can only be used when you have data in the loss functions.

## Model transformations

### Prediction Error Methods

`JuliaSimModelOptimizer.DiscreteFixedGainPEM`

— Type`DiscreteFixedGainPEM(alpha)`

This model transformation implements the prediction error method as a discrete callback that brings the solution closer to the data at the corresponding timepoints as the integrator steps through with a factor that depends on `alpha`

, which is a scalar value from 0 to 1. If `alpha`

is 0, then this is equivalent with not applying any correction, while `alpha=1`

means that we start each step from the data.

If there is data for the unknowns, it is directly used. If there is data for observed variables, corresponding equations which are linear with respect to the unknowns are used to estimate other unknowns in those equations for which there is no data. This is only the case if the system of equations are invertible with respect to those other unknowns.

It is also useful for calibrating unstable systems, as pure simulation methods cannot track the true solution over longer time intervals and diverge.