# Experiments

In JuliaSimModelOptimizer, the different variations of the model to be ran are called "experiments". For example, one experiment may specify that the model should be solved with a driving voltage of 10V pulse, while the next experiment specifies that the driving voltage is a 5V pulse. Each experiment is then optionally tied to a dataset which, when defined in an inverse problem, specifies a multi-simulation optimization problem that the further functions (`calibrate`

, `parametric_uq`

, etc.) generate solutions for. The type of experiment is used to signify what the data corresponds to measuring, i.e. whether the experiment is used to match data of time series, or steady states, etc.

## Experiment Types

The following describes the types of experiments which can be generated.

`JuliaSimModelOptimizer.Experiment`

— Type`Experiment(data, model; kwargs...)`

The `Experiment`

describes an experiment in which `data`

was obtained. The dynamics of the investigated system are represented in `model`

as an ODESystem. The experiment is used within the optimization problem, as part of `InverseProblem`

to fit the unknown `model`

parameters and initial conditions to `data`

. If there is no data or if no data is needed, one can just use `Experiment(model; kwargs...)`

, i.e. just avoid passing the `data`

argument. If the data is not passed, then one must provide the `tspan`

argument, representing the timespan used for integrating the differential equations of the model, otherwise this is inferred from the data.

When simulating an `Experiment`

, the parameters and initial conditions that are used in the equations are based on the following hierarchy: If the parameter is fixed by the experiment, that has the highest priority, otherwise if the parameter is to be estimated (i.e. it's present in the search space), than the estimated value is used, otherwise, the default value obtained from the model definition is used. When we say that a parameter is fixed by the experiment, that is meant to reflect the conditions under which the experiment was conducted. In this way we can have one experiment estimating a parameter that is known (or fixed) in another one at the same time. For example let's consider that we have two separate experiments in the inverse problem. The first experiment is characterized by knowing one parameter value, say `a=1`

. This means that for the first experiment we'll have to fix the value of the known parameter to its known value. The value of this parameter is not known for the other experiment. We want to find `a`

and also make use of what we know from experiment 1, so in this case we can set `params=[a=>1]`

only for experiment 1 and have `a`

in the search space for experiment 2. With this configuration, when we simulate the experiments, the tuned value for `a`

will be ignored in the first experiment and the fixed (i.e. `a=1`

) value will be used, while the second experiment will make use of it. Since the first experiment also contributes to the overall objective value associated with the inverse problem, we are making use of the information from the first experiment where the (globally) unknown parameter is known in a particular case. In order to specify the fixed parameters or initial conditions, one can use the `params`

keyword argument for the parameters (e.g. `params = [a => specific_value, b => other_value]`

) and `u0`

for the initial conditions (e.g. `u0 = [state_name => custom_initial_value]`

). The fixed values for the parameters and initial conditions can also be parametrized. For example if `a`

is in the search space, we can have the initial condition for a state `u1`

to be fixed to `2*a`

. In this case the value will depend on the tuned value of `a`

and will be different based on what tuned values are tried for `a`

, but the relation `u1=2a`

will always hold.

The contribution of the `Experiment`

to the cost function is computed using the `squaredl2loss`

function by default, but this can be changed by using the `err`

keyword argument (e.g. loss*func = (tuned*vals, sol, data) -> compute_error). The function requires 3 arguments, the tuned values of the parameters or initial conditions (i.e. what was provided as search space), the solution of the experiment and the data and is expected to return a scalar value corresponding to the loss of the experiment.

If an `MCMCOpt`

method is used, then a likelihood function is needed for each experiment. This likelihood describes the distribution that generated the data. By default, a multivariate Normal distribution is used, centered at the solution of the experiment at each saved timepoint `u`

with a standard deviation `s`

around it. This can be changed by using the `likelihood`

keyword argument (e.g. likelihood = (u, s) -> MvNormal(u, s)).

The `s`

parameter represents any observation noise around the solution of the experiment at each timepoint, `u`

. A prior distribution for `s`

is needed to run an `MCMCOpt`

method. The default value for this prior distribution is an `InverseGamma(2,3)`

. This can be changed by using the `noise_priors`

keyword argument (e.g. `noise_prior = [s1 => Exponential(1), s2 => Gamma(3,2)]`

). If only one prior is given, then it is assumed that it applies to all saved states (`save_names`

). Otherwise a vector containing pairs of states and prior distributions with length equal to `saved_names`

needs to be provided, to set different priors for each state. In the example above, the model had two states `s1`

and `s2`

so two pairs were provided. The noise parameters `s`

are assumed to be unique for each experiment and to be constant across timepoints for each state of each experiment.

In the case of GSA, a `reduction`

function is used. The output of `reduction`

is expected to be the quantity whose sensitivity is being investigated.

**Positional arguments**

`data`

: A`DataSet`

or a tabular data object. If there is no data or no data is needed, this can be omitted.`model`

: An`ODESystem`

describing the model that we are using for the experiment. The model needs to define defaults for all initial conditions and all parameters.

**Keyword arguments**

`u0`

: fix the initial conditions for the experiment (e.g.`u0 = [state_name => custom_initial_value]`

)`params`

: fix the parameters for the experiment (e.g.`params = [p1 => specific_value, ... p3 => other_value]`

)`model_transformations`

: Apply some transformations to the model used in this experiment, such as using`PredictionErrorMethod`

(e.g.`model_transformations = (PredictionErrorMethod(alpha),)`

)`callback`

: A callback or a callback set to be used during the simulation. See https://docs.sciml.ai/DiffEqDocs/stable/features/callback_functions/ for more details.`tspan`

: the timespan to use when integrating the equations. If data is passed, than it is automatically determined.`saveat`

: the times at which the solution of the differential equations will be saved. If the data is passed, the times for which we have data will be used and this argument does not need to be passed. If this argument is provided when using data, care must be taken in the experiment loss function, such that the correct time points are used.`save_names`

: the names of model variables or data columns to use from the given data. By default all model parameters and initial conditions that are present in the data are used. This argument should be used only if one wants to use only a subset of the available data.`loss_func`

: the contribution to the loss corresponding to this experiment (e.g.`loss_func = (tuned_vals, sol, data) -> compute_error`

).`prob_f_kwargs`

: A`NamedTuple`

indicating keyword arguments to be passed to the`ODEFunction`

constructor. See the ModelingToolkit docs for more details.`dependency`

: This keyword can be be assigned to a variable representing an other (previously defined) experiment to express the fact that the initial conditions for this experiment depend on the solution of the given experiment. For example if one experiment (e.g.`experiment_ss`

) defines a steady state, we can use that for the definition of the initial conditions for a subsequent experiment with`dependency=experiment_ss`

.

If additional keywords are passed, they will be forwarded to the `solve`

call from DifferentialEquations. For example, one can pass `alg=Tsit5()`

to specify what solver will be used. More information about supported arguments can be found here.

`JuliaSimModelOptimizer.DiscreteExperiment`

— Type`DiscreteExperiment(data, model, dt; kwargs...)`

Describes a experiment that is simulated in discrete time with the time increment `dt`

. This object can be initialized in the same way as an `Experiment`

object, with the only difference being that `dt`

is an additional argument here. The simulation for this experiment type corresponds to solving a `DiscreteProblem`

.

See the SciML documentation for background information on discrete time problems.

`JuliaSimModelOptimizer.SteadyStateExperiment`

— Type`SteadyStateExperiment(data, model; kwargs...)`

Describes a experiment that is ran until a steady state is reached. This object can be initialized in the same way as a `Experiment`

object, with the only difference being that `data`

needs to be a `Vector`

here. The `data`

in this case represents the values of the saved states when the system has reached its steady state. The simulation for this experiment type corresponds to solving a `SteadyStateProblem`

.

See the SciML documentation for background information on steady state problems.

## Design optimization

For design optimization problems, the `DesignConfiguration`

API can be used.

`JuliaSimModelOptimizer.DesignConfiguration`

— Function`DesignConfiguration(model; kwargs...)`

The `DesignConfiguration`

represents the target for a design optimization problem, making use of `Experiment`

internally. The dynamics of the investigated system are represented in `model`

as an ODESystem. The configuration is used within an optimization problem corresponding to a `InverseProblem`

to define the target objective that is to be achieved using the tunable parameters and initial conditions of the system.

The contribution of the `DesignConfiguration`

to the cost function is computed using the integrated running cost, expressed here by the `reduction`

and `running_cost`

keyword arguments. The `running_cost`

computes or specifies symbolically the elementwise running cost, i.e. the value of the running cost for each saved element of the solution and the `reduction`

receives those values as a single argument and returns the corresponding (integrated) value. In the symbolic form, the `running_cost`

keyword argument accepts a symbolic expression using variables and parameters corresponding to the model and is internally transformed into a function that evaluates the expression based on a given ODE solution that corresponds to a design configuration. The functional form for the `running_cost`

is a function that requires 3 arguments, the tuned values of the parameters or initial conditions (i.e. what was provided as the search space), the solution to the `ODEProblem`

corresponding to the design configuration and the last argument `data`

, which can be used to access additional information and can be provided via the `data`

keyword argument. The function is expected to return a scalar value corresponding to the loss that is associated to the design configuration defined by the tuned values passed in the first argument. If there is no easy way of expressing the loss function with `running_cost`

and `reduction`

, one can directly provide the `loss_func`

keyword argument from `Experiment`

.

The cost function corresponding to the design configuration forms the objective function for the optimization problem defined by the inverse problem. The tuned values of the parameters that are tried during the optimization are then used to solve the `ODEProblem`

corresponding to the design configuration. The solution is provided to the running cost and the available timepoints are defined by `saveat`

.

Constraints that define the design configuration can be provided using the `constraints`

keyword argument in the form of a vector of equations or inequalities. If the expression contains time dependent variables, then the expression will be automatically discretized and evaluated at `constraints_ts`

, which is by default the same as `saveat`

. If the constraints should be evaluated at different times from the running cost, such as when a denser discretization is needed around an event, the `constraints_ts`

keyword can be used to provide an arbitrary vector of timepoints to be used.

**Positional arguments**

`model`

: An`ODESystem`

describing the model that we are using for the design configuration. The model needs to define defaults for all initial conditions and all parameters.

**Keyword arguments**

`constraints`

: a vector of equations representing equality or inequality constraints.`u0`

: fix the initial conditions for the experiment (e.g.`u0 = [state_name => custom_initial_value]`

), see`Experiment`

for more details.`params`

: fix the parameters for the experiment (e.g.`params = [p1 => specific_value, ... p3 => other_value]`

), see`Experiment`

for more details.`model_transformations`

: Apply some transformations to the model used in this experiment, such as using`PredictionErrorMethod`

(e.g.`model_transformations = (PredictionErrorMethod(alpha),)`

)`callback`

: A callback or a callback set to be used during the simulation. See https://docs.sciml.ai/DiffEqDocs/stable/features/callback_functions/ for more details.`tspan`

: the timespan to use when integrating the equations.`saveat`

: the times at which the solution should be saved using interpolations. Defaults to saving where the integrator stops. This controls the timepoints when the running cost is evaluated.`constraints_ts`

: the times at which time dependent constraints should be evaluated. Defaults to`saveat`

.`running_cost`

: the contribution to the loss corresponding to this design configuration (e.g.`running_cost = (tuned_vals, sol, data) -> (sol[sys.var1] - ref_val)^2`

or running*cost = (sys.var1 - ref*val)^2).`reduction`

: this function is applied to the result of the`running_cost`

, acting like an integration, and is expected to return a scalar. By default`mean`

is used.`loss_func`

: if one wants to override the`reduction(running_cost)`

description of the loss function, a function with the`(tuned_vals, sol, data)`

signature can be passed.`prob_f_kwargs`

: A`NamedTuple`

indicating keyword arguments to be passed to the`ODEFunction`

constructor. See the ModelingToolkit docs for more details.`dependency`

: This keyword can be be assigned to a variable representing an other (previously defined) experiment to express the fact that the initial conditions for this experiment depend on the solution of the given experiment. For example if one experiment (e.g.`experiment_ss`

) defines a steady state, we can use that for the definition of the initial conditions for a subsequent experiment with`dependency=experiment_ss`

.

If additional keywords are passed, they will be forwarded to the `solve`

call. For example, one can pass `alg=Tsit5()`

to specify what solver will be used. More information about supported arguments can be found here.

## Simulation and Analysis Functions

To better understand and debug experiments, the experiments come with associated analysis functions to allow for easy investigation of the results in an experiment-by-experiment form. The following functions help the introspection of such experiments.

`JuliaSimModelOptimizer.simulate`

— Function`simulate(experiment::AbstractExperiment, prob::InverseProblem, x)`

Simulate the given `experiment`

using optimization-state point `x`

, which contains values for each parameter and initial condition that is optimized in `InverseProblem`

`prob`

.

## Loss Functions

By default, the loss function associated with a experiment against its data is the standard Euclidean distance, also known as the L2 loss. However, `JuliaSimModelOptimizer`

provides alternative loss definitions to allow for customizing the fitting strategy.

`JuliaSimModelOptimizer.squaredl2loss`

— Function`squaredl2loss(tuned_vals, sol, data)`

Sum of squared error loss:

$\sum_{i=1}^{M} \sum_{j=1}^{N} \left( \text{sol}_{i,j} - \text{data}_{i,j} \right)^2$

where N is the number of saved timepoints and M the number of measured states in the solution.

`JuliaSimModelOptimizer.l2loss`

— Function`l2loss(tuned_vals, sol, data)`

Sum of l2 loss:

$\sqrt{\sum_{i=1}^{M} \sum_{j=1}^{N} \left( \text{sol}_{i,j} - \text{data}_{i,j} \right)^2}$

where N is the number of saved timepoints and M the number of measured states in the solution.

`JuliaSimModelOptimizer.meansquaredl2loss`

— Function`meansquaredl2loss(tuned_vals, sol, data)`

Mean of squared l2 loss:

$\frac{(\sum_{i=1}^{M} \sum_{j=1}^{N} \left( \text{sol}_{i,j} - \text{data}_{i,j} \right)^2)}{N}$

where N is the number of saved timepoints and M the number of measured states in the solution.

`JuliaSimModelOptimizer.root_meansquaredl2loss`

— Function`root_meansquaredl2loss(sol, data)`

Root of mean squared l2 loss:

$\sqrt{\frac{(\sum_{i=1}^{M} \sum_{j=1}^{N} \left( \text{sol}_{i,j} - \text{data}_{i,j} \right)^2)}{N}}$

where N is the number of saved timepoints and M the number of measured states in the solution.

`JuliaSimModelOptimizer.norm_meansquaredl2loss`

— Function`norm_meansquaredl2loss(tuned_vals, sol, data)`

Normalized mean squared l2 loss:

$\frac{(\sum_{i=1}^{M} \sum_{j=1}^{N} \left( \text{sol}_{i,j} - \text{data}_{i,j} \right)^2)}{(\sum_{i=1}^{M} \sum_{j=1}^{N} \left( \text{sol}_{i,j} - mean\_sol_{i} \right)^2}$

where N is the number of saved timepoints and M the number of measured states in the solution.

`JuliaSimModelOptimizer.ARMLoss`

— Function`ARMLoss(sol, bounds)`

Allen-Rieger-Musante (ARM) loss :

$\sum_{i=1}^{M} \sum_{j=1}^{N} \text{max} \left[ \left( \text{sol}_{i,j} - \frac{\text{u}_{i,j} + \text{l}_{i,j}}{2} \right)^2 - \left( \frac{\text{u}_{i,j} - \text{l}_{i,j}}{2} \right)^2, 0 \right]$

where N is the number of saved timepoints, M the number of measured states in the solution and `l, u`

are the lower and upper bounds of each measured state respectively.

**Reference**

Allen RJ, Rieger TR, Musante CJ. Efficient Generation and Selection of Virtual Populations in Quantitative Systems Pharmacology Models. CPT Pharmacometrics Syst Pharmacol. 2016 Mar;5(3):140-6. doi: 10.1002/psp4.12063. Epub 2016 Mar 17. PMID: 27069777; PMCID: PMC4809626.