Generating surrogates

Info

The JuliaSim software is available for preview only. Please contact us for access, by emailing info@juliacomputing.com, if you are interested in evaluating JuliaSim.

Introduction

In order to demonstrate how JuliaSim.jl could be used in conjunction with OrdinaryDiffEq.jl, we construct a concise example, namely, the so-called ROBER problem, which consists of a stiff system of 3 non-linear ordinary differential equations (ODEs).

First, we define the ODE in question:

using QuasiMonteCarlo, OrdinaryDiffEq, ModelingToolkit, Surrogates, JuliaSim
function rober(du, u, p, t)
    y₁, y₂, y₃ = u # initial vector
    k₁, k₂, k₃ = p # rate constants
    du[1] = -k₁ * y₁ + k₃ * y₂ * y₃
    du[2] =  k₁ * y₁ - k₂ * y₂^2 - k₃ * y₂ * y₃
    du[3] =  k₂ * y₂^2
    nothing
end

tstop = 1e4
p = [0.04, 3e7, 1e4]
u0 = [1.0, 0.0, 0.0] # initial condition
tspan = (0.0, tstop)

prob = ODEProblem(rober, u0, tspan, p)
sol = solve(prob, Rosenbrock23())

This creates an ODEProblem, which can be simulated for a set of parameters. Now we need to specify a space of parameters over which we want to train our surrogate:

param_space = [(0.036, 0.044), (2.7e7, 3.3e7), (0.9e4, 1.1e4)]

Next, we specify the surrogate algorithm and cast the ODEProblem to DEProblemSimulation:

surralg = LPCTESN(1000)
sim = DEProblemSimulation(prob)

The abbreviation LPCTESN stands for 'Linear Projection Continuous-Time Echo State Network' (consult this paper for more information). The integer 1000 specifies reservoir size. It should always be provided by the user.

Finally, we call surrogatize and simulate_ensemble as follows:

odesurrogate = JuliaSim.surrogatize(
    sim,
    param_space,
    surralg,
    100; # 'n_sample_pts'
    ensemble_kwargs = (;),
    component_name = :robertson_surrogate,
    verbose=true)
odesimresults = JuliaSim.simulate_ensemble(sim, param_space, 100)

Note that the number of sets to be sampled from the parameter space (n_sample_pts) has to be provided by the user.

As can be inferred from this example, JuliaSim is a very intuitive tool allowing for high composability within the SciML ecosystem (and beyond). It ought to be emphasized that this simple example does not fully reflect JuliaSim's capabilities (as the paragraphs above clearly demonstrate).

Plotting functions

Numerical results would be incomplete without proper visualization and summaries. JuliaSim provides the necessary plotting functionalities for an easier analysis of the results. In its simplest form, the user may just call:

plot(odesimresults; ns = 1:2, output_dim = 2) # ns stands for 'samples to be plotted'

to obtain a visualization of the ROBER problem (see above).

The user may also wish to inspect the training statistics of the CTESN:

plot(odetrainstats, odesurrogate; ns = 1:2, output_dim = 2, log_scale = true)

Below is a sample plot: training plot

Finally, a comprehensive diagnostic report can be obtained by calling the weave_diagnostics function:

JuliaSim.weave_diagnostics(
    surrogate,
    "DCPM Temperature Diagnostic Report",  # Heading of the Report
    "output_fmu.pdf";                      # Path to save the Generated Report
    doctype = "md2pdf",                    # Any output format from markdown works.
    log_scale = false,
    include_errors = [:pointwise_error, :curve_distance],
    use_absolute_err = false,
    visualize_n_samples = 1
)

Here is a part of a sample diagnostic report of the ROBER problem:

rober diagnostic report

rober plot

Reducing Surrogate Training Time

Matrix linear solve operations make up a bulk of CTESNs' training time. In scenarios where the training is taking too long, the user can infer which training hyperparameter might be causing the delay and make necessary modifications.

There are the four types of linear solve operation scaling (with varying L) we observe in CTESNs:

  1. A[L, L] \ B[L, 500]
  2. A[500, 500] \ B[500, L]
  3. A[500, L] \ B[500, 500]
  4. A[500, 500] \ B[L, 500]

We observe how each type scales in time w.r.t. increase in matrix size along different dimensions in the below figure. We see that, in practice, A[L, L] \ B[L, 500] scales quadratically whereas rest of the types scale linearly. We can consider this as average time complexity.

complexity

Average time taken for the linear solve matrix operation (\) when repeatedly run on single-node JuliaHub instance with 32 vCPUs with increasing matrix sizes. 500 is an arbitrary control to study the scaling of compute time w.r.t. increase in matrix size along different dimensions. A[L, L] \ B[L, 500] scales quadratically whereas rest of the types scale linearly.

NPCTESNs

With NPCTESNs, we see two main types of linear solve operations

  • Op 1: A[n_time_pts, n_time_pts] \ B[n_time_pts, n_states] which occurs n_sample_pts times.
  • Op 2: A[n_sample_pts+1, n_sample_pts+1] \ B[n_sample_pts+1, n_time_pts*n_states] which occurs only once.

This implies that the average case time-complexity can we written as:

\[\begin{equation*} \begin{split} % \mathbb{O} &= \text{n\_sample\_pts} \times \text{time\_pts}^3 + (\text{n\_sample\_pts}+1)^3\\ \Theta &= \text{n\_sample\_pts} \times time(\text{Op 1}) + time(\text{Op 2})\\ &= \text{n\_sample\_pts} \times \text{n\_time\_pts}^2 \times \text{n\_states} + (\text{n\_sample\_pts}+1)^2 \times \text{n\_time\_pts} \times \text{n\_states}\\ % &= \end{split} \end{equation*}\]

To summarize, the NPCTESN surrogate training time varies:

  • quadratically w.r.t. number of time points in the samples (n_time_pts)
  • quadratically w.r.t. number of sample points (n_sample_pts)
  • linearly w.r.t number of states (n_states)

LPCTESNs

  • Op 1: A[n_time_pts, reservoir_size] \ B[n_time_pts, n_states] which occurs n_sample_pts times.
  • Op 2: A[n_sample_pts+1, n_sample_pts+1] \ B[n_sample_pts+1, reservoir_size*n_states] which occurs once.

\[\begin{equation*} \begin{split} % \mathbb{O} &= \text{n\_sample\_pts} \times \text{time\_pts}^3 + (\text{n\_sample\_pts}+1)^3\\ \Theta &= \text{n\_sample\_pts} \times time(\text{Op 1}) + time(\text{Op 2})\\ &= \text{n\_sample\_pts} \times \text{n\_time\_pts} \times \text{reservoir\_size} \times \text{n\_states} + (\text{n\_sample\_pts}+1)^2 \times \text{reservoir\_size} \times \text{n\_states}\\ % &= \end{split} \end{equation*}\]

To summarize, the LPCTESN surrogate training time varies:

  • linearly w.r.t. number of time points in the samples (n_time_pts)
  • quadratically w.r.t. number of sample points (n_sample_pts)
  • linearly w.r.t number of states (n_states)
  • linearly w.r.t reservoir size (reservoir_size)