Generating surrogates
The JuliaSim software is available for preview only. Please contact us for access, by emailing info@juliacomputing.com, if you are interested in evaluating JuliaSim.
Introduction
In order to demonstrate how JuliaSim.jl could be used in conjunction with OrdinaryDiffEq.jl, we construct a concise example, namely, the so-called ROBER problem, which consists of a stiff system of 3 non-linear ordinary differential equations (ODEs).
First, we define the ODE in question:
using QuasiMonteCarlo, OrdinaryDiffEq, ModelingToolkit, Surrogates, JuliaSim
function rober(du, u, p, t)
y₁, y₂, y₃ = u # initial vector
k₁, k₂, k₃ = p # rate constants
du[1] = -k₁ * y₁ + k₃ * y₂ * y₃
du[2] = k₁ * y₁ - k₂ * y₂^2 - k₃ * y₂ * y₃
du[3] = k₂ * y₂^2
nothing
end
tstop = 1e4
p = [0.04, 3e7, 1e4]
u0 = [1.0, 0.0, 0.0] # initial condition
tspan = (0.0, tstop)
prob = ODEProblem(rober, u0, tspan, p)
sol = solve(prob, Rosenbrock23())
This creates an ODEProblem
, which can be simulated for a set of parameters. Now we need to specify a space of parameters over which we want to train our surrogate:
param_space = [(0.036, 0.044), (2.7e7, 3.3e7), (0.9e4, 1.1e4)]
Next, we specify the surrogate algorithm and cast the ODEProblem
to DEProblemSimulation
:
surralg = LPCTESN(1000)
sim = DEProblemSimulation(prob)
The abbreviation LPCTESN
stands for 'Linear Projection Continuous-Time Echo State Network' (consult this paper for more information). The integer 1000
specifies reservoir size. It should always be provided by the user.
Finally, we call surrogatize
and simulate_ensemble
as follows:
odesurrogate = JuliaSim.surrogatize(
sim,
param_space,
surralg,
100; # 'n_sample_pts'
ensemble_kwargs = (;),
component_name = :robertson_surrogate,
verbose=true)
odesimresults = JuliaSim.simulate_ensemble(sim, param_space, 100)
Note that the number of sets to be sampled from the parameter space (n_sample_pts
) has to be provided by the user.
As can be inferred from this example, JuliaSim is a very intuitive tool allowing for high composability within the SciML ecosystem (and beyond). It ought to be emphasized that this simple example does not fully reflect JuliaSim's capabilities (as the paragraphs above clearly demonstrate).
Plotting functions
Numerical results would be incomplete without proper visualization and summaries. JuliaSim provides the necessary plotting functionalities for an easier analysis of the results. In its simplest form, the user may just call:
plot(odesimresults; ns = 1:2, output_dim = 2) # ns stands for 'samples to be plotted'
to obtain a visualization of the ROBER problem (see above).
The user may also wish to inspect the training statistics of the CTESN:
plot(odetrainstats, odesurrogate; ns = 1:2, output_dim = 2, log_scale = true)
Below is a sample plot:
Finally, a comprehensive diagnostic report can be obtained by calling the weave_diagnostics
function:
JuliaSim.weave_diagnostics(
surrogate,
"DCPM Temperature Diagnostic Report", # Heading of the Report
"output_fmu.pdf"; # Path to save the Generated Report
doctype = "md2pdf", # Any output format from markdown works.
log_scale = false,
include_errors = [:pointwise_error, :curve_distance],
use_absolute_err = false,
visualize_n_samples = 1
)
Here is a part of a sample diagnostic report of the ROBER problem:
Reducing Surrogate Training Time
Matrix linear solve operations make up a bulk of CTESNs' training time. In scenarios where the training is taking too long, the user can infer which training hyperparameter might be causing the delay and make necessary modifications.
There are the four types of linear solve operation scaling (with varying L
) we observe in CTESNs:
A[L, L] \ B[L, 500]
A[500, 500] \ B[500, L]
A[500, L] \ B[500, 500]
A[500, 500] \ B[L, 500]
We observe how each type scales in time w.r.t. increase in matrix size along different dimensions in the below figure. We see that, in practice, A[L, L] \ B[L, 500]
scales quadratically whereas rest of the types scale linearly. We can consider this as average time complexity.
Average time taken for the linear solve matrix operation (\
) when repeatedly run on single-node JuliaHub instance with 32 vCPUs with increasing matrix sizes. 500
is an arbitrary control to study the scaling of compute time w.r.t. increase in matrix size along different dimensions. A[L, L] \ B[L, 500]
scales quadratically whereas rest of the types scale linearly.
NPCTESNs
With NPCTESNs, we see two main types of linear solve operations
- Op 1:
A[n_time_pts, n_time_pts] \ B[n_time_pts, n_states]
which occursn_sample_pts
times. - Op 2:
A[n_sample_pts+1, n_sample_pts+1] \ B[n_sample_pts+1, n_time_pts*n_states]
which occurs only once.
This implies that the average case time-complexity can we written as:
\[\begin{equation*} \begin{split} % \mathbb{O} &= \text{n\_sample\_pts} \times \text{time\_pts}^3 + (\text{n\_sample\_pts}+1)^3\\ \Theta &= \text{n\_sample\_pts} \times time(\text{Op 1}) + time(\text{Op 2})\\ &= \text{n\_sample\_pts} \times \text{n\_time\_pts}^2 \times \text{n\_states} + (\text{n\_sample\_pts}+1)^2 \times \text{n\_time\_pts} \times \text{n\_states}\\ % &= \end{split} \end{equation*}\]
To summarize, the NPCTESN surrogate training time varies:
- quadratically w.r.t. number of time points in the samples (
n_time_pts
) - quadratically w.r.t. number of sample points (
n_sample_pts
) - linearly w.r.t number of states (
n_states
)
LPCTESNs
- Op 1:
A[n_time_pts, reservoir_size] \ B[n_time_pts, n_states]
which occursn_sample_pts
times. - Op 2:
A[n_sample_pts+1, n_sample_pts+1] \ B[n_sample_pts+1, reservoir_size*n_states]
which occurs once.
\[\begin{equation*} \begin{split} % \mathbb{O} &= \text{n\_sample\_pts} \times \text{time\_pts}^3 + (\text{n\_sample\_pts}+1)^3\\ \Theta &= \text{n\_sample\_pts} \times time(\text{Op 1}) + time(\text{Op 2})\\ &= \text{n\_sample\_pts} \times \text{n\_time\_pts} \times \text{reservoir\_size} \times \text{n\_states} + (\text{n\_sample\_pts}+1)^2 \times \text{reservoir\_size} \times \text{n\_states}\\ % &= \end{split} \end{equation*}\]
To summarize, the LPCTESN surrogate training time varies:
- linearly w.r.t. number of time points in the samples (
n_time_pts
) - quadratically w.r.t. number of sample points (
n_sample_pts
) - linearly w.r.t number of states (
n_states
) - linearly w.r.t reservoir size (
reservoir_size
)