Model Autocomplete
In this tutorial, we show how to use PumasQSP to extend a partial model via the auto-complete function to find the missing physics.
Motivation
When calibrating models to data, we often not only encounter uncertainties about the parameter values but also about the structure of the model. This can happen no matter how hard we try to fit our model to the data and could point out that we need to adjust the equations of the model. Universal Differential Equations (UDEs) address issue. Starting with a partial model, they find the minimal mechanistic extension that would provide a better fit.
The model we consider in this tutorial is the Lotka-Volterra model. Its equations are given by:
\[\begin{aligned} \frac{dx}{dt} &= \alpha x - \beta x y \\ \frac{dy}{dt} &= -\delta y + \gamma x y \\ \end{aligned}\]
It is a model of rabbits and wolves. $\alpha x$ is the exponential growth of rabbits in isolation, $-\beta x y$ and $\gamma x y$ are the interaction effects of wolves eating rabbits, and $-\delta y$ is the term for how wolves die hungry in isolation.
Now assume that we have never seen rabbits and wolves in the same room. We only know the two effects $\alpha x$ and $-\delta y$. Can we use Scientific Machine Learning to automatically discover an extension to what we already know? That is what we will solve with the universal differential equation.
Julia environment
For this tutorial we will need the following packages:
Module | Description |
---|---|
PumasQSP | PumasQSP is used to formulate our problem and perform automated model discovery |
ModelingToolkit | The symbolic modeling environment |
OrdinaryDiffEq (DifferentialEquations.jl) | The numerical differential equation solvers |
DataSets | We will load our experimental data from datasets on JuliaHub |
CSV | Handling delimited text data |
DataFrames | Handling tabular data manipulation |
OptimizationOptimisers | The optimization solver package with Adam |
OptimizationOptimJL | The optimization solver package with LBFGS |
DataDrivenDiffEq | The symbolic regression interface |
DataDrivenSparse | The sparse regression symbolic regression solvers |
StableRNGs | Stable random seeding |
Plots | The plotting and visualization library |
Besides that we'll also need the following Julia standard libraries:
Module | Description |
---|---|
LinearAlgebra | Required for the norm function |
Statistics | Required for the mean function |
First we prepare the environment by listing the packages we are using within the example.
using PumasQSP
using ModelingToolkit
using OrdinaryDiffEq
using DataSets
using CSV
using DataFrames
using OptimizationOptimisers: Adam
using OptimizationOptimJL: LBFGS
using DataDrivenDiffEq
using DataDrivenSparse
using StableRNGs
using LinearAlgebra
using Statistics
using Plots
Plots.GRBackend()
Model setup
First, we define the incomplete model using ModelingToolkit.
# Model states and their default values
@variables begin
t
x(t) = 5.0
y(t) = 5.0
end
# Model parameters and their default values
@parameters begin
α = 1.3
δ = 1.8
end
# Differential operator
D = Differential(t)
# Differential equations
eqs = [
D(x) ~ α * x,
D(y) ~ - δ * y,
]
# Model definition
@named incomplete_model = ODESystem(eqs)
\[ \begin{align} \frac{\mathrm{d} x\left( t \right)}{\mathrm{d}t} =& \alpha x\left( t \right) \\ \frac{\mathrm{d} y\left( t \right)}{\mathrm{d}t} =& - \delta y\left( t \right) \end{align} \]
Second, we read in the training data (that corresponds to the correct model data + some small normally distributed noise).
# Load the data
training_dataset = dataset("pumasqsptutorials/lotka_data")
data = open(IO, training_dataset) do io
CSV.read(io, DataFrame)
end
Row | timestamp | x(t) | y(t) |
---|---|---|---|
Float64 | Float64 | Float64 | |
1 | 0.0 | 3.14639 | 1.54233 |
2 | 0.25 | 2.97895 | 1.81893 |
3 | 0.5 | 2.66341 | 2.04754 |
4 | 0.75 | 2.31933 | 2.14718 |
5 | 1.0 | 1.96996 | 2.09154 |
6 | 1.25 | 1.72233 | 1.93848 |
7 | 1.5 | 1.58525 | 1.71391 |
8 | 1.75 | 1.52222 | 1.50399 |
9 | 2.0 | 1.56685 | 1.29362 |
10 | 2.25 | 1.63834 | 1.14605 |
11 | 2.5 | 1.76952 | 1.00918 |
12 | 2.75 | 1.95116 | 0.932905 |
13 | 3.0 | 2.22475 | 0.914375 |
14 | 3.25 | 2.49791 | 0.941644 |
15 | 3.5 | 2.77768 | 1.00284 |
16 | 3.75 | 3.02139 | 1.14662 |
17 | 4.0 | 3.14939 | 1.37638 |
18 | 4.25 | 3.08999 | 1.6282 |
19 | 4.5 | 2.90096 | 1.90895 |
20 | 4.75 | 2.56037 | 2.08787 |
21 | 5.0 | 2.17786 | 2.14388 |
Lastly, we bring model and data together in a Trial
object and build the InverseProblem
with the neural component via the neural_network
keyword argument.
# Trial definition
trial = Trial(training_dataset, incomplete_model;
u0 = [x => data."x(t)"[1], y => data."y(t)"[1]],
alg = Vern7(),
abstol = 1e-6,
reltol = 1e-6
)
# Construct inverse problem
prob = InverseProblem(trial, incomplete_model, []; neural_network = multi_layer_feed_forward(2, 2), nn_rng = StableRNG(1111))
InverseProblem for incomplete_model with one trial optimizing 0 parameters.
The neural_network
keyword argument accepts a Lux.jl model. For ease of use, PumasQSP provides functions that can build such models easily, such as multi_layer_feed_forward
. As our model has two states, the input and output of the neural network will be two.
Step 1: Training the neural network
The training of the neural network is done by calibrating the inverse problem in two stages, i.e. we forward the first calibration result using the Adam()
optimizer to the latter calibration call using the LBFGS()
optimizer.
res1 = calibrate(prob, SingleShooting(; maxiters = 5000, optimizer = Adam()))
res2 = calibrate(res1, SingleShooting(; maxiters = 1000, optimizer = LBFGS()))
Calibration result computed in 2 minutes and 45 seconds. Final objective value: 0.00271634.
Optimization ended with Failure Return Code.
We visualize the convergence of the two optimizers.
## pl_losses
pl_losses = convergenceplot([res1, res2], yscale = :log10, lw = 2.5)
We also visualize the learnt model together against the training data to estimate the accuracy of the neural network.
## pl_trajectory
pl_trajectory = plot(trial, prob, res2, show_data = true, legend = :bottomleft)
Step 2: Learning a symbolic representation from the neural network
We can now use the training results to autocomplete our model with symbolic terms that match the data. This can be done with the autocomplete
function, which is compatible with DataDrivenDiffEq algorithms. For this example we will use the ADMM
algorithm from DataDrivenSparse.jl
and we will assume that our missing terms come from a polynomial basis of degree 4. In order to improve the accuracy of the symbolic interpretation, we process the neural network predictions in batches and we normalize the results using the Z-score transformation.
λ = exp10.(-3:0.01:3)
alg = ADMM(λ)
data_processing = DataProcessing(split = 0.9, batchsize = 30, shuffle = true)
res = autocomplete(res2, alg;
basis_generator = (polynomial_basis, 4),
digits = 1,
normalize = DataNormalization(ZScoreTransform),
data_processing)
Calibration result computed in 21 seconds and 950 milliseconds. Final objective value: 0.0032404.
Optimization ended with Success Return Code and returned.
┌─────────────┬─────────────┬───────────┬───────────┬────────────┬──────────┐
│ p₁ │ p₂ │ p₃ │ p₄ │ p₅ │ p₆ │
├─────────────┼─────────────┼───────────┼───────────┼────────────┼──────────┤
│ 0.000761669 │ -0.00695412 │ -0.896389 │ 0.0065833 │ 0.00210341 │ 0.798034 │
└─────────────┴─────────────┴───────────┴───────────┴────────────┴──────────┘
The autocomplete function returns a calibration result as the new model parameters are calibrated to best match the data. We can extract the autocompleted model using get_model
and retrieve the learnt equations.
learnt_model = get_model(res)
learnt_equations = equations(learnt_model)
\[ \begin{align} \frac{\mathrm{d} x\left( t \right)}{\mathrm{d}t} =& p_1 + \alpha x\left( t \right) + p_2 y\left( t \right) + p_3 x\left( t \right) y\left( t \right) \\ \frac{\mathrm{d} y\left( t \right)}{\mathrm{d}t} =& p_4 + p_5 y\left( t \right) - \delta y\left( t \right) + p_6 x\left( t \right) y\left( t \right) \end{align} \]
We can also use this new model to make predictions outside of the training data by solving for a larger timespan.
t = only(get_trials(res))
plot(t, res, tspan = (0, 50.), saveat = 0.1, show_data = true, legend = :topright, lw = 2.5)
- 1Derived from ["Automatically Discover Missing Physics by Embedding Machine Learning into Differential Equations"](https://docs.sciml.ai/Overview/stable/showcase/missingphysics/) from SciML: Differentiable Modeling and Simulation Combined with Machine Learning, MIT licensed, see repository for details._