Launching Batch Jobs in JuliaHub
JuliaSimModelOptimizer provides an interface to launch batch jobs on JuliaHub using JuliaHub.jl. This is particularly useful for scaling compute to multiple threads/processes like for parametric_uq
, MultipleShooting
where it can be massively parallelized. In this tutorial, let us walk through in the process of launching these batch jobs. We will use the same example as in Getting Started page, i.e., De-Sauty bridge.
Julia Environment
For this tutorial we will need the following packages:
Module | Description |
---|---|
JuliaSimModelOptimizer | The high-level library used to formulate our problem and perform automated model discovery |
ModelingToolkit | The symbolic modeling environment |
ModelingToolkitStandardLibrary | Library for using standard modeling components |
OrdinaryDiffEq | The numerical differential equation solvers |
DataSets | We will load our experimental data from datasets on JuliaHub |
Plots | The plotting and visualization library |
JuliaHub | Package for a programmatic Julia interface to the JuliaHub platform |
Serialization | Julia Standard library for serializing and deserializing files |
using JuliaSimModelOptimizer
using ModelingToolkit
using ModelingToolkit: t_nounits as t, D_nounits as D
using ModelingToolkitStandardLibrary.Electrical
using ModelingToolkitStandardLibrary.Blocks
using OrdinaryDiffEq
using DataSets
using Plots
using JuliaHub
using Serialization
Before we launch any job, we need to authenticate to JuliaHub which can be done using JuliaHub.authenticate
. We will use this auth
object for submitting jobs.
auth = JuliaHub.authenticate()
Data Setup
We can read tabular experimental data where the model names are matching column names in the table.
data = dataset("juliasimtutorials/circuit_data")
name = "juliasimtutorials/circuit_data"
uuid = "56ac293a-9ebf-401c-b8a9-57d986fb2747"
tags = []
description = "Data for the Model Optimizer getting started tutorial."
type = "Blob"
[storage]
driver = "JuliaHubDataRepo"
bucket_region = "us-east-1"
bucket = "internaljuilahubdatasets"
credentials_url = "https://internal.juliahub.com/datasets/56ac293a-9ebf-401c-b8a9-57d986fb2747/credentials"
prefix = "datasets"
vendor = "aws"
type = "Blob"
[storage.juliahub_credentials]
auth_toml_path = "/home/github_actions/actions-runner-1/home/.julia/servers/internal.juliahub.com/auth.toml"
[[storage.versions]]
blobstore_path = "u1"
date = "2023-06-10T23:53:27.845+00:00"
size = 2768
version = 1
Model Setup
We will now define the De Sauty circuit model.
function create_model(; C₁ = 3e-5, C₂ = 1e-6)
@variables t
@named resistor1 = Resistor(R = 5.0)
@named resistor2 = Resistor(R = 2.0)
@named capacitor1 = Capacitor(C = C₁)
@named capacitor2 = Capacitor(C = C₂)
@named source = Voltage()
@named input_signal = Sine(frequency = 100.0)
@named ground = Ground()
@named ampermeter = CurrentSensor()
eqs = [connect(input_signal.output, source.V)
connect(source.p, capacitor1.n, capacitor2.n)
connect(source.n, resistor1.p, resistor2.p, ground.g)
connect(resistor1.n, capacitor1.p, ampermeter.n)
connect(resistor2.n, capacitor2.p, ampermeter.p)]
@named circuit_model = ODESystem(eqs, t,
systems = [
resistor1, resistor2, capacitor1, capacitor2,
source, input_signal, ground, ampermeter,
])
end
model = create_model()
sys = structural_simplify(model)
\[ \begin{align} \frac{\mathrm{d} capacitor2_{+}v\left( t \right)}{\mathrm{d}t} =& capacitor1_{+}vˍt\left( t \right) \\ 0 =& - resistor2_{+}i\left( t \right) + capacitor2_{+}i\left( t \right) - resistor1_{+}i\left( t \right) + capacitor1_{+}i\left( t \right) \end{align} \]
Defining Experiment and InverseProblem
In order to create an Experiment
, we will use the default initial values of the states and parameters of our model. These are our initial guesses which will be used to optimize the inverse problem in order to fit the given data.
@unpack capacitor1, capacitor2 = model
experiment = Experiment(data, sys; initial_conditions = [capacitor2.v => 0.0, capacitor1.i => 0.0], alg = Rodas4(), abstol = 1e-6, reltol = 1e-5)
Experiment for circuit_model with no fixed parameters or initial conditions.
The simulation of this experiment is given by:
ODEProblem with uType Vector{Float64} and tType Float64. In-place: true
timespan: (0.0, 0.1)
Once we have created the experiment, the next step is to create an InverseProblem
. This inverse problem, requires us to provide the search space as a vector of pairs corresponding to the parameters that we want to recover and the assumption that we have for their respective bounds. Here we are trying to calibrate the values of capacitor2.C
which is the capacitance of capacitor2
.
prob = InverseProblem(experiment, [capacitor2.C => (1.e-7, 1e-3)])
InverseProblem with one experiment with 1 elements in the search space.
Parametric Uncertainty Quantification
Now, lets do parameteric uncertainty quantification using parametric_uq
. Here is the part where we need to provide JuliaHub batch job specifications using JuliaHubJob
.
JuliaHubJob
is a wrapper for all algorithms (like SingleShooting
, MultipleShooting
, parametric_uq
) to add compute specifications like number of cpus, memory etc. (see here for more details), batch image to use, name of the JuliaHub.dataset
where the results would be uploaded and the authentication token.
alg = StochGlobalOpt(
method = SingleShooting(maxiters = 10^3),
parallel_type = EnsembleDistributed())
specs = (ncpu = 8,
memory = 64,
nnodes = 1,
process_per_node = false)
alg_juliahub = JuliaHubJob(; auth,
batch_image = JuliaHub.batchimage("juliasim-batch", "JuliaSim - Stable"),
node_specs = specs, dataset_name = "desauty",
alg = alg)
JuliaHubJob{@NamedTuple{ncpu::Int64, memory::Int64, nnodes::Int64, process_per_node::Bool}, JuliaHub.Authentication, JuliaHub.BatchImage, String, StochGlobalOpt{SciMLBase.EnsembleDistributed, SingleShooting{Nothing, Nothing, MathOptInterface.OptimizerWithAttributes, Base.Pairs{Symbol, Union{}, Tuple{}, @NamedTuple{}}}}}((ncpu = 8, memory = 64, nnodes = 1, process_per_node = false), JuliaHub.Authentication("https://internal.juliahub.com", "JuliaSimSurrogates-CI", *****), JuliaHub.batchimage("juliasim-batch", "JuliaSim - Stable"), "desauty", StochGlobalOpt{SciMLBase.EnsembleDistributed, SingleShooting{Nothing, Nothing, MathOptInterface.OptimizerWithAttributes, Base.Pairs{Symbol, Union{}, Tuple{}, @NamedTuple{}}}}(SciMLBase.EnsembleDistributed(), SingleShooting{Nothing, Nothing, MathOptInterface.OptimizerWithAttributes, Base.Pairs{Symbol, Union{}, Tuple{}, @NamedTuple{}}}(nothing, nothing, MathOptInterface.OptimizerWithAttributes(Ipopt.Optimizer, Pair{MathOptInterface.AbstractOptimizerAttribute, Any}[MathOptInterface.RawOptimizerAttribute("tol") => 0.0001, MathOptInterface.RawOptimizerAttribute("max_iter") => 1000, MathOptInterface.RawOptimizerAttribute("acceptable_tol") => 0.001, MathOptInterface.RawOptimizerAttribute("acceptable_iter") => 250, MathOptInterface.RawOptimizerAttribute("print_level") => 0, MathOptInterface.RawOptimizerAttribute("hessian_approximation") => "limited-memory"]), Base.Pairs{Symbol, Union{}, Tuple{}, @NamedTuple{}}())))
Once we define this, running parametric_uq
or calibrate
is the same except instead of passing alg
, we pass in alg_juliahub
. This returns a job object as the batch job will run asynchronously.
sample_size = 100
ps_job = parametric_uq(prob, alg_juliahub; sample_size = sample_size)
Once a job is launched, we can see the progress and logs on the JuliaHub platform. When the job is finished, we can download the dataset which corresponds to the results and inspect it.
JuliaHub.download_dataset("desauty", "./ps"; auth)
ps = deserialize("ps")
Parametric uncertainty ensemble of length 100 computed in 21 seconds and 411 milliseconds.
┌─────────────────┬──────────────┐
│ Final Objective │ capacitor2₊C │
├─────────────────┼──────────────┤
│ 1.2189e-9 │ 1.00257e-5 │
│ 7.12926e-10 │ 1.00066e-5 │
│ 1.95654e-9 │ 1.00356e-5 │
│ 1.7821e-9 │ 1.00332e-5 │
│ 1.10027e-9 │ 1.00094e-5 │
│ 7.16e-10 │ 1.00075e-5 │
│ 4.05401e-8 │ 1.01644e-5 │
│ 1.29643e-9 │ 1.00069e-5 │
│ 7.62835e-10 │ 1.00081e-5 │
│ 7.02763e-10 │ 1.00068e-5 │
│ 7.52366e-10 │ 1.00102e-5 │
│ 1.0498e-9 │ 1.00224e-5 │
│ 2.97907e-9 │ 1.00451e-5 │
│ 1.37062e-9 │ 1.00285e-5 │
│ 1.02825e-9 │ 1.00197e-5 │
│ 7.73236e-10 │ 1.00079e-5 │
│ ⋮ │ ⋮ │
└─────────────────┴──────────────┘
84 rows omitted
Visualization
We can see the results of the analysis by plotting its histogram.
plot(ps, bins = 100, legend = :best)
This way we can launch batch jobs for long standing computations in a seamless manner.