Experiment Data Interface

The data ingested by DyadModelOptimizer needs a particular format. Let us first go through the rules and then demonstrate them with an example.

Julia Environment

For this tutorial, we will use the following packages:

ModuleDescription
DyadModelOptimizerDyadModelOptimizer is used to formulate our problem
ModelingToolkitThe symbolic modeling environment
ModelingToolkitStandardLibraryLibrary for using standard modeling components
OrdinaryDiffEqThe numerical differential equation solvers
DataSetsWe will load our experimental data from datasets on JuliaHub
using DyadModelOptimizer
using ModelingToolkit
using ModelingToolkit: t_nounits as t
using ModelingToolkitStandardLibrary.Electrical
using ModelingToolkitStandardLibrary.Blocks: Sine
using OrdinaryDiffEq
using DyadData

Data Format

The rules are as follows:

  1. Data which can either be simulations or real world data are bunch of timeseries for all the states/observables arranged in a tabular format which has a Tables.jl interface such as DataFrame, CSV.File, NamedTuple containing vectors for each columns etc.
  2. First column of every table should be named as "timestamp". This column is the series of time points. When we have steady state problems, we can have Inf in the timestamp column to signify that it is steady state data.
  3. The names for the rest of the columns should match the names of the states or algebraic variables of the corresponding model. Note that: i. The independent variable suffix (usually "(t)"), which ModelingToolkit generates is not required, it can either be present or omitted. ii. For models with sub-systems, the variables should be namespaced according to the system they belong to. The character used to delimit the namespace can either be "." or "₊".

Now, let us go through an example which is the same as the example in the getting started page.

function create_model(; C₁ = 3e-5, C₂ = 1e-6)
    @named resistor1 = Resistor(R = 5.0)
    @named resistor2 = Resistor(R = 2.0)
    @named capacitor1 = Capacitor(C = C₁)
    @named capacitor2 = Capacitor(C = C₂)
    @named source = Voltage()
    @named input_signal = Sine(frequency = 100.0)
    @named ground = Ground()
    @named ampermeter = CurrentSensor()

    eqs = [connect(input_signal.output, source.V)
        connect(source.p, capacitor1.n, capacitor2.n)
        connect(source.n, resistor1.p, resistor2.p, ground.g)
        connect(resistor1.n, capacitor1.p, ampermeter.n)
        connect(resistor2.n, capacitor2.p, ampermeter.p)]

    @named circuit_model = ODESystem(eqs, t,
        systems = [
            resistor1, resistor2, capacitor1, capacitor2,
            source, input_signal, ground, ampermeter,
        ])
end

model = create_model()
sys = structural_simplify(model)

\[ \begin{align} \frac{\mathrm{d} \mathtt{capacitor2.v}\left( t \right)}{\mathrm{d}t} &= \mathtt{capacitor1.vˍt}\left( t \right) \\ 0 &= - \mathtt{resistor2.i}\left( t \right) + \mathtt{capacitor2.i}\left( t \right) - \mathtt{resistor1.i}\left( t \right) + \mathtt{capacitor1.i}\left( t \right) \end{align} \]

We can see the unknowns of the model by:

unknowns(sys)
2-element Vector{SymbolicUtils.BasicSymbolic{Real}}:
 capacitor2₊v(t)
 capacitor1₊i(t)

We can see that states of the model are defined in an interpretable manner. For example,"capacitor2₊v(t)" means the voltage across capacitor2. So, using the rules defined above, the name of this column in the dataset can be:

  • "capacitor2₊v(t)"
  • "capacitor2₊v"
  • "capacitor2.v(t)"
  • "capacitor2.v"

All the above names map to the same state in the model.

Data Storage

The data can be saved in any format on disk as long we can deserialize it in the formats mentioned above.

For ease, DyadModelOptimizer also works with DyadData, which provides an interface for working with JuliaHub datasets as well as local files and raw data.

Let us demonstrate this with an example. We will use the dataset from getting started page.

data = DyadDataset("juliasimtutorials", "circuit_data", independent_var="timestamp", dependent_vars=["ampermeter.i(t)"])
experiment = Experiment(data, sys; overrides = [sys.capacitor2.v => 0.0], alg = Rodas5P(), abstol = 1e-6, reltol = 1e-5)
Experiment for circuit_model 
with the following overrides:
capacitor2₊v(t) => 0.0.
The simulation of this experiment is given by:
ODEProblem with uType Vector{Float64} and tType Float64. In-place: true
Initialization status: FULLY_DETERMINED
Non-trivial mass matrix: true
timespan: (0.0, 0.1)

We can see that we only need to pass in the DataSet object directly into the Experiment constructor to use the data.

DyadData interface

DyadData.DyadDatasetType
DyadDataset(
filepath::AbstractString = "";
independent_var::AbstractString,
dependent_vars::Vector{<:AbstractString},
kwargs...)

Represent a timeseries-like dataset that is backed by a local file.

Keyword arguments

  • independent_var: the name of the column that represents the independent variable (usually the time)
  • dependent_vars: a vector of the names of the columns for the dependent variables

When reading files (local file option or a downloaded JuliaHub dataset), CSV.jl is used. Additional keyword arguments passed to this function will be passed on to CSV.read. This can help with changing settings such as the delimiter used in the file. See https://csv.juliadata.org/stable/reading.html for more details.

source
DyadDataset(
username::AbstractString,
name::AbstractString;
independent_var::AbstractString,
dependent_vars::Vector{<:AbstractString},
kwargs...)

Represent a timeseries-like dataset that is backed by a JuliaHub dataset.

Keyword arguments

  • independent_var: the name of the column that represents the independent variable (usually the time)
  • dependent_vars: a vector of the names of the columns for the dependent variables

When reading files (local file option or a downloaded JuliaHub dataset), CSV.jl is used. Additional keyword arguments passed to this function will be passed on to CSV.read. This can help with changing settings such as the delimiter used in the file. See https://csv.juliadata.org/stable/reading.html for more details.

source
DyadDataset(
data;
independent_var::AbstractString,
dependent_vars::Vector{<:AbstractString},
kwargs...)

Represent a timeseries-like dataset that is backed by raw data.

Keyword arguments

  • independent_var: the name of the column that represents the independent variable (usually the time)
  • dependent_vars: a vector of the names of the columns for the dependent variables
source
DyadData.build_dataframeFunction
build_dataframe(d::DyadDataset)

Build a DataFrame out of the specified timeseries dataset. The column names will correspond to the names of the independent variable & the ones for the dependent variables. Note that the order of the columns is dictated by the order in the file, not by the order inside the dependent_vars argument for DyadDataset. The dependent_vars argument only specifies the available columns to use, not their order.

source