Experiment Data Interface

The data ingested by DyadModelOptimizer needs a particular format. Let us first go through the rules and then demonstrate them with an example.

Julia Environment

For this tutorial, we will use the following packages:

Module	Description
DyadModelOptimizer	DyadModelOptimizer is used to formulate our problem
ModelingToolkit	The symbolic modeling environment
ModelingToolkitStandardLibrary	Library for using standard modeling components
OrdinaryDiffEq	The numerical differential equation solvers
DataSets	We will load our experimental data from datasets on JuliaHub

using DyadModelOptimizer
using ModelingToolkit
using ModelingToolkit: t_nounits as t
using ModelingToolkitStandardLibrary.Electrical
using ModelingToolkitStandardLibrary.Blocks: Sine
using OrdinaryDiffEq
using DyadData

Data Format

The rules are as follows:

Data which can either be simulations or real world data are bunch of timeseries for all the states/observables arranged in a tabular format which has a Tables.jl interface such as DataFrame, CSV.File, NamedTuple containing vectors for each columns etc.
First column of every table should be named as "timestamp". This column is the series of time points. When we have steady state problems, we can have Inf in the timestamp column to signify that it is steady state data.
The names for the rest of the columns should match the names of the states or algebraic variables of the corresponding model. Note that: i. The independent variable suffix (usually "(t)"), which ModelingToolkit generates is not required, it can either be present or omitted. ii. For models with sub-systems, the variables should be namespaced according to the system they belong to. The character used to delimit the namespace can either be "." or "₊".

Now, let us go through an example which is the same as the example in the getting started page.

function create_model(; C₁ = 3e-5, C₂ = 1e-6)
    @named resistor1 = Resistor(R = 5.0)
    @named resistor2 = Resistor(R = 2.0)
    @named capacitor1 = Capacitor(C = C₁)
    @named capacitor2 = Capacitor(C = C₂)
    @named source = Voltage()
    @named input_signal = Sine(frequency = 100.0)
    @named ground = Ground()
    @named ampermeter = CurrentSensor()

    eqs = [connect(input_signal.output, source.V)
        connect(source.p, capacitor1.n, capacitor2.n)
        connect(source.n, resistor1.p, resistor2.p, ground.g)
        connect(resistor1.n, capacitor1.p, ampermeter.n)
        connect(resistor2.n, capacitor2.p, ampermeter.p)]

    @named circuit_model = System(eqs, t,
        systems = [
            resistor1, resistor2, capacitor1, capacitor2,
            source, input_signal, ground, ampermeter,
        ])
end

model = create_model()
sys = mtkcompile(model)

\[ \begin{align} \frac{\mathrm{d} \mathtt{capacitor2.v}\left( t \right)}{\mathrm{d}t} &= \mathtt{capacitor1.vˍt}\left( t \right) \\ 0 &= - \mathtt{resistor2.i}\left( t \right) + \mathtt{capacitor2.i}\left( t \right) - \mathtt{resistor1.i}\left( t \right) + \mathtt{capacitor1.i}\left( t \right) \end{align} \]

We can see the unknowns of the model by:

unknowns(sys)

2-element Vector{SymbolicUtils.BasicSymbolic{Real}}:
 capacitor2₊v(t)
 capacitor1₊i(t)

We can see that states of the model are defined in an interpretable manner. For example,"capacitor2₊v(t)" means the voltage across capacitor2. So, using the rules defined above, the name of this column in the dataset can be:

"capacitor2₊v(t)"
"capacitor2₊v"
"capacitor2.v(t)"
"capacitor2.v"

All the above names map to the same state in the model.

Data Storage

The data can be saved in any format on disk as long we can deserialize it in the formats mentioned above.

For ease, DyadModelOptimizer also works with DyadData, which provides an interface for working with JuliaHub datasets as well as local files and raw data.

Let us demonstrate this with an example. We will use the dataset from getting started page.

data = DyadTimeseries("dyad+juliahub://juliahub.com/datasets/juliasimtutorials/circuit_data", independent_var="timestamp", dependent_vars=["ampermeter.i(t)"])
experiment = Experiment(data, sys; overrides = [sys.capacitor2.v => 0.0], alg = Rodas5P(), abstol = 1e-6, reltol = 1e-5)

Experiment for circuit_model 
with the following overrides:
capacitor2₊v(t) => 0.0.
The simulation of this experiment is given by:
ODEProblem with uType Vector{Float64} and tType Float64. In-place: true
Initialization status: FULLY_DETERMINED
Non-trivial mass matrix: true
timespan: (0.0, 0.1)

We can see that we only need to pass in the DataSet object directly into the Experiment constructor to use the data.

DyadData interface

DyadData.DyadTimeseries — Type

DyadTimeseries(
uri::AbstractString;
independent_var::AbstractString,
dependent_vars::Vector{<:AbstractString},
kwargs...)

Represent a timeseries-like dataset specified by a URI.

Supported URI schemes

file:///absolute/path - local file with absolute path
dyad://package_name/local_path - dyad package asset (relative to assets/ folder)
dyad+juliahub://juliahub.com/datasets/username/dataset_name - JuliaHub dataset

Keyword arguments

independent_var: the name of the column that represents the independent variable (usually the time)
dependent_vars: a vector of the names of the columns for the dependent variables

When reading CSV files, additional keyword arguments will be passed to CSV.read. This can help with changing settings such as the delimiter used in the file. See https://csv.juliadata.org/stable/reading.html for more details.

Examples

# Local file
ds = DyadTimeseries("file:///home/user/data.csv";
                    independent_var="timestamp",
                    dependent_vars=["x", "y"])

# Dyad package asset
ds = DyadTimeseries("dyad://DyadData/lotka.csv";
                    independent_var="timestamp",
                    dependent_vars=["x(t)", "y(t)"])

# JuliaHub dataset
ds = DyadTimeseries("dyad+juliahub://juliahub.com/datasets/user/dataset_name";
                    independent_var="time",
                    dependent_vars=["var1", "var2"])

source

DyadTimeseries(
data::AbstractMatrix;
independent_var::AbstractString,
dependent_vars::Vector{<:AbstractString})

Represent a timeseries-like dataset that is backed by raw data (matrix).

Keyword arguments

independent_var: the name of the column that represents the independent variable (usually the time)
dependent_vars: a vector of the names of the columns for the dependent variables

Example

data = rand(10, 3)
ds = DyadTimeseries(data; independent_var="time", dependent_vars=["x", "y"])

source

DyadData.DyadTable — Type

DyadTable(
uri::AbstractString;
columns::Vector{<:AbstractString},
kwargs...)

Represent a table specified by a URI.

Supported URI schemes

file:///absolute/path - local file with absolute path
dyad://package_name/local_path - dyad package asset (relative to assets/ folder)
dyad+juliahub://juliahub.com/datasets/username/dataset_name - JuliaHub dataset

Keyword arguments

columns: a vector of the names of the columns

Examples

# Local file
tbl = DyadTable("file:///home/user/data.csv";
                columns=["var1", "var2", "var3"])

# Dyad package asset
tbl = DyadTable("dyad://DyadData/data.csv";
                columns=["s1(t)", "s1s2(t)", "s2(t)"])

# JuliaHub dataset
tbl = DyadTable("dyad+juliahub://juliahub.com/datasets/user/dataset_name";
                columns=["x", "y", "z"])

source

DyadTable(
data::AbstractMatrix;
columns::Vector{<:AbstractString})

Represent a table that is backed by raw data (e.g. a vector for a single row).

Keyword arguments

columns: a vector of the names of the columns

Example

data = [1.0, 2.0, 3.0]
tbl = DyadTable(data; columns=["x", "y", "z"])

source

DyadTable(
data;
columns::Vector{<:AbstractString})

Represent a table that is backed by raw data (e.g. a matrix).

Keyword arguments

columns: a vector of the names of the columns

Example

data = rand(10, 3)
tbl = DyadTable(data; columns=["x", "y", "z"])

source

DyadData.build_table — Function

build_table(d::DyadTimeseries)

Build a Table from TypedTables.jl out of the specified timeseries dataset. The column names will correspond to the names of the independent variable & the ones for the dependent variables.

Note that the order of the columns is dictated by the order in the file, not by the order inside the dependent_vars argument for DyadTimeseries. The dependent_vars argument only specifies the available columns to use, not their order.

Example

ds = DyadTimeseries("file:///path/to/data.csv";
                    independent_var="time",
                    dependent_vars=["x", "y"])
table = build_table(ds)

source

build_table(d::DyadTable)

Build a Table from TypedTables.jl out of the specified table dataset. The column names will correspond to the columns.

Example

tbl = DyadTable("file:///path/to/data.csv";
                columns=["x", "y", "z"])
table = build_table(tbl)

source

build_table(d::DyadInterpolationTable2D)

Read the full table from the dataset (all columns). This delegates to the appropriate build_table method for the dataset type.

source