Experiment Data Interface
The data ingested by DyadModelOptimizer needs a particular format. Let us first go through the rules and then demonstrate them with an example.
Julia Environment
For this tutorial, we will use the following packages:
Module | Description |
---|---|
DyadModelOptimizer | DyadModelOptimizer is used to formulate our problem |
ModelingToolkit | The symbolic modeling environment |
ModelingToolkitStandardLibrary | Library for using standard modeling components |
OrdinaryDiffEq | The numerical differential equation solvers |
DataSets | We will load our experimental data from datasets on JuliaHub |
using DyadModelOptimizer
using ModelingToolkit
using ModelingToolkit: t_nounits as t
using ModelingToolkitStandardLibrary.Electrical
using ModelingToolkitStandardLibrary.Blocks: Sine
using OrdinaryDiffEq
using DyadData
Data Format
The rules are as follows:
- Data which can either be simulations or real world data are bunch of timeseries for all the states/observables arranged in a tabular format which has a
Tables.jl
interface such asDataFrame
,CSV.File
,NamedTuple
containing vectors for each columns etc. - First column of every table should be named as "timestamp". This column is the series of time points. When we have steady state problems, we can have
Inf
in the timestamp column to signify that it is steady state data. - The names for the rest of the columns should match the names of the states or algebraic variables of the corresponding model. Note that: i. The independent variable suffix (usually "(t)"), which
ModelingToolkit
generates is not required, it can either be present or omitted. ii. For models with sub-systems, the variables should be namespaced according to the system they belong to. The character used to delimit the namespace can either be "." or "₊".
Now, let us go through an example which is the same as the example in the getting started page.
function create_model(; C₁ = 3e-5, C₂ = 1e-6)
@named resistor1 = Resistor(R = 5.0)
@named resistor2 = Resistor(R = 2.0)
@named capacitor1 = Capacitor(C = C₁)
@named capacitor2 = Capacitor(C = C₂)
@named source = Voltage()
@named input_signal = Sine(frequency = 100.0)
@named ground = Ground()
@named ampermeter = CurrentSensor()
eqs = [connect(input_signal.output, source.V)
connect(source.p, capacitor1.n, capacitor2.n)
connect(source.n, resistor1.p, resistor2.p, ground.g)
connect(resistor1.n, capacitor1.p, ampermeter.n)
connect(resistor2.n, capacitor2.p, ampermeter.p)]
@named circuit_model = ODESystem(eqs, t,
systems = [
resistor1, resistor2, capacitor1, capacitor2,
source, input_signal, ground, ampermeter,
])
end
model = create_model()
sys = structural_simplify(model)
\[ \begin{align} \frac{\mathrm{d} \mathtt{capacitor2.v}\left( t \right)}{\mathrm{d}t} &= \mathtt{capacitor1.vˍt}\left( t \right) \\ 0 &= - \mathtt{resistor2.i}\left( t \right) + \mathtt{capacitor2.i}\left( t \right) - \mathtt{resistor1.i}\left( t \right) + \mathtt{capacitor1.i}\left( t \right) \end{align} \]
We can see the unknowns of the model by:
unknowns(sys)
2-element Vector{SymbolicUtils.BasicSymbolic{Real}}:
capacitor2₊v(t)
capacitor1₊i(t)
We can see that states of the model are defined in an interpretable manner. For example,"capacitor2₊v(t)" means the voltage across capacitor2
. So, using the rules defined above, the name of this column in the dataset can be:
- "capacitor2₊v(t)"
- "capacitor2₊v"
- "capacitor2.v(t)"
- "capacitor2.v"
All the above names map to the same state in the model.
Data Storage
The data can be saved in any format on disk as long we can deserialize it in the formats mentioned above.
For ease, DyadModelOptimizer also works with DyadData, which provides an interface for working with JuliaHub datasets as well as local files and raw data.
Let us demonstrate this with an example. We will use the dataset from getting started page.
data = DyadDataset("juliasimtutorials", "circuit_data", independent_var="timestamp", dependent_vars=["ampermeter.i(t)"])
experiment = Experiment(data, sys; overrides = [sys.capacitor2.v => 0.0], alg = Rodas5P(), abstol = 1e-6, reltol = 1e-5)
Experiment for circuit_model
with the following overrides:
capacitor2₊v(t) => 0.0.
The simulation of this experiment is given by:
ODEProblem with uType Vector{Float64} and tType Float64. In-place: true
Initialization status: FULLY_DETERMINED
Non-trivial mass matrix: true
timespan: (0.0, 0.1)
We can see that we only need to pass in the DataSet
object directly into the Experiment
constructor to use the data.
DyadData interface
DyadData.DyadDataset
— TypeDyadDataset(
filepath::AbstractString = "";
independent_var::AbstractString,
dependent_vars::Vector{<:AbstractString},
kwargs...)
Represent a timeseries-like dataset that is backed by a local file.
Keyword arguments
independent_var
: the name of the column that represents the independent variable (usually the time)dependent_vars
: a vector of the names of the columns for the dependent variables
When reading files (local file option or a downloaded JuliaHub dataset), CSV.jl is used. Additional keyword arguments passed to this function will be passed on to CSV.read
. This can help with changing settings such as the delimiter used in the file. See https://csv.juliadata.org/stable/reading.html for more details.
DyadDataset(
username::AbstractString,
name::AbstractString;
independent_var::AbstractString,
dependent_vars::Vector{<:AbstractString},
kwargs...)
Represent a timeseries-like dataset that is backed by a JuliaHub dataset.
Keyword arguments
independent_var
: the name of the column that represents the independent variable (usually the time)dependent_vars
: a vector of the names of the columns for the dependent variables
When reading files (local file option or a downloaded JuliaHub dataset), CSV.jl is used. Additional keyword arguments passed to this function will be passed on to CSV.read
. This can help with changing settings such as the delimiter used in the file. See https://csv.juliadata.org/stable/reading.html for more details.
DyadDataset(
data;
independent_var::AbstractString,
dependent_vars::Vector{<:AbstractString},
kwargs...)
Represent a timeseries-like dataset that is backed by raw data.
Keyword arguments
independent_var
: the name of the column that represents the independent variable (usually the time)dependent_vars
: a vector of the names of the columns for the dependent variables
DyadData.build_dataframe
— Functionbuild_dataframe(d::DyadDataset)
Build a DataFrame
out of the specified timeseries dataset. The column names will correspond to the names of the independent variable & the ones for the dependent variables. Note that the order of the columns is dictated by the order in the file, not by the order inside the dependent_vars
argument for DyadDataset
. The dependent_vars
argument only specifies the available columns to use, not their order.