Using Pumas-QSP: Virtual populations

Pumas-QSP provides the methodological framework to create and use Virtual Populations (VP) in Julia.

Methodological background

A schematical overview of a potential user case of VP generation is presented in the figure below.

Workflow of VP generation

Once VPs are generated given some clinical data, we can use them to expand our analysis scope to dimensions where, for example, experimental data collection is difficult. This objetive is schematically shown in the figure below. Here, we summarise three groups:

  1. We can use the VP to model the sytem for new values of model parameters. Note that here we refer to model parameters not as parameters that specify a patient, but other, independent parameters of the model. As an example we have the reaction rate \alpha.
  2. We can use the VP to model the system for new initial conditions for the observed states of the sytem. Here, this relates to Protein A and B.
  3. We can use the VP to model states of the system that have not been observed. These are Protein C to H in this example.

Objective of VPs

To create VPs with Pumas-QSP, three main ingredients are needed:

  • the QSP model,
  • the clinical trial data
  • some cost specifications.

In the following, we demonstrate how to set up Pumas-QSP for a QSP modelling workflow.

Julia environment

First we prepare the environment by listening the packages we are using within the example.

# Packages used for this example
using PumasQSP
using Plots
using OrdinaryDiffEq
using ModelingToolkit
using DataFrames
using StatsPlots
using GlobalSensitivity
using Statistics

Model setup

As the example model we are using the Hires model with eight variables. Their dynamics are described by eight equations and default values are set for the variables and parameters. We use the ModelingTookit.jl package to specify the model in Julia.

# Defining QSP model
iv = @variables t # Independent variables
states = @variables y1(t) y2(t) y3(t) y4(t) y5(t) y6(t) y7(t) y8(t) # Dependent variables
ps = @parameters a b  # Parameters of the differential equations
D = Differential(t)
# The differential equations for the dependent variables
eqs = [ D(y1)  ~  (-a*y1 + 0.43*y2 + 8.32*y3 + 0.0007),
        D(y2)  ~  (a*y1 - 8.75*y2),
        D(y3)  ~  (-10.03*y3 + 0.43*y4 + 0.035*y5),
        D(y4)  ~  (8.32*y2 + a*y3 - 1.12*y4),
        D(y5)  ~  (-1.745*y5 + 0.43*y6 + 0.43*y7),
        D(y6)  ~  (-b*y6*y8 + 0.69*y4 + a*y5 - 0.43*y6 + 0.69*y7),
        D(y7)  ~  (b*y6*y8 - 1.81*y7),
        D(y8)  ~  (-b*y6*y8 + 1.81*y7)
# These are the values for the states and parameters
defs = Dict(y1 => 1.,
        y2 => 0.,
        y3 => 0.,
        y4 => 0.,
        y5 => 0.,
        y6 => 0.,
        y7 => 0.,
        y8 => 0.0057,
        a => 1.71,
        b => 280.0
# Model definition
@named model = ODESystem(eqs, t, states, ps; defaults = defs, checks = false)

The model is described by an ODE system. Simulating the model produces some time series data over a given time frame and this process would relate to 1. Simulating parameter combination \theta_i in the Figure. A convenient way to manage clinical trial data, is to use the package DataFrames.jl. We can use it to read data easily and efficiently.

# Reading csv file using dataframe package
data1 = read(<path-to-file>)

This step relates to 2. Data in the Figure. For the sake of clarity, however, we simulate this data in this example. Here, we are considering two clinical conditions. We simulate to model for them and use the solutions as observation. Subsequently we captured the two conditions via Tiral objects. The first trial describes the system in it's default settings over a set timespan.

# Clinical condition 1
trial1_tspan = (0.0, 10.)

# Data for trial 1
trail1_n = 10
trial1_saveat = range(trial1_tspan[1], trial1_tspan[2], length = trail1_n)
trial1_prob = ODEProblem(model, model.defaults, trial1_tspan, saveat = trial1_saveat)
trial1_sol = solve(trial1_prob, Tsit5(), reltol=1e-8, abstol=1e-8)
trial1_data = DataFrame(trial1_sol)

# Creating the Trial object for trial 1
trial1 = Trial(trial1_data, model;
	        tspan = trial1_tspan,
		alg = Tsit5(),
                reduction = sol -> mean(sol, dims = 2)

The second conditions has altered parameters trial2_ps and a different timespan trial2_tspan.

# Clinical condition 2
trial2_ps = [a => 1.61,
                b => 350.0]
trial2_tspan = (0.0, 20)

# Simulated data for this condition
trail2_n = 20
trial2_saveat = range(trial2_tspan[1], trial2_tspan[2], length = trail2_n)
trial2_prob = ODEProblem(model, model.defaults, trial2_tspan, trial2_ps, saveat = trial2_saveat)
trial2_sol = solve(trial2_prob, Tsit5(), reltol=1e-8, abstol=1e-8)
trial2_data = DataFrame(trial2_sol)

# Creating the Trial object for trial 2
trial2 = Trial(trial2_data, model;
                params = trial2_ps,
                tspan = trial2_tspan,
                        reduction = sol -> mean(sol, dims = 2)

Note: Pumas-QSP supports different types of trials. For clarity, here we focus on the simplest one. See … for more details.


The cost function is specified by the Pumas-QSP object QSPCost. Next to the model and the trials, the user needs to specify a search space. This is to constrain the optimiser in which values to consider as potential values for the parameter combinations.

# Creating a QSPCost object
cost = QSPCost(model, [trial1, trial2],
        search_space = [s1 => (1., 3), k1 => (0, 5)])

Once the QSP model and the trials are in the program, the package can generate virtual patients, i.e. find parameter combinations for the provided model, which describe the clinical trial data sufficiently well measured in terms of the specified cost. This is done based on a stochastic optimization approach. In each optimization step, an new parameter combination is selected, the model is simulated (i.e. the ODE problem is solved) and the ODE solution is compared to the clinical trial data to assess its fit. This process relates to 3. Optimising parameter combination \theta_i in the Figure.

# Creating virtual population
trials = [trial1, trial2]
range_y1 = (0.1, 3)
cost = QSPCost(model, trials, search_space = [y1 => range_y1])
vp = vpop(cost, StochGlobalOpt( maxiters = 10), population_size = 500)

With this vpop call, 500 seperate optimization runs are started leading to a VP with 500 patients. This relates to 4. Repeating stochastic optimization n times in the Figure.


We can conveniently analyse and visualise the results by relying on the ploting recipes for Pumas-QSP.

# Viusalization (1/2): Simulation results
plot(vp, trials[1], legend = :outerright, grid = "off")
plot(vp, trials[2], legend = :outerright, grid = "off")

# Viusalization (2/2): Parameters
params = DataFrame(vp)[!,"y1(t)"]
	plt = violin(params, ylim = range_y1, grid = :off, legend = :outerright, label = "VP", ylabel = "y1", color = "purple")

Using VP

Once we have created and validated our VP, we can use it to explore the system for new clinical conditions. For this, we can use the virtual_trial object.

# Creating virtual trial
vt_4 = virtual_trial(trials[trial_ind_new]; model,
	u0 =[y3 =>2.1],
	params = [a =>1.7],
	tspan = (0, 100.),
	saveat = 1.);