Guide

In this section we will show step by step how PathWeightSampling.jl can be used to compute the mutual information for a simple model of gene expression.

Setting Up the System

The model considered in this example consists of four reactions:

                κ
reaction 1:  ∅ ---> S

                λ
reaction 2:  S ---> ∅

                ρ
reaction 3:  S ---> S + X

                μ
reaction 4:  X ---> ∅

The first two reactions specify the evolution of the input signal S, and last two reactions specify the evolution of the output X. Thus, both the input and output signal are modeled as a simple birth-death process, however the birth rate of X increases with higher copy numbers of S.

The first step is to create the system that we are going to use. The simple gene expression model shown above is already included as an example in PathWeightSampling.jl and can be directly used as follows:

using PathWeightSampling

system = PathWeightSampling.gene_expression_system()

SimpleSystem with 4 reactions
Input variables: S(t)
Output variables: X(t)
Initial condition:
    S(t) = 50
    X(t) = 50
Parameters:
    κ = 50.0
    λ = 1.0
    ρ = 10.0
    μ = 10.0

The result is a system consisting of the 4 reactions mentioned above and default values for the initial condition and the parameters that specify the reaction rates.

Generating and Plotting Trajectories

We can generate a configuration of this system. A configuration is a combination of an input trajectory and an output trajectories. Using generate_configuration we can create a configuration by first simulating an input trajectory and then use that input trajectory to simulate a corresponding output trajectory.

conf = generate_configuration(system)

PathWeightSampling.SXconfiguration{StaticArrays.SVector{1, Int64}, StaticArrays.SVector{1, Int64}, Float64}(PathWeightSampling.Trajectory{StaticArrays.SVector{1, Int64}, Float64}(StaticArrays.SVector{1, Int64}[[50], [51], [52], [53], [52], [51], [52], [53], [54], [55]  …  [62], [61], [60], [59], [58], [57], [56], [57], [56], [55]], [0.017066157852485137, 0.01773655268442671, 0.01841403454153793, 0.024533955828140046, 0.03353816482729357, 0.06493163934905688, 0.06500850142592382, 0.06535439197396897, 0.06860776913704285, 0.07627806184167915  …  1.9336113414471015, 1.9380607015247613, 1.9387806054631993, 1.9394284917165354, 1.9405741315395435, 1.9617205860147646, 1.977517554359468, 1.9847841930381107, 1.9908957717563711, 2.0], [1, 1, 1, 2, 2, 1, 1, 1, 1, 2  …  2, 2, 2, 2, 2, 2, 1, 2, 2, 0]), PathWeightSampling.Trajectory{StaticArrays.SVector{1, Int64}, Float64}(StaticArrays.SVector{1, Int64}[[50], [49], [48], [47], [48], [47], [48], [49], [50], [51]  …  [58], [59], [60], [61], [62], [63], [62], [61], [60], [59]], [5.383280070821908e-5, 0.00015355815999044278, 0.0024926247925441, 0.0028680863818435764, 0.0032775631261098507, 0.006735265252268677, 0.007460182082917555, 0.009847199597742193, 0.011250435403179639, 0.01328022857068524  …  1.99055446682675, 1.99098338931463, 1.9914128245617542, 1.9921395219344515, 1.9930184984361838, 1.9945256806084624, 1.9961035140615497, 1.9981334658934224, 1.999080443292919, 2.0], [4, 4, 4, 3, 4, 3, 3, 3, 3, 4  …  3, 3, 3, 3, 3, 4, 4, 4, 4, 0]))

Let us plot the generated configuration:

using Plots
plot(conf)

We see a plot of the generated input and output trajectories that make up the configuration.

The individual trajectories of the configuration can also be accessed directly:

input_traj = conf.s_traj
output_traj = conf.x_traj
p1 = plot(input_traj, label="input")
p2 = plot(output_traj, label="output")
plot(p1, p2, layout = (2, 1))

Computing the Trajectory Mutual Information

For our system we can compute the trajectory mutual information straightforwardly.

result = PathWeightSampling.mutual_information(system, DirectMCEstimate(256), num_samples=100)


Progress:   2%|▌                          |  ETA: 0:08:14 ( 5.04  s/it)
Progress:  23%|██████▎                    |  ETA: 0:00:43 ( 0.56  s/it)
Progress:  36%|█████████▊                 |  ETA: 0:00:25 ( 0.40  s/it)
Progress:  50%|█████████████▌             |  ETA: 0:00:16 ( 0.31  s/it)
Progress:  63%|█████████████████          |  ETA: 0:00:10 ( 0.27  s/it)
Progress:  76%|████████████████████▌      |  ETA: 0:00:06 ( 0.24  s/it)
Progress:  90%|████████████████████████▎  |  ETA: 0:00:02 ( 0.22  s/it)
Progress: 100%|███████████████████████████| Time: 0:00:20 ( 0.21  s/it)

This performs a full PWS Monte Carlo simulation and displays a progress bar during the computation. Naturally, the PWS.mutual_information takes the system as its first argument. The second argument is an object specifying the marginalization algorithm to use for computing the marginal trajectory probability. Here we chose the simple brute-force DirectMC algorithm with $M=256$ samples. Thus, we compute a "Direct PWS" estimate. The final keyword argument is the overall number of Monte Carlo samples to use for estimating the mutual information. This is the number of samples taken in the outer Monte Carlo simulation as opposed to the $M=256$ samples taken in the inner Monte Carlo loop.

result is a DataFrame containing the simulation results. We can display the individual Monte Carlo samples:

plot(
    system.dtimes,
    result.MutualInformation,
    color=:black,
    linewidth=0.2,
    legend=false,
    xlabel="trajectory duration",
    ylabel="mutual information (nats)"
)

The final Monte Carlo estimate is simply the mean of the individual samples:

using Statistics
plot(
    system.dtimes,
    mean(result.MutualInformation),
    color=:black,
    linewidth=2,
    legend=false,
    xlabel="trajectory duration",
    ylabel="mutual information (nats)"
)

Note that since we only used 100 MC samples the fluctuation of the result is relatively large. To judge the statistical error due to the number of Monte Carlo samples, we can additionally plot error bars. A common error measure in Monte Carlo simulations is the "standard error of the mean", defined as the standard deviation divided by the square root of the number of samples. We use this method to draw error bars.

sem(x) = std(x) / sqrt(length(x))
plot(
    system.dtimes,
    mean(result.MutualInformation),
    yerr=sem(result.MutualInformation),
    color=:black,
    linewidth=2,
    legend=false,
    xlabel="trajectory duration",
    ylabel="mutual information (nats)"
)

More Advanced Marginalization Strategies

So far we computed the mutual information using the brute-force Direct PWS algorithm. However, we can choose a different approach to perform the marginalization integrals. To change the marginalization strategy we simply pass a different algorithm as the second argument of PWS.mutual_information. The possible choices for the marginalization strategy are

DirectMCEstimate(m): The simple brute force marginalization using a Direct Monte Carlo estimate. The integer m specifies the number of samples to use per brute-force computation. This method works well for short trajectories but becomes exponentially worse for longer trajectories.
SMCEstimate(m): Improved computation of marginalization integrals using a sequential Monte Carlo technique (specifically using a particle filter). The integer m specifies the number of "particles" that are being propagated simultaneously. This method works much better than the DirectMCEstimate for long trajectories.
TIEstimate(burn_in, integration_nodes, num_samples): Use thermodynamic integration to compute the marginalization integrals. This will set up a number of MCMC simulations in path-space to perform the TI integral. burn_in specifies the number of initial samples from the MCMC simulation to be discarded, integration_nodes specifies the number of points to use in the Gaussian quadrature, and num_samples specifies the number of MCMC samples per integration node to generate.
AnnealingEstimate(subsample, num_temps, num_samples): Use annealed importance sampling to compute the marginalization integrals. This technique is very similar to thermodynamic integration and also uses MCMC simulations in path space. subsample specifies the number of Metropolis trials to perform before recording a new MCMC sample. num_temps sets how many different "temperatures" should be used for the annealing. num_samples is the number of MCMC samples to use per temperature setting.

We can compute the mutual information using each of these strategies and compare the results:

strategies = [
    DirectMCEstimate(128),
    SMCEstimate(128),
    TIEstimate(0, 8, 16),
    # AnnealingEstimate(0, 128, 1)
]
results = [PathWeightSampling.mutual_information(system, strat, num_samples=100, progress=false) for strat in strategies]

plot()
for (strat, r) in zip(strategies, results)
    plot!(
        system.dtimes,
        mean(r.MutualInformation),
        label=PathWeightSampling.name(strat),
        xlabel="trajectory duration",
        ylabel="mutual information (nats)"
    )
end

API Summary

Thus, the core function to estimate the trajectory mutual information is PathWeightSampling.mutual_information. A complete description of its arguments and return value is given below.

PathWeightSampling.mutual_information — Function

mutual_information(system, algorithm; num_samples=1, progress=true)

Perform a simulation to compute the mutual information between input and output trajectories of system.

Arguments

The required marginalization integrals to obtain the marginal probability $\mathcal{P}[\bm{x}]$ are performed using the specified algorithm.

Overall, num_samples Monte Carlo samples are performed. For each individual sample, one or mupltiple marginalization operations need to be performed.

If progress == true, a progress bar will be shown during the computation.

Returns

Returns a DataFrame containing the results of the simulation. This resulting DataFrame has 3 columns. Assuming, the returned value has been named result the columns can be accessed by:

result.MutualInformation: A vector of vectors that contains the results of the simulation. Each element of the outer vector is the result of a single Monte Carlo sample. Each element is a vector containing the trajectory mutual information estimates for each time specified in system.dtimes.
result.TimeMarginal: A vector containing, for each sample, the CPU time in seconds used for the computation of the marginal entropy.
result.TimeConditional: A vector containing, for each sample, the CPU time in seconds used for the computation of the conditional entropy.

source