Data Types

Problem & Model

The BosipProblem structure contains all information about the inference problem, as well as the model hyperparameters.

BOSIP.BosipProblem — Type

BosipProblem(X, Y; kwargs...)
BosipProblem(::ExperimentData; kwargs...)

Defines the likelihood-free inference problem and stores all data.

Args

The initial data are provided either as two column-wise matrices X and Y with inputs and outputs of the simulator respectively, or as an instance of BOSS.ExperimentData.

Currently, at least one datapoint has to be provided (purely for implementation reasons).

Kwargs

f::Any: The simulation to be queried for data.
domain::Domain: The parameter domain of the problem.
acquisition::BosipAcquisition: Defines the acquisition function.
model::SurrogateModel: The surrogate model to be used to model the proxy δ.
likelihood::Likelihood: The likelihood of the experiment observation z_o.
x_prior::MultivariateDistribution: The prior p(x) on the input parameters.
y_sets::Union{Nothing, Matrix{Bool}}: Optional parameter intended for advanced usage. The binary columns define subsets y_1, ..., y_m of the observation dimensions within y. The algorithm then trains multiple posteriors p(θ|y_1), ..., p(θ|y_m) simultaneously. The posteriors can be compared after the run is completed to see which observation subsets are most informative.

source

Likelihood

The abstract type Likelihood represents the likelihood distribution of the observation z_o.

BOSIP.Likelihood — Type

Represents the assumed likelihood of the experiment observation $z_o$.

See also the MonteCarloLikelihood for a simplified interface for likelihoods.

Defining a Custom Likelihood

To define a custom likelihood, create a new subtype of Likelihood and implement the following API;

Each subtype of Likelihood should implement:

loglike(::Likelihood, δ::AbstractVector{<:Real}, [x::AbstarctVector{<:Real}])
log_likelihood_mean(::Likelihood, ::BosipProblem, ::ModelPosterior)

Each subtype of Likelihood should implement at least one of:

log_sq_likelihood_mean(::Likelihood, ::BosipProblem, ::ModelPosterior)
log_likelihood_variance(::Likelihood, ::BosipProblem, ::ModelPosterior)

Additionally, the following method is also necessary to implement if BosipProblem where !isnothing(problem.y_sets) is used:

get_subset(::Likelihood, y_set::AsbtractVector{<:Bool}):

The following additional methods are provided by default and need not be implemented:

log_approx_likelihood(::Likelihood, ::BosipProblem, ::ModelPosterior)
like(::Likelihood, δ::AbstractVector{<:Real}, [x::AbstractVector{<:Real}])

source

To implement a custom likelihood, either subtype Likelihood directly and implement its full interface, or alternatively subtype MonteCarloLikelihood, which provides a simplified interface. The full Likelihood interface can be used to define closed-form solutions for the integrals required to calculate the expected likelihood and its variance with respect to the surrogate model uncertainty. If one subtypes the MonteCarloLikelihood, these integrals are automatically approximated using MC integration.

BOSIP.MonteCarloLikelihood — Type

MonteCarloLikelihood <: Likelihood

An abstract type for simplified definition of likelihoods in comparison to the default Likelihood interface.

Consider defining a custom likelihood by subtyping Likelihood and implementing the full interface to provide closed-form solutions for the integrals in log_likelihood_mean, log_sq_likelihood_mean, log_likelihood_variance.

Defining a Custom Monte Carlo Likelihood

Each subtype of MonteCarloLikelihood should implement:

loglike(::MonteCarloLikelihood, δ::AbstractVector{<:Real}, [x::AbstractVector{<:Real}]) -> ::Real
mc_samples(::MonteCarloLikelihood) -> ::Int

The rest of the Likelihood interface is already implemented via Monte Carlo integration.

source

Alternatively, one can simply instantiate the CustomLikelihood and provide the mapping from the modeled variable to the log-likelihood. This is functionally equivalent to defining a new MonteCarloLikelihood subtype.

BOSIP.CustomLikelihood — Type

CustomLikelihood(; log_ψ::Function)

A custom likelihood defined via providing the log-likelihood mapping $log_ψ(δ, x) ↦ log p(z_o|δ)$, where $z_o$ is the observation, $δ$ is the proxy variable modeled by the surrogate model, and $x$ are the input parameters (which will usually not be used for the calculation).

The parameters $x$ are provided for special cases, where some transformation of the modeled variable is used, which is based on the input parameters.

Keywords

log_ψ::Function: A function log(ℓ) = log_ψ(δ, x) computing the log-likelihood for a given model output δ and input parameters x. Here, δ is the proxy variable modeled by the surrogate model and x are the input parameters (which will usually not be used for the calculation).
mc_samples::Int = 1000: Number of Monte Carlo samples to use when computing the expected log-likelihood and its variance.

source

A list of some predefined likelihoods follows;

The NormalLikelihood assumes that the observation z_o has been drawn from a Gaussian distribution with a known diagonal covariance matrix with the std_obs values on the diagonal. The simulator is used to learn the mean function.

BOSIP.NormalLikelihood — Type

NormalLikelihood(; z_obs, std_obs)

The observation is assumed to have been generated from a normal distribution as z_o \sim Normal(f(x), Diagonal(std_obs)). We can use the simulator to query y = f(x).

Kwargs

z_obs::Vector{Float64}: The observed values from the real experiment.
std_obs::Union{Vector{Float64}, Nothing}: The standard deviations of the Gaussian observation noise on each dimension of the "ground truth" observation. (If the observation is considered to be generated from the simulator and not some "real" experiment, provide std_obs = nothing` and the adaptively trained simulation noise deviation will be used in place of the experiment noise deviation as well. This may be the case for some toy problems or benchmarks.)

source

The LogNormalLikelihood assumes that the observation z_o has been drawn from a log-normal distribution with a known diagonal covariance matrix with the std_obs values on the diagonal. The simulator is used to learn the mean function.

BOSIP.LogNormalLikelihood — Type

LogNormalLikelihood(; kwargs...)

The observation z is assumed to follow a log-normal distribution with the expected value \mathbf{E}[y] = z_obs and the fixed coefficient of variation CV, where y is the true response variable (without observation noise).

We assume that the surrogate model approximates the log-response log(y) = log(f(x)). Modeling the log-response is more suitable as y is strictly positive. Accordingly, the observation is provided in the log-space as log(z_obs) to avoid confusion. (This way, the simulator log(y) = log(f(x)) should return similar values to log(z_obs).)

Multiple dimensions of the observation z are assumed to be independent.

This likelihood model corresponds to many physical applications with measurement diagnostics with a relative error (e.g. "± 20%") rather than an absolute error (e.g. "± 0.1").

Kwargs

log_z_obs::Vector{Float64}: Log of the observed values from the real experiment.
CV::Vector{Float64}: The coefficients of variation of the observations describing the relative observation error. (If a measurement device is described to have precision "± 20%", this usually means that ~95% of the measurements fall within 20% of the true value, which corresponds to CV = 0.2 / 2 = 0.1.)

source

The BinomialLikelihood assumes that the observation z_o has been drawn from a Binomial distribution with a known number trials. The simulator is used to learn the probability parameter p as a function of the input parameters. The expectation over this likelihood (in case one wants to use posterior_mean and/or posterior_variance) is calculated via simple numerical integration on a predefined grid.

BOSIP.BinomialLikelihood — Type

BinomialLikelihood(; z_obs, trials, kwargs...)

The observation is assumed to have been generated from a Binomial distribution as z_o \sim Binomial(trials, f(x)). We can use the simulator to query y = f(x).

The simulator should only return values between 0 and 1. The GP estimates are clamped to this range.

Kwargs

z_obs::Vector{Int64}: The observed values from the real experiment.
trials::Vector{Int64}: The number of trials for each observation dimension.
int_grid_size::Int64: The number of samples used to approximate the expected likelihood.

source

The ExpLikelihood assumes that the function f of the BosipProblem already maps the parameters $x$ to the log-likelihood $\log p(z_o|y)$. Thus, the ExpLikelihood only exponentiates the surrogate model output $\delta$ to obtain the likelihood value.

BOSIP.ExpLikelihood — Type

ExpLikelihood()

Assumes the model approximates the log-likelihood directly (as a scalar). Only exponentiates the model prediction.

source

Acquisition Function

The abstract type BosipAcquisition represents the acquisition function.

BOSIP.BosipAcquisition — Type

An abstract type for BOSIP acquisition functions.

Required API for subtypes of BosipAcquisition:

Implement method (::CustomAcq)(::Type{<:UniFittedParams}, ::BosipProblem, ::BosipOptions) -> (x -> ::Real).

Optional API for subtypes of BosipAcquisition:

Implement method (::CustomAcq)(::Type{<:MultiFittedParams}, ::BosipProblem, ::BosipOptions) -> (x -> ::Real). A default fallback is provided for MultiFittedParams, which averages individual acquisition functions for each sample.

source

The MaxVar can be used to solve LFI problems. It maximizes the posterior variance to select the next evaluation point.

BOSIP.MaxVar — Type

MaxVar()

Selects the new evaluation point by maximizing the variance of the posterior approximation.

source

BOSIP.LogMaxVar — Type

LogMaxVar()

Selects the new evaluation point by maximizing the log variance of the posterior approximation.

The LogMaxVar acquisition is functionally equivalent to MaxVar. Using MaxVar or LogMaxVar can be more/less suitable in different scenarios. Switching between the two can help with numerical stability.

source

The IMMD acquisition maximizes the Integrated MMD as a proxy to the Expected Integrated Information Gain. That is; it attempts to minimize the entropy of the current distribution over the possible parameter posteriors (which is implicitly given by the surrogate model posterior). However, since calculating the KLD is too challenging, MMD is used instead. Beware, that there are no theoretical guarantees about this approximation though.

BOSIP.IMMD — Type

IMMD(; kwargs...)

Selects new data point by maximizing the Integrated MMD (IMMD), where MMD stands for maximum mean discrepancy.

This acquisition function is (loosely) based on information gain. Ideally, we would like to calculate the mutual information between the new data point (a vector-valued random variable from a multivariate distribution given by the GPs) and the posterior approximation (a "random function" from a infinite-dimensional distribution).

Calculating mutual information of an infinite-dimensional variable is infeasible. Thus, we calculate the mutual information of the new data point and the posterior probability value at a single point x, integrated over x. This integral is still infeasible, but can be approximated by Monte Carlo integration.

Mutual information is calculated as the Kullback-Leibler divergence (KLD) of the joint and marginal distributions of the two variables. Instead of the KLD distance, we use the MMD distance, as it can be readily estimated from samples. Finally, instead of the MMD between the joing and marginal distributions, we can calculate the HSIC (Hilbert-Schmidt independence criterion) of the two variables.

In conclusion, instead of the mutual information of the new data point (vector-valued random variable) and the posterior pdf (a function-valued random variable), we calculate the HSIC between the new data point and some point x on the domain, integrated over x.

Kwargs

y_samples::Int64: The amount of samples drawn from the joint and marginal distributions to estimate the HSIC value.
x_samples::Int64: The amount of samples used to approximate the integral over the parameter domain.
x_proposal::MultivariateDistribution: This distribution is used to sample parameter samples used to numerically approximate the integral over the parameter domain.
y_kernel::Kernel: The kernel used for the samples of the new data point.
p_kernel::Kernel: The kernel used for the posterior function value samples.

source

The MWMV can be used to solve LFSS problems. It maximizes the "mass-weighted mean variance" of the posteriors given by the different sensor sets.

BOSIP.MWMV — Type

MWMV(; kwargs...)

The Mass-Weighted Mean Variance acquisition function.

Selects the next evaluation point by maximizing a weighted average of the variances of the individual posterior approximations given by different sensor sets. The weights are determined as the total probability mass of the current data w.r.t. each approximate posterior.

Keywords

samples::Int: The number of samples used to estimate the evidence.

source

Termination Condition

The abstract type BosipTermCond represents the termination condition for the whole BOSIP procedure. Additionally, any BOSS.TermCond from the BOSS.jl package can be used with BOSIP.jl as well, and it will be automatically converted to a BosipTermCond.

BOSIP.BosipTermCond — Type

An abstract type for BOSIP termination conditions.

Implementing custom termination condition:

Create struct CustomTermCond <: BosipTermCond
Implement method (::CustomTermCond)(::BosipProblem) -> ::Bool

source

The most basic termination condition is the BOSS.IterLimit, which can be used to simply terminate the procedure after a predefined number of iterations.

BOSIP.jl provides two specialized termination conditions; the AEConfidence, and the UBLBConfidence. Both of them estimate the degree of convergence by comparing confidence regions given by two different approximations of the posterior.

BOSIP.AEConfidence — Type

AEConfidence(; kwargs...)

Calculates the q-confidence region of the expected and the approximate posteriors. Terminates after the IoU of the two confidence regions surpasses r.

Keywords

max_iters::Union{Nothing, <:Int}: The maximum number of iterations.
samples::Int: The number of samples used to approximate the confidence regions and their IoU ratio. Only has an effect if isnothing(xs).
xs::Union{Nothing, <:AbstractMatrix{<:Real}}: Can be used to provide a pre-sampled set of parameter samples from the x_prior defined in BosipProblem.
q::Float64: The confidence value of the confidence regions. Defaults to q = 0.95.
r::Float64: The algorithm terminates once the IoU ratio surpasses r. Defaults to r = 0.95.

source

BOSIP.UBLBConfidence — Type

UBLBConfidence(; kwargs...)

Calculates the q-confidence region of the UB and LB approximate posterior. Terminates after the IoU of the two confidence intervals surpasses r. The UB and LB confidence intervals are calculated using the GP mean +- n GP stds.

Keywords

max_iters::Union{Nothing, <:Int}: The maximum number of iterations.
samples::Int: The number of samples used to approximate the confidence regions and their IoU ratio. Only has an effect if isnothing(xs).
xs::Union{Nothing, <:AbstractMatrix{<:Real}}: Can be used to provide a pre-sampled set of parameter samples from the x_prior defined in BosipProblem.
n::Float64: The number of predictive deviations added/substracted from the GP mean to get the two posterior approximations. Defaults to n = 1..
q::Float64: The confidence value of the confidence regions. Defaults to q = 0.8.
r::Float64: The algorithm terminates once the IoU ratio surpasses r. Defaults to r = 0.8.

source

Miscellaneous

The BosipOptions structure can be used to define miscellaneous settings of BOSIP.jl.

BOSIP.BosipOptions — Type

BosipOptions(; kwargs...)

Stores miscellaneous settings.

Keywords

info::Bool: Setting info=false silences the algorithm.
debug::Bool: Set debug=true to print stactraces of caught optimization errors.
parallel_evals::Symbol: Possible values: :serial, :parallel, :distributed. Defaults to :parallel. Determines whether to run multiple objective function evaluations within one batch in serial, parallel, or distributed fashion. (Only has an effect if batching AM is used.)
callback::Union{<:BossCallback, <:BosipCallback}: If provided, the callback will be called before the BO procedure starts and after every iteration.

source

The abstract type BosipCallback can be derived to define a custom callback, which will be called once before the BOSIP procedure starts, and subsequently in every iteration.

For an example usage of this functionality, see the example in the package repository, where a custom callback is used to create the plots.

BOSIP.BosipCallback — Type

If a callback cb of type BosipCallback is defined in BosipOptions, the method cb(::BosipProblem; kwargs...) will be called in every iteration.

cb(problem::BosipProblem;
    model_fitter::BOSS.ModelFitter,
    acq_maximizer::BOSS.AcquisitionMaximizer,
    term_cond::TermCond,                        # either `BOSS.TermCond` or a `BosipTermCond` wrapped into `TermCondWrapper`
    options::BossOptions,
    first::Bool,
)

source

Samplers

The subtypes of DistributionSampler can be used to draw samples from the trained parameter posterior distribution.

BOSIP.DistributionSampler — Type

DistributionSampler

Subtypes of DistributionSampler are used to sample from a probability distribution.

Each subtype of DistributionSampler should implement:

sample_posterior(::DistributionSampler, logpost::Function, domain::Domain, count::Int; kwargs...) -> (X, ws)

Each subtype of DistributionSampler may additionally implement:

sample_posterior(::DistributionSampler, loglike::Function, prior::MultivariateDistribution, domain::Domain, count::Int; kwargs...) -> (X, ws)

source

BOSIP.PureSampler — Type

PureSampler <: DistributionSampler

A DistributionSampler which samples directly from the provided pdf, and always returns samples with uniform weights.

source

BOSIP.WeightedSampler — Type

WeightedSampler <: DistributionSampler

A DistributionSampler which does not sample directly from the pdf, but instead returns samples with non-uniform weights correcting for the sampling bias.

source

In particular, the following distribution samplers are currently provided.

BOSIP.RejectionSampler — Type

RejectionSampler(; kwargs...)

A sampler that uses trivial rejection sampling to draw samples from the posterior distribution.

Keywords

logpdf_maximizer::LogpdfMaximizer: The optimizer used to find the maximum logpdf value.

source

BOSIP.TuringSampler — Type

TuringSampler <: DistributionSampler(; kwargs...)

Aggregates settings for the sample_posterior function, which uses the Turing.jl package.

Keywords

sampler::Any: The sampling algorithm used to draw the samples.
warmup::Int: The amount of initial unused 'warmup' samples in each chain.
chain_count::Int: The amount of independent chains sampled.
leap_size: Every leap_size-th sample is used from each chain. (To avoid correlated samples.)
parallel: If parallel=true then the chains are sampled in parallel.

Sampling Process

In each sampled chain;

The first warmup samples are discarded.
From the following leap_size * samples_in_chain samples each leap_size-th is kept.

Then the samples from all chains are concatenated and returned.

Total drawn samples: 'chaincount * (warmup + leapsize * samplesinchain)' Total returned samples: 'chaincount * samplesin_chain'

source

BOSIP.AMISSampler — Type

AMIS(; kwargs...)

Adaptive Metropolis Importance Sampling (AMIS) sampler for posterior distributions.

The sampler first aproximates the posterior distribution by a Laplace approximation centered on the maximum of the posterior, or with a Gaussian mixture model, and draws samples from it in the 0th iteration.

Afterwards, the AMIS algorithm is run for iters iterations with a simple Gaussian proposal distribution re-fitted in each iteration.

Keywords

iters::Int: Number of iterations of the AMIS algorithm.
proposal_fitter::DistributionFitter: The algorithm used to re-fit the proposal distribution in each iteration. Defaults to the AnalyticalFitter.
gauss_mix_options::Union{Nothing, GaussMixOptions}: Options for the Gaussian mixture approximation used for the 0th iteration. Defaults to nothing, which means the Laplace approximation is used instead.

source

Evaluation Metric

The subtypes of DistributionMetric can be used to evaluate the quality of the learned parameter posterior distribution.

BOSIP.DistributionMetric — Type

Subtypes of DistributionMetric are used to evaluate the quality of the posterior approximation.

The DistributionMetrics are grouped into two categories; SampleMetric and PDFMetric.

source

BOSIP.SampleMetric — Type

SampleMetric is a subtype of DistributionMetric that evaluates the quality of the posterior approximation based on samples drawn from the true and approximate posteriors.

Each subtype of SampleMetric should implement:

calculate_metric(::DistributionMetric, true_samples::AbstractMatrix{<:Real}, approx_samples::AbstractMatrix{<:Real}; kwargs...) -> ::Real

source

BOSIP.PDFMetric — Type

PDFMetric is a subtype of DistributionMetric that evaluates the quality of the posterior approximation based on the log-probability density functions (logpdfs) of the true and approximate posteriors.

Each subtype of PDFMetric should implement:

calculate_metric(::DistributionMetric, true_logpost::Function, approx_logpost::Function; kwargs...) -> ::Real

source

In particular, the following metrics are currently provided.

BOSIP.MMDMetric — Type

MMDMetric(; kwargs...)

Measures the quality of the posterior approximation by sampling from the true posterior and the approximate posterior and calculating the Maximum Mean Discrepancy (MMD) between the two sample sets.

Keywords

kernel::Kernel: The kernel used to calculate the MMD. It is important to choose appropriate lengthscales for the kernel.

source

BOSIP.OptMMDMetric — Type

OptMMDMetric(; kwargs...)

Measures the quality of the posterior approximation by sampling from the true posterior and the approximate posterior and calculating the Maximum Mean Discrepancy (MMD).

In constrast to MMDMetric, this metric optimizes the kernel lengthscales automatically during each evaluation of the metric.

Keywords

kernel::Kernel: The kernel used to calculate the MMD. (Provide a kernel without lengthscales as they are optimized automatically.)
bounds::AbstractBounds: The domain bounds of the BosipProblem.
algorithm: The optimization algorithm used to optimize the kernel lengthscales.
kwargs...: Additional keyword arguments passed to the optimization algorithm.

source

BOSIP.TVMetric — Type

TVMetric(; kwargs...)

Measures the quality of the posterior approximation by approximating the Total Variation (TV) distance based on a precomputed parameter grid.

Keywords

grid::Matrix{Float64}: The parameter grid used to approximate the TV integral.
log_ws::Vector{Float64}: The log-weights for the grid points. Should be 1 / q(x), where q(x) is the probability density function of the distribution used to sample the grid points. (1 / domain_area is appropriate for an evenly distributed grid) (It is also possible to provide the non-logarithmic weights ws instead.)
true_logpost::Function: The log-pdf of the true posterior distribution. If provided, the log-pdf values on the grid are cached, which greatly improves performance.

source

References

[1] Gutmann, Michael U., and Jukka Cor. "Bayesian optimization for likelihood-free inference of simulator-based statistical models." Journal of Machine Learning Research 17.125 (2016): 1-47.

[2] Järvenpää, Marko, et al. "Efficient acquisition rules for model-based approximate Bayesian computation." (2019): 595-622.