Functions

This page contains documentation for all exported functions.

Training

Call the main function bosip! to run the BOSIP procedure, which sequentially queries the expensive blackbox simulator to learn the parameter posterior efficiently.

BOSIP.bosip!Function
bosip!(::BosipProblem; kwargs...)

Run the BOSIP method on the given BosipProblem.

The bosip! function is a wrapper for BOSS.bo!, which implements the underlying Bayesian optimization procedure.

Arguments

  • problem::BosipProblem: Defines the inference problem, together with all model hyperparameters.

Keywords

  • model_fitter::BOSS.ModelFitter: Defines the algorithm used to estimate the model hyperparameters.
  • acq_maximizer::BOSS.AcquisitionMaximizer: Defines the algorithm used to maximize the acquisition function in order to select the next evaluation point in each iteration.
  • term_cond::Union{<:BOSS.TermCond, <:BosipTermCond}: Defines the termination condition of the whole procedure.
  • options::BosipOptions: Can be used to specify additional miscellaneous options.

References

BOSS.bo!, BosipProblem, BosipAcquisition, BOSS.ModelFitter, BOSS.AcquisitionMaximizer, BOSS.TermCond, BosipTermCond, BosipOptions

Examples

See 'https://soldasim.github.io/BOSIP.jl/stable/example_lfi' for example usage.

source

Call the function estimate_parameters! to fit the model hyperparameters according to the current dataset. (One can also call bosip! with term_cond = IterLimit(0) to fit the hyperparameters without running any simulations. This will additionally only refit the model if the dataset changed since the last parameter estimation. In contrast, calling estimate_parameters! will always re-run the parameter estimation.)

BOSS.estimate_parameters!Function
estimate_parameters!(::BossProblem, ::ModelFitter)

Estimate the model parameters & hyperparameters using the given model_fitter algorithm.

Keywords

  • options::BossOptions: Defines miscellaneous settings.
source
estimate_parameters!(::BosipProblem, ::ModelFitter)

Estimate the hyperparameters of the model. Uses the provided ModelFitter to fit the hyperparameters of the model according to the data stored in the BosipProblem.

Keywords

  • options::BosipOptions: Defines miscellaneous settings.
source

Analytically compute the optimal estimate of the distribution parameters according to the given data xs.

This function is a part of the optional API of the ProposalDistribution and may not be implemented for every distribution.

source

Call the function maximize_acquisition to obtain a promising candidate for the next simulation.

BOSS.maximize_acquisitionFunction
x = maximize_acquisition(::BossProblem, ::AcquisitionMaximizer)

Maximize the given acquisition function via the given acq_maximizer algorithm to find the optimal next evaluation point(s).

Keywords

  • options::BossOptions: Defines miscellaneous settings.
source
x = maximize_acquisition(::BosipProblem, ::AcquisitionMaximizer)

Select parameters for the next simulation. Uses the provided AcquisitionMaximizer to maximize the acquisition function and find the optimal candidate parameters.

Keywords

  • options::BosipOptions: Defines miscellaneous settings.
source

Call the function eval_objective! to start a simulation run.

BOSS.eval_objective!Function
eval_objective!(::BossProblem, x::AbstractVector{<:Real})

Evaluate the objective function and update the data.

Keywords

  • options::BossOptions: Defines miscellaneous settings.
source
eval_objective!(::BosipProblem, x::AbstractVector{<:Real})

Evaluate the blackbox simulation for the given parameters x.

Keywords

  • options::BosipOptions: Defines miscellaneous settings.
source

Parameter Posterior & Likelihood

This section contains function used to obtain the trained parameter posterior/likelihood approximations.

The approx_posterior function can be used to obtain the (un)normalized approximate posterior $p(x|z_o) \propto p(z_o|x) p(x)$ obtained by substituting the predictive means of the GPs directly as the discrepancies from the true observation.

BOSIP.approx_posteriorFunction
approx_posterior(::BosipProblem; kwargs...)

Return the MAP estimation of the unnormalized approximate posterior $\hat{p}(z_o|x) p(x)$ as a function of $x$.

If normalize=true, the resulting posterior is approximately normalized.

The posterior is approximated by directly substituting the predictive means of the GPs as the discrepancies from the true observation and ignoring both the uncertainty of the GPs due to a lack of data and due to the simulator evaluation noise.

By using approx_posterior or posterior_mean one controls, whether to integrate over the uncertainty in the discrepancy estimate. In addition to that, by providing a ModelFitter{MAP} or a ModelFitter{BI} to bosip! one controls, whether to integrate over the uncertainty in the GP hyperparameters.

Keywords

  • normalize::Bool: If normalize is set to true, the evidence $\hat{p}(z_o)$is estimated by sampling and the normalized approximate posterior\hat{p}(z_o|x) p(x) / \hat{p}(z_o)` is returned instead of the unnormalized one.
  • xs::Union{Nothing, <:AbstractMatrix{<:Real}}: Can be used to provide a pre-sampled set of samples from the parameter prior $p(x)$ as a column-wise matrix. Only has an effect if normalize == true.
  • samples::Int: Controls the number of samples used to estimate the evidence. Only has an effect if normalize == true and isnothing(xs).

See Also

posterior_mean, posterior_variance, approx_likelihood

source

The posterior_mean function can be used to obtain the expected value of the (un)normalized posterior $\mathbb{E}\left[p(x|z_o)\right] \propto \mathbb{E}\left[p(z_o|x)p(x)\right]$ obtained by analytically integrating over the uncertainty of the GPs and the simulator.

BOSIP.posterior_meanFunction
posterior_mean(::BosipProblem; kwargs...)

Return the expectation of the unnormalized posterior $\mathbb{E}[\hat{p}(z_o|x) p(x)]$ as a function of $x$.

If normalize=true, the resulting expected posterior is approximately normalized.

The returned function maps parameters x to the expected posterior probability density value integrated over the uncertainty of the GPs due to a lack of data and due to the simulator evaluation noise.

By using approx_posterior or posterior_mean one controls, whether to integrate over the uncertainty in the discrepancy estimate. In addition to that, by providing a ModelFitter{MAP} or a ModelFitter{BI} to bosip! one controls, whether to integrate over the uncertainty in the GP hyperparameters.

Keywords

  • normalize::Bool: If normalize is set to true, the evidence $\hat{p}(z_o)$is estimated by sampling and the normalized expected posterior\mathbb{E}[\hat{p}(z_o|x) p(x)]` is returned instead of the unnormalized one.
  • xs::Union{Nothing, <:AbstractMatrix{<:Real}}: Can be used to provide a pre-sampled set of samples from the parameter prior $p(x)$ as a column-wise matrix. Only has an effect if normalize == true.
  • samples::Int: Controls the number of samples used to estimate the evidence. Only has an effect if normalize == true and isnothing(xs).

See Also

approx_posterior, posterior_variance, likelihood_mean

source

The posterior_variance function can be used to obtain the variance of the (un)normalized posterior $\mathbb{V}\left[p(x|z_o)\right] \propto \mathbb{V}\left[p(z_o|x)p(x)\right]$ obtained by analytically integrating over the uncertainty of the GPs and the simulator.

BOSIP.posterior_varianceFunction
posterior_variance(::BosipProblem; kwargs...)

Return the variance of the unnormalized posterior $\mathbb{V}[\hat{p}(z_o|x) p(x)]$ as a function of $x$.

If normalize=true, the resulting posterior variance is approximately normalized.

The returned function maps parameters x to the variance of the posterior probability density value estimate caused by the uncertainty of the GPs due to a lack of data and due to the simulator evaluation noise.

By providing a ModelFitter{MAP} or a ModelFitter{BI} to bosip! one controls, whether to compute the variance over the uncertainty in the GP hyperparameters as well.

Keywords

  • normalize::Bool: If normalize is set to true, the evidence $\hat{p}(z_o)$is estimated by sampling and the normalized posterior variance\mathbb{V}[\hat{p}(z_o|x) p(x) / \hat{p}(z_o)]` is returned instead of the unnormalized one.
  • xs::Union{Nothing, <:AbstractMatrix{<:Real}}: Can be used to provide a pre-sampled set of samples from the parameter prior $p(x)$ as a column-wise matrix. Only has an effect if normalize == true.
  • samples::Int: Controls the number of samples used to estimate the evidence. Only has an effect if normalize == true and isnothing(xs).

See Also

approx_posterior, posterior_mean, likelihood_variance

source

The approx_likelihood function can be used to obtain the approximate likelihood $p(z_o|x)$ obtained by substituting the predictive means of the GPs directly as the discrepancies from the true observation.

BOSIP.approx_likelihoodFunction
approx_likelihood(::BosipProblem)

Return the MAP estimation of the likelihood $\hat{p}(z_o|x)$ as a function of $x$.

The likelihood is approximated by directly substituting the predictive means of the GPs as the discrepancies from the true observation and ignoring both the uncertainty of the GPs due to a lack of data and due to the simulator evaluation noise.

By using approx_likelihood or likelihood_mean one controls, whether to integrate over the uncertainty in the discrepancy estimate. In addition to that, by providing a ModelFitter{MAP} or a ModelFitter{BI} to bosip! one controls, whether to integrate over the uncertainty in the GP hyperparameters.

See Also

likelihood_mean, likelihood_variance, approx_posterior

source
BOSIP.log_approx_likelihoodFunction
log_approx_likelihood(::Likelihood, ::BosipProblem, ::ModelPosterior)

Returns a function log_approx_like mapping $x$ to $log \hat{p}(z_o|x)$, with the following two methods:

  • log_approx_like(x::AbstractVector{<:Real}) -> ::Real
  • log_approx_like(X::AbstractMatrix{<:Real}) -> ::AbstractVector{<:Real}
source

The likelihood_mean function can be used to obtain the expected value of the likelihood $\mathbb{E}\left[p(z_o|x)\right]$ obtained by analytically integrating over the uncertainty of the GPs and the simulator.

BOSIP.likelihood_meanFunction
likelihood_mean(::BosipProblem)

Return the expectation of the likelihood approximation $\mathbb{E}[\hat{p}(z_o|x)]$ as a function of $x$.

The returned function maps parameters x to the expected likelihood probability density value integrated over the uncertainty of the GPs due to a lack of data and due to the simulator evaluation noise.

By using approx_likelihood or likelihood_mean one controls, whether to integrate over the uncertainty in the discrepancy estimate. In addition to that, by providing a ModelFitter{MAP} or a ModelFitter{BI} to bosip! one controls, whether to integrate over the uncertainty in the GP hyperparameters.

See Also

approx_likelihood, likelihood_variance, posterior_mean

source
BOSIP.log_likelihood_meanFunction
log_likelihood_mean(::Likelihood, ::BosipProblem, ::ModelPosterior)

Returns a function log_like_mean mapping $x$ to $log \mathbb{E}[ \hat{p}(z_o|x) | GP ]$, with the following two methods:

  • log_like_mean(x::AbstractVector{<:Real}) -> ::Real
  • log_like_mean(X::AbstractMatrix{<:Real}) -> ::AbstractVector{<:Real}
source

The likelihood_variance function can be used to obtain the variance of the likelihood $\mathbb{V}\left[p(z_o|x)\right]$ obtained by analytically integrating over the uncertainty of the GPs and the simulator.

BOSIP.likelihood_varianceFunction
likelihood_variance(::BosipProblem)

Return the variance of the likelihood approximation $\mathbb{V}[\hat{p}(z_o|x)]$ as a function of $x$.

The returned function maps parameters x to the variance of the likelihood probability density value estimate caused by the uncertainty of the GPs due to a lack of data and the uncertainty of the simulator due to the evaluation noise.

By providing a ModelFitter{MAP} or a ModelFitter{BI} to bosip! one controls, whether to compute the variance over the uncertainty in the GP hyperparameters as well.

See Also

approx_likelihood, likelihood_mean, posterior_variance

source
BOSIP.log_likelihood_varianceFunction
log_likelihood_variance(::Likelihood, ::BosipProblem, ::ModelPosterior)

Return a function log_like_var mapping $x$ to $log \mathbb{V}[ \hat{p}(z_o|x) | GP ]$, with the following two methods:

  • log_like_var(x::AbstractVector{<:Real}) -> ::Real
  • log_like_var(X::AbstractMatrix{<:Real}) -> ::AbstractVector{<:Real}
source

The evidence function can be used to approximate the evidence $p(z_o)$ of a given posterior function by sampling. It is advisable to use this estimate only in low parameter dimensions, as it will require many samples to achieve reasonable precision on high-dimensional domains.

The evidence is the normalization constant needed to obtain the normalized posterior. The evidence function is used to normalize the posterior if one calls approx_posterior, posterior_mean, or posterior_variance with normalize=true.

BOSIP.evidenceFunction
evidence(post, x_prior; kwargs...)

Return the estimated evidence $\hat{p}(z_o)$.

Arguments

  • post: A function ::AbstractVector{<:Real} -> ::Real representing the posterior $p(x|z_o)$.
  • x_prior: A multivariate distribution representing the prior $p(x)$.

Keywords

  • xs::Union{Nothing, <:AbstractMatrix{<:Real}}: Can be used to provide a pre-sampled set of samples from the x_prior as a column-wise matrix.
  • samples::Int: Controls the number of samples used to estimate the evidence. Only has an effect if isnothing(xs).
source

The functions like and loglike can be used to evaluate the likelihood value

Acquisition Function

The function construct_acquisition can be used to obtain the acquisition function.

Sampling from the Posterior

The sample_approx_posterior, sample_expected_posterior, and sample_posterior functions can be used to obtain approximate samples from the trained parameter posterior.

BOSIP.sample_approx_posteriorFunction
xs, ws = sample_approx_posterior(bosip::BosipProblem, sampler::DistributionSampler, count::Int; kwargs...)

Sample count samples from the approximate posterior of the BosipProblem using the specified sampler. Return a column-wise matrix of the drawn samples.

Keywords

  • options::BosipOptions: Miscellaneous preferences. Defaults to BosipOptions().

See Also

sample_expected_posterior, sample_posterior, resample

source
BOSIP.sample_expected_posteriorFunction
xs, ws = sample_approx_posterior(bosip::BosipProblem, sampler::DistributionSampler, count::Int; kwargs...)

Sample count samples from the expected posterior (i.e. the posterior mean) of the BosipProblem using the specified sampler. Return a column-wise matrix of the drawn samples.

Keywords

  • options::BosipOptions: Miscellaneous preferences. Defaults to BosipOptions().

See Also

sample_approx_posterior, sample_posterior, resample

source
BOSIP.sample_posteriorFunction
sample_posterior(::DistributionSampler, logpost::Function, domain::Domain, count::Int; kwargs...)
sample_posterior(::DistributionSampler, loglike::Function, prior::MultivariateDistribution, domain::Domain, count::Int; kwargs...)

Sample count samples from the given posterior log-density function.

Keywords

  • options::BosipOptions: Miscellaneous preferences. Defaults to BosipOptions().
source
BOSIP.resampleFunction
xs = resample(xs::AbstractMatrix{<:Real}, ws::AbstractVector{<:Real}, count::Int)

Resample count samples from the given data set xs weighted by the given weights ws with replacement to obtain a new un-weighted data set.

Some data points may repeat in the resampled data set. Increasing the sample size of the initial data set may help to reduce the number of repetitions.

source

The sampling is performed via the Turing.jl package. The Turing.jl package is a quite heavy dependency, so it is not loaded by default. To sample from the posterior, one has to first load Turing.jl as using Turing, which will also compile the sample_posterior function.

Confidence Sets

This section contains function used to extract approximate confidence sets from the posterior. It is advised to use these approximations only with low-dimensional parameter domains, as they will require many samples to reach reasonable precision in high-dimensional domains.

The find_cutoff function can be used to estimate some confidence set of a given posterior function.

BOSIP.find_cutoffFunction
c = find_cutoff(target_pdf, xs, q)
c = find_cutoff(target_pdf, xs, ws, q)

Estimate the cutoff value c such that the set {x | post(x) >= c} contains q of the total probability mass.

The value c is estimated based on the provided samples xs sampled according to the target_pdf.

Alternatively, one can provide samples xs sampled according to some proposal_pdf with corresponding importance weights ws = target_pdf.(eachcol(xs)) ./ proposal_pdf.(eachcol(xs)).

See Also

approx_cutoff_area set_iou

source

The approx_cutoff_area function can be used to estimate the ratio of the area of a confidence set given by sum cutoff constant (perhaps found by find_cutoff) and the whole domain.

BOSIP.approx_cutoff_areaFunction
V = approx_cutoff_area(target_pdf, xs, c)
V = approx_cutoff_area(target_pdf, xs, ws, c)

Approximate the ratio of the area where target_pdf(x) >= c relative to the whole support of target_pdf.

The are is estimated based on the provided samples xs sampled uniformly from the whole support of target_pdf.

Alternatively, one can provide samples xs sampled according to some proposal_pdf with corresponding importance weights ws = 1 ./ proposal_pdf.(eachcol(xs)).

See Also

find_cutoff set_iou

source

The set_iou function can be used to estimate the intersection-over-union (IoU) value between two sets.

BOSIP.set_iouFunction
iou = set_iou(in_A, in_B, x_prior, xs)

Approximate the intersection-over-union of two sets A and B.

The parameters in_A, in_B are binary arrays declaring which samples from xs fall into the sets A and B. The column-wise matrix xs contains the parameter samples. The samples have to be drawn from the common prior x_prior.

See Also

find_cutoff approx_cutoff_area

source

Plotting Posterior Marginals

The functions plot_marginals_int and plot_marginals_kde are provided to visualize the trained posterior. Both functions create a matrix of figures containing the approximate marginal posteriors of each pair of parameters and the individual marginals on the diagonal.

The function plot_marginals_int approximates the marginals by numerical integration whereas the function plot_marginals_kde approximates the marginals by kernel density estimation.

BOSIP.plot_marginals_intFunction
using CairoMakie
plot_marginals_int(::BosipProblem; kwargs...)

Create a matrix of plots displaying the marginal posterior distribution of each pair of parameters with the individual marginals of each parameter on the diagonal.

Approximates the marginals by numerically integrating the marginal integrals over a generated latin hypercube grid of parameter samples. The plots are normalized according to the plotting grid.

Also provides an option to plot "marginals" of different functions by using the func and normalize keywords.

Kwargs

  • func::Function: Defines the function which is plotted. The plotted function f is defined as f = func(::BosipProblem). Reasonable options for func include approx_posterior, posterior_mean, posterior_variance etc.
  • normalize::Bool: Specifies whether the plotted marginals are normalized. If normalize=false, the plotted values are simply averages over the random LHC grid. If normalize=true, the plotted values are additionally normalized sum to 1. Defaults to true.
  • lhc_grid_size::Int: The number of samples in the generate LHC grid. The higher the number, the more precise marginal plots.
  • plot_settings::PlotSettings: Settings for the plotting.
  • info::Bool: Set to false to disable prints.
  • display::Bool: Set to false to not display the figure. It is still returned.
  • matrix_ops::Bool: Set to false to disable the use of matrix operations for plotting the marginals is they are not supported for the given func. Disabling matrix operations can significantly hinder performance.
source
BOSIP.plot_marginals_kdeFunction
using CairoMakie, Turing
plot_marginals_kde(::BosipProblem; kwargs...)

Create a matrix of plots displaying the marginal posterior distribution of each pair of parameters with the individual marginals of each parameter on the diagonal.

Approximates the marginals by kernel density estimation over parameter samples drawn by MCMC methods from the Turing.jl package. The plots are normalized according to the plotting grid.

One should experiment with different kernel length-scales to obtain a good approximation of the marginals. The kernel and length-scales are provided via the kernel and lengthscale keyword arguments.

Kwargs

  • turing_options::TuringOptions: Settings for the MCMC sampling.
  • kernel::Kernel: The kernel used in the KDE.
  • lengthscale::Union{<:Real, <:AbstractVector{<:Real}}: The lengthscale for the kernel used in the KDE. Either provide a single length-scale used for all parameter dimensions as a real number, or provide individual length-scales for each parameter dimension as a vector of real numbers.
  • plot_settings::PlotSettings: Settings for the plotting.
  • info::Bool: Set to false to disable prints.
  • display::Bool: Set to false to not display the figure. It is still returned.
source

Utils

The function approx_by_gauss_mix together with the structure GaussMixOptions can be used to obtain a Gaussian mixture approximation the provided probability density function.

BOSIP.approx_by_gauss_mixFunction

Approximate the given posterior by a Gaussian mixture.

Find all modes via Optimization.jl, then approximate each mode with a mutlivariate Gaussian with mean in the mode and variance according to the second derivation of the true posterior in the mode.

source
BOSIP.GaussMixOptionsType
GaussMixOptions(; kwargs...)

Contains all hyperparameters for the function approx_by_gauss_mix.

Kwargs

  • algorithm: Optimization algorithm used to find the modes.
  • multistart::Int: Number of optimization restarts.
  • parallel::Bool: Controls whether the individual optimization runs are performed in paralell.
  • static_schedule::Bool: If static_schedule=true then the :static schedule is used for parallelization. This is makes the parallel tasks sticky (non-migrating), but can decrease performance.
  • autodiff::SciMLBase.AbstractADType: Defines the autodiff library used for the optimization. (Only relevant if a gradient-based optimizer is set as algorithm.)
  • cluster_ϵs::Union{Nothing, Vector{Float64}}: The minimum distance between modes. Modes which are too close to a "more important" mode are discarded. Also defines the minimum distance of a mode from a domain boundary.
  • rel_min_weight::Float64: The minimum pdf value of a mode to be considered relative to the highest pdf value among all found modes.
  • kwargs...: Other kwargs are passed to the optimization algorithm.
source