Getting started
Assume we have a simple model, fsens
, which depends on two parameters, a
and b
and produces two outputs, target1
and target2
.
fsens = (a,b) -> (;target1 = 10a + b -1, target2 = a + b -0.5)
Our knowledge about reasonable model parameters is encoded by a prior probability distribution. We can specify those by the kind of distribution, its mode and an upper quantile.
using SubglobalSensitivityAnalysis, Distributions
install_R_dependencies(["sensitivity"])
paramsModeUpperRows = [
(:a, LogNormal, 0.2 , 0.5),
(:b, LogitNormal, 0.7 , 0.9),
]
┌ Warning: RCall.jl: Loading required namespace: sensitivity
│ Registered S3 method overwritten by 'sensitivity':
│ method from
│ print.src dplyr
└ @ RCall ~/.julia/packages/RCall/YrsKg/src/io.jl:172
The output DataFrame reports
- the estimated index and confidence bounds (column value, cflower, cfupper)
- for each of the parameter/index_type/output combinations
We can provide this directly to estimate_subglobal_sobol_indices
below, or we estimate/specify distribution parameters directly in a DataFrame with column :dist
.
df_dist = fit_distributions(paramsModeUpperRows)
Row | par | dType | mode | upper | dist |
---|---|---|---|---|---|
Symbol | UnionAll | Float64 | Float64 | Distribu… | |
1 | a | LogNormal | 0.2 | 0.5 | Distributions.LogNormal{Float64}(μ=-1.60944, σ=0.467504) |
2 | b | LogitNormal | 0.7 | 0.9 | Distributions.LogitNormal{Float64}(μ=0.847298, σ=0.688751) |
While these distributions are reasonable for each parameter, there are probably parameter combinations that produce unreasonable results. Hence, we want to restrict our analysis to a parameter space around a central parameter vector, p0
.
p0 = Dict(:a => 0.34, :b => 0.6)
By default a range around p0
is created that covers 20% of the cumulative probability range, i.e a span of 0.2.
For this range 20% of the area under the probability density function is covered.
The design matrix for the sensitivity analysis is constructed in the cumulative densities and transformed to parameter values. For each of the parameter vectors of the design matrix an output is computed. Now the Sobol indices and their confidence ranges can be computed for this output.
All this encapsulated by function estimate_subglobal_sobol_indices
.
# note, for real analysis use a larger sample size
df_sobol = estimate_subglobal_sobol_indices(fsens, df_dist, p0; n_sample = 50)
df_sobol
Row | par | index | value | cf_lower | cf_upper | target |
---|---|---|---|---|---|---|
Symbol | Symbol | Float64 | Float64 | Float64 | Symbol | |
1 | a | first_order | 0.996736 | 0.929827 | 1.06364 | target1 |
2 | b | first_order | -0.127308 | -0.424991 | 0.170376 | target1 |
3 | a | total | 1.12624 | 0.818976 | 1.4335 | target1 |
4 | b | total | 0.00376793 | -0.130406 | 0.137942 | target1 |
5 | a | first_order | 0.719658 | 0.582963 | 0.856353 | target2 |
6 | b | first_order | 0.0455479 | -0.165535 | 0.256631 | target2 |
7 | a | total | 0.966002 | 0.694506 | 1.2375 | target2 |
8 | b | total | 0.328559 | 0.0935427 | 0.563574 | target2 |
The resulting DataFrame reports:
- the estimated Sobol indices and their confidence bounds (columns value, cflower, cfupper)
- for all the combinations of parameter, which index, and output (columns par, index, target)