Distribution Fitting to aggregate statistics

This package provides method to fit a distribution to a given set of aggregate statistics.

# to specified moments
d = fit(LogNormal, Moments(3.0,4.0))
(mean(d), var(d)) .≈ (3.0, 4.0)

# to mean and upper quantile point
d = fit(LogNormal, 3, @qp_uu(8))
(mean(d), quantile(d, 0.975)) .≈ (3.0, 8.0)

# to mode and upper quantile point
d = fit(LogNormal, 3, @qp_uu(8), Val(:mode))
(mode(d), quantile(d, 0.975)) .≈ (3.0, 8.0)

# to two quantiles, i.e confidence range
d = fit(LogNormal, @qp_ll(1.0), @qp_uu(8))
(quantile(d, 0.025), quantile(d, 0.975)) .≈ (1.0, 8.0)

# approximate a different distribution by matching moments
dn = Normal(3,2)
d = fit(LogNormal, moments(dn))
(mean(d), var(d)) .≈ (3.0, 4.0)

Fit to statistical moments

StatsBase.fitMethod
fit(D, m)

Fit a statistical distribution of type D to given moments m.

Arguments

  • D: The type of distribution to fit
  • m: The moments of the distribution

Notes

This can be used to approximate one distribution by another.

See also AbstractMoments, moments.

Examples

d = fit(LogNormal, Moments(3.2,4.6));
(mean(d), var(d)) .≈ (3.2,4.6)
d = fit(LogNormal, moments(Normal(3,1.2)));
(mean(d), std(d)) .≈ (3,1.2)
plot(d); lines(!Normal(3,1.2))
source
LogNormals.momentsMethod
moments(D, ::Val{N} = Val(2))

Get the first N moments of a distribution.

See also type AbstractMoments.

Examples

moments(LogNormal(), Val(4))  # first four moments 
moments(Normal())  # mean and variance
source

The syntax Moments(mean,var) produces an object of type Moments <: AbstractMoments.

LogNormals.AbstractMomentsType
AbstractMoments{N}

A representation of statistical moments of a distribution

The following functions are supported

  • n_moments(m): get the number of recorded moments

The following getters return a single moment or throw an error if the moment has not been recorded

  • mean(m): get the mean
  • var(m): get the variance
  • skewness(m): get the variance
  • kurtosis(m): get the variance
  • getindex(m,i): get the ith moment, i.e. indexing m[i]

The basic implementation Moments is immutable and convert(AbstractArray, m::Moments) returns an SArray{N,T}.

Examples

m = Moments(1,0.2);
n_moments(m) == 2
var(m) == m[2]
kurtosis(m) # throws error because its above 2nd moment
source

Fit to several quantile points

StatsBase.fitMethod
fit(D, lower, upper)

Fit a statistical distribution to a set of quantiles

Arguments

  • D: The type of the distribution to fit
  • lower: lower QuantilePoint (p,q)
  • upper: upper QuantilePoint (p,q)

Notes

Several macros help to construct QuantilePoints

  • @qp(q,p) quantile at specified p: QuantilePoint(q,p)
  • @qp_ll(q0_025) quantile at very low p: QuantilePoint(q0_025,0.025)
  • @qp_l(q0_05) quantile at low p: QuantilePoint(q0_05,0.05)
  • @qp_m(median) quantile at median: QuantilePoint(median,0.5)
  • @qp_u(q0_95) quantile at high p: QuantilePoint(q0_95,0.95)
  • @qp_uu(q0_975) quantile at very high p: QuantilePoint(q0_975,0.975)

Examples

d = fit(LogNormal, @qp_m(3), @qp_uu(5));
quantile.(d, [0.5, 0.975]) ≈ [3,5]
source

Fit to mean,mode,median and a quantile point

StatsBase.fitMethod
fit(D, val, qp, ::Val{stats} = Val(:mean))

Fit a statistical distribution to a quantile and given statistics

Arguments

  • D: The type of distribution to fit
  • val: The value of statistics
  • qp: QuantilePoint(q,p)
  • stats Which statistics to fit: defaults to Val(:mean). Alternatives are: Val(:mode), Val(:median)

Examples

d = fit(LogNormal, 5, @qp_uu(14));
(mean(d),quantile(d, 0.975)) .≈ (5,14)
d = fit(LogNormal, 5, @qp_uu(14), Val(:mode));
(mode(d),quantile(d, 0.975)) .≈ (5,14)
source

Implementing support for another distribution

In order to use the fitting framework for a distribution MyDist, one needs to implement the following four methods.

StatsBase.fit(::Type{MyDist}, m::AbstractMoments)

fit_mean_quantile(::Type{MyDist}, mean, qp::QuantilePoint)

fit_mode_quantile(::Type{MyDist}, mode, qp::QuantilePoint)

StatsBase.fit(::Type{MyDist}, lower::QuantilePoint, upper::QuantilePoint)

The default method for fit with stats = :median already works based on the methods for two quantile points. If the general method on two quantile points cannot be specified, one can alternatively implement method:

fit_median_quantile(::Type{MyDist}, median, qp::QuantilePoint)