Default Handling of missing values

Several functions and macros help to extend functions that were designed not taking care of missing values.

Main macros

The main tools are macros

  • @handlemissings_stub: defines only the dispatch infrastructure to be extended manually
  • @handlemissings_typed: additionally defines default handling for PassMissing and SkipMissing for arguments whose eltye does not match missings
  • @handlemissings_any: define these handling for arguments whose eltype does match missings including Any.
MissingStrategies.@handlemissings_stubMacro

@handlemissings_stub(fun, ...)

Calling handlemissings with just creating the disaptching matches but no implementations yet that handling missings.

  • Default to using argument type Any and providing no default strategy (use arguments to change this.)
  • Method with inserted MissingStrategy argument that forwards to the dispatching function
  • A dispaching method for eltypes not allowing for missings for any MissingStrategy that calls the original function without the MissingStrategy.

Arguments: see handlemissings

One then can define the other methods yourself using Simpletraits @traitfn.

using SimpleTraits
f1(x::AbstractArray{<:Real}) = "method that is not accepting missings in eltype"
@handlemissings_stub(
  # signature matching that of the original function to be called
  f1(x::AbstractArray{<:Real}) = 0,
  # pos_missing, pos_strategy, type_missing, defaultstrategy
  1,2,AbstractArray{<:Union{Missing,Real}}, PassMissing()
) 
methods(f1) # just to see that new methods have been defined
# the new methods forward to new function f1_hm that can be extended for special cases
# note the argument order: missing strategy comes first in the dispatching function
@traitfn function f1_hm(ms::PassMissing, x::::IsEltypeSuperOfMissing) 
  "method handling missings in eltype"
end
f1([1.0,2.0]) == "method that is not accepting missings in eltype"
f1([1.0,2.0], PassMissing()) == "method that is not accepting missings in eltype"
f1([1.0,2.0,missing]) == "method handling missings in eltype"
source
MissingStrategies.@handlemissings_typedMacro

@handlemissings_typed(fun, ...)

Calling handlemissings with defaults tailored to an original method where the eltype does not accepts missings:

  • Dispatching methods as with @handlemissings_stub
  • Argument type of the new function must be specified. May use Any. A default method (without MissingStrategy argument) is created that forwards to the PassMissing method. Hence, Make sure that the argument type differs from the original method so that the original method its not overwritten.
  • PassMissing method calls the original method with an broadcast where each element has been converted to the corresponding nonmissing type (mgen.passmissing_convert).
  • SkipMissing method collects the skipmissing object before calling the original function (mgen.handlemissing_collect_skip).

Arguments: see handlemissings

Note on default MissingStrategy

Note, that defining a default Missingstrategy at an argument position before further optional arguments behaves in a way that may not be intuitive.

f2(x::AbstractArray{<:Real},optarg=1:3) = x
@handlemissings_typed(f2(x::AbstractArray{<:Real},optarg=1:3)=0,1,2,Any)
# f2(x,ms::MissingStrategy=PassMissing(),optarg=1:3) # generated
# f2([1.0,missing], 2:4) # no method defined -> rethink argument ordering

In order to call the PassMissing variant in the above case, one would need to call @handle_missing_typed separately for the method with a single and the method with two arguments and place the default missing strategy behind the second argument in the second case.

f3(x::AbstractArray{<:Real},optarg=1:3) = x
@handlemissings_typed(f3(x::AbstractArray{<:Real})=0,1,2,Any)
@handlemissings_typed(f3(x::AbstractArray{<:Real}, optarg)=0,1,3,Any)
ismissing(f3([1.0,missing], 2:4))
source
MissingStrategies.@handlemissings_anyMacro

@handlemissings_any(fun, ...)

Calling handlemissings with defaults tailored to an original method where the eltype accepts missings already:

  • Dispatching methods as with @handlemissings_stub
  • Agument type of the new function defaults to Any. No default method (without the MissingStragety argument) is created.
  • PassMissing methods calls the original directly without converting the type of the argument with missings (mgen.passmissing_nonconvert).
  • HandleMissing methods calls the original directly with the skipmissing() iterator object (mgen.handlemissing_skip).

Arguments: see handlemissings

Note, that if the original method allows missing in eltype, you need to explicitly pass the PassMissing() by argument. A potential default method would override the original method and either not be called at all or call itself recursively causing an infinite loop.

source

Infrastructure

The macros above just wrap a call to the handlemissings function using different argument values, especially different generator functions.

MissingStrategies.handlemissingsFunction
handlemissings(fun, ...)

Creates an expression that defines new methods that allow missings in the eltype of an argument.

Arguments

  • fun: Expression of a function to extend
  • pos_missing = 1: The postition of the argument that should handle missings
  • pos_strategy = pos_missing + 1: The position at which the argument of MissingStrategy is to be inserted into the function signature
  • type_missing = :Any: The new type of the argument that should handle missings. This can be an expression of the value.
  • defaultstrategy::Union{Nothing,MissingStrategy} = :nothing: the value of the default of the strategy argument. Use :nothing to indicate not specifying a default value. This can be an expression of the value.
  • gens = (): Tuple of generator functions (see [mgen]{@ref})
  • argname_strategy = :ms: symbol of the argument name of the strategy argument
  • suffix="_hm": attached to the name of the dispatching function to avoid method ambiguities

Ususually this function is called from a macro that povide suitable dfault values

source

The SimpleTraits package together with the IsEltypeSuperOfMissing trait is used to dispatch to different methods depending on whether the eltype of a given argument allows for missing or does not allow for missings.

The orginal method is extended by a method signature with modified the type of a given argument, usually to eltype Union{Missing,<eltype_orig>} and an additional argument ms::MissingStrategy. The new method forwards to a dispatching function of name <name_orig>_<suffix> with MissingStrategy as the first argument and given argument x of type x::::IsEltypeSuperOfMissing. The suffix defaults to "_hm" but an be changed to avoid method ambiguities if methods are extended that differ only by the original type of x. A further advantage of using a separate dispatching method is, that the original function is not extended by too many new methods.

The dispatching function can be extended by SimpleTraits.@traitfn to handle the differnt combinations of whether eltye of x was missing or not and the different Missing strategies. See @handlemissings_stub for an example.

Generator functions

The argument gens of handlemissings takes a tuple of generator functions, that do the actual work of defining additional methods.

The used generator functions are defined in submodule mgen

MissingStrategies.mgen.forwarderMethod
forwarder(...)

Defines a method with new type of missing argument and MissingStrategy inserted forwarding to dispatching function of new name with MissingStrategy at first position.

source
MissingStrategies.mgen.passmissing_nonconvertMethod
passmissing_nonconvert(...)

Defines a trait method for PassMissing for arguments whose eltype allowing missing. This method returns missing if any missing items are encoured or otherwise calls the original function with non-modified arguments, i.e. with type that allows missings in its eltype.

source
MissingStrategies.mgen.passmissing_convertMethod
passmissing_convert(...)

Defines a trait method for PassMissing for arguments whose eltype allowing missing. This method returns missing if any missing items are encoured or otherwise calls the original function, but converts the argument to the corresponding nonmissing type.

source
MissingStrategies.mgen.handlemissing_collect_skipMethod
handlemissing_collect_skip(...)

Defines a trait method for HandleMissingStrategy for arguments whose eltype allows missings. This method transforms the argument by collect(skipmissing(x)) before passing it on to the original function.

Hence it passes a vector with corresponding nonmissing eltype, but does require allocation.

source
MissingStrategies.mgen.handlemissing_skipMethod
handlemissing_skip(...)

Defines a trait method for HandleMissingStrategy for arguments whose eltype allows missings. This method transforms the argument by skipmissing(x) before passing it on to the original function.

Hence it passes an itereator of undefined type but does not require allocations.

source

Each generator function requires arguments that are collected by and passed by splatting the dictionary results in the call of a generator function.

MissingStrategies.getdispatchinfoFunction
getdispatchinfo(...)

Collect information for dispatching non-missing types into a dictionary.

Arguments

see handlemissings.

Return value

A dictionary with entries

  • dict_forig: dictionary result of MacroTools.splitdef(fun)
  • fname_disp: Symbol of the dispatch function name,
  • argnames: the aguement names of fun,
  • kwargpasses: an Vector of expression of keyword parameters 'argname = argname'
  • pos_missing: position of the argument that should handle missings,
  • type_missing: the new type of that argument,
  • pos_strategy: the position at which the argument of MissingStrategy is inserted,
  • defaultstrategy: the default of that argument of type MissingStrategy
  • suffix="_hm": number to be appended to fname_disp to avoid method ambiguities,

These match the required argument for the method generators in module mgen.

source