Default Handling of missing values
Several functions and macros help to extend functions that were designed not taking care of missing values.
Main macros
The main tools are macros
@handlemissings_stub
: defines only the dispatch infrastructure to be extended manually@handlemissings_typed
: additionally defines default handling forPassMissing
andSkipMissing
for arguments whose eltye does not match missings@handlemissings_any
: define these handling for arguments whose eltype does match missings includingAny
.
MissingStrategies.@handlemissings_stub
— Macro@handlemissings_stub(fun, ...)
Calling handlemissings
with just creating the disaptching matches but no implementations yet that handling missings.
- Default to using argument type Any and providing no default strategy (use arguments to change this.)
- Method with inserted MissingStrategy argument that forwards to the dispatching function
- A dispaching method for eltypes not allowing for missings for any MissingStrategy that calls the original function without the MissingStrategy.
Arguments: see handlemissings
One then can define the other methods yourself using Simpletraits @traitfn
.
using SimpleTraits
f1(x::AbstractArray{<:Real}) = "method that is not accepting missings in eltype"
@handlemissings_stub(
# signature matching that of the original function to be called
f1(x::AbstractArray{<:Real}) = 0,
# pos_missing, pos_strategy, type_missing, defaultstrategy
1,2,AbstractArray{<:Union{Missing,Real}}, PassMissing()
)
methods(f1) # just to see that new methods have been defined
# the new methods forward to new function f1_hm that can be extended for special cases
# note the argument order: missing strategy comes first in the dispatching function
@traitfn function f1_hm(ms::PassMissing, x::::IsEltypeSuperOfMissing)
"method handling missings in eltype"
end
f1([1.0,2.0]) == "method that is not accepting missings in eltype"
f1([1.0,2.0], PassMissing()) == "method that is not accepting missings in eltype"
f1([1.0,2.0,missing]) == "method handling missings in eltype"
MissingStrategies.@handlemissings_typed
— Macro@handlemissings_typed(fun, ...)
Calling handlemissings
with defaults tailored to an original method where the eltype does not accepts missings:
- Dispatching methods as with
@handlemissings_stub
- Argument type of the new function must be specified. May use
Any
. A default method (withoutMissingStrategy
argument) is created that forwards to the PassMissing method. Hence, Make sure that the argument type differs from the original method so that the original method its not overwritten. - PassMissing method calls the original method with an broadcast where each element has been converted to the corresponding nonmissing type (
mgen.passmissing_convert
). - SkipMissing method collects the skipmissing object before calling the original function (
mgen.handlemissing_collect_skip
).
Arguments: see handlemissings
Note on default MissingStrategy
Note, that defining a default Missingstrategy at an argument position before further optional arguments behaves in a way that may not be intuitive.
f2(x::AbstractArray{<:Real},optarg=1:3) = x
@handlemissings_typed(f2(x::AbstractArray{<:Real},optarg=1:3)=0,1,2,Any)
# f2(x,ms::MissingStrategy=PassMissing(),optarg=1:3) # generated
# f2([1.0,missing], 2:4) # no method defined -> rethink argument ordering
In order to call the PassMissing variant in the above case, one would need to call @handle_missing_typed
separately for the method with a single and the method with two arguments and place the default missing strategy behind the second argument in the second case.
f3(x::AbstractArray{<:Real},optarg=1:3) = x
@handlemissings_typed(f3(x::AbstractArray{<:Real})=0,1,2,Any)
@handlemissings_typed(f3(x::AbstractArray{<:Real}, optarg)=0,1,3,Any)
ismissing(f3([1.0,missing], 2:4))
MissingStrategies.@handlemissings_any
— Macro@handlemissings_any(fun, ...)
Calling handlemissings
with defaults tailored to an original method where the eltype accepts missings already:
- Dispatching methods as with
@handlemissings_stub
- Agument type of the new function defaults to
Any
. No default method (without theMissingStragety
argument) is created. - PassMissing methods calls the original directly without converting the type of the argument with missings (
mgen.passmissing_nonconvert
). - HandleMissing methods calls the original directly with the
skipmissing()
iterator object (mgen.handlemissing_skip
).
Arguments: see handlemissings
Note, that if the original method allows missing
in eltype
, you need to explicitly pass the PassMissing()
by argument. A potential default method would override the original method and either not be called at all or call itself recursively causing an infinite loop.
Infrastructure
The macros above just wrap a call to the handlemissings
function using different argument values, especially different generator functions.
MissingStrategies.handlemissings
— Functionhandlemissings(fun, ...)
Creates an expression that defines new methods that allow missings in the eltype of an argument.
Arguments
fun
: Expression of a function to extendpos_missing = 1
: The postition of the argument that should handle missingspos_strategy = pos_missing + 1
: The position at which the argument of MissingStrategy is to be inserted into the function signaturetype_missing = :Any
: The new type of the argument that should handle missings. This can be an expression of the value.defaultstrategy::Union{Nothing,MissingStrategy} = :nothing
: the value of the default of the strategy argument. Use:nothing
to indicate not specifying a default value. This can be an expression of the value.gens = ()
: Tuple of generator functions (see [mgen]{@ref})argname_strategy = :ms
: symbol of the argument name of the strategy argumentsuffix="_hm"
: attached to the name of the dispatching function to avoid method ambiguities
Ususually this function is called from a macro that povide suitable dfault values
@handlemissings_typed
suitable if the type of the original method does not allow missing value@handlemissings_any
suitable if the type of the original method does allow missing value.@handlemissings_stub
suitable for writing user-specified handling routines.
The SimpleTraits package together with the IsEltypeSuperOfMissing
trait is used to dispatch to different methods depending on whether the eltype of a given argument allows for missing or does not allow for missings.
The orginal method is extended by a method signature with modified the type of a given argument, usually to eltype Union{Missing,<eltype_orig>}
and an additional argument ms::MissingStrategy
. The new method forwards to a dispatching function of name <name_orig>_<suffix>
with MissingStrategy as the first argument and given argument x
of type x::::IsEltypeSuperOfMissing
. The suffix defaults to "_hm" but an be changed to avoid method ambiguities if methods are extended that differ only by the original type of x
. A further advantage of using a separate dispatching method is, that the original function is not extended by too many new methods.
The dispatching function can be extended by SimpleTraits.@traitfn
to handle the differnt combinations of whether eltye of x was missing or not and the different Missing strategies. See @handlemissings_stub
for an example.
Generator functions
The argument gens
of handlemissings
takes a tuple of generator functions, that do the actual work of defining additional methods.
The used generator functions are defined in submodule mgen
MissingStrategies.mgen
— ModuleSubmodule defining functions that generate specific methods that handle missings. They are called from within handlemissings
.
The keyword arguments of the generator functions correspond to entries in getdispatchinfo
.
MissingStrategies.mgen.forwarder
— Methodforwarder(...)
Defines a method with new type of missing argument and MissingStrategy inserted forwarding to dispatching function of new name with MissingStrategy at first position.
MissingStrategies.mgen.missingstrategy_notsuperofeltype
— Methodmissingstrategy_notsuperofeltype(...)
Defines a trait method for any MissingStrategy for arguments whose eltype does not allow missing. This just forwards to the original function.
MissingStrategies.mgen.passmissing_nonconvert
— Methodpassmissing_nonconvert(...)
Defines a trait method for PassMissing
for arguments whose eltype allowing missing. This method returns missing if any missing items are encoured or otherwise calls the original function with non-modified arguments, i.e. with type that allows missings in its eltype.
MissingStrategies.mgen.passmissing_convert
— Methodpassmissing_convert(...)
Defines a trait method for PassMissing
for arguments whose eltype allowing missing. This method returns missing if any missing items are encoured or otherwise calls the original function, but converts the argument to the corresponding nonmissing type.
MissingStrategies.mgen.handlemissing_collect_skip
— Methodhandlemissing_collect_skip(...)
Defines a trait method for HandleMissingStrategy
for arguments whose eltype allows missings. This method transforms the argument by collect(skipmissing(x))
before passing it on to the original function.
Hence it passes a vector with corresponding nonmissing eltype, but does require allocation.
MissingStrategies.mgen.handlemissing_skip
— Methodhandlemissing_skip(...)
Defines a trait method for HandleMissingStrategy
for arguments whose eltype allows missings. This method transforms the argument by skipmissing(x)
before passing it on to the original function.
Hence it passes an itereator of undefined type but does not require allocations.
Each generator function requires arguments that are collected by and passed by splatting the dictionary results in the call of a generator function.
MissingStrategies.getdispatchinfo
— Functiongetdispatchinfo(...)
Collect information for dispatching non-missing types into a dictionary.
Arguments
see handlemissings
.
Return value
A dictionary with entries
dict_forig
: dictionary result of MacroTools.splitdef(fun)fname_disp
: Symbol of the dispatch function name,argnames
: the aguement names of fun,kwargpasses
: an Vector of expression of keyword parameters 'argname = argname'pos_missing
: position of the argument that should handle missings,type_missing
: the new type of that argument,pos_strategy
: the position at which the argument of MissingStrategy is inserted,defaultstrategy
: the default of that argument of typeMissingStrategy
suffix="_hm"
: number to be appended tofname_disp
to avoid method ambiguities,
These match the required argument for the method generators in module mgen
.