
Overimputation Diagnostics for Multiple Imputation
Source:vignettes/articles/vismi_overimp_demo.qmd
Obtain overimputation objects
For demonstration, we use the newborn dataset included in the vismi package. This is an incomplete dataset with missing values in variables of various types.
We can obtain an overimputation object with 5 multiple imputations (m = 5), extra missing values with proportion 20% (p = 0.2) and test set ratio 20% (test_ratio = 0.2). Imputation method can be set to "mixgb" or "mice", which would call mixgb() or mice in the backend. Users can also pass additional argument related to mixgb() or mice() through overimp().
Under this setting, 20% extra missing values will be introduced and data will be split into training data (80%) and test data (20%). An imputation model would be built using only training data, and it will be used to impute the extra missing values in both the training data and the test data.
Visual diagnostic for overimputation
1D visualisation
Numeric variable
The options for a numeric variable include: cv, density, ridge, qq, and qqline.
vismi_overimp(obj = obj, x = "head_circumference_cm", num_plot = "cv")
vismi_overimp(obj = obj, x = "head_circumference_cm", num_plot = "density")
vismi_overimp(obj = obj, x = "head_circumference_cm", num_plot = "ridge")
vismi_overimp(obj = obj, x = "head_circumference_cm", num_plot = "qq")
vismi_overimp(obj = obj, x = "head_circumference_cm", num_plot = "qqline")factor variable
The options for a factor variable include: cv, bar, dodge.
vismi_overimp(obj = obj, x = "ethnicity", fac_plot = "cv")
vismi_overimp(obj = obj, x = "ethnicity", fac_plot = "cv", stack_y = TRUE)
vismi_overimp(obj = obj, x = "ethnicity", fac_plot = "cv", stack_y = TRUE,
diag_color = "white")
vismi_overimp(obj = obj, x = "health", fac_plot = "bar")
vismi_overimp(obj = obj, x = "health", fac_plot = "dodge")2D visualisation
2 numeric variables
vismi_overimp(obj = obj, x = "head_circumference_cm", y = "recumbent_length_cm")1 factor 1 numeric variables
vismi_overimp(obj = obj, y = "sex", x = "head_circumference_cm",
alpha = 0.5, point_size = 0.2, boxpoints = "all")2 factor variables
vismi_overimp(obj = obj, x = "health", y = "sex")