Skip to contents

Obtain overimputation objects

For demonstration, we use the newborn dataset included in the vismi package. This is an incomplete dataset with missing values in variables of various types.

We can obtain an overimputation object with 5 multiple imputations (m = 5), extra missing values with proportion 20% (p = 0.2) and test set ratio 20% (test_ratio = 0.2). Imputation method can be set to "mixgb" or "mice", which would call mixgb() or mice in the backend. Users can also pass additional argument related to mixgb() or mice() through overimp().

Under this setting, 20% extra missing values will be introduced and data will be split into training data (80%) and test data (20%). An imputation model would be built using only training data, and it will be used to impute the extra missing values in both the training data and the test data.

library(vismi)
obj <- overimp(data = newborn, m = 5, p = 0.2, test_ratio = 0.2,
    method = "mixgb", pmm.type = "auto")

Visual diagnostic for overimputation

1D visualisation

Numeric variable

The options for a numeric variable include: cv, density, ridge, qq, and qqline.

vismi_overimp(obj = obj, x = "head_circumference_cm", num_plot = "cv")

vismi_overimp(obj = obj, x = "head_circumference_cm", num_plot = "density")

vismi_overimp(obj = obj, x = "head_circumference_cm", num_plot = "ridge")

vismi_overimp(obj = obj, x = "head_circumference_cm", num_plot = "qq")

vismi_overimp(obj = obj, x = "head_circumference_cm", num_plot = "qqline")

factor variable

The options for a factor variable include: cv, bar, dodge.

vismi_overimp(obj = obj, x = "ethnicity", fac_plot = "cv")

vismi_overimp(obj = obj, x = "ethnicity", fac_plot = "cv", stack_y = TRUE)

vismi_overimp(obj = obj, x = "ethnicity", fac_plot = "cv", stack_y = TRUE,
    diag_color = "white")

vismi_overimp(obj = obj, x = "health", fac_plot = "bar")

vismi_overimp(obj = obj, x = "health", fac_plot = "dodge")

2D visualisation

2 numeric variables

vismi_overimp(obj = obj, x = "head_circumference_cm", y = "recumbent_length_cm")

1 factor 1 numeric variables

vismi_overimp(obj = obj, y = "sex", x = "head_circumference_cm",
    alpha = 0.5, point_size = 0.2, boxpoints = "all")

2 factor variables

vismi_overimp(obj = obj, x = "health", y = "sex")