Clarification on the Role of predictorMatrix in the mice Package for Imputation #674
Replies: 4 comments 1 reply
-
If there are no predictors, |
Beta Was this translation helpful? Give feedback.
-
Hi, I have the following three questions to ask you when using the mice package for imputation. |
Beta Was this translation helpful? Give feedback.
-
It sounds as if you have duplicate or linearly dependent variables that cause the multicollinearity error. If so, you can remove some. The error can occur if there are sparse categorical variables, so check for that. |
Beta Was this translation helpful? Give feedback.
-
Thank you very much for your patient response. I would like to ask a few more questions:
|
Beta Was this translation helpful? Give feedback.
-
Hi,
I'm using the mice package for imputations and wanted to speed up the process by using quickpred. I understand that in the default case (pred <- quickpred(data, minpuc = 0, mincor = 0.1)), each row's variable is predicted using columns marked with 1.
> print(pred) OSESmean.income PC1 PC2 PC3 PC4 PC5 PC6 PC7 PC8 PC9 PC10 sex age TIV total_WhiteSurfArea_area total_MeanThickness_thickness edu OSESmean.income 0 1 1 0 0 0 0 0 0 0 0 0 1 0 0 0 1 PC1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 PC2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 PC3 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 PC4 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 PC5 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 PC6 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 PC7 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 PC8 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 PC9 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 PC10 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 sex 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 age 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 TIV 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1 0 0 total_WhiteSurfArea_area 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 total_MeanThickness_thickness 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 edu 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1 0 height 0 1 0 0 0 0 0 0 0 0 0 1 0 1 1 0 1 weightMRI 0 1 0 0 0 0 0 0 0 0 0 1 0 1 1 0 0 urban_score15 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1 1 urban_score18 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1 1 lh_bankssts_area 0 0 0 0 0 0 0 0 0 0 0 1 0 1 1 0 0 lh_caudalanteriorcingulate_area 0 0 0 0 0 0 0 0 0 0 0 1 0 1 1 0 0 lh_caudalmiddlefrontal_area 0 0 0 0 0 0 0 0 0 0 0 1 0 1 1 0 0 lh_cuneus_area 0 0 0 0 0 0 0 0 0 0 0 1 0 1 1 0 0 height weightMRI urban_score15 urban_score18 lh_bankssts_area lh_caudalanteriorcingulate_area lh_caudalmiddlefrontal_area OSESmean.income 0 0 1 1 0 0 0 PC1 0 0 0 0 0 0 0 PC2 0 0 0 0 0 0 0 PC3 0 0 0 0 0 0 0 PC4 0 0 0 0 0 0 0 PC5 0 0 0 0 0 0 0 PC6 0 0 0 0 0 0 0 PC7 0 0 0 0 0 0 0 PC8 0 0 0 0 0 0 0 PC9 0 0 0 0 0 0 0 PC10 0 0 0 0 0 0 0 sex 0 0 0 0 0 0 0 age 0 0 0 0 0 0 0 TIV 1 1 0 0 1 1 1 total_WhiteSurfArea_area 1 1 0 0 1 1 1 total_MeanThickness_thickness 0 0 0 0 0 0 0 edu 1 0 1 1 0 0 0 height 0 1 1 1 1 1 1 weightMRI 1 0 1 1 1 1 1 urban_score15 1 1 0 1 0 0 0 urban_score18 1 1 1 0 0 0 0 lh_bankssts_area 1 1 0 0 0 1 1 lh_caudalanteriorcingulate_area 1 1 0 0 1 0 1 lh_caudalmiddlefrontal_area 1 1 0 0 1 1 0 lh_cuneus_area 1 1 0 0 1 1 1 lh_cuneus_area OSESmean.income 0 PC1 0 PC2 0 PC3 0 PC4 0 PC5 0 PC6 0 PC7 0 PC8 0 PC9 0 PC10 0 sex 0 age 0 TIV 1 total_WhiteSurfArea_area 1 total_MeanThickness_thickness 0 edu 0 height 1 weightMRI 1 urban_score15 0 urban_score18 0 lh_bankssts_area 1 lh_caudalanteriorcingulate_area 1 lh_caudalmiddlefrontal_area 1 lh_cuneus_area 0
However, when I set minpuc = 0.6 and mincor = 0.6, some rows (i.e. OSESmean.income or lh_bankssts_area, including those with missing data) have all zeros. How is the target variable imputed when its row in predictorMatrix has no predictors (all zeros)?
> print(pred) OSESmean.income PC1 PC2 PC3 PC4 PC5 PC6 PC7 PC8 PC9 PC10 sex age TIV total_WhiteSurfArea_area total_MeanThickness_thickness edu OSESmean.income 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 PC1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 PC2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 PC3 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 PC4 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 PC5 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 PC6 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 PC7 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 PC8 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 PC9 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 PC10 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 sex 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 age 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 TIV 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 total_WhiteSurfArea_area 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 total_MeanThickness_thickness 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 edu 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 height 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 weightMRI 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 urban_score15 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 urban_score18 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 lh_bankssts_area 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 lh_caudalanteriorcingulate_area 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 lh_caudalmiddlefrontal_area 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 lh_cuneus_area 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 height weightMRI urban_score15 urban_score18 lh_bankssts_area lh_caudalanteriorcingulate_area lh_caudalmiddlefrontal_area OSESmean.income 0 0 0 0 0 0 0 PC1 0 0 0 0 0 0 0 PC2 0 0 0 0 0 0 0 PC3 0 0 0 0 0 0 0 PC4 0 0 0 0 0 0 0 PC5 0 0 0 0 0 0 0 PC6 0 0 0 0 0 0 0 PC7 0 0 0 0 0 0 0 PC8 0 0 0 0 0 0 0 PC9 0 0 0 0 0 0 0 PC10 0 0 0 0 0 0 0 sex 0 0 0 0 0 0 0 age 0 0 0 0 0 0 0 TIV 0 0 0 0 0 0 0 total_WhiteSurfArea_area 0 0 0 0 0 0 0 total_MeanThickness_thickness 0 0 0 0 0 0 0 edu 0 0 0 0 0 0 0 height 0 0 0 0 0 0 0 weightMRI 0 0 0 0 0 0 0 urban_score15 0 0 0 0 0 0 0 urban_score18 0 0 0 0 0 0 0 lh_bankssts_area 0 0 0 0 0 0 0 lh_caudalanteriorcingulate_area 0 0 0 0 0 0 0 lh_caudalmiddlefrontal_area 0 0 0 0 0 0 0 lh_cuneus_area 0 0 0 0 0 0 0 lh_cuneus_area OSESmean.income 0 PC1 0 PC2 0 PC3 0 PC4 0 PC5 0 PC6 0 PC7 0 PC8 0 PC9 0 PC10 0 sex 0 age 0 TIV 0 total_WhiteSurfArea_area 0 total_MeanThickness_thickness 0 edu 0 height 0 weightMRI 0 urban_score15 0 urban_score18 0 lh_bankssts_area 0 lh_caudalanteriorcingulate_area 0 lh_caudalmiddlefrontal_area 0 lh_cuneus_area 0
Thanks for clarifying this behavior!
Bests,
Jasmine
Beta Was this translation helpful? Give feedback.
All reactions