Replies: 6 comments
-
You have basically recreated set.seed(1)
d <- data.frame(x1 = 1:10,
x2 = c(1:9, 10.1))
d$y1 <- d$x1 + rnorm(nrow(d), 0, 0.1)
d$y2 <- d$x2 + rnorm(nrow(d), 0, 0.1)
cor(d)
#> x1 x2 y1 y2
#> x1 1.0000000 0.9999608 0.9996862 0.9993470
#> x2 0.9999608 1.0000000 0.9995650 0.9994010
#> y1 0.9996862 0.9995650 1.0000000 0.9985884
#> y2 0.9993470 0.9994010 0.9985884 1.0000000 Created on 2020-10-09 by the reprex package (v0.3.0) Add more variance and up the threshold to avoid library(mice)
#>
#> Attaching package: 'mice'
#> The following objects are masked from 'package:base':
#>
#> cbind, rbind
set.seed(1)
d <- data.frame(x1 = 1:10,
x2 = c(1:9, 10.1))
d$y1 <- d$x1 + rnorm(nrow(d), 0, 2)
d$y2 <- d$x2 + rnorm(nrow(d), 0, 2)
cor(d)
#> x1 x2 y1 y2
#> x1 1.0000000 0.9999608 0.9104953 0.8441215
#> x2 0.9999608 1.0000000 0.9091064 0.8443825
#> y1 0.9104953 0.9091064 1.0000000 0.6746252
#> y2 0.8441215 0.8443825 0.6746252 1.0000000
meth <- c(x1 = "", x2 = "", y1 = "norm", y2 = "norm")
predMat <- matrix(0, nrow = ncol(d), ncol = ncol(d))
rownames(predMat) <- colnames(predMat) <- names(d)
predMat["y1", "x1"] <- 1
predMat["y2", "x2"] <- 1
predMat
#> x1 x2 y1 y2
#> x1 0 0 0 0
#> x2 0 0 0 0
#> y1 1 0 0 0
#> y2 0 1 0 0
mice.out <- mice(data = d,
m = 5,
maxit = 10,
method = meth,
predictorMatrix = predMat,
print = FALSE,
threshold = 1) # multicollinearity threshold
mice.out$predictorMatrix
#> x1 x2 y1 y2
#> x1 0 0 0 0
#> x2 0 0 0 0
#> y1 1 0 0 0
#> y2 0 1 0 0
mice.out$loggedEvents
#> NULL Created on 2020-10-09 by the reprex package (v0.3.0) |
Beta Was this translation helpful? Give feedback.
-
Thanks for your reply Gerko. I will use the What I was suggesting was that, as far as I can see, the high correlation between But I can accept that I will have to set Thanks. |
Beta Was this translation helpful? Give feedback.
-
The behaviour is as intended. There are two different things happening in your code:
|
Beta Was this translation helpful? Give feedback.
-
Thanks for the explanation Stef. |
Beta Was this translation helpful? Give feedback.
-
Related #150 |
Beta Was this translation helpful? Give feedback.
-
On rereading, I think there is an error in my advice above: It is the |
Beta Was this translation helpful? Give feedback.
-
I've seen some behaviour from mice that seems strange to me - high correlation logged events for predictor variables that are not used in the same model. I'm using mice version 3.11.0.
In the following example,
x1
andx2
have very high correlation but they are not used in the same model, so I would not have expected a problem.Since
x1
andx2
don't appear as predictor variables in the same model, I don't understand why their high correlation causes a logged event. We also see that the imputed values are bad.Of course this is an extreme example and we could run
mice
separately fory1
andy2
, but in another situation they could have predictor variables in common.Maybe I'm misunderstanding something. Is this behaviour intended?
Thanks.
Beta Was this translation helpful? Give feedback.
All reactions