You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This morning I'm working with some data that hasn't been touched since November (over 7 months ago). I'm the maintainer for this data, it lives on my personal machine, and I use UNF to validate which version of the dataset I'm working with. Today I'm getting UNF values that are inconsistent with values calculated last November. I'm getting similar inconsistencies for some of the examples in ?unf (shown below). In particular I'm getting inconsistencies for unf(longley, ver=4, digits=3) and unf(cbind.data.frame(x1,x2),ver=3) and its equivalents. The UNFs for my data were calculated using version 6.
Both calculations were done using UNF version 2.0.6 on the same machine. One potential difference is last November I was using R 3.5.1 and today I'm using R 4.0.0.
Please specify whether your issue is about:
a possible bug
a question about package functionality
a suggested code or documentation change, improvement to the code, or feature request
Put your code here:
library(UNF)
# Version 6 #### FORTHCOMING #### Version 5 ### vectors### just numerics
unf5(1:20) # UNF:5:/FIOZM/29oC3TK/IE52m2A==#> UNF5:/FIOZM/29oC3TK/IE52m2A==
unf5(-3:3, dvn_zero=TRUE) # UNF:5:pwzm1tdPaqypPWRWDeW6Jw==#> UNF5:pwzm1tdPaqypPWRWDeW6Jw==### characters and factors
unf5(c('test','1','2','3')) # UNF:5:fH4NJMYkaAJ16OWMEE+zpQ==#> UNF5:fH4NJMYkaAJ16OWMEE+zpQ==
unf5(as.factor(c('test','1','2','3'))) # UNF:5:fH4NJMYkaAJ16OWMEE+zpQ==#> UNF5:fH4NJMYkaAJ16OWMEE+zpQ==### logicals
unf5(c(TRUE,TRUE,FALSE), dvn_zero=TRUE)# UNF:5:DedhGlU7W6o2CBelrIZ3iw==#> UNF5:DedhGlU7W6o2CBelrIZ3iw==### missing values
unf5(c(1:5,NA)) # UNF:5:Msnz4m7QVvqBUWxxrE7kNQ==#> UNF5:Msnz4m7QVvqBUWxxrE7kNQ==## variable order and object structure is irrelevant
unf(data.frame(1:3,4:6,7:9)) # UNF:5:ukDZSJXck7fn4SlPJMPFTQ==#> UNF6:ukDZSJXck7fn4SlPJMPFTQ==
unf(data.frame(7:9,1:3,4:6))
#> UNF6:ukDZSJXck7fn4SlPJMPFTQ==
unf(list(1:3,4:6,7:9))
#> UNF6:ukDZSJXck7fn4SlPJMPFTQ==# Version 4 ## version 4
data(longley)
unf(longley, ver=4, digits=3) # PjAV6/R6Kdg0urKrDVDzfMPWJrsBn5FfOdZVr9W8Ybg=#> UNF4:3,128:KjRoxvNqv+Gkbso2DZ5N3lztfFYA02PPy8KlAByze9s=# version 4.1
unf(longley, ver=4.1, digits=3) # 8nzEDWbNacXlv5Zypp+3YCQgMao/eNusOv/u5GmBj9I=#> UNF4.1:3,128:8nzEDWbNacXlv5Zypp+3YCQgMao/eNusOv/u5GmBj9I=# Version 3 #x1<-1:20x2<-x1+.00001
unf3(x1) # HRSmPi9QZzlIA+KwmDNP8w==#> UNF3:M+FD+2bN2GJGqHJmhZeWig==
unf3(x2) # OhFpUw1lrpTE+csF30Ut4Q==#> UNF3:cN+0PxPJHvbQQd5I+pLKpg==# UNFs are identical at specified level of rounding
identical(unf3(x1), unf3(x2))
#> [1] FALSE
identical(unf3(x1, digits=5),unf3(x2, digits=5))
#> [1] TRUE# dataframes, matrices, and lists are all treated identically:
unf(cbind.data.frame(x1,x2),ver=3) # E8+DS5SG4CSoM7j8KAkC9A==#> UNF3:eIjrbuHf+6rWU/XD+4F7+g==
unf(list(x1,x2), ver=3)
#> UNF3:eIjrbuHf+6rWU/XD+4F7+g==
unf(cbind(x1,x2), ver=3)
#> UNF3:eIjrbuHf+6rWU/XD+4F7+g==
sessionInfo()
#> R version 4.0.0 (2020-04-24)#> Platform: x86_64-apple-darwin17.0 (64-bit)#> Running under: macOS Catalina 10.15.5#> #> Matrix products: default#> BLAS: /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRblas.dylib#> LAPACK: /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRlapack.dylib#> #> locale:#> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8#> #> attached base packages:#> [1] stats graphics grDevices utils datasets methods base #> #> other attached packages:#> [1] UNF_2.0.6#> #> loaded via a namespace (and not attached):#> [1] compiler_4.0.0 magrittr_1.5 tools_4.0.0 htmltools_0.4.0#> [5] base64enc_0.1-3 yaml_2.2.1 Rcpp_1.0.4.6 stringi_1.4.6 #> [9] rmarkdown_2.1 highr_0.8 knitr_1.28 stringr_1.4.0 #> [13] xfun_0.13 digest_0.6.25 rlang_0.4.6 evaluate_0.14
Thanks for this report. Definitely concerning but I'm wondering if it's unique to R 4.0.0. I'm not seeing these in 4.0.2 nor any issues on CRAN.
It's been a long time since I've looked at this code so it's definitely possible there's a problem but there's an intentionally thorough test suite to catch these kinds of things, so I'm hopeful it's an upstream problem that has since been resolved.
Just a note that I'm still seeing this in R 4.1.2 and UNF 2.0.8: the hashes I'm getting are the same as the ones I reported in the reprex, not what's in the docs. Here's an updated sessionInfo():
R version 4.1.2 (2021-11-01)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS Big Sur 11.6.8
Matrix products: default
LAPACK: /Library/Frameworks/R.framework/Versions/4.1/Resources/lib/libRlapack.dylib
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] UNF_2.0.8
loaded via a namespace (and not attached):
[1] compiler_4.1.2 tools_4.1.2 base64enc_0.1-3 digest_0.6.29
This morning I'm working with some data that hasn't been touched since November (over 7 months ago). I'm the maintainer for this data, it lives on my personal machine, and I use UNF to validate which version of the dataset I'm working with. Today I'm getting UNF values that are inconsistent with values calculated last November. I'm getting similar inconsistencies for some of the examples in
?unf
(shown below). In particular I'm getting inconsistencies forunf(longley, ver=4, digits=3)
andunf(cbind.data.frame(x1,x2),ver=3)
and its equivalents. The UNFs for my data were calculated using version 6.Both calculations were done using
UNF
version 2.0.6 on the same machine. One potential difference is last November I was using R 3.5.1 and today I'm using R 4.0.0.Please specify whether your issue is about:
Put your code here:
Created on 2020-06-27 by the reprex package (v0.3.0)
The text was updated successfully, but these errors were encountered: