Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Control over NA equality in base::rle() #157

Open
teunbrand opened this issue Dec 13, 2023 · 0 comments
Open

Control over NA equality in base::rle() #157

teunbrand opened this issue Dec 13, 2023 · 0 comments

Comments

@teunbrand
Copy link

An option to treat runs of NAs as equal so they are counted as a single run. Currently, rle() treats consecutive NA as separate runs.

x <- c(1, NA, NA, 3, 3, 3)
rle(x)
#> Run Length Encoding
#>   lengths: int [1:4] 1 1 1 3
#>   values : num [1:4] 1 NA NA 3

While I think this behaviour is technically correct, practically you'd want to use run-length encoding for compression or segmentation in which case individual runs of NAs are useless.
I'd like something like the following where one can choose to ignore that NAs are not equal:

rle2 <- function(x, na.equal = FALSE) {
  if (!is.vector(x) && !is.list(x)) 
    stop("'x' must be a vector of an atomic type")
  n <- length(x)
  if (n == 0L) 
    return(structure(list(lengths = integer(), values = x), class = "rle"))
  if (isTRUE(na.equal)) { # changed
    ux <- unique(x)       #
    x <- match(x, ux)     #
  }                       #
  y <- x[-1L] != x[-n]
  i <- c(which(y | is.na(y)), n)
  values <- x[i]
  if (isTRUE(na.equal)) { # changed
    values <- ux[values]  #
  }                       #
  structure(list(lengths = diff(c(0L, i)), values = values), class = "rle")
}
rle2(x, na.equal = TRUE)
#> Run Length Encoding
#>   lengths: int [1:3] 1 2 3
#>   values : num [1:3] 1 NA 3

The vctrs::vec_unrep() function also treats NAs this way:

vctrs::vec_unrep(x)
#>   key times
#> 1   1     1
#> 2  NA     2
#> 3   3     3
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant