Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dramatically improve performance #5

Open
egnha opened this issue Apr 15, 2018 · 5 comments
Open

Dramatically improve performance #5

egnha opened this issue Apr 15, 2018 · 5 comments
Labels
help wanted Extra attention is needed

Comments

@egnha
Copy link
Owner

egnha commented Apr 15, 2018

%<<-% is fine for interactive use, but currently it is too slow for general use in runtime functions (e.g., package exports). It should be on par with doing multiple standard assignments. This will require the core algorithm—in particular, reify_names()—to be reimplemented in C.

@egnha egnha added the help wanted Extra attention is needed label Jul 2, 2018
@egnha egnha changed the title Improve performance for assignments of flat-list components Dramatically improve performance Jul 2, 2018
@dirkschumacher
Copy link

Is this still relevant?

@egnha
Copy link
Owner Author

egnha commented Aug 29, 2019

It would definitely be nice to resolve this. However, I don't see myself getting around to it any time soon. PR welcome. :)

@dirkschumacher
Copy link

I will add it to my list of fun coding problems to be considered if I have energy :)

@dirkschumacher
Copy link

Ok played around with an Rcpp sugar version of dots_matched but that did not really speed anything up for shorter expressions. A careful optimized version in C might speed things up. What do you think are the bottlenecks? Do you have any benchmarks?

@egnha
Copy link
Owner Author

egnha commented Aug 31, 2019

Probably not what you're asking about, but here (for the record) is a benchmark for %<<-% itself vs multiple assignment in a fairly typical setting:

library(dub)

a <- list(1, list(2, list(3)))

bench::mark(
  (x:(y:(z))) %<<-% a,
  {x <- a[[1]]; y <- a[[c(2, 1)]]; z <- a[[c(2, 2, 1)]]},
  check = FALSE,
  iterations = 1e5
)
#> # A tibble: 2 x 13
#>   expression                                                  min   median `itr/sec`
#>   <bch:expr>                                             <bch:tm> <bch:tm>     <dbl>
#> 1 (x:(y:(z))) %<<-% a                                     163.4µs  193.1µs     5024.
#> 2 { x <- a[[1]] y <- a[[c(2, 1)]] z <- a[[c(2, 2, 1)]] }   1.16µs   1.29µs   721889.
#> # … with 9 more variables: mem_alloc <bch:byt>, `gc/sec` <dbl>, n_itr <int>, n_gc <dbl>,
#> #   total_time <bch:tm>, result <list>, memory <list>, time <list>, gc <list>

😖

It would be nice to get an order of magnitude speedup, in general, thus in this case a reduction to around 10 microseconds. (By the way, zeallot performance is comparably poor, even somewhat worse.)

I agree with you: a rewrite in C is probably the only option (if dub is to remain dependency-free). To be frank, I haven't tried profiling dots_matched(), or other functions in names.R, so I don't know exactly where the bottlenecks are. Certainly there is a penalty to making so many recursive calls. But I doubt converting those to a more imperative R would substantially help. (Then again, I haven't tried.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

2 participants