Expedite the DLTMAP from orbit #504

cliu-sift · 2021-08-05T23:16:26Z

cliu-sift
Aug 5, 2021

Hi Orbit team, I'm playing with the orbit package recently with some time series prediction tasks. So simply speaking, what I want to do is to make k-step ahead prediction for candidate i (i=1, 150). For each candidate i, I have a time series with length about 4000 time stamps, and I want to make rolling forecast to the time series values starting from around the 2000th time stamp. This means I will need to make prediction for each candidate about 2000 times. For each prediction, I used DLTMAP from package.

Since my prediction times is high, and I need to repetitively calling the DLTMAP() function from orbit, I tried to make it more efficient through parallel. I used the Parallel function from joblib, and the code to parallel looks like below:

res = Parallel(n_jobs=60)(delayed(wrapper_function_containing_DLTMAP)(cid, watchtower_data, split_time, 
                     response_col, date_col, pred_len, trend_day) for cid in customer_id)

When I call like this, I suffer from memory leaks or worker unexpectedly stopped error if the number of customer in customer_id is large. I feel there potentially exists some internal conflict between the parallel function here and something calling inside DLTMAP, but I'm not sure. Does anyone know better ways to make the process more efficient?

I also tried to increase the number of cores specified in the DLTMAP function like below:

            dlt = DLTMAP(
                response_col=response_col,
                date_col=date_col,
                seasonality=48,
                prediction_percentiles=[0, 0.25, 0.5, 1, 99, 99.5, 99.75, 100],
                global_trend_option='flat',
                cores = 60
            )

But unfortunately, I found out that it wasn't helpful, the running time for dlt.fit() is almost exactly the same no matter I specify cores = 1 or cores = 60. Did I miss anything here? Any feedback is appreciated. Thank you!

Answered by edwinnglabs

Aug 11, 2021

Hi @cliu-sift I don't think the arg cores help in MAP since it is a single chain optimization. Meanwhile, i wonder if you could leverage package like joblib in that case. @wangzhishi @ppstacy

View full answer

edwinnglabs · 2021-08-11T05:17:12Z

edwinnglabs
Aug 11, 2021
Maintainer

Hi @cliu-sift I don't think the arg cores help in MAP since it is a single chain optimization. Meanwhile, i wonder if you could leverage package like joblib in that case. @wangzhishi @ppstacy

1 reply

edwinnglabs Aug 21, 2021
Maintainer

@cliu-sift In some version, you can may end up running bootstrap to get uncertainties for the prediction and it may cost most of the time in your fit-predict process. To avoid that, you can try having n_bootstrap_draws=-1. It may help.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Expedite the DLTMAP from orbit #504

{{title}}

Replies: 1 comment 1 reply

{{title}}

{{title}}

Select a reply

Expedite the DLTMAP from orbit #504

cliu-sift Aug 5, 2021

Replies: 1 comment · 1 reply

edwinnglabs Aug 11, 2021 Maintainer

edwinnglabs Aug 21, 2021 Maintainer

cliu-sift
Aug 5, 2021

Replies: 1 comment 1 reply

edwinnglabs
Aug 11, 2021
Maintainer

edwinnglabs Aug 21, 2021
Maintainer