Feature Request: Support Lambert W x F distributions #55

gmgeorg · 2023-09-04T14:22:26Z

gmgeorg
Sep 4, 2023

It would be great if XGBoostLSS can support Lambert W x F distributions; particularly useful are Lambert W x Gaussian distributions (Tukey's h is a special case of this for $\alpha=1$ and $h = \delta$) as they can be used to transform data to normally distributed data, even if original data is (very) heavy tailed.

In XGBoostLSS context I can see this being useful any time where normal regression might be too restrictive to give correct tail probability estimates (e.g., low sample size; financial data) and one can inspect $\delta$ predictions from XGBoostLSS for which parts of space have long/heavy tail (more uncertainty) than others. Secondly, skewed/heavy-tail Lambert W x gamma distributions are useful to impose heavier right tail for survival like problems.

I'm not aware of a pytorch implementation of Lambert W function, let alone Lambert W x F distributions. TensorFlow has both implemented; scipy.special.lambertw implements the Lambert W function.

If a pytorch implementation of the distribution is required to make this work in XGBoostLSS, then as an alternative AFAICT this should be possible to accomplish using normalizing flows, with the heavy-tail Lambert W transformation as a specific normalizing flow function.

References

heavy-tail Lambert W x F distributions (Goerg, 2015)
LambertW R package: https://github.com/gmgeorg/LambertW
TensorFlow probability implementation of Lambert W bijectors and Lambert W x Gaussian Distribution
gaussianization layers based on Lambert W x F transformations/distributions: https://openreview.net/forum?id=OXP9Ns0gnIq

StatMixedML · 2023-09-04T16:55:31Z

StatMixedML
Sep 4, 2023
Maintainer

@gmgeorg Thanks for pointing me to the Lambert W x F family of distributions. I haven't been aware of them as of now.

I haven't gone through your useful links yet, but what does a Lambert W x F distribution do exactly? Would it allow to transform any observed variable y with distribution F_{y} into another Lambert W x F distribution? As a specific example, assume y to be heavily skewed, how would the Lambert W x F transformed distribution of y_{transformed} look like. Can we sample from the new distribution? What kind of parameters are there to transform y to y_{transformed}. How does the choice of the initial F_{y} affect estimation? Why not use a mixture of Gaussians instead?

Can you maybe give a step-by-step guide including density-plots of y and y_{transformed}.

Very much looking forward to you reply!

0 replies

gmgeorg · 2023-09-04T17:49:11Z

gmgeorg
Sep 4, 2023
Author

@StatMixedML Great questions. I will respond briefly here w/ references, but all details I defer to original papers, LambertW R package vignette / manual / examples, and various cross-validated posts.

I haven't gone through your useful links yet, but what does a Lambert W x F distribution do exactly?

They transform latent random variables X ~ F into observed random variables (data) Y ~ Lambert W x F, using bijective transformation that introduces skewness/heavyier tails compared to F. This is especially interesting for F being a Normal distribution. Key point here is that you can back-transform the observed data into the latent F space (e.g., see here for transforming even a Cauchy distribution into sthg that is indistinguishable from a random sample of a Normal distribution).

Would it allow to transform any observed variable y with distribution F_{y} into another Lambert W x F distribution?

Almost :) ... it allows to transform any latent variable $X \sim F$ into an observable $Y \sim$ Lambert W x F (the inverse then gets you from $y$ to $x$). See Figure 1 of original paper(s) Goerg 2011 and Goerg 2015 for motivation.

As a specific example, assume y to be heavily skewed, how would the Lambert W x F transformed distribution of y_{transformed} look like.

With caveat from above that the direction of transformation is reversed usually, the distribution of the transformed data is "F" (whatever you think is most reasonable for the latent process -- obviously in practice check the assumptions by testing for x = y_{transformed} after training the model).

Can we sample from the new distribution?

Yes, it's a trivial sampling strategy (a deterministic function of samples of $x_1, \ldots, x_n$). See here.

What kind of parameters are there to transform y to y_{transformed}. How does the choice of the initial F_{y} affect estimation?

Lambert W x F distributions have one (or two) additional parameters on top of the original F distribution:

skewed Lambert W x F distributions (Goerg 2011) have one additional skewness parameter $\gamma$
heavy-tailed Lambert W x F distributions (Goerg 2015) have one (or two) additional tail parameters, $\delta$ (for symmetric) and $(\delta_l, \delta_r)$ for skewed, heavy tails.

Why not use a mixture of Gaussians instead?

The latent random variable / data property is lost here; also it's much more parsimonious to have a 3 parameter family to capture heavy-tails than K mixture of Gaussians, which also don't actually capture heavy-tailedness (let alone estimation of likelihood requiring EM algorithms compared to simple, direct optimization with global optima).

Can you maybe give a step-by-step guide including density-plots of y and y_{transformed}.

PTAL at various cross-validated posts and run the examples in the LambertW package for illustration.

Other helpful links

https://statmodeling.stat.columbia.edu/wp-content/uploads/2021/06/week24_lambertw_blog_post.pdf and https://statmodeling.stat.columbia.edu/2021/08/27/lambertw-x-fx-transforms-in-stan/ for examples of (skewed) Lambert W x exponential distributions
gaussianize Python package w/ sklearn API to Gaussianize data with Lambert W x F transforms

0 replies

StatMixedML · 2023-09-05T07:39:24Z

StatMixedML
Sep 5, 2023
Maintainer

Thanks for your great answer. I need some time, though, to get familiar with the framework. Sounds very interesting indeed!

0 replies

StatMixedML · 2023-09-29T11:39:44Z

StatMixedML
Sep 29, 2023
Maintainer

@gmgeorg Can you create a PR that implements the Lambert W x F as a new distribution, including example, documentation and unit-tests? I currently don't find the time. Thanks.

0 replies

gmgeorg · 2023-10-25T22:28:10Z

gmgeorg
Oct 25, 2023
Author

I'll take a stab at it. Btw, https://github.com/gmgeorg/torchlambertw is now public so can be included here if needed later down the road. Will let you know if I run into any issues trying to implement it for Lambert W x Normal.

0 replies

gmgeorg · 2023-11-30T15:24:23Z

gmgeorg
Nov 30, 2023
Author

@StatMixedML added them here

https://github.com/StatMixedML/XGBoostLSS/pull/65/files

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature Request: Support Lambert W x F distributions #55

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 6 comments

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

Select a reply

Feature Request: Support Lambert W x F distributions #55

gmgeorg Sep 4, 2023

Replies: 6 comments

StatMixedML Sep 4, 2023 Maintainer

gmgeorg Sep 4, 2023 Author

StatMixedML Sep 5, 2023 Maintainer

StatMixedML Sep 29, 2023 Maintainer

gmgeorg Oct 25, 2023 Author

gmgeorg Nov 30, 2023 Author

gmgeorg
Sep 4, 2023

StatMixedML
Sep 4, 2023
Maintainer

gmgeorg
Sep 4, 2023
Author

StatMixedML
Sep 5, 2023
Maintainer

StatMixedML
Sep 29, 2023
Maintainer

gmgeorg
Oct 25, 2023
Author

gmgeorg
Nov 30, 2023
Author