Replies: 6 comments
-
@gmgeorg Thanks for pointing me to the I haven't gone through your useful links yet, but what does a Can you maybe give a step-by-step guide including density-plots of Very much looking forward to you reply! |
Beta Was this translation helpful? Give feedback.
-
@StatMixedML Great questions. I will respond briefly here w/ references, but all details I defer to original papers, LambertW R package vignette / manual / examples, and various cross-validated posts.
They transform latent random variables X ~ F into observed random variables (data) Y ~ Lambert W x F, using bijective transformation that introduces skewness/heavyier tails compared to F. This is especially interesting for F being a Normal distribution. Key point here is that you can back-transform the observed data into the latent F space (e.g., see here for transforming even a Cauchy distribution into sthg that is indistinguishable from a random sample of a Normal distribution).
Almost :) ... it allows to transform any latent variable
With caveat from above that the direction of transformation is reversed usually, the distribution of the transformed data is "F" (whatever you think is most reasonable for the latent process -- obviously in practice check the assumptions by testing for x = y_{transformed} after training the model).
Yes, it's a trivial sampling strategy (a deterministic function of samples of
Lambert W x F distributions have one (or two) additional parameters on top of the original F distribution:
The latent random variable / data property is lost here; also it's much more parsimonious to have a 3 parameter family to capture heavy-tails than K mixture of Gaussians, which also don't actually capture heavy-tailedness (let alone estimation of likelihood requiring EM algorithms compared to simple, direct optimization with global optima).
PTAL at various cross-validated posts and run the examples in the LambertW package for illustration.
Other helpful links
|
Beta Was this translation helpful? Give feedback.
-
Thanks for your great answer. I need some time, though, to get familiar with the framework. Sounds very interesting indeed! |
Beta Was this translation helpful? Give feedback.
-
@gmgeorg Can you create a PR that implements the Lambert W x F as a new distribution, including example, documentation and unit-tests? I currently don't find the time. Thanks. |
Beta Was this translation helpful? Give feedback.
-
I'll take a stab at it. Btw, https://github.com/gmgeorg/torchlambertw is now public so can be included here if needed later down the road. Will let you know if I run into any issues trying to implement it for Lambert W x Normal. |
Beta Was this translation helpful? Give feedback.
-
@StatMixedML added them here |
Beta Was this translation helpful? Give feedback.
-
It would be great if XGBoostLSS can support Lambert W x F distributions; particularly useful are Lambert W x Gaussian distributions (Tukey's h is a special case of this for$\alpha=1$ and $h = \delta$ ) as they can be used to transform data to normally distributed data, even if original data is (very) heavy tailed.
In XGBoostLSS context I can see this being useful any time where normal regression might be too restrictive to give correct tail probability estimates (e.g., low sample size; financial data) and one can inspect$\delta$ predictions from XGBoostLSS for which parts of space have long/heavy tail (more uncertainty) than others. Secondly, skewed/heavy-tail Lambert W x gamma distributions are useful to impose heavier right tail for survival like problems.
I'm not aware of a
pytorch
implementation of Lambert W function, let alone Lambert W x F distributions. TensorFlow has both implemented;scipy.special.lambertw
implements the Lambert W function.If a
pytorch
implementation of thedistribution
is required to make this work in XGBoostLSS, then as an alternative AFAICT this should be possible to accomplish using normalizing flows, with the heavy-tail Lambert W transformation as a specific normalizing flow function.References
heavy-tail Lambert W x F distributions (Goerg, 2015)
LambertW R package: https://github.com/gmgeorg/LambertW
TensorFlow probability implementation of Lambert W bijectors and Lambert W x Gaussian
Distribution
gaussianization layers based on Lambert W x F transformations/distributions: https://openreview.net/forum?id=OXP9Ns0gnIq
Beta Was this translation helpful? Give feedback.
All reactions