Add pareto distribution #82

tblazina · 2021-02-14T22:07:07Z

I'm not familiar really with this distribution and am a bit confused with all the different ways it is parameterized depending on which library you look at, for example Stan and PyMC3 both use a shape and scale parameter but the jax.scipy.stats implementation uses a parameter b as well as a loc and scale parameter. I guess the use of the b parameter stems from the jax.random.pareto function which I seems to be similar to the Numpy implementation where it is the "The Lomax or Pareto II distribution is a shifted Pareto distribution". I am not sure which would be preferable to use, some guidance/input would be appreciated. 🙏

This is necessary to account for the jax.random.pareto function using the type II Pareto distribution

rlouf · 2021-02-16T14:23:52Z

If we note $m$ the scale parameter and $b$ the shape parameter, my understanding is that JAX implemnts:

$$P(x) = \frac{b m^b}{(x-loc)^{b+1}}$$

While on the other hand PyMC3 implements

$$P(x) = \frac{b m^{b}}{x^{b+1}}$$

What I would do is leave scale and shape as the first two arguments when initializing the distribution (same defaults as PyMC3) and add a keyword argument loc = 0. So it would have the signature Pareto(shape, scale, loc=0). What do you think?

tblazina · 2021-02-16T16:49:59Z

Sounds reasonable to me 👍
Would you prefer using b and m as the parameter names or rather shape and scale? I personally like shape and scale better but not sure it there is some reason that being consistent with the jax implementation would be preferable?

rlouf · 2021-02-16T17:27:37Z

shape and scale make more sense to me, and it is better to have an API close to PyMC3's. You can keep loc.

Also added Pareto distribution to mcx.distribution init

Add in promo_shapes and brodcast_to

rlouf

Thank you for taking the time to add the Pareto distribution! It will be ready to merge once we have tests for the sampling shape and support correctness.

rlouf · 2021-03-12T10:54:42Z

tests/distributions/pareto_test.py

+        numerator = (scale ** 2) * shape
+        denominator = ((shape - 1) ** 2) * (shape - 2)
+        return numerator / denominator
+


Great addition! However, before I merge we'll need to add tests for the shape and the support! Would you mind adding those?

indeed, was planning on it when I get some time, hopefully in the next few days!

Is there anything I can do to help?

I'll let you know when I get to it this weekend. Last 2.5 weeks I had a kidney stone which involved two surgeries and like 5 nights in hospitals, but things seem to be resolved now. 2021 has not been my year in terms of health. Nonetheless, I should finally have some time this weekend!

ok I found some time to add more tests - but I'm having one issue with a failing test for the variance in the case when the shape parameter is <= 2 and I'm not entirely sure what I've implemented wrong. Not being totally familiar with the Pareto distribution i've kind of just followed the information on https://en.wikipedia.org/wiki/Pareto_distribution which is stating that the variance should be infinite when the shape parameter is <= 2, however this is not the case in the current implementation. I'd appreciate some feedback!

Extensive test suite, great job!

Remember that we defined shape = b in this case. The variance should thus be theoretically infinite when $shape < 1$ per the fomulae above.

Then, if you measure the variance of samples drawn from the distribution, you should get a very large number but not strictly $\infty$. You can check that $\sigma > 10 \mu$ for instance when $0 < shape < 1$. It would also be nice to check that $\mu \rightarrow \infty$ when $shape < 0$.

Alright, I'll update the tests to reflect this. Thanks for the clarification!

Sorry was away from this too long and am a bit confused because in your suggestion you are using $\sigma$ and $\mu$ notation, and I'm a bit confused as to what you are referring to, when you say "The variance should thus be theoretically infinite when $shape < 1$ per the fomulae above." I'm not sure what formulae you are exactly referring too because in the way I've implemented it, having a $shape < 1$ doesn't result in the variance being infinite:

numerator = (scale ** 2) * shape denominator = ((shape - 1) ** 2) * (shape - 2) return numerator / denominator

I get that for that variance of the samples won't strictly be $\infty$, but I think I have implemented the Pareto distribution incorrectly but can't figure out what I've done wrong. Would need some additional assistance, thanks!

rlouf · 2021-06-14T06:31:47Z

Hi @tblazina Looks like the tests are not passing :( Are you still planning on working on this?

tblazina · 2021-06-15T11:33:11Z

Hi @rlouf - sorry about that, this fell by the wayside but I would plan on finishing this yes! I'll try to get to it in the next few days and if I don't think I can get around to doing it I will let you know!

tblazina added 2 commits February 14, 2021 23:06

Add pareto distribution

74eb7b0

Fix sample method

03ee596

This is necessary to account for the jax.random.pareto function using the type II Pareto distribution

rlouf changed the title ~~Add pareto distribution~~ [WIP] Add pareto distribution Feb 16, 2021

Update parameter names in Pareto distribution

d41346b

Also added Pareto distribution to mcx.distribution init

rlouf force-pushed the master branch 3 times, most recently from f8f3e6b to 965f6dd Compare February 23, 2021 11:28

tblazina added 4 commits March 7, 2021 20:57

Merge branch 'master' into add-pareto-distribution

957a6f3

Update pareto implementation

1b9dd28

Add in promo_shapes and brodcast_to

Add some first tests for Pareto distribution

7de71bf

Fix import sorting and linting errors

ea00e36

rlouf requested changes Mar 12, 2021

View reviewed changes

Add more tests for Pareto distribution

403d897

rlouf changed the title ~~[WIP] Add pareto distribution~~ Add pareto distribution Apr 12, 2021

Merge branch 'master' into add-pareto-distribution

dadb1c7

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add pareto distribution #82

Add pareto distribution #82

tblazina commented Feb 14, 2021 •

edited

Loading

rlouf commented Feb 16, 2021

tblazina commented Feb 16, 2021

rlouf commented Feb 16, 2021

rlouf left a comment

rlouf Mar 12, 2021

tblazina Mar 13, 2021

rlouf Mar 30, 2021

tblazina Mar 30, 2021

tblazina Mar 31, 2021

rlouf Apr 12, 2021

tblazina Apr 15, 2021

tblazina Jul 3, 2021

rlouf commented Jun 14, 2021

tblazina commented Jun 15, 2021

Add pareto distribution #82

Are you sure you want to change the base?

Add pareto distribution #82

Conversation

tblazina commented Feb 14, 2021 • edited Loading

rlouf commented Feb 16, 2021

tblazina commented Feb 16, 2021

rlouf commented Feb 16, 2021

rlouf left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rlouf commented Jun 14, 2021

tblazina commented Jun 15, 2021

tblazina commented Feb 14, 2021 •

edited

Loading