Equations for a "fast" $\frac{1}{\sqrt[3]{x}}$ method.
This repository contains a set of procedures to compute numerical methods in the vein of the fast inverse root method. In particular, we will generate code that
- Computes rational powers (
$x^{\frac{a}{b}}$ ) to an arbitrary precision. - Computes irrational powers (
$x^c$ ) to within 10% relative error. - Computes
$\exp(x)$ to within 10% relative error. - Computes
$\log(x)$ to within 10% relative error. - Computes the geometric mean
$\sqrt[n]{\prod_k^n x_k}$ of anstd::array
quickly to within 10% error.
Additionally, we will do so using mostly just integer arithmetic.
You can use everything in floathacks
by including hacks.h
#include <floathacks/hacks.h>
using namespace floathacks; // Comment this out if you don't want your top-level namespace to be polluted
This document is compiled from READOTHER.md
by readme2tex
. Make sure that you pip install readme2tex
. You
can run
python -m readme2tex --output README.md --branch svgs
to recompile these docs.
To generate an estimation for
float approximate_root = fpow<FLOAT(0.12345)>::estimate(x);
Since estimates of pow
can be refined into better iterates (as long as c
is "rational enough"), you can also
compute a more exact result via
float root = pow<FLOAT(0.12345), n>(x);
where n
is the number of newton iterations to perform. The code generated by this template will unroll itself, so it's
relatively efficient.
However, the optimized code does not let you use it as a constexpr
or where the exponent is not constant. In those cases,
you can use consts::fpow(x, c)
and consts::pow(x, c, iterations = 2)
instead:
float root = consts::pow(x, 0.12345, n);
Note that the compiler isn't able to deduce the optimal constants in these cases, so you'll incur additional penalties computing the constants of the method.
You can also compute an approximation of
float guess = fexp(x);
Unfortunately, since there are no refinement methods available for exponentials, we can't do much
with this result if it's too coarse for your needs. In addition, due to overflow, this method breaks down
when x
approaches 90.
Similarly, you can also compute an approximation of
float guess = flog(x);
Again, as is with the case of fexp
, there are no refinement methods available for logarithms either.
All of the f***
methods above have bounded relative errors of at most 10%. The refined pow
method
can be made to give arbitrary precision by increasing the number of refinement iterations. Each refinement
iteration takes time proportional to the number of digits in the floating point representation of the exponent.
Note that since floats are finite, this is bounded above by 32 (and more tightly, 23).
You can compute the geometric mean (std::array<float, n>
with
float guess = fgmean<3>({ 1, 2, 3 });
This can be refined, but you typically do not care about the absolute precision of a mean-like statistic.
To refine this, you can run Newton's method on
The key ingredient of these types of methods is the pair of transformations
-
$\textrm{f2l}(x)$ takes aIEEE 754
single precision floating point number and outputs its "machine" representation. In essence, it acts likeunsigned long f2l(float x) { union {float fl; unsigned long lg;} lens = { x }; return lens.lg; }
-
$\textrm{l2f}(z)$ takes an unsigned long representing a float and returns aIEEE 754
single precision floating point number. It acts likefloat l2f(unsigned long z) { union {float fl; unsigned long lg;} lens = { z }; return lens.fl; }
So for example, the fast inverse root method:
union {float fl; unsigned long lg;} lens = { x };
lens.lg = 0x5f3759df - lens.lg / 2;
float y = lens.fl;
can be equivalently expressed as $$ \textrm{l2f}\left(\textrm{0x5f3759df} - \frac{\textrm{f2l}(x)}{2}\right) $$
In a similar vein, a fast inverse cube-root method is presented at the start of this page. $$ \textrm{l2f}\left(\text{0x54a2fa8c} - \frac{\textrm{f2l}(x)}{3}\right) \approx \frac{1}{\sqrt[3]{x}} $$
We will justify this in the next section.
We can approximate any bias
, as long
as it is reasonably small, will work. At bias = 0
, the method computes a value whose error is completely positive.
Therefore, by increasing the bias, we can shift some of the error down into the negative plane and
halve the error.
As seen in the fast-inverse-root method, a bias of -0x5c416
tend to work well for pretty much every case that I've
tried, as long as we tack on at least one Newton refinement stage at the end. It works well without refinement as well,
but an even bias of -0x5c000
works even better.
Why does this work? See these slides for the derivation. In particular, the fast inverse square-root is a subclass of this method.
We can approximate
Here,
To give a derivation of this equation, we'll need to borrow a few mathematical tools from analysis. In particular, while
l2f
and f2l
have many discontinuities (
Consider the function
$$
\frac{\partial \textrm{f2l}(f(z))}{\partial z} = \textrm{f2l}'(f(z)) \cdot f'(z)
$$
where the equality is a consequence of the chain-rule, assuming that f2l
is differentiable at the particular value of
Well, it's not all that mysterious. The derivative of f2l
is just the rate at which a number's IEEE 754 machine
representation changes as we make small perturbations to a number. Unfortunately, while it might be easy to compute
this derivative as a numerical approximation, we still don't have an approximate form for algebraic manipulation.
While
Here, equality holds when
Therefore, $$ \textrm{l2f}'(x) \approx \epsilon \times \textrm{l2f}(x) $$
From here, we also have \begin{align*} \textrm{f2l}(\textrm{l2f}(z)) &= z \ \frac{\partial \textrm{f2l}(\textrm{l2f}(z))}{\partial z} &= \frac{dz}{dz} \ \textrm{f2l}'(\textrm{l2f}(z)) \textrm{l2f}'(z) &= 1 & \text{chain rule}\ \textrm{f2l}'(\textrm{l2f}(z)) &= \frac{1}{\textrm{l2f}'(z)} \ &\approx \frac{1}{\epsilon \times \textrm{l2f}(z)} & \text{since }\textrm{l2f}'(z) = \epsilon \textrm{l2f}(z)\ \textrm{f2l}'(x) &\approx \frac{\epsilon^{-1}}{x} & \text{by substituting } x = \textrm{l2f}(z) \end{align*}
Given
Similarly, since
This makes sense, since we'd like these two functions to be inverses of each other.
Consider
$$
\textrm{f2l}(\exp(x)) \approx \epsilon^{-1} \log(\exp(x)) + C
$$
which suggests that
Since we would like
In a similar spirit, we can use the approximation
$$
\textrm{f2l}(x) \approx \epsilon^{-1} \log(x) + C
$$
to derive
$$
\log(x) \approx \epsilon \times \left(\textrm{f2l}(x) - C\right)
$$
Imposing a boundary condition at
However, this actually computes some other logarithm
where the
There's a straightforward derivation of the geometric mean. Consider the approximations of
Notice that we just add a series of integers, followed by an integer divide, which is pretty efficient.
For more information on how the constant (
Equations rendered with readme2tex.