Computations related to the paper Gaussian Pre-Activations in Neural Networks: Myth or Reality? (P. Wolinski, J. Arbel).
This package computes the functions Example01_Full_computation.ipynb
.
We want to find a pair
We have chosen
Thus, given
Two steps:
- find a symmetric distribution
$\mathrm{Q}_{\theta}$ such that:
- find an odd function
$\phi_{\theta}$ such that:
We approximate by
Therefore, we only have to optimize the following loss according to
For that, we:
- use
integration.ParameterizedFunction
as function$g_{\Lambda}$ ; - perform the integration with
integration.Integrand
; - compute the loss;
- backpropagate the gradient of the loss through the computation of the integral to compute
$\frac{\partial \ell}{\partial \Lambda}$ ; - make a gradient step to train
$\Lambda$ .
This optimization process is coded in optimization.find_density
.
Once
where
To compute
- an interpolation: make a numerical computation of
$\phi_{\theta}(x)$ for several$x$ ; - an approximation: build a function that approximates well the "interpolation" (or "graph") computed at step 1:
a. propose a family of parameterized functionsactivation.ActivationFunction
that would fit well the interpolation for any$\theta$ ,
b. for a given$\theta$ , opzimize the parameters ofactivation.ActivationFunction
by gradient descent, so that the graph of the final function would be close to the interpolation.
This optimization process is coded in optimization.find_activation
.