Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

High-dimensional ensembles seem to be weird #130

Open
tcstewar opened this issue Jul 25, 2016 · 5 comments
Open

High-dimensional ensembles seem to be weird #130

tcstewar opened this issue Jul 25, 2016 · 5 comments

Comments

@tcstewar
Copy link
Contributor

Here's a minimal(ish) example that exhibits some strange behaviour:

import nengo
import numpy as np
from nengo import spa

import ctn_benchmark
class Vision(ctn_benchmark.Benchmark):
    def params(self):
        self.default('number of neurons', n_neurons=80)
        self.default('dim input', dim_input=1260)
        self.default('dim output', dim_output=32)
        self.default('fixed input', fixed_input=False)
        self.default('function of time', function_of_time=False)
        self.default('aligned vector', aligned_vector=False)

    def model(self, p):
        model = nengo.Network()
        with model:
            a = nengo.Ensemble(p.n_neurons, p.dim_input)
            b = nengo.Node(None, size_in=p.dim_output)

            vocab = spa.Vocabulary(p.dim_output)
            if p.aligned_vector:
                vocab.add('A', np.eye(p.dim_output)[0])

            nengo.Connection(a, b, synapse=0.005,
                             function=lambda x: vocab.parse('A').v)

            if p.fixed_input:
                stim = nengo.Node(np.eye(p.dim_input)[0])
            else:
                def stim_func(t):
                    return np.eye(p.dim_input)[0]
                stim = nengo.Node(stim_func)
                if p.function_of_time and p.backend == 'nengo_spinnaker':
                    import nengo_spinnaker
                    nengo_spinnaker.add_spinnaker_params(model.config)
                    model.config[stim].function_of_time = True
            nengo.Connection(stim, a)

            self.p = nengo.Probe(b, synapse=0.03)
            self.vocab = vocab

        return model
    def evaluate(self, p, sim, plot):
        sim.run(0.5)
        if plot is not None:
            plot.plot(sim.trange(), np.dot(sim.data[self.p],
                                           self.vocab.vectors.T))
        return {}

if __name__ == '__main__':
    Vision().run()

This model is a simple large-vector input being fed into one ensemble, from which we decode out a smaller-dimensional function which we send to a passthrough Node and probe it.

Using branch dev0716, if we run this with 100 neurons, we run into this slicing problem:

INFO:nengo_spinnaker.simulator:Building netlist
Traceback (most recent call last):
  File "vision3.py", line 57, in <module>
    Vision().run()
  File "c:\users\terry\documents\github\ctn_benchmarks\ctn_benchmark\benchmark.py", line 122, in run
    sim = Simulator(model, dt=p.dt)
  File "c:\users\terry\documents\github\nengo_spinnaker\nengo_spinnaker\simulator.py", line 139, in __init__
    self.netlist = self.model.make_netlist(self.max_steps or 0)
  File "c:\users\terry\documents\github\nengo_spinnaker\nengo_spinnaker\builder\builder.py", line 340, in make_netlist
    self, *args, **kwargs
  File "c:\users\terry\documents\github\nengo_spinnaker\nengo_spinnaker\operators\lif.py", line 453, in make_vertices
    cluster_vertices = cluster.make_vertices(cycles)
  File "c:\users\terry\documents\github\nengo_spinnaker\nengo_spinnaker\operators\lif.py", line 619, in make_vertices
    assert n_slices <= 16  # Too many cores in the cluster
AssertionError

To fix this, we adjust lif.py to adjust the memory padding:

        dtcm_constraint = partition.Constraint(16 * 64 * 2**10,
                                               0.9)  # 90% of 16 cores DTCM

        # The number of cycles available is 200MHz * the machine timestep; or
        #200 * the machine timestep in microseconds.
        cycles = 200 * model.machine_timestep
        cpu_constraint = partition.Constraint(cycles * 16,
                                              0.8)  # 80% of 16 cores compute

For the examples run on this page, I've set the 0.9 and 0.8 values to 0.5 and 0.5.

Running this model with different parameters produces a variety of results.

If I run the model as it is (n_neurons=80, fixed_input=False, function_of_time=False, aligned_vector=False) then the system is passing in the input during runtime and I get a whole bunch of watchdog errors:

Core (2, 9, 2) in state AppState.watchdog
/local/mundya/nengo_devel/nengo_spinnaker/spinnaker_components/ensemble/ensemble.c:584 Malloc ensemble.input (5040 bytes)
/local/mundya/nengo_devel/nengo_spinnaker/spinnaker_components/common/input_filtering.c:286 Malloc filters->filters (20 bytes)
/local/mundya/nengo_devel/nengo_spinnaker/spinnaker_components/common/input_filtering.c:308 Malloc filters->filters[f].input (8 bytes)
/local/mundya/nengo_devel/nengo_spinnaker/spinnaker_components/common/input_filtering.c:310 Malloc filters->filters[f].input->value (360 bytes)
/local/mundya/nengo_devel/nengo_spinnaker/spinnaker_components/common/input_filtering.c:318 Malloc filters->filters[f].output (360 bytes)
/local/mundya/nengo_devel/nengo_spinnaker/spinnaker_components/common/input_filtering.c:81 Malloc filter->state (8 bytes)
/local/mundya/nengo_devel/nengo_spinnaker/spinnaker_components/common/input_filtering.c:240 Malloc filters->routes (16 bytes)
/local/mundya/nengo_devel/nengo_spinnaker/spinnaker_components/ensemble/ensemble.c:655 Malloc ensemble.encoders (30240 bytes)
/local/mundya/nengo_devel/nengo_spinnaker/spinnaker_components/ensemble/ensemble.c:661 Malloc ensemble.bias (24 bytes)
/local/mundya/nengo_devel/nengo_spinnaker/spinnaker_components/ensemble/ensemble.c:666 Malloc ensemble.gain (24 bytes)
/local/mundya/nengo_devel/nengo_spinnaker/spinnaker_components/ensemble/ensemble.c:671 Malloc ensemble.population_lengths (56 bytes)
/local/mundya/nengo_devel/nengo_spinnaker/spinnaker_components/ensemble/ensemble.c:697 Malloc ensemble.spikes (56 bytes)
/local/mundya/nengo_devel/nengo_spinnaker/spinnaker_components/ensemble/ensemble.c:703 Malloc ensemble.decoders (640 bytes)
/local/mundya/nengo_devel/nengo_spinnaker/spinnaker_components/ensemble/ensemble.c:716 Malloc ensemble.keys (8 bytes)
/local/mundya/nengo_devel/nengo_spinnaker/spinnaker_components/ensemble/neuron_lif.c:17 Malloc ensemble->state (16 bytes)
/local/mundya/nengo_devel/nengo_spinnaker/spinnaker_components/ensemble/neuron_lif.c:21 Malloc state->voltages (24 bytes)
/local/mundya/nengo_devel/nengo_spinnaker/spinnaker_components/ensemble/neuron_lif.c:25 Malloc state->refractory (24 bytes)
PES learning: Num rules:0
Voja learning: Num rules:0, One over radius:1.000000
/local/mundya/nengo_devel/nengo_spinnaker/spinnaker_components/ensemble/recording.c:14 Malloc buffer->buffer (12 bytes)
/local/mundya/nengo_devel/nengo_spinnaker/spinnaker_components/ensemble/recording.c:14 Malloc buffer->buffer (4 bytes)

If I give it a fixed input (n_neurons=80, fixed_input=True, function_of_time=False, aligned_vector=False), then the value gets precomputed, rolled into the bias, and it works great:

image

But, if I now increase the number of neurons to 100 (n_neurons=100, fixed_input=True, function_of_time=False, aligned_vector=False), it runs but gives an incorrect result (much much smaller than it should be):

image

Bizarrely, if I set aligned_vector to True, it now works fine at both 80 and 100 neurons. This is bizarre, since the only change is that the desired output is [1,0,0,0,0,0,...] instead of a randomly chosen 32-dimensional unit vector.

(n_neurons=100, fixed_input=True, function_of_time=False, aligned_vector=False)
image

(n_neurons=80, fixed_input=True, function_of_time=False, aligned_vector=False)
image

Now let's see what happens if we set the input to be a function_of_time Node.

The function_of_time approach works great with 100 neurons and aligned vectors:

(n_neurons=100, fixed_input=False, function_of_time=True, aligned_vector=True)
image

If we don't go with an aligned_vector, we get the same problem as above:
(n_neurons=100, fixed_input=False, function_of_time=True, aligned_vector=False)
image

But now if I go down to 80 neurons, it dies with a watchdog error:
(n_neurons=80, fixed_input=False, function_of_time=True, aligned_vector=False)

INFO:nengo_spinnaker.simulator:Running simulation...
Core (7, 8, 1) in state AppState.watchdog
/local/mundya/nengo_devel/nengo_spinnaker/spinnaker_components/ensemble/ensemble.c:584 Malloc ensemble.input (5040 bytes)
/local/mundya/nengo_devel/nengo_spinnaker/spinnaker_components/common/input_filtering.c:286 Malloc filters->filters (20 bytes)
/local/mundya/nengo_devel/nengo_spinnaker/spinnaker_components/common/input_filtering.c:308 Malloc filters->filters[f].input (8 bytes)
/local/mundya/nengo_devel/nengo_spinnaker/spinnaker_components/common/input_filtering.c:310 Malloc filters->filters[f].input->value (360 bytes)
/local/mundya/nengo_devel/nengo_spinnaker/spinnaker_components/common/input_filtering.c:318 Malloc filters->filters[f].output (360 bytes)
/local/mundya/nengo_devel/nengo_spinnaker/spinnaker_components/common/input_filtering.c:81 Malloc filter->state (8 bytes)
/local/mundya/nengo_devel/nengo_spinnaker/spinnaker_components/common/input_filtering.c:240 Malloc filters->routes (16 bytes)
/local/mundya/nengo_devel/nengo_spinnaker/spinnaker_components/ensemble/ensemble.c:655 Malloc ensemble.encoders (30240 bytes)
/local/mundya/nengo_devel/nengo_spinnaker/spinnaker_components/ensemble/ensemble.c:661 Malloc ensemble.bias (24 bytes)
/local/mundya/nengo_devel/nengo_spinnaker/spinnaker_components/ensemble/ensemble.c:666 Malloc ensemble.gain (24 bytes)
/local/mundya/nengo_devel/nengo_spinnaker/spinnaker_components/ensemble/ensemble.c:671 Malloc ensemble.population_lengths (56 bytes)
/local/mundya/nengo_devel/nengo_spinnaker/spinnaker_components/ensemble/ensemble.c:697 Malloc ensemble.spikes (56 bytes)
/local/mundya/nengo_devel/nengo_spinnaker/spinnaker_components/ensemble/ensemble.c:703 Malloc ensemble.decoders (960 bytes)
/local/mundya/nengo_devel/nengo_spinnaker/spinnaker_components/ensemble/ensemble.c:716 Malloc ensemble.keys (12 bytes)
/local/mundya/nengo_devel/nengo_spinnaker/spinnaker_components/ensemble/neuron_lif.c:17 Malloc ensemble->state (16 bytes)
/local/mundya/nengo_devel/nengo_spinnaker/spinnaker_components/ensemble/neuron_lif.c:21 Malloc state->voltages (24 bytes)
/local/mundya/nengo_devel/nengo_spinnaker/spinnaker_components/ensemble/neuron_lif.c:25 Malloc state->refractory (24 bytes)
PES learning: Num rules:0
Voja learning: Num rules:0, One over radius:1.000000
/local/mundya/nengo_devel/nengo_spinnaker/spinnaker_components/ensemble/recording.c:14 Malloc buffer->buffer (12 bytes)
/local/mundya/nengo_devel/nengo_spinnaker/spinnaker_components/ensemble/recording.c:14 Malloc buffer->buffer (4 bytes)
@tcstewar
Copy link
Contributor Author

tcstewar commented Jul 25, 2016

Also, given that last surprising result (that it works with 100 neurons but not with 80), I also tried running it without the function_of_time nor the fixed_input setting at 100 neurons:

(n_neurons=100,, fixed_input=False, function_of_time=False, aligned_vector=False)
image

(n_neurons=100,, fixed_input=False, function_of_time=False, aligned_vector=True)
image

So, looks like having a normal input Node (i.e. one that's executed on the host) runs fine at 100 neurons but not at 80 neurons, and it hits that same weird problem that the output value is only correct if there's an aligned value on the output.

@tcstewar
Copy link
Contributor Author

A hopefully useful summary of the above:

  • function_of_time=False, fixed_input=False
    • aligned_vector=False
      • n_neurons=80: watchdog crash
      • n_neurons=100: runs, but wrong answer
    • aligned_vector=True
      • n_neurons=80: watchdog crash
      • n_neurons=100: runs, correct answer
  • function_of_time=False, fixed_input=True
    • aligned_vector=False
      • n_neurons=80: runs, correct answer
      • n_neurons=100: runs, but wrong answer
    • aligned_vector=True
      • n_neurons=80: runs, correct answer
      • n_neurons=100: runs, correct answer
  • function_of_time=True, fixed_input=False
    • aligned_vector=False
      • n_neurons=80: watchdog crash
      • n_neurons=100: runs, but wrong answer
    • aligned_vector=True
      • n_neurons=80: watchdog crash
      • n_neurons=100: runs, correct answer

@mundya
Copy link
Member

mundya commented Jul 25, 2016

Thank you for all the detail!

Note to self, or @tcstewar if you want to try - what happens if the vector is [0, 0, ..., 1] and supplied using a function_of_time input? Prediction - the output isn't as expected. If so, the cause may well be that the compute constraint is being violated - maybe because the packet processing cost hasn't been accounted for.

@tcstewar
Copy link
Contributor Author

Note to self, or @tcstewar if you want to try - what happens if the vector is [0, 0, ..., 1] and supplied using a function_of_time input? Prediction - the output isn't as expected. If so, the cause may well be that the compute constraint is being violated - maybe because the packet processing cost hasn't been accounted for.

Hmm... I just tried this, and the output is actually correct. :(

@mundya
Copy link
Member

mundya commented Oct 25, 2016

I'm wondering if running the nodes on host is adding additional noise to debugging this problem - largely because the place and route solution isn't (currently) necessarily repeatable, consequently where the network is placed will affect how reliably data from the host gets into the simulation.

That said I'll investigate this a little later today.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants