Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate the negative effect of one PopulationRateMonitor per Subgroup #2

Open
denisalevi opened this issue Apr 13, 2022 · 0 comments

Comments

@denisalevi
Copy link
Owner

Each NeuronGroup object with N neurons and with a threshold condition defines its own spikespace. In the C++ and CUDA standalone devices, this spikespace is declared as an N + 1 array. Each time step, the spiking neuron IDs of all spiking neurons are stored in the spikespace, and the last entry stores the number of spiking neurons. All other entries default to -1. Let's consider a NeuronGroup with N=5 neurons, of which neurons 0, 3 and 4 spike in the current time step. For cpp_standalone, the neuron IDs are sorted in the spikespace and it would look like this (using pseudocode here):

spikespace = [0, 3, 4, -1, -1, 3]

For cuda_standalone, the neuron IDs are generally randomly ordered (because the spikespace is filled in parallel by multiple CUDA threads during threshold detection, see Brian2CUDA paper, section 2.3.1). The spikespace could e.g. look like this

spikespace = [3, 0, 4, -1, -1, 3]

A PopulationRateMonitor records the instantaneous population firing rate at each time step. If it records from the full NeuronGroup, it just divides the number of spiking neurons by the total number of neurons

population_firing_rate = spikespace[N] / N

This works for both, cpp_standalone and cuda_standalone.

But when the PopulationRateMonitor records from the Subgroup of the NeuronGroup, it gets more complicated because the number of spikes in the spikespace only exists per NeuronGroup, not per Subgroup, hence the PopulationRateMonitor has to count the number of spiking neurons in the spikespace that belong to the recorded Subgroup. In Brian2, Subgroups are always contiguous, that means a Subgroup can consists of neurons 1, 2, 3 in our example (but never neurons 1, 3, without neuron 2). For cpp_standalone, the PopulationRateMonitor implementation then has to find the indices of the first and last spiking neurons in the spikespace, which are also in the recorded Subgroup. The number of spiking neurons for the SubGroup is then just the difference of those indices.

For cuda_standalone, where the neurons in the spikespace are not sorted, this is not possible. Instead, we need to go through all neurons in the spikespace and count those that belong to the spikespace. Our current implementation for that is working, but terribly inefficient. But a first fix is straight forward and should be implemented soon, see brian-team/brian2cuda#285.

Nevertheless, recording from a subgroup will always be more inefficient, and even more so in cuda_standalone. We should investigate the extent of this. If the effect is too strong, one could try a few workarounds:

  • If a Spikemonitor is recording all spikes of a Subgroup (or the NeuronGroup), we could determine the firing rates after the simulation from the SpikeMonitor data.
  • If concurrent kernel execution is implemented (Making use of concurrent kernel execution brian-team/brian2cuda#65, we are working on it), we could consider using many NeuronGroups instead of a single merged group. Would need some benchmarking of whether this makes sense since many aspects play a role here: How much slower is the concurrent execution of many small kernels compared to one large one, how bad is the populationrate recording, what is the effect on compile times when having many codeobject source files from many NeuronGroups etc.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant