You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Each NeuronGroup object with N neurons and with a threshold condition defines its own spikespace. In the C++ and CUDA standalone devices, this spikespace is declared as an N + 1 array. Each time step, the spiking neuron IDs of all spiking neurons are stored in the spikespace, and the last entry stores the number of spiking neurons. All other entries default to -1. Let's consider a NeuronGroup with N=5 neurons, of which neurons 0, 3 and 4 spike in the current time step. For cpp_standalone, the neuron IDs are sorted in the spikespace and it would look like this (using pseudocode here):
spikespace = [0, 3, 4, -1, -1, 3]
For cuda_standalone, the neuron IDs are generally randomly ordered (because the spikespace is filled in parallel by multiple CUDA threads during threshold detection, see Brian2CUDA paper, section 2.3.1). The spikespace could e.g. look like this
spikespace = [3, 0, 4, -1, -1, 3]
A PopulationRateMonitor records the instantaneous population firing rate at each time step. If it records from the full NeuronGroup, it just divides the number of spiking neurons by the total number of neurons
population_firing_rate = spikespace[N] / N
This works for both, cpp_standalone and cuda_standalone.
But when the PopulationRateMonitor records from the Subgroup of the NeuronGroup, it gets more complicated because the number of spikes in the spikespace only exists per NeuronGroup, not per Subgroup, hence the PopulationRateMonitor has to count the number of spiking neurons in the spikespace that belong to the recorded Subgroup. In Brian2, Subgroups are always contiguous, that means a Subgroup can consists of neurons 1, 2, 3 in our example (but never neurons 1, 3, without neuron 2). For cpp_standalone, the PopulationRateMonitor implementation then has to find the indices of the first and last spiking neurons in the spikespace, which are also in the recorded Subgroup. The number of spiking neurons for the SubGroup is then just the difference of those indices.
For cuda_standalone, where the neurons in the spikespace are not sorted, this is not possible. Instead, we need to go through all neurons in the spikespace and count those that belong to the spikespace. Our current implementation for that is working, but terribly inefficient. But a first fix is straight forward and should be implemented soon, see brian-team/brian2cuda#285.
Nevertheless, recording from a subgroup will always be more inefficient, and even more so in cuda_standalone. We should investigate the extent of this. If the effect is too strong, one could try a few workarounds:
If a Spikemonitor is recording all spikes of a Subgroup (or the NeuronGroup), we could determine the firing rates after the simulation from the SpikeMonitor data.
If concurrent kernel execution is implemented (Making use of concurrent kernel execution brian-team/brian2cuda#65, we are working on it), we could consider using many NeuronGroups instead of a single merged group. Would need some benchmarking of whether this makes sense since many aspects play a role here: How much slower is the concurrent execution of many small kernels compared to one large one, how bad is the populationrate recording, what is the effect on compile times when having many codeobject source files from many NeuronGroups etc.
The text was updated successfully, but these errors were encountered:
Each
NeuronGroup
object withN
neurons and with athreshold
condition defines its ownspikespace
. In the C++ and CUDA standalone devices, this spikespace is declared as anN + 1
array. Each time step, the spiking neuron IDs of all spiking neurons are stored in the spikespace, and the last entry stores the number of spiking neurons. All other entries default to-1
. Let's consider aNeuronGroup
withN=5
neurons, of which neurons 0, 3 and 4 spike in the current time step. Forcpp_standalone
, the neuron IDs are sorted in the spikespace and it would look like this (using pseudocode here):For
cuda_standalone
, the neuron IDs are generally randomly ordered (because the spikespace is filled in parallel by multiple CUDA threads during threshold detection, see Brian2CUDA paper, section 2.3.1). The spikespace could e.g. look like thisA
PopulationRateMonitor
records the instantaneous population firing rate at each time step. If it records from the fullNeuronGroup
, it just divides the number of spiking neurons by the total number of neuronsThis works for both,
cpp_standalone
andcuda_standalone
.But when the
PopulationRateMonitor
records from theSubgroup
of theNeuronGroup
, it gets more complicated because the number of spikes in the spikespace only exists perNeuronGroup
, not perSubgroup
, hence thePopulationRateMonitor
has to count the number of spiking neurons in the spikespace that belong to the recordedSubgroup
. In Brian2,Subgroup
s are always contiguous, that means aSubgroup
can consists of neurons1, 2, 3
in our example (but never neurons1, 3
, without neuron2
). Forcpp_standalone
, thePopulationRateMonitor
implementation then has to find the indices of the first and last spiking neurons in the spikespace, which are also in the recordedSubgroup
. The number of spiking neurons for theSubGroup
is then just the difference of those indices.For
cuda_standalone
, where the neurons in the spikespace are not sorted, this is not possible. Instead, we need to go through all neurons in the spikespace and count those that belong to the spikespace. Our current implementation for that is working, but terribly inefficient. But a first fix is straight forward and should be implemented soon, see brian-team/brian2cuda#285.Nevertheless, recording from a subgroup will always be more inefficient, and even more so in
cuda_standalone
. We should investigate the extent of this. If the effect is too strong, one could try a few workarounds:Spikemonitor
is recording all spikes of aSubgroup
(or theNeuronGroup
), we could determine the firing rates after the simulation from theSpikeMonitor
data.NeuronGroups
instead of a single merged group. Would need some benchmarking of whether this makes sense since many aspects play a role here: How much slower is the concurrent execution of many small kernels compared to one large one, how bad is the populationrate recording, what is the effect on compile times when having many codeobject source files from manyNeuronGroups
etc.The text was updated successfully, but these errors were encountered: