Public Types | Public Member Functions | List of all members
spore::SynapticSamplingRewardGradientConnection< targetidentifierT > Class Template Reference

Reward-based synaptic sampling connection class. More...

#include <synaptic_sampling_rewardgradient_connection.h>

Inheritance diagram for spore::SynapticSamplingRewardGradientConnection< targetidentifierT >:
Inheritance graph
[legend]
Collaboration diagram for spore::SynapticSamplingRewardGradientConnection< targetidentifierT >:
Collaboration graph
[legend]

Public Types

typedef SynapticSamplingRewardGradientCommonProperties CommonPropertiesType
 Type to use for representing common synapse properties.
 
typedef nest::Connection< targetidentifierT > ConnectionBase
 Shortcut for base class.
 

Public Member Functions

 SynapticSamplingRewardGradientConnection ()
 
 SynapticSamplingRewardGradientConnection (const SynapticSamplingRewardGradientConnection< targetidentifierT > &rhs)
 
 ~SynapticSamplingRewardGradientConnection ()
 
void check_connection (nest::Node &s, nest::Node &t, nest::rport receptor_type, double t_lastspike, const CommonPropertiesType &cp)
 
void get_status (DictionaryDatum &d) const
 
void set_status (const DictionaryDatum &d, nest::ConnectorModel &cm)
 Status setter function. More...
 
void send (nest::Event &e, nest::thread t, double t_lastspike, const CommonPropertiesType &cp)
 
void check_synapse_params (const DictionaryDatum &syn_spec) const
 

Detailed Description

template<typename targetidentifierT>
class spore::SynapticSamplingRewardGradientConnection< targetidentifierT >

Reward-based synaptic sampling connection class.

This connection type implements the reward-based synaptic sampling algorithm introduced in [1,2,3]. The target node to which synapses of this type are connected must be derived from TracingNode. A second node which is also derived from type TracingNode must be registered to the synapse model at its reward_transmitter parameter. The synapse model performs a stochastic policy search which tries to maximize the reward signal provided by the reward_transmitter node. At the same time synaptic weights are constraint by a Gaussian prior with mean $\mu$ and standard deviation $\sigma$. This synapse type can not change its sign, i.e. synapses are either excitatory or inhibitory depending on the sign of the weight_scale parameter. If synaptic weights fall below a threshold (determined by parameter parameter_mapping_offset) weights are clipped to zero (retracted synapses). The synapse model also implements an optional mechanism to automatically remove retracted synapses from the simulation. This mechanism can be turned on using the delete_retracted_synapses parameter.

Parameters and state variables

The following parameters can be set in the common properties dictionary (default values and constraints are given in parentheses, corresponding symbols in the equations given below and in references [1,2,3] are given in braces):

name type comment
learning_rate double learning rate (5e-08, ≥0.0) { $\beta$}
temperature double amplitude of parameter noise (0.1, ≥0.0) { $T_\theta$}
gradient_noise double amplitude of gradient noise (0.0, ≥0.0) { $T_g$}
psp_tau_rise double double exponential PSP kernel rise (2.0, >0.0) [ms] { $\tau_r$}
psp_tau_fall double double exponential PSP kernel decay (20.0, >0.0) [ms] { $\tau_m$}
psp_cutoff_amplitude double psp is clipped to 0 below this value (0.0001, ≥0.0) { $\tau_m$}
integration_time double time of gradient integration (50000.0, >0.0) [ms] { $\tau_g$}
episode_length double length of eligibility trace (1000.0, >0.0) [ms] { $\tau_e$}
weight_update_interval double interval of synaptic weight updates (100.0, >0.0) [ms]
parameter_mapping_offset double offset parameter for computing synaptic weight (3.0) { $\theta_0$}
weight_scale double scaling factor for the synaptic weight (1.0) { $w_0$}
direct_gradient_rate double rate of directly applying changes to the synaptic parameter (0.0) { $c_e$}
gradient_scale double scaling parameter for the gradient (1.0) { $c_g$}
max_param double maximum synaptic parameter (5.0)
min_param double minimum synaptic parameter (-2.0)
max_param_change double maximum synaptic parameter change (40.0, ≥0.0)
reward_transmitter long GID of the synapse's reward transmitter*
bap_trace_id long ID of the BAP trace (0, ≥0)
dopa_trace_id long ID of the dopamine trace (0, ≥0)
simulate_retracted_synapses bool continue simulating retracted synapses (false)
delete_retracted_synapses bool delete retracted synapses (false)

*) reward_transmitter must be set to the GID of a TracingNode before simulation startup.

The following parameters can be set in the status dictionary:

name type comment
synaptic_parameter double current synaptic parameter { $\theta(t)$}
weight double current synaptic weight { $w(t)$}
eligibility_trace double current eligibility trace { $e(t)$}
reward_gradient double current reward gradient { $g(t)$}
prior_mean double mean of the Gaussian prior { $\mu$}
prior_precision double precision of the Gaussian prior { $c_p$}
recorder_times [double] time points of parameter recordings*
weight_values [double] array of recorded synaptic weight values*
synaptic_parameter_values [double] array of recorded synaptic parameter values*
reward_gradient_values [double] array of recorded reward gradient values*
eligibility_trace_values [double] array of recorded eligibility trace values*
psp_values [double] array of recorded psp values*
recorder_interval double interval of synaptic recordings [ms]
reset_recorder bool clear all recorded values now* (write only)

*) Recorder fields are read only. If reset_recorder is set to true all recorder fields will be cleared instantaneously.

Implementation Details

This connection type is a diligent synapse model, therefore updates are triggered on a regular interval which is ensured by the ConnectionUpdateManager. The state of each synapse consists of the variables $y(t), e(t), g(t), \theta(t), w(t)$. The variable $y(t)$ is the presynaptic spike train filtered with a PSP kernel $\epsilon(t)$ of the form

\[ \epsilon(t) \;=\; \frac{\tau_r}{\tau_m - \tau_r}\left( e^{-\frac{1}{\tau_m}} - e^{-\frac{1}{\tau_r}} \right)\;. \hspace{24px} (1) \]

A node derived from type TracingNode must be registered to the synapse model at its reward_transmitter parameter. The trace of this node at id dopa_trace_id is used as reward signal $dopa(t)$. The trace of the postsynaptic neuron with id bap_trace_id is used as back-propagating signal $bap(t)$. The synapse then solves the following set of differential equations:

\[ \frac{d e(t)}{dt} \;=\; -\frac{1}{\tau_e} e(t) \,+\, w(t)\,y(t)\,bap(t) \hspace{24px} (2) \]

\[ \frac{d g(t)}{dt} \;=\; -\frac{1}{\tau_g} g(t) \,+\, dopa(t)\,e(t) \,+\, T_g\,d \mathcal{W}_g \hspace{24px} (3) \]

\[ d \theta(t) \;=\; \beta\,\bigg( c_p (\mu - \theta(t)) + c_g\,g(t) + c_e \, dopa(t) \,e(t) \bigg) dt \,+\, \sqrt{ 2 T_\theta \beta } \, \mathcal{W}_{\theta} \hspace{24px} (4) \]

\[ w(t) \;=\; w_0 \, \exp ( \theta(t) - \theta_0 ) \hspace{24px} (5) \]

The precision of the prior in equation (4) relates to the standard deviation as $c_p = 1/\sigma^2$. Setting $c_p=0$ corresponds to a non-informative (flat) prior.

The differential equations (2-5) are solved using Euler integration. The dynamics of the postsynaptic term $y(t)$, the eligibility trace $e(t)$ and the reward gradient $g(t)$ are updated at each NEST time step. The dynamics of $\theta(t)$ and $w(t)$ are updated on a time grid based on weight_update_interval. The synaptic weights remain constant between two updates. The synapse recorder is only invoked after each weight update which means that recorder_interval must be a multiple of weight_update_interval. Synaptic parameters are clipped at min_param and max_param. Parameter gradients are clipped at +/- max_param_change. Synaptic weights of synapses for which $\theta(t)$ falls below 0 are clipped to 0 (retracted synapses). If simulate_retracted_synapses is set to false simulation of $y(t), e(t)$ and $g(t)$ is not continued for retracted synapse. This means that only the stochastic dynamics of $\theta(t)$ are simulated until the synapse is reformed again. During this time, the reward gradient $g(t)$ is fixed to 0. If delete_retracted_synapses is set to true, retracted synapses will be removed from the network using the garbage collector of the ConnectionUpdateManager.

References

[1] David Kappel, Robert Legenstein, Stefan Habenschuss, Michael Hsieh and Wolfgang Maass. A Dynamic Connectome Supports the Emergence of Stable Computational Function of Neural Circuits through Reward-Based Learning. eNeuro, 2018. https://doi.org/10.1523/ENEURO.0301-17.2018

[2] David Kappel, Robert Legenstein, Stefan Habenschuss, Michael Hsieh and Wolfgang Maass. Reward-based self-configuration of neural circuits. 2017. https://arxiv.org/abs/1704.04238

[3] Zhaofei Yu, David Kappel, Robert Legenstein, Sen Song, Feng Chen and Wolfgang Maass. CaMKII activation supports reward-based neural network optimization through Hamiltonian sampling. 2016. https://arxiv.org/abs/1606.00157

Author
David Kappel, Michael Hsieh
See also
TracingNode, DiligentConnectorModel

Constructor & Destructor Documentation

◆ SynapticSamplingRewardGradientConnection() [1/2]

template<typename targetidentifierT >
spore::SynapticSamplingRewardGradientConnection< targetidentifierT >::SynapticSamplingRewardGradientConnection ( )

Default Constructor.

◆ SynapticSamplingRewardGradientConnection() [2/2]

template<typename targetidentifierT >
spore::SynapticSamplingRewardGradientConnection< targetidentifierT >::SynapticSamplingRewardGradientConnection ( const SynapticSamplingRewardGradientConnection< targetidentifierT > &  rhs)

Copy Constructor.

◆ ~SynapticSamplingRewardGradientConnection()

template<typename targetidentifierT >
spore::SynapticSamplingRewardGradientConnection< targetidentifierT >::~SynapticSamplingRewardGradientConnection ( )

Destructor.

Member Function Documentation

◆ check_connection()

template<typename targetidentifierT>
void spore::SynapticSamplingRewardGradientConnection< targetidentifierT >::check_connection ( nest::Node &  s,
nest::Node &  t,
nest::rport  receptor_type,
double  t_lastspike,
const CommonPropertiesType cp 
)
inline

Checks if the type of the postsynaptic node is supported. Throws an IllegalConnection exception if the postsynaptic node is not derived from TracingNode.

◆ check_synapse_params()

template<typename targetidentifierT >
void spore::SynapticSamplingRewardGradientConnection< targetidentifierT >::check_synapse_params ( const DictionaryDatum &  syn_spec) const

Check syn_spec dictionary for parameters that are not allowed for this connection. Will issue warning or throw error if a parameter is found.

◆ get_status()

template<typename targetidentifierT >
void spore::SynapticSamplingRewardGradientConnection< targetidentifierT >::get_status ( DictionaryDatum &  d) const

Status getter function.

◆ send()

template<typename targetidentifierT >
void spore::SynapticSamplingRewardGradientConnection< targetidentifierT >::send ( nest::Event &  e,
nest::thread  thread,
double  t_last_spike,
const CommonPropertiesType cp 
)

Send an event to the postsynaptic neuron. This will update the synapse state and synaptic weights to the current slice origin and send the spike event. This method is also triggered by the ConnectionUpdateManager to indicate that the synapse is running out of date. In this case an invalid rport of -1 is passed and the spike is not delivered to the postsynaptic neuron.

Parameters
ethe spike event.
threadthe id of the connections thread.
t_last_spikethe time of the last spike.
cpthe synapse type common properties.

◆ set_status()

template<typename targetidentifierT >
void spore::SynapticSamplingRewardGradientConnection< targetidentifierT >::set_status ( const DictionaryDatum &  d,
nest::ConnectorModel &  cm 
)

Status setter function.

Note
weight will be overwritten next time when the synapse is updated.

The documentation for this class was generated from the following file: