Reward-based synaptic sampling connection class. More...

#include <synaptic_sampling_rewardgradient_connection.h>

Inheritance diagram for spore::SynapticSamplingRewardGradientConnection< targetidentifierT >:

Collaboration diagram for spore::SynapticSamplingRewardGradientConnection< targetidentifierT >:

Public Types
typedef SynapticSamplingRewardGradientCommonProperties	CommonPropertiesType
	Type to use for representing common synapse properties.

typedef nest::Connection< targetidentifierT >	ConnectionBase
	Shortcut for base class.

Public Member Functions
	SynapticSamplingRewardGradientConnection ()

	SynapticSamplingRewardGradientConnection (const SynapticSamplingRewardGradientConnection< targetidentifierT > &rhs)

	~SynapticSamplingRewardGradientConnection ()

void	check_connection (nest::Node &s, nest::Node &t, nest::rport receptor_type, double t_lastspike, const CommonPropertiesType &cp)

void	get_status (DictionaryDatum &d) const

void	set_status (const DictionaryDatum &d, nest::ConnectorModel &cm)
	Status setter function. More...

void	send (nest::Event &e, nest::thread t, double t_lastspike, const CommonPropertiesType &cp)

void	check_synapse_params (const DictionaryDatum &syn_spec) const

Detailed Description

template<typename targetidentifierT>
class spore::SynapticSamplingRewardGradientConnection< targetidentifierT >

Reward-based synaptic sampling connection class.

This connection type implements the reward-based synaptic sampling algorithm introduced in [1,2,3]. The target node to which synapses of this type are connected must be derived from TracingNode. A second node which is also derived from type TracingNode must be registered to the synapse model at its reward_transmitter parameter. The synapse model performs a stochastic policy search which tries to maximize the reward signal provided by the reward_transmitter node. At the same time synaptic weights are constraint by a Gaussian prior with mean $\mu$ and standard deviation $\sigma$ . This synapse type can not change its sign, i.e. synapses are either excitatory or inhibitory depending on the sign of the weight_scale parameter. If synaptic weights fall below a threshold (determined by parameter parameter_mapping_offset) weights are clipped to zero (retracted synapses). The synapse model also implements an optional mechanism to automatically remove retracted synapses from the simulation. This mechanism can be turned on using the delete_retracted_synapses parameter.

Parameters and state variables

The following parameters can be set in the common properties dictionary (default values and constraints are given in parentheses, corresponding symbols in the equations given below and in references [1,2,3] are given in braces):

name	type	comment
learning_rate	double	learning rate (5e-08, ≥0.0) { $\beta$ }
temperature	double	amplitude of parameter noise (0.1, ≥0.0) { $T_\theta$ }
gradient_noise	double	amplitude of gradient noise (0.0, ≥0.0) { }
psp_tau_rise	double	double exponential PSP kernel rise (2.0, >0.0) [ms] { $\tau_r$ }
psp_tau_fall	double	double exponential PSP kernel decay (20.0, >0.0) [ms] { $\tau_m$ }
psp_cutoff_amplitude	double	psp is clipped to 0 below this value (0.0001, ≥0.0) { $\tau_m$ }
integration_time	double	time of gradient integration (50000.0, >0.0) [ms] { $\tau_g$ }
episode_length	double	length of eligibility trace (1000.0, >0.0) [ms] { $\tau_e$ }
weight_update_interval	double	interval of synaptic weight updates (100.0, >0.0) [ms]
parameter_mapping_offset	double	offset parameter for computing synaptic weight (3.0) { $\theta_0$ }
weight_scale	double	scaling factor for the synaptic weight (1.0) { }
direct_gradient_rate	double	rate of directly applying changes to the synaptic parameter (0.0) { }
gradient_scale	double	scaling parameter for the gradient (1.0) { }
max_param	double	maximum synaptic parameter (5.0)
min_param	double	minimum synaptic parameter (-2.0)
max_param_change	double	maximum synaptic parameter change (40.0, ≥0.0)
reward_transmitter	long	GID of the synapse's reward transmitter*
bap_trace_id	long	ID of the BAP trace (0, ≥0)
dopa_trace_id	long	ID of the dopamine trace (0, ≥0)
simulate_retracted_synapses	bool	continue simulating retracted synapses (false)
delete_retracted_synapses	bool	delete retracted synapses (false)

*) reward_transmitter must be set to the GID of a TracingNode before simulation startup.

The following parameters can be set in the status dictionary:

name	type	comment
synaptic_parameter	double	current synaptic parameter { $\theta(t)$ }
weight	double	current synaptic weight { }
eligibility_trace	double	current eligibility trace { }
reward_gradient	double	current reward gradient { }
prior_mean	double	mean of the Gaussian prior { $\mu$ }
prior_precision	double	precision of the Gaussian prior { }
recorder_times	[double]	time points of parameter recordings*
weight_values	[double]	array of recorded synaptic weight values*
synaptic_parameter_values	[double]	array of recorded synaptic parameter values*
reward_gradient_values	[double]	array of recorded reward gradient values*
eligibility_trace_values	[double]	array of recorded eligibility trace values*
psp_values	[double]	array of recorded psp values*
recorder_interval	double	interval of synaptic recordings [ms]
reset_recorder	bool	clear all recorded values now* (write only)

*) Recorder fields are read only. If reset_recorder is set to true all recorder fields will be cleared instantaneously.

Implementation Details

This connection type is a diligent synapse model, therefore updates are triggered on a regular interval which is ensured by the ConnectionUpdateManager. The state of each synapse consists of the variables $y(t), e(t), g(t), \theta(t), w(t)$ . The variable $y(t)$ is the presynaptic spike train filtered with a PSP kernel $\epsilon(t)$ of the form

$\epsilon(t) \;=\; \frac{\tau_r}{\tau_m - \tau_r}\left( e^{-\frac{1}{\tau_m}} - e^{-\frac{1}{\tau_r}} \right)\;. \hspace{24px} (1)$

A node derived from type TracingNode must be registered to the synapse model at its reward_transmitter parameter. The trace of this node at id dopa_trace_id is used as reward signal $dopa(t)$ . The trace of the postsynaptic neuron with id bap_trace_id is used as back-propagating signal $bap(t)$ . The synapse then solves the following set of differential equations:

$\frac{d e(t)}{dt} \;=\; -\frac{1}{\tau_e} e(t) \,+\, w(t)\,y(t)\,bap(t) \hspace{24px} (2)$

$\frac{d g(t)}{dt} \;=\; -\frac{1}{\tau_g} g(t) \,+\, dopa(t)\,e(t) \,+\, T_g\,d \mathcal{W}_g \hspace{24px} (3)$

$d \theta(t) \;=\; \beta\,\bigg( c_p (\mu - \theta(t)) + c_g\,g(t) + c_e \, dopa(t) \,e(t) \bigg) dt \,+\, \sqrt{ 2 T_\theta \beta } \, \mathcal{W}_{\theta} \hspace{24px} (4)$

$w(t) \;=\; w_0 \, \exp ( \theta(t) - \theta_0 ) \hspace{24px} (5)$

The precision of the prior in equation (4) relates to the standard deviation as $c_p = 1/\sigma^2$ . Setting $c_p=0$ corresponds to a non-informative (flat) prior.

The differential equations (2-5) are solved using Euler integration. The dynamics of the postsynaptic term $y(t)$ , the eligibility trace $e(t)$ and the reward gradient $g(t)$ are updated at each NEST time step. The dynamics of $\theta(t)$ and $w(t)$ are updated on a time grid based on weight_update_interval. The synaptic weights remain constant between two updates. The synapse recorder is only invoked after each weight update which means that recorder_interval must be a multiple of weight_update_interval. Synaptic parameters are clipped at min_param and max_param. Parameter gradients are clipped at +/- max_param_change. Synaptic weights of synapses for which $\theta(t)$ falls below 0 are clipped to 0 (retracted synapses). If simulate_retracted_synapses is set to false simulation of $y(t), e(t)$ and $g(t)$ is not continued for retracted synapse. This means that only the stochastic dynamics of $\theta(t)$ are simulated until the synapse is reformed again. During this time, the reward gradient $g(t)$ is fixed to 0. If delete_retracted_synapses is set to true, retracted synapses will be removed from the network using the garbage collector of the ConnectionUpdateManager.

References

[1] David Kappel, Robert Legenstein, Stefan Habenschuss, Michael Hsieh and Wolfgang Maass. A Dynamic Connectome Supports the Emergence of Stable Computational Function of Neural Circuits through Reward-Based Learning. eNeuro, 2018. https://doi.org/10.1523/ENEURO.0301-17.2018

[2] David Kappel, Robert Legenstein, Stefan Habenschuss, Michael Hsieh and Wolfgang Maass. Reward-based self-configuration of neural circuits. 2017. https://arxiv.org/abs/1704.04238

[3] Zhaofei Yu, David Kappel, Robert Legenstein, Sen Song, Feng Chen and Wolfgang Maass. CaMKII activation supports reward-based neural network optimization through Hamiltonian sampling. 2016. https://arxiv.org/abs/1606.00157

Author: David Kappel, Michael Hsieh

See also: TracingNode, DiligentConnectorModel

Constructor & Destructor Documentation

◆ SynapticSamplingRewardGradientConnection() [1/2]

template<typename targetidentifierT >

spore::SynapticSamplingRewardGradientConnection< targetidentifierT >::SynapticSamplingRewardGradientConnection ( )

Default Constructor.

◆ SynapticSamplingRewardGradientConnection() [2/2]

template<typename targetidentifierT >

spore::SynapticSamplingRewardGradientConnection< targetidentifierT >::SynapticSamplingRewardGradientConnection ( const SynapticSamplingRewardGradientConnection< targetidentifierT > & rhs )

Copy Constructor.

◆ ~SynapticSamplingRewardGradientConnection()

template<typename targetidentifierT >

spore::SynapticSamplingRewardGradientConnection< targetidentifierT >::~SynapticSamplingRewardGradientConnection ( )

Destructor.

Member Function Documentation

◆ check_connection()

template<typename targetidentifierT>

void spore::SynapticSamplingRewardGradientConnection< targetidentifierT >::check_connection	(	nest::Node &	s,
		nest::Node &	t,
		nest::rport	receptor_type,
		double	t_lastspike,
		const CommonPropertiesType &	cp
	)

inline

Checks if the type of the postsynaptic node is supported. Throws an IllegalConnection exception if the postsynaptic node is not derived from TracingNode.

◆ check_synapse_params()

template<typename targetidentifierT >

void spore::SynapticSamplingRewardGradientConnection< targetidentifierT >::check_synapse_params ( const DictionaryDatum & syn_spec ) const

Check syn_spec dictionary for parameters that are not allowed for this connection. Will issue warning or throw error if a parameter is found.

◆ get_status()

template<typename targetidentifierT >

void spore::SynapticSamplingRewardGradientConnection< targetidentifierT >::get_status ( DictionaryDatum & d ) const

Status getter function.

◆ send()

template<typename targetidentifierT >

void spore::SynapticSamplingRewardGradientConnection< targetidentifierT >::send	(	nest::Event &	e,
		nest::thread	thread,
		double	t_last_spike,
		const CommonPropertiesType &	cp
	)

Send an event to the postsynaptic neuron. This will update the synapse state and synaptic weights to the current slice origin and send the spike event. This method is also triggered by the ConnectionUpdateManager to indicate that the synapse is running out of date. In this case an invalid rport of -1 is passed and the spike is not delivered to the postsynaptic neuron.

Parameters

e	the spike event.
thread	the id of the connections thread.
t_last_spike	the time of the last spike.
cp	the synapse type common properties.

◆ set_status()

template<typename targetidentifierT >

void spore::SynapticSamplingRewardGradientConnection< targetidentifierT >::set_status	(	const DictionaryDatum &	d,
		nest::ConnectorModel &	cm
	)

Status setter function.

Note: weight will be overwritten next time when the synapse is updated.

The documentation for this class was generated from the following file:

synaptic_sampling_rewardgradient_connection.h

Public Types

Public Member Functions

Detailed Description

template<typename targetidentifierT> class spore::SynapticSamplingRewardGradientConnection< targetidentifierT >

Constructor & Destructor Documentation

◆ SynapticSamplingRewardGradientConnection() [1/2]

◆ SynapticSamplingRewardGradientConnection() [2/2]

◆ ~SynapticSamplingRewardGradientConnection()

Member Function Documentation

◆ check_connection()

◆ check_synapse_params()

◆ get_status()

◆ send()

◆ set_status()

template<typename targetidentifierT>
class spore::SynapticSamplingRewardGradientConnection< targetidentifierT >