Recurrent Spiking Networks Solve Planning Tasks
Research output: Contribution to journal › Article › Research › peer-review
Standard
In: Scientific reports (e-only), Vol. 6.2016, No. 21142, 21142, 18.02.2016.
Research output: Contribution to journal › Article › Research › peer-review
Harvard
APA
Vancouver
Author
Bibtex - Download
}
RIS (suitable for import to EndNote) - Download
TY - JOUR
T1 - Recurrent Spiking Networks Solve Planning Tasks
AU - Rückert, Elmar
AU - Kappel, David
AU - Tanneberg, Daniel
AU - Pecevski, Dejan
AU - Peters, Jan
N1 - Funding Information: The research leading to these results has received funding from the European Community's Seventh Framework Programme (FP7/2007-2013) under grant agreements No. 248311 (AMARSI) and No. 600716 (CoDyCo). The authors would like to thank Wolfgang Maass, Matthew Botvinick and Tucker Hermans for comments that greatly improved the manuscript. We would also like to thank Brad Pfeiffer and David Foster for the permission to print parts of their inspiring results10.
PY - 2016/2/18
Y1 - 2016/2/18
N2 - A recurrent spiking neural network is proposed that implements planning as probabilistic inference for finite and infinite horizon tasks. The architecture splits this problem into two parts: The stochastic transient firing of the network embodies the dynamics of the planning task. With appropriate injected input this dynamics is shaped to generate high-reward state trajectories. A general class of reward-modulated plasticity rules for these afferent synapses is presented. The updates optimize the likelihood of getting a reward through a variant of an Expectation Maximization algorithm and learning is guaranteed to convergence to a local maximum. We find that the network dynamics are qualitatively similar to transient firing patterns during planning and foraging in the hippocampus of awake behaving rats. The model extends classical attractor models and provides a testable prediction on identifying modulating contextual information. In a real robot arm reaching and obstacle avoidance task the ability to represent multiple task solutions is investigated. The neural planning method with its local update rules provides the basis for future neuromorphic hardware implementations with promising potentials like large data processing abilities and early initiation of strategies to avoid dangerous situations in robot co-worker scenarios.
AB - A recurrent spiking neural network is proposed that implements planning as probabilistic inference for finite and infinite horizon tasks. The architecture splits this problem into two parts: The stochastic transient firing of the network embodies the dynamics of the planning task. With appropriate injected input this dynamics is shaped to generate high-reward state trajectories. A general class of reward-modulated plasticity rules for these afferent synapses is presented. The updates optimize the likelihood of getting a reward through a variant of an Expectation Maximization algorithm and learning is guaranteed to convergence to a local maximum. We find that the network dynamics are qualitatively similar to transient firing patterns during planning and foraging in the hippocampus of awake behaving rats. The model extends classical attractor models and provides a testable prediction on identifying modulating contextual information. In a real robot arm reaching and obstacle avoidance task the ability to represent multiple task solutions is investigated. The neural planning method with its local update rules provides the basis for future neuromorphic hardware implementations with promising potentials like large data processing abilities and early initiation of strategies to avoid dangerous situations in robot co-worker scenarios.
UR - http://www.scopus.com/inward/record.url?scp=84959020509&partnerID=8YFLogxK
U2 - 10.1038/srep21142
DO - 10.1038/srep21142
M3 - Article
AN - SCOPUS:84959020509
VL - 6.2016
JO - Scientific reports (e-only)
JF - Scientific reports (e-only)
SN - 2045-2322
IS - 21142
M1 - 21142
ER -