Recurrent Spiking Networks Solve Planning Tasks

Elmar Rückert; David Kappel; Daniel Tanneberg; Dejan Pecevski; Jan Peters

doi:10.1038/srep21142

Recurrent Spiking Networks Solve Planning Tasks

Publikationen: Beitrag in Fachzeitschrift › Artikel › Forschung › (peer-reviewed)

Standard

Recurrent Spiking Networks Solve Planning Tasks. / Rückert, Elmar; Kappel, David; Tanneberg, Daniel et al.
in: Scientific reports, Jahrgang 6.2016, Nr. 21142, 21142, 18.02.2016.

Publikationen: Beitrag in Fachzeitschrift › Artikel › Forschung › (peer-reviewed)

Harvard

Rückert, E, Kappel, D, Tanneberg, D, Pecevski, D & Peters, J 2016, 'Recurrent Spiking Networks Solve Planning Tasks', Scientific reports, Jg. 6.2016, Nr. 21142, 21142. https://doi.org/10.1038/srep21142

APA

Rückert, E., Kappel, D., Tanneberg, D., Pecevski, D., & Peters, J. (2016). Recurrent Spiking Networks Solve Planning Tasks. Scientific reports, 6.2016(21142), Artikel 21142. https://doi.org/10.1038/srep21142

Vancouver

Rückert E, Kappel D, Tanneberg D, Pecevski D, Peters J. Recurrent Spiking Networks Solve Planning Tasks. Scientific reports. 2016 Feb 18;6.2016(21142):21142. doi: 10.1038/srep21142

Author

Rückert, Elmar ; Kappel, David ; Tanneberg, Daniel et al. / Recurrent Spiking Networks Solve Planning Tasks. in: Scientific reports. 2016 ; Jahrgang 6.2016, Nr. 21142.

Bibtex - Download

@article{f8c3160ad7d748e9ab1ec57753380c6c,

title = "Recurrent Spiking Networks Solve Planning Tasks",

abstract = "A recurrent spiking neural network is proposed that implements planning as probabilistic inference for finite and infinite horizon tasks. The architecture splits this problem into two parts: The stochastic transient firing of the network embodies the dynamics of the planning task. With appropriate injected input this dynamics is shaped to generate high-reward state trajectories. A general class of reward-modulated plasticity rules for these afferent synapses is presented. The updates optimize the likelihood of getting a reward through a variant of an Expectation Maximization algorithm and learning is guaranteed to convergence to a local maximum. We find that the network dynamics are qualitatively similar to transient firing patterns during planning and foraging in the hippocampus of awake behaving rats. The model extends classical attractor models and provides a testable prediction on identifying modulating contextual information. In a real robot arm reaching and obstacle avoidance task the ability to represent multiple task solutions is investigated. The neural planning method with its local update rules provides the basis for future neuromorphic hardware implementations with promising potentials like large data processing abilities and early initiation of strategies to avoid dangerous situations in robot co-worker scenarios.",

author = "Elmar R{\"u}ckert and David Kappel and Daniel Tanneberg and Dejan Pecevski and Jan Peters",

note = "Funding Information: The research leading to these results has received funding from the European Community's Seventh Framework Programme (FP7/2007-2013) under grant agreements No. 248311 (AMARSI) and No. 600716 (CoDyCo). The authors would like to thank Wolfgang Maass, Matthew Botvinick and Tucker Hermans for comments that greatly improved the manuscript. We would also like to thank Brad Pfeiffer and David Foster for the permission to print parts of their inspiring results10.",

year = "2016",

month = feb,

day = "18",

doi = "10.1038/srep21142",

language = "English",

volume = "6.2016",

journal = "Scientific reports",

issn = "2045-2322",

publisher = "Nature Research",

number = "21142",

}

RIS (suitable for import to EndNote) - Download

TY - JOUR

T1 - Recurrent Spiking Networks Solve Planning Tasks

AU - Rückert, Elmar

AU - Kappel, David

AU - Tanneberg, Daniel

AU - Pecevski, Dejan

AU - Peters, Jan

N1 - Funding Information: The research leading to these results has received funding from the European Community's Seventh Framework Programme (FP7/2007-2013) under grant agreements No. 248311 (AMARSI) and No. 600716 (CoDyCo). The authors would like to thank Wolfgang Maass, Matthew Botvinick and Tucker Hermans for comments that greatly improved the manuscript. We would also like to thank Brad Pfeiffer and David Foster for the permission to print parts of their inspiring results10.

PY - 2016/2/18

Y1 - 2016/2/18

N2 - A recurrent spiking neural network is proposed that implements planning as probabilistic inference for finite and infinite horizon tasks. The architecture splits this problem into two parts: The stochastic transient firing of the network embodies the dynamics of the planning task. With appropriate injected input this dynamics is shaped to generate high-reward state trajectories. A general class of reward-modulated plasticity rules for these afferent synapses is presented. The updates optimize the likelihood of getting a reward through a variant of an Expectation Maximization algorithm and learning is guaranteed to convergence to a local maximum. We find that the network dynamics are qualitatively similar to transient firing patterns during planning and foraging in the hippocampus of awake behaving rats. The model extends classical attractor models and provides a testable prediction on identifying modulating contextual information. In a real robot arm reaching and obstacle avoidance task the ability to represent multiple task solutions is investigated. The neural planning method with its local update rules provides the basis for future neuromorphic hardware implementations with promising potentials like large data processing abilities and early initiation of strategies to avoid dangerous situations in robot co-worker scenarios.

AB - A recurrent spiking neural network is proposed that implements planning as probabilistic inference for finite and infinite horizon tasks. The architecture splits this problem into two parts: The stochastic transient firing of the network embodies the dynamics of the planning task. With appropriate injected input this dynamics is shaped to generate high-reward state trajectories. A general class of reward-modulated plasticity rules for these afferent synapses is presented. The updates optimize the likelihood of getting a reward through a variant of an Expectation Maximization algorithm and learning is guaranteed to convergence to a local maximum. We find that the network dynamics are qualitatively similar to transient firing patterns during planning and foraging in the hippocampus of awake behaving rats. The model extends classical attractor models and provides a testable prediction on identifying modulating contextual information. In a real robot arm reaching and obstacle avoidance task the ability to represent multiple task solutions is investigated. The neural planning method with its local update rules provides the basis for future neuromorphic hardware implementations with promising potentials like large data processing abilities and early initiation of strategies to avoid dangerous situations in robot co-worker scenarios.

UR - http://www.scopus.com/inward/record.url?scp=84959020509&partnerID=8YFLogxK

U2 - 10.1038/srep21142

DO - 10.1038/srep21142

M3 - Article

AN - SCOPUS:84959020509

VL - 6.2016

JO - Scientific reports

JF - Scientific reports

SN - 2045-2322

IS - 21142

M1 - 21142

ER -

Forschungsportal

Recurrent Spiking Networks Solve Planning Tasks

Standard

Harvard

APA

Vancouver

Author

Bibtex - Download

RIS (suitable for import to EndNote) - Download

360 Link