Reward-based training of recurrent neural networks for cognitive and value-based tasks

  1. H Francis Song
  2. Guangyu R Yang
  3. Xiao-Jing Wang  Is a corresponding author
  1. New York University, United States

Abstract

Trained neural network models, which exhibit features of neural activity recorded from behaving animals, may provide insights into the circuit mechanisms of cognitive functions through systematic analysis of network activity and connectivity. However, in contrast to the graded error signals commonly used to train networks through supervised learning, animals learn from reward feedback on definite actions through reinforcement learning. Reward maximization is particularly relevant when optimal behavior depends on an animal's internal judgment of confidence or subjective preferences. Here, we implement reward-based training of recurrent neural networks in which a value network guides learning by using the activity of the decision network to predict future reward. We show that such models capture behavioral and electrophysiological findings from well-known experimental paradigms. Our work provides a unified framework for investigating diverse cognitive and value-based computations, and predicts a role for value representation that is essential for learning, but not executing, a task.

Article and author information

Author details

  1. H Francis Song

    Center for Neural Science, New York University, New York, United States
    Competing interests
    The authors declare that no competing interests exist.
  2. Guangyu R Yang

    Center for Neural Science, New York University, New York, United States
    Competing interests
    The authors declare that no competing interests exist.
  3. Xiao-Jing Wang

    Center for Neural Science, New York University, New York, United States
    For correspondence
    xjwang@nyu.edu
    Competing interests
    The authors declare that no competing interests exist.
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0003-3124-8474

Funding

Office of Naval Research (N00014-13-1-0297)

  • H Francis Song
  • Guangyu R Yang
  • Xiao-Jing Wang

Google

  • H Francis Song
  • Guangyu R Yang
  • Xiao-Jing Wang

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Reviewing Editor

  1. Timothy EJ Behrens, University College London, United Kingdom

Version history

  1. Received: September 13, 2016
  2. Accepted: January 12, 2017
  3. Accepted Manuscript published: January 13, 2017 (version 1)
  4. Version of Record published: February 6, 2017 (version 2)

Copyright

© 2017, Song et al.

This article is distributed under the terms of the Creative Commons Attribution License permitting unrestricted use and redistribution provided that the original author and source are credited.

Metrics

  • 10,362
    views
  • 1,845
    downloads
  • 111
    citations

Views, downloads and citations are aggregated across all versions of this paper published by eLife.

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

  1. H Francis Song
  2. Guangyu R Yang
  3. Xiao-Jing Wang
(2017)
Reward-based training of recurrent neural networks for cognitive and value-based tasks
eLife 6:e21492.
https://doi.org/10.7554/eLife.21492

Share this article

https://doi.org/10.7554/eLife.21492

Further reading

    1. Neuroscience
    Aedan Yue Li, Natalia Ladyka-Wojcik ... Morgan D Barense
    Research Article Updated

    Combining information from multiple senses is essential to object recognition, core to the ability to learn concepts, make new inferences, and generalize across distinct entities. Yet how the mind combines sensory input into coherent crossmodal representations – the crossmodal binding problem – remains poorly understood. Here, we applied multi-echo fMRI across a 4-day paradigm, in which participants learned three-dimensional crossmodal representations created from well-characterized unimodal visual shape and sound features. Our novel paradigm decoupled the learned crossmodal object representations from their baseline unimodal shapes and sounds, thus allowing us to track the emergence of crossmodal object representations as they were learned by healthy adults. Critically, we found that two anterior temporal lobe structures – temporal pole and perirhinal cortex – differentiated learned from non-learned crossmodal objects, even when controlling for the unimodal features that composed those objects. These results provide evidence for integrated crossmodal object representations in the anterior temporal lobes that were different from the representations for the unimodal features. Furthermore, we found that perirhinal cortex representations were by default biased toward visual shape, but this initial visual bias was attenuated by crossmodal learning. Thus, crossmodal learning transformed perirhinal representations such that they were no longer predominantly grounded in the visual modality, which may be a mechanism by which object concepts gain their abstraction.

    1. Neuroscience
    2. Stem Cells and Regenerative Medicine
    Pascal Forcella, Niklas Ifflander ... Verdon Taylor
    Research Article

    Neural stem cells (NSCs) are multipotent and correct fate determination is crucial to guarantee brain formation and homeostasis. How NSCs are instructed to generate neuronal or glial progeny is not well understood. Here we addressed how murine adult hippocampal NSC fate is regulated and describe how Scaffold Attachment Factor B (SAFB) blocks oligodendrocyte production to enable neuron generation. We found that SAFB prevents NSC expression of the transcription factor Nuclear Factor I/B (NFIB) by binding to sequences in the Nfib mRNA and enhancing Drosha-dependent cleavage of the transcripts. We show that increasing SAFB expression prevents oligodendrocyte production by multipotent adult NSCs, and conditional deletion of Safb increases NFIB expression and oligodendrocyte formation in the adult hippocampus. Our results provide novel insights into a mechanism that controls Drosha functions for selective regulation of NSC fate by modulating the post-transcriptional destabilization of Nfib mRNA in a lineage-specific manner.