4600 • The Journal of Neuroscience, April 20, 2016 • 36(16):4600 – 4613 Behavioral/Cognitive The Medial Orbitofrontal Cortex Regulates Sensitivity to Outcome Value Shannon L. Gourley,1,2,3* Kelsey S. Zimmermann,1,2,3* Amanda G. Allen,1,3 and Jane R. Taylor4,5,6 1 Department of Pediatrics, 2Graduate Program in Neuroscience, and 3Yerkes National Primate Research Center, Emory University, Atlanta, Georgia 30329, and, 4Interdepartmental Neuroscience Program and Departments of 5Psychiatry and 6Psychology, Yale University, New Haven, Connecticut 06520 An essential component of goal-directed decision-making is the ability to maintain flexible responding based on the value of a given reward, or “reinforcer.” The medial orbitofrontal cortex (mOFC), a subregion of the ventromedial prefrontal cortex, is uniquely positioned to regulate this process. We trained mice to nose poke for food reinforcers and then stimulated this region using CaMKII-driven Gs-coupled designer receptors exclusively activated by designer drugs (DREADDs). In other mice, we silenced the neuroplasticityassociated neurotrophin brain-derived neurotrophic factor (BDNF). Activation of Gs-DREADDs increased behavioral sensitivity to reinforcer devaluation, whereas Bdnf knockdown blocked sensitivity. These changes were accompanied by modifications in breakpoint ratios in a progressive ratio task, and they were recapitulated in Bdnf⫹/⫺ mice. Replacement of BDNF selectively in the mOFC in Bdnf⫹/⫺ mice rescued behavioral deficiencies, as well as phosphorylation of extracellular-signal regulated kinase 1/2 (ERK1/2). Thus, BDNF expression in the mOFC is both necessary and sufficient for the expression of typical effort allocation relative to an anticipated reinforcer. Additional experiments indicated that expression of the immediate-early gene c-fos was aberrantly elevated in the Bdnf⫹/⫺ dorsal striatum, and BDNF replacement in the mOFC normalized expression. Also, systemic administration of an MAP kinase kinase inhibitor increased breakpoint ratios, whereas the addition of discrete cues bridging the response– outcome contingency rescued breakpoints in Bdnf⫹/⫺ mice. We argue that BDNF–ERK1/2 in the mOFC is a key regulator of “online” goal-directed action selection. Key words: cue; dorsal striatum; neurotrophin; operant; orbital; progressive ratio Significance Statement Goal-directed response selection often involves predicting the consequences of one’s actions and the value of potential payoffs. Lesions or chemogenetic inactivation of the medial orbitofrontal cortex (mOFC) in rats induces failures in retrieving outcome identity memories (Bradfield et al., 2015), suggesting that the healthy mOFC serves to access outcome value information when it is not immediately observable and thereby guide goal-directed decision-making. Our findings suggest that the mOFC also bidirectionally regulates effort allocation for a given reward and that expression of the neurotrophin BDNF in the mOFC is both necessary and sufficient for mice to sustain stable representations of reinforcer value. Introduction An essential component of goal-directed decision-making is assessing the value of a reinforcer and engaging response strategies Received Nov. 23, 2015; revised March 4, 2016; accepted March 8, 2016. Author contributions: S.L.G. and J.R.T. designed research; S.L.G., K.S.Z., and A.G.A. performed research; S.L.G. and J.R.T. analyzed data; S.L.G., K.S.Z., A.G.A., and J.R.T. wrote the paper. This work was supported by National Institutes of Health Grants DA011717, DA027844 (J.R.T.), and MH101477 (S.L.G.), the Children’s Center for Neuroscience Research (S.L.G.), and the Connecticut Department of Mental Health and Addiction Services (J.R.T.). The Emory Viral Vector Core is supported by National Institute of Neurological Disorders and Stroke Core Facilities Grant P30NS055077. The Yerkes National Primate Research Center is supported by P51OD011132. We thank Alexia Kedves, Tendi Hungwe, and Courtni Andrews for valuable assistance and the Duman laboratory for providing Bdnf⫹/⫺ mice used here. We also thank Dr. Glenn Schafe for critical comments on this manuscript and Dr. R. Jude Samulski at the University of North Carolina Viral Vector Core. *S.L.G. and K.S.Z. contributed equally to this work. Correspondence should be addressed to Dr. Shannon L. Gourley, Yerkes National Primate Research Center, 954 Gatewood Drive NE, Atlanta, GA 30329. E-mail: shannon.l.gourley@emory.edu. accordingly. Electrophysiological studies in nonhuman primates and neuroimaging studies in humans suggest that the ventral and medial orbitofrontal cortex (mOFC) encodes the value of rewards during real or hypothetical tasks (Arana et al., 2003; Paulus and Frank, 2003; Padoa-Schioppa and Assad, 2006, 2008; Plassmann et al., 2007; Valentin et al., 2007). Furthermore, mOFC neurons are sensitive to satiety-specific reinforcer devaluation and notably more so than centrolateral OFC neurons (Bouret and Richmond, 2010). This observation is consistent with the perspective that, across species, ventromedial prefrontal cortical structures are concerned with representations of outcome value, as determined by internal means such as hunger/satiety, as well as inhibitory control processes; meanwhile, the lateral OFC inteDOI:10.1523/JNEUROSCI.4253-15.2016 Copyright © 2016 the authors 0270-6474/16/364600-14$15.00/0 Gourley, Zimmermann et al. • The Role of BDNF in the Medial Orbital Cortex grates multisensory information to build representations of task states to guide optimal response selection strategies (Wallis, 2012; Wilson et al., 2014; Stalnaker et al., 2015; Gourley and Taylor, 2016). What are the molecular mechanisms regulating mOFC function? One candidate is BDNF. BDNF belongs to a family of structurally and functionally related peptide growth factors and is expressed throughout the brain. BDNF facilitates synaptic transmission (Kang and Schuman, 1995, 1996) and long-term potentiation (LTP) in the adult hippocampus (Figurov et al., 1996; Korte et al., 1996; Patterson et al., 1996). In the cerebral cortex, BDNF regulates AMPA receptor subunit expression, experiencedependent synaptogenesis, and dendritic modeling (McAllister et al., 1995, 1997; Gorski et al., 2003; Genoud et al., 2004; Nakata and Nakamura, 2007). Bdnf⫺/⫺ mutant mice do not survive to adulthood, but Bdnf⫹/⫺ mutants are viable and grossly normal on multiple tests of “emotionality” and memory (Montkowski and Holsboer 1997; but see Linnarsson et al., 1997; MacQueen et al., 2001; Chourbaji et al., 2004). Nonetheless, hippocampal LTP is impaired in these animals (Korte et al., 1995), indicating that even incomplete loss of BDNF has functional consequences. To assess the role of mOFC BDNF in reward-related decisionmaking, we first turned to reinforcer devaluation assays. In both primates and rodents, sensitivity to outcome value can be quantified using these procedures, e.g., allowing mice that have been trained previously to respond for a food reinforcer unconditional access to the food before testing. Sensitivity to outcome value is reflected by diminished responding for the now-devalued food (Colwill and Rescorla, 1986; Dickinson, 1980; Balleine and O’Doherty, 2010). In another task, the progressive ratio (PR) schedule of reinforcement can quantify perceived outcome value using instrumental response requirements that progressively increase with each reinforcer delivery (Hodos 1961). In this case, the highest response/reinforcer ratio—the “breakpoint ratio”— can serve as an indicator of perceived “value.” We report that viral-mediated mOFC-selective Bdnf knockdown decreases behavioral sensitivity to reinforcer devaluation and inflates responding on a PR schedule of reinforcement. Constitutive Bdnf⫹/⫺ mice also generate aberrant breakpoints, and mOFC-selective BDNF replacement in these mice rescues behavioral abnormalities and immediate-early gene expression in the downstream striatum. Goal-directed response selection often involves predicting the consequences of one’s actions and the value of future outcomes. Bradfield et al. (2015) recently reported that the mOFC sustains goal-directed behavior by retrieving outcome identity information to guide response strategies when reward value is not immediately observable (e.g., during certain stages of reinforcer devaluation tasks). Our data suggest that this property also helps to gate appropriate effort allocation (as in the PR task) and that the neurotrophin BDNF is a critical molecular substrate mediating these mOFC functions. To further support these perspectives, we additionally report that sensitivity to outcome value and PR schedules of reinforcement can be enhanced by stimulating CaMKII-driven Gs-coupled excitatory designer receptor exclusively activated by designer drugs (DREADDs) in the mOFC. These findings raise the possibility that the mOFC could be a viable target for therapeutic interventions aimed at correcting, normalizing, or enriching goal-directed behaviors. Examples include therapies aimed at bringing about behavioral change in depression or addiction, illnesses commonly characterized by behavioral rigidity and inflexibility. J. Neurosci., April 20, 2016 • 36(16):4600 – 4613 • 4601 Materials and Methods Subjects Mice included the following: (1) adult male C57BL/6 mice (10 –12 weeks old) obtained from Charles River or The Jackson Laboratory; (2) Bdnf⫹/⫺ and littermate wild-type controls bred on a C57BL/6 background (The Jackson Laboratory); or (3) adult male mice homozygous for a floxed Bdnf gene (exon V) bred on a mixed BALB/c background (The Jackson Laboratory). Mutant mice were tested between 10 weeks and 6 months of age, at which point body weights in Bdnf⫹/⫺ animals significantly increase (Kernie et al., 2000), even despite food restriction (S.L.G. and J.R.T., unpublished observations). Genotypes were determined by PCR of tail tissue and, in the case of Bdnf⫹/⫺ mice, confirmed by postmortem analysis of homogenized hippocampal tissue by BDNF ELISA (25 ␮l/well; Promega; methods below). Mice were food-restricted during instrumental conditioning, maintaining ⬃93% original body weight unless otherwise noted. All tests were conducted during the light phase of the 12 h light cycle (7:00 A.M. lights on). The Yale and Emory University Animal Care and Use Committees approved procedures as appropriate. Viral vector delivery AAV8 –CaMKII–HA–rM3D(Gs)–IRES–mCitrine (AAV–Gs–DREADD– mCitrine) or AAV8 –CaMKII–GFP (AAV–GFP) viral vectors were generated by the University of North Carolina Viral Vector Core and infused into wild-type mice. Lentiviral vectors expressing GFP or Cre recombinase (Cre) under the CMV promoter were generated by the Emory University Viral Vector Core and infused into floxed Bdnf mice. Mice were anesthetized with ketamine/dexdomitor. With needles centered at bregma, stereotaxic coordinates were located on the leveled skull using a digitized stereotaxic frame. Viral vectors were delivered to ⫹2.3 mm anteroposterior (AP), ⫺2.8 mm dorsoventral (DV), and ⫾0.1 mm mediolateral (ML; Gourley et al., 2010) in a volume of 0.5 ␮l/side. The microsyringe remained in place for 5 min after infusion. Mice were sutured and allowed to recover for at least 3 weeks before behavioral testing. After testing, mice were deeply anesthetized and transcardially perfused with 4% paraformaldehyde, then brains were sectioned into 40 ␮m sections, and mCitrine or GFP was imaged to confirm that infusions infected the mOFC. Clozapine-N-oxide injection in DREADDs-expressing mice and general experimental design The DREADDs ligand clozapine-N-oxide (CNO; 1 mg/kg, i.p., in 2% DMSO and PBS; Sigma) was prepared fresh daily and administered immediately before the 30 min prefeeding period for devaluation experiments and then again 30 min before testing for PR experiments. A final injection was delivered 30 min before a 1 h locomotor monitoring test. The 30 min inject-to-test interval allowed CNO time to penetrate the brain. All mice received CNO, which increases the excitability of infected neurons in Gs–DREADDs-expressing mice but does not affect control GFP-expressing neurons (Urban and Roth, 2015). In these experiments, the acute stress resulting from injection would be expected to reduce sensitivity to reinforcer devaluation (Schwabe and Wolf, 2011), as was indeed observed. Mice were accordingly then trained for four additional sessions and then retested in the devaluation procedure to confirm that mice could ultimately develop sensitivity to reinforcer devaluation with habituation to the injection. Instrumental response training Instrumental response training was conducted as reported previously (Gourley et al., 2008a,b). Briefly, experimenters used standard aluminum operant conditioning chambers for mice (16 ⫻ 14 ⫻ 12.5 cm) controlled by MedPC software and equipped with two or three nosepoke recesses (Med-Associates). Each chamber was housed in a soundattenuating outer chamber equipped with white noise generator, fan, and house light. A dispenser delivered grain-based food pellets (20 mg; BioServ) into the magazine. Head entries into the one active nose-poke aperture and magazine were detected by photocell. Mice were initially trained to perform the operant response over several 25 min sessions, during which one, two, or three responses resulted in food reinforcement Gourley, Zimmermann et al. • The Role of BDNF in the Medial Orbital Cortex 4602 • J. Neurosci., April 20, 2016 • 36(16):4600 – 4613 (a variable ratio 2 schedule of reinforcement). Once responding stabilized, experiments proceeded as indicated below. Response rates were compared by two-factor (genotype or treatment ⫻ session) ANOVA with repeated measures (RM). In the case of interactions, Tukey’s post hoc comparisons were used for all experiments. Satiety-specific reinforcer devaluation To investigate behavioral sensitivity to reinforcer devaluation, trained mice were allowed unlimited access to the reinforcer pellets in a clean cage for 30 min immediately before a 15 min probe test conducted in extinction. Response rates were compared by two-factor ANOVA to responding in a session when ad libitum access to the reinforcer pellets had not been available, but rather mice were given regular chow. These “devalued” ( pellets) and “value” (chow) sessions were counterbalanced, with one test per day on sequential days. In one experiment, a main effect of testing order indicated that mice extinguished responding during the second test regardless of devalued or value condition. To address this issue, response rates after the first prefeeding period are compared with those generated on the final day of training, referred to as “baseline.” PR test PR testing was conducted in mice trained to nose poke as described above. We then applied a linearly increasing response/reinforcement requirement (1, 5, 9, x ⫹ 4 responses/reinforcement). The test ended when no active responses were emitted for 5 min or if the mouse had not “timed out” within 4 h. The highest response/reinforcement ratios achieved—termed breakpoint ratios—were compared by two-factor (genotype ⫻ session) RM-ANOVA or t test as appropriate. For mice used in viral vector experiments, this test followed reinforcer devaluation testing to conserve animal usage. In Bdnf⫹/⫺ mice, independent groups were used for devaluation and PR tests. Breakpoint ratios ⬎2 SDs outside of the group means were excluded (Gourley et al., 2008a; throughout all experiments: n ⫽ 1 lenti-GFP; n ⫽ 3 intact wild-type; n ⫽ 1 Bdnf⫹/⫺ plus vehicle; n ⫽ 2 Bdnf⫹/⫺ plus BDNF. Additionally, two Bdnf knockdown mice were excluded because of experimenter error.) The PR task is commonly used as an assay of “reward value.” Measuring post-reinforcement pausing (PRP) in animals with differing breakpoint ratios can, however, determine whether differences in primary motivation can instead account for group differences (Skjoldager et al., 1993). For example, with increased food restriction, presumably increasing “motivation,” PRPs shorten, whereas changes in reinforcer value (e.g., one pellet vs three pellets) do not affect PRPs (Skjoldager et al., 1993). Thus, this metric—the time between magazine head entry after reinforcer delivery and initiation of the next trial—was also compared between wild-type and Bdnf⫹/⫺ mice, which experience adult-onset obesity (Kernie et al., 2000). The first, median, and final pauses for each session were extracted and compared by RM-ANOVA. The ratio of responses on the active versus inactive apertures was also compared by two-factor RM-ANOVA to verify that mice distinguished the active from inactive apertures. BDNF microinfusion For BDNF microinfusions, mice were experimentally naive Bdnf⫹/⫺ and littermate wild-type mice trained to perform the instrumental response as described and then subjected to three PR test sessions. Based on breakpoint ratios, mice were matched and assigned to groups (wild-type plus vehicle, wild-type plus BDNF, Bdnf⫹/⫺ plus vehicle, Bdnf⫹/⫺ plus BDNF). Mice were then anesthetized with 1:1 2-methyl-2-butanol and tribromoethanol diluted 40-fold with saline. The head was shaved and placed in a digitized stereotaxic frame. The scalp was incised, skin retracted, and bregma and lambda identified. The head was leveled, and recombinant human BDNF (Millipore Bioscience Research Reagents) dissolved in saline (0.4 ␮g/␮l; Gourley et al., 2008b, 2009a, 2012) was infused into the mOFC (⫹2.3 mm AP, ⫺2.8 mm DV, ⫾0.1 mm ML; Gourley et al., 2010) in a volume of 0.2 ␮l over 6 min using a digital coordinate system with 1⁄100 mm resolution (David Kopf Instruments). Pilot experiments indicated that a single infusion of BDNF affects PR responding up to 8 d after infusion, so mice here were allowed 4 d recovery, followed by 3 consecutive days of PR testing, with a single PR session per day. Each animal’s breakpoint ratios were averaged, yielding a single value per mouse, which were compared between groups by two-factor (genotype ⫻ infusion) ANOVA. Mice were killed by rapid decapitation immediately after the last session, brains were rapidly sectioned, bilateral needle entry sites were documented, and tissues were frozen on dry ice for Western blot analyses. Additional groups of behaviorally naive wild-type mice were infused and killed 24 h after infusion for Western blot and immunostaining analyses. Western blotting A single experimenter dissected frozen tissue punches from the ventromedial prefrontal cortex (vmPFC), dorsal striatum, dorsal hippocampus, and nucleus accumbens (NAc) core and shell using 1.2 and 0.50 mm tissue cores (Fine Science Tools). vmPFC samples were collected with a single midline punch with the tissue core aimed at the rostroventral-most part of the vmPFC, containing the mOFC. Samples likely included both mOFC and ventral prelimbic PFC to generate sufficient protein concentrations for multiple blots. Hence, these samples are referred to as “vmPFC” tissue. Tissue was sonicated in lysis buffer [75–200 ␮l: 137 mM NaCl, 20 mM Tris-Hcl, pH 8, 1% igepal, 10% glycerol, and 1:100 Phosphatase Inhibitor Cocktails 1 and 2 (Sigma)] and stored at ⫺80°C. Protein concentrations were determined using a Bradford colorimetric assay (Pierce), and 20 ␮g of each sample was separated by SDS-PAGE on an 8 –16% gradient Trisglycine gel (Invitrogen). After transfer to nitrocellulose membrane, blots were blocked with 5% nonfat milk for 1 h. The following primary antibodies were used: anti-phosphorylated (p) ERK1/2 (mouse; 1:1000; Cell Signaling Technology), anti-ERK1/2 (rabbit; 1:2000; Cell Signaling Technology), anti-trkB (mouse; 1:1000; BD Biosciences), anti-p-trk (rabbit; 1:500; Cell Signaling Technology), anti-GluR1 (rabbit; 1:500; Millipore Bioscience Research Reagents), and anti-c-fos (rabbit; 1:500, Santa Cruz Biotechnology). Membranes were incubated with primary antibodies at 4°C for 1 h or overnight and then incubated with IRDye 700 Dx Anti-Rb IgG and IRDye 800 Dx Anti-Ms IgG (both 1:5000; Rockland Immunochemicals) for 1 h. Bands were then quantified using infrared densitometry analysis (Odyssey Infrared Imaging System). Membranes were reprobed with antiGAPDH, which served as a loading control (mouse; 1:20,000; Advanced Immunochemical). pERK1/2 was normalized to total ERK1/2, which was not changed in any comparison. Infrared values were converted to a percentage of control samples from the same membrane to control for variance between gels. Group means were then compared by t test or two-factor ANOVA as appropriate. BDNF ELISA Fresh frozen brain tissue was dissected and homogenized as for Western blotting experiments. BDNF was quantified by ELISA (Promega) in duplicate in accordance with the instructions of the manufacturer except for exclusion of the extraction step. BDNF concentrations were normalized to total protein concentrations in each sample. pERK1/2 immunostaining One group of wild-type mice was infused with BDNF or saline as described. Then, 24 h later, brains were harvested and stored in 4% paraformaldehyde for 48 h and then transferred to 30% w/v sucrose before being sectioned into 45 ␮m sections. Sections were blocked in a PBS solution containing 2% normal goat serum, 1% bovine serum albumin (BSA), and 0.3% Triton X-100 (Sigma) for 1 h at room temperature. Sections were then incubated in primary antibody solution containing 0.3% normal goat serum, 1% BSA, and 0.3% Triton X-100 at 4°C for 48 h. pERK1/2 (1:400; Cell Signaling Technology) served as the primary antibody. Sections were incubated in secondary antibody solution containing 0.5% normal goat serum and 0.3% Triton X-100, with Alexa Fluor 633 (1:200; Life Technologies) serving as the secondary antibody. Sections were imaged on a Nikon 4550s SMZ18 microscope with settings held constant. Concentric rings 35 ␮m apart were generated in NIH ImageJ with the center of the first ring positioned at the base of the infusion site. Integrated intensity was measured along the perimeter of each ring to assess the spread of ERK1/2 phosphorylation after BDNF Gourley, Zimmermann et al. • The Role of BDNF in the Medial Orbital Cortex infusion, and values were compared by RM-ANOVA. Each mouse contributed a single value. Experimental design for experiments using a MAP kinase kinase inhibitor To evaluate whether suppressing MAP kinase kinase (MEK), a kinase directly upstream of ERK1/2, could regulate PR responding, intact wildtype mice were trained to perform an instrumental response for food reinforcement as described. During this period, mice were also habituated to injection by nightly handling and mock intraperitoneal injection to ensure injections before PR tests did not interfere with performance. Mice were then matched based on reinforcers acquired during training and assigned to either vehicle or PD0184161 (5-bromo-2-[(2-chloro4-iodophenyl)amino]-N-(cyclopropylmethoxy)-3,4-difluorobenzamide; 30 mg/kg, generously provided by Dr. David Russell, Yale University, New Haven, CT) groups. Because the drug necessitated 100% DMSO for dissolution, mice were maintained at 29 –31 g body weight to allow for accurate injection of very small injection volume (30 ␮l/mouse). In contrast to other experiments, this translates to maintaining mice at ⬃100% of their original body weight. PD0184161 was administered within 48 h of dissolution and kept at 4°C when not in use. Thirty minutes before the PR tests, mice were injected with either drug or vehicle alone. An additional control group was injected at the end of the session to evaluate whether MEK inhibition interfered with postsession memory consolidation. Breakpoint ratios were compared by RM-ANOVA. Three weeks after the last test, mice were tested without drug to identify whether PD0184161 had long-term consequences. Locomotor monitoring Ambulation in a clean cage was quantified using the automated Omnitech Digiscan Micromonitor system equipped with 16 photocells or a customized Med-Associates locomotor monitoring system identically equipped with 16 photocells. Mice were food restricted overnight to recapitulate locomotor activity levels during instrumental response training and testing. Consecutive photobeams broken across 60 min were compared by two-factor (group ⫻ time bin) RM-ANOVA. Mice expressing DREADDs and their GFP-expressing counterparts were administered CNO and then placed in the locomotor monitoring chambers 30 min later. Given evidence that the mOFC may regulate repetitive behavior (Ahmari et al., 2013), ambulatory counts were segregated from repetitive interruption of the same photobeam, an indicator of stereotypy-like behavior. Both types of photobeam breaks were compared between groups by RM-ANOVA. Additional behavioral testing in drug-naive mice Cued reinforcer delivery. A naive group of Bdnf⫹/⫺ and wild-type mice was used. We added discrete stimuli (a 2 s, 2.9 kHz tone and extinction of the house light during the 2 s between nose-poke response and food pellet delivery) signaling reinforcer delivery. This stimulus was delivered during both training and PR testing. Otherwise, training, testing, and analytic procedures were identical. Extinction conditioning. In trained mice, responding in the absence of reinforcement (extinction) was evaluated. Here, the tube connecting the food hopper and magazine was disconnected; daily tests were otherwise identical to those during response training. Response rates were compared by two-factor (group ⫻ session) RM-ANOVA. Sucrose consumption. Mice were habituated to drinking a 1% (w/v) sucrose (Sigma) solution in place of water for 2 d as part of an experimental protocol that was aimed at evaluating animals’ hedonic response to a desirable food product. Mice were next fluid deprived for 4 h, followed by 1 h access to the sucrose solution. The deprivation periods were then extended to 14 and 19 h to habituate mice to water restriction. Finally, each mouse was allowed 1 h access to the solution in its home cage, whereas cage mates were housed in a clean cage in the colony room. Each mouse had 1 h access such that the average deprivation period was 14 h. ANOVA with testing order as the independent measure confirmed no effects on consumption (F ⬍ 1). Consumption values were then normalized to body weight and compared by t test. This procedure has been shown previously to be sensitive to other manipulations (Gourley et al., 2008b, 2013a; Gourley and Taylor, 2009). J. Neurosci., April 20, 2016 • 36(16):4600 – 4613 • 4603 Results Stimulation of the mOFC enhances behavioral sensitivity to reinforcer devaluation We first expressed CaMKII-driven AAV–GFP in the mOFC. GFP was mostly contained within the mOFC (Fig. 1a). Within the downstream dorsal striatum, labeled terminals were confined to the medial-most aspects of the rostral striatum, adjacent to the lateral ventricles (Fig. 1b). This innervation pattern is highly consistent with previous reports (Schilman et al., 2008; Hoover and Vertes, 2011). Innervation of the amygdala was also consistent with previous reports (Hoover and Vertes, 2011; Fig. 1c); relative to projections of the infralimbic and prelimbic cortices (Mcdonald et al., 1996), the mOFC has sparse innervation of the lateral capsular division of the central nucleus of the amygdala (CeA). Rather, projections mostly avoid the CeA, instead innervating the ventral portion of the lateral nucleus of the amygdala and the medial aspects of the basal nucleus (Fig. 1c). Next, CaMKII-driven AAV–Gs–DREADD–mCitrine or AAV– GFP was infused into the mOFC of separate mice. Fluorescence distribution is represented in Figure 1d, with the majority of infusions selective to the mOFC. Mice were trained to nose poke for food pellets. We identified no group differences in response acquisition (interaction, F(5,40) ⫽ 1.7, p ⫽ 0.15; effect of group, F ⬍ 1; Fig. 1e). We then delivered the DREADDs ligand CNO via systemic injection to all mice, regardless of viral vector, and gave the mice access to food ad libitum: the reinforcer pellets in one session and regular chow in another session. Throughout, food intake did not differ between groups (t8 ⫽ ⫺1.44, p ⫽ 0.19; Fig. 1f ). Mice were then placed in the operant conditioning chambers. Control mice generated robust response rates throughout, insensitive to reinforcer devaluation. In contrast, Gs–DREADDs-expressing mice inhibited responding after prefeeding with the reinforcer pellets (session ⫻ group, F(1,8) ⫽ 7.6, p ⫽ 0.03; Fig. 1g). Thus, Gs–DREADDs-mediated mOFC stimulation enhanced behavioral sensitivity to reinforcer devaluation. Mice were then trained for four additional sessions (Fig. 1e), and the reinforcer devaluation procedure was repeated. With this additional identical test (and habituation to the injection stressor), all mice displayed sensitivity to reinforcer devaluation as expected, inhibiting responding after pellet prefeeding (main effect, F(1,16) ⫽ 22.9, p ⬍ 0.001; Fig. 1h). These findings indicate that CNO does not itself impair sensitivity to prefeeding devaluation (i.e., in GFP control mice) but rather enhances sensitivity to reinforcer devaluation in Gs–DREADDs-expressing mice. Activation of mOFC Gs–DREADDs also does not appear to obviously affect extinction conditioning because overall response rates were indistinguishable between groups during the probe tests (Fig. 1g,h), which are conducted in extinction. We hypothesized that stimulation of the mOFC may increase sensitivity to effort requirements, i.e., the effort required to obtain a reinforcer, relative to the value of the reinforcer. Thus, we tested the same mice in a PR task. In this test, the response requirements progressively increase for a reinforcer of fixed value. Consistent with our hypothesis, activation of Gs–DREADDs decreased breakpoint ratios (t14 ⫽ 2.2, p ⫽ 0.04; Fig. 1i). There is some evidence that hyperexcitation of the mOFC causes repetitive stereotypy-like behaviors (Ahmari et al., 2013). Previous experiments used optogenetic stimulation of the mOFC rather than DREADDs approaches, which instead regulate the firing threshold of neurons (Urban and Roth, 2015). We quantified spontaneous locomotor activity for 1 h 30 min after CNO administration (matching the timing of devaluation and PR test- 4604 • J. Neurosci., April 20, 2016 • 36(16):4600 – 4613 Gourley, Zimmermann et al. • The Role of BDNF in the Medial Orbital Cortex Figure 1. Chemogenetic stimulation of the mOFC enhances behavioral sensitivity to reinforcer value and PR response requirements. a, Viral vectors expressing GFP were infused into the mOFC, annotated by the anatomical boundaries outlined at right. b, Fluorescing axons were detectable in a stereotyped pattern hugging the medial wall of the dorsal striatum, particularly in the rostral portion highlighted by the gray dashed lines (cf. Schilman et al., 2008). c, Terminals were also detected in the medial compartment of the basal amygdala. The corresponding coronal sections are shown at right. Blue boxes outline the areas shown in the photomicrographs. d, Infection spread for viral vector experiments is represented on images from the Mouse Brain Library (Rosen et al., 2000), with black indicating the largest spread and white the smallest. e, Mice expressing CaMKII-driven AAV–GFP or AAV–Gs–DREADD–mCitrine acquired the instrumental response. Breaks in the response acquisition curves indicate tests for behavioral sensitivity to reinforcer devaluation. f, Mice were fed the reinforcer pellets or regular chow ad libitum before probe tests (devalued and value conditions). Groups did not differ in food consumption. g, Activation of Gs–DREADDs augmented behavioral sensitivity to reinforcer devaluation, decreasing response (Figure legend continues.) Gourley, Zimmermann et al. • The Role of BDNF in the Medial Orbital Cortex J. Neurosci., April 20, 2016 • 36(16):4600 – 4613 • 4605 Figure 2. Selective Bdnf knockdown in the mOFC decreases behavioral sensitivity to reinforcer value and PR response requirements. a, Cre-expressing viral vectors were infused into the mOFC (as in Fig. 1d) of “floxed” Bdnf mice. Mice were trained to nose poke for food reinforcers, with no differences between groups. b, After prefeeding devaluation (and in the absence of an injection stressor as in Fig. 1), control GFP-expressing mice decreased response rates relative to baseline. Meanwhile, selective Bdnf knockdown mice failed to modify response rates. Instead, response rates were indistinguishable from those generated by mice that had access to regular chow before test (value groups). c, mOFC Bdnf knockdown also increased breakpoint ratios. d, Meanwhile, response extinction was not affected; e, furthermore, spontaneous locomotor activity was unaffected. Ambulatory and repetitive photobeam breaks are represented in 5 min bins (left) and 1 h total counts (right). Symbols represent means ⫾ SEMs. *p ⬍ 0.05. n ⫽ 8 per group. ing). Mice engaged in ambulatory behavior more than repetitive stereotypy-like behavior in general (F(1,28) ⫽ 103, p ⬍ 0.001), but we identified no group differences in either ambulation or repetitive stereotypy-like locomotor counts (F ⬍ 1; Fig. 1j). This same locomotor monitoring system has been used to document changes in psychostimulant-elicited locomotion and stereotypylike behavior (Gourley et al., 2009b), increasing confidence in this null result. Selective Bdnf knockdown in the mOFC decreases behavioral sensitivity to reinforcer devaluation Excitatory pyramidal neurons—those targeted by CaMKIIdriven AAV–Gs–DREADD—are a primary source of the proplasticity neurotrophin BDNF in the cortex. We hypothesized that BDNF may be a molecular mechanism by which the mOFC regulates reward-related decision-making. To test this perspective, we next reduced expression of Bdnf in the mOFC using a viral vector approach (for representation of viral vector spread, see Fig. 1d). Subsequently, mice acquired the nose-poke response with no group differences (interaction and main effect, F ⬍ 1; Fig. 2a). After pellet prefeeding (and in the absence of an injection stressor as in the studies above), control GFP-expressing mice decreased response rates as expected. Meanwhile, mice with selective Bdnf knockdown failed to modify response rates (interaction, F(1,13) ⫽ 5.3, p ⫽ 0.04; Fig. 2b), insensitive to reinforcer devaluation. Instead, these mice responded identically to those that had been prefed with regular chow, leaving the value of the food pellet intact (value groups). (Note that response rates generated during the probe tests were compared with each animal’s own baseline because an effect of testing order in this experiment indicated that mice extinguished responding during a second probe test not shown, regardless of devalued or value condition.) 4 (Figure legend continued.) rates. Meanwhile, control mice did not modify their response patterns. h, A second experience with the reinforcer devaluation procedure and injection stress ultimately resulted in response inhibition in both groups (n ⫽ 4 – 6 per group). i, Activating Gs–DREADDs also reduced breakpoints in a PR test (n ⫽ 6 –10 per group). j, Locomotor activity was not effected by Gs–DREADDs stimulation. Ambulatory and repetitive photobeam breaks are represented in 5 min bins (left) and 1 h total counts (right). Mice in both instrumental conditioning experiments were tested. Bars and symbols represent means ⫾ SEMs. *p ⬍ 0.05, **p ⬍ 0.001. mOFC-selective Bdnf knockdown mice also generated higher breakpoint ratios in a PR test (t(13) ⫽ ⫺2.1, p ⫽ 0.05; Fig. 2c), again the opposite pattern relative to mOFC Gs–DREADDsexpressing mice. This could not obviously be attributable to resistance to extinction because selective Bdnf knockdown did not affect extinction conditioning (main effect of session, F(2,26) ⫽ 14.2, p ⬍ 0.001; effect of group and interaction, F ⬍ 1; Fig. 2d). The lack of effect on extinction conditioning replicates our previous findings on this topic (Gourley et al., 2009a), and this pattern overall is consistent with the suggestion that the mouse mOFC, like the primate mOFC, regulates behavioral sensitivity to outcome value. As with the Gs–DREADDs mice, locomotor activity counts did not differ between groups (ambulation, F(1,14) ⫽ 1.7, p ⫽ 0.2; repetitive photobeam breaks, F(1,14) ⫽ 1.2, p ⫽ 0.3; interactions, F ⬍ 1; Fig. 2e). Bdnf⫹/⫺ mice are behaviorally insensitive to reinforcer devaluation Our findings suggest that mOFC BDNF serves as an inhibitory brake on reward-related responding. Bdnf⫹/⫺ mice are viable, meaning that we were next able to assess whether these mice develop the same phenotype as selective Bdnf knockdown mice and whether it could be rescued by selective replacement of BDNF in the mOFC. We first confirmed that brain (hippocampal) BDNF expression in Bdnf⫹/⫺ mice was approximately half that in wild-type mice as expected ( p ⬍ 0.001; Fig. 3a). Mice in this experiment successfully acquired the food-reinforced instrumental response. We detected no differences between groups (F ⬍ 1; Fig. 3b). Food consumption during the prefeeding periods was also unaffected by genotype (t(26) ⫽ 1.2, p ⫽ 0.25; Fig. 3c). Wild-type mice subsequently reduced responding associated with the now devalued reinforcer, but Bdnf⫹/⫺ mice failed to inhibit responding, showing insensitivity to reinforcer devaluation (interaction, F(1,26) ⫽ 4.11, p ⫽ 0.05; within-group post hoc p ⫽ 0.8; Fig. 3d). In addition to unchanged food intake during ad libitum feeding (Fig. 3c), consumption of a palatable sucrose solution was also indistinguishable between genotypes ( p ⫽ 0.6; Fig. 3e). This pattern suggests that adult Bdnf⫹/⫺ mice are behaviorally insensitive to reinforcer devaluation, as opposed to, for example, the hedonic valence of the reinforcer. This is an important distinction, 4606 • J. Neurosci., April 20, 2016 • 36(16):4600 – 4613 Gourley, Zimmermann et al. • The Role of BDNF in the Medial Orbital Cortex Figure 3. Bdnf⫹/⫺ mice are behaviorally insensitive to reinforcer devaluation. a, Bdnf⫹/⫺ mice expressed approximately half as much BDNF in hippocampal lysates as wild-type littermates (n ⫽ 11–12 per group). b, Mice acquired the nose-poke response without differences between groups. c, Mice also consumed equivalent amounts of food under ad libitum conditions. d, After freely consuming reinforcer pellets, however, only wild-type mice significantly decreased response rates, whereas Bdnf⫹/⫺ mice failed to significantly modify response patterns. e, Nonetheless, the same mice consumed equivalent amounts of a palatable sucrose solution (SUC; n ⫽ 16 wild-type, 12 Bdnf⫹/⫺ littermates). Bars and symbols represent means ⫾ SEMs per group, *p ⬍ 0.05; **p ⬍ 0.001. wt, Wild-type. Figure 4. Responding on a PR schedule is inflated in Bdnf⫹/⫺ mice. a, Bdnf⫹/⫺ mice achieved higher breakpoint ratios on a PR schedule of reinforcement, particularly during test sessions 4 –7. b, Despite this, responding was equally selective for the active, relative to inactive, apertures between groups. c, d, Comparisons of the average first, median, and last PRPs from sessions 1–3, when breakpoint ratios did not significantly differ, and sessions 4 –7, when breakpoints differed, revealed main effects of time as expected, but no genotype effects. These patterns suggest that motivational differences between wild-type and Bdnf⫹/⫺ mice cannot obviously account for differences in breakpoints. e, vmPFC expression of p-trk, GluR1, and pERK1/2 were decreased in Bdnf⫹/⫺ mice. The latter could be attributed reduced phosphorylation of the ERK2 isoform. Representative bands are adjacent. Bars and symbols represent means ⫾ SEMs per group. *p ⬍ 0.05; **p ⬍ 0.001. wt, Wild-type. n ⫽ 13–14 per group. given that Bdnf⫹/⫺ mice develop late-life obesity (Kernie et al., 2000). PR responding is elevated in Bdnf⫹/⫺ mice As in our experiments with mOFC-selective Bdnf knockdown, responding on a PR schedule was also assessed. Here, we tested mice daily over the course of 1 week, revealing an interaction between genotype and session (F(6,150) ⫽ 2.6, p ⫽ 0.02; Fig. 4a). Breakpoint ratios gradually grew in Bdnf⫹/⫺ mice, coupled with a modest decline in typical mice. However, the ratio of responses on the active versus inactive apertures was un- changed (main effect of genotype, F(1,25) ⫽ 1; interaction, F(6,150) ⫽ 1.6, p ⫽ 0.17; Fig. 4b), indicating that responding was equally selective for the active nose-poke aperture between groups. An increase in breakpoint ratios could conceivably be attributed to increased perceived value of the reinforcer or increased motivation to acquire the reinforcer. PRPs can dissociate these factors. PRPs decrease when the motivation to acquire an outcome increases, for example, rats more rapidly initiate a new trial after collecting a reinforcer when hungry (Skjoldager et al., 1993). In contrast, increasing the quantity of Gourley, Zimmermann et al. • The Role of BDNF in the Medial Orbital Cortex J. Neurosci., April 20, 2016 • 36(16):4600 – 4613 • 4607 Figure 5. mOFC BDNF infusion enhances pERK1/2 expression. a, Experimental timeline. Mice were infused into the mOFC with saline or BDNF, then brains were collected 24 h later. b, BDNF increased expression of c-fos and pERK1/2, as measured by Western blot. Representative bands are adjacent. n ⫽ 6 per group. c, In another cohort of mice, pERK1/2 expression was quantified in concentric rings radiating from the base of the infusion site in fixed, hemisected coronal sections. The schematic represents the rings overlaid on representations of the largest, smallest, and representative (middle line) pERK1/2 fluorescence signal. d, Quantitative immunostaining revealed increased pERK1/2 expression at the infusion site 24 h after infusion, and the pERK1/2 signal was further elevated in BDNF-infused mice. Representative image is adjacent. n ⫽ 3– 4 per group. Bars and symbols represent means ⫾ SEMs per group. *p ⬍ 0.05. sal, Saline. reinforcers, for example, by providing three pellets instead of one, also increases PR breakpoints but leaves PRPs unaffected. To determine whether motivational contributions potentially increased breakpoint ratios in Bdnf⫹/⫺ mice, we extracted the first, median, and last PRP for each mouse for each test session (Skjoldager et al., 1993). Several analytic approaches failed to identify an effect of genotype. For example, we averaged PRPs for sessions 1–3 (when breakpoints did not significantly differ) and compared them with sessions 4 –7 (when breakpoints differed). These analyses failed to reveal any effects of genotype (F values ⱕ1; Fig. 4c,d). Only main effects of time were detected as expected (session 1–3, F(2,23) ⫽ 6.8, p ⫽ 0.002; session 4 –7, F(2,23) ⫽ 6.9, p ⫽ 0.002). As an additional example, PRPs did not differ between groups during session 1 when breakpoints also did not differ, nor during session 7 when breakpoints did differ (F values ⱕ1; data not shown). In fact, during no test session did the PRPs differ as a function of genotype (all p values ⬎0.05). This pattern of responding— increased breakpoint ratios, coupled with unaffected PRPs— suggests that differences in primary motivation do not account for breakpoint ratio differences between Bdnf⫹/⫺ and littermate wild-type mice. pERK2 is reduced in Bdnf⫹/⫺ vmPFC pERK1/2 has been proposed as a marker of neuronal activity. Furthermore, BDNF binding to its high-affinity receptor trkB activates the ERK MAP kinase signaling pathway. For these reasons, pERK1/2 was analyzed in vmPFC tissue samples, which include the mOFC, immediately after the last session. pERK1/2 was decreased in Bdnf⫹/⫺ mice as expected (t(22) ⫽ 3.3, p ⫽ 0.004; Fig. 4e). Analysis of the individual pERK1/2 isoforms indicated that pERK2, which is preferentially associated with activitydependent neuroplasticity (English and Sweatt, 1996), was significantly reduced ( p ⫽ 0.006). Expression of the primary BDNF receptor trkB was unchanged ( p ⬎ 0.6), but phosphorylation of the receptor was decreased as expected, as was expression of the GluR1 subunit of the AMPA receptor ( p ⬍ 0.05; Fig. 4e). pERK1/2 was also analyzed in the dorsal hippocampus and NAc; notably, there were no differences in expression levels in these regions (data not shown). BDNF replacement in the mOFC blocks behavioral abnormalities We next aimed to block behavioral abnormalities in Bdnf⫹/⫺ mice. We first developed a BDNF microinfusion protocol that, in genetically intact mice, increased levels of pERK1/2 and the immediate-early gene c-fos at the infusion site, detectable 24 h after infusion (all p ⬍ 0.05; Fig. 5a,b). To determine the anatomical distribution of BDNF-mediated ERK1/2 stimulation, we also immunostained for pERK1/2 24 h after infusion and quantified expression in 35 ␮m concentric rings around the infusion site in hemisected coronal sections (Fig. 5c). pERK1/2 expression was increased proximal to the infusion site in both groups, but pERK1/2 was higher in the BDNF group within 500 ␮m of the infusion terminus (interaction, F(19,95) ⫽ 1.9, p ⫽ 0.05; Fig. 5d). Next, we trained a naive cohort of Bdnf⫹/⫺ and littermate wild-type mice to perform the instrumental response. We again detected no differences in response rates or reinforcement rates between groups (data not shown). Then, groups were assigned by matching breakpoint ratios collected during three initial PR test sessions (Fig. 6a). After BDNF infusion in the mOFC, mice were tested again, and an interaction between genotype and infusion was detected (F(1,37) ⫽ 10.3, p ⫽ 0.003; Fig. 6b). mOFC BDNF replacement in Bdnf⫹/⫺ mice normalized breakpoint ratios compared with saline-infused Bdnf⫹/⫺ mice ( p ⫽ 0.01) and wild-type mice infused with BDNF ( p ⫽ 0.009). In other words, selective BDNF replacement fully rescued responding in Bdnf⫹/⫺ mice, and thus, mOFC BDNF is both necessary and sufficient for typical responding in this task. Interestingly, BDNF infusion in wild-type mice modestly increased breakpoint ratios ( p ⫽ 0.08), suggesting that mOFC BDNF regulates action selection according to an inverted U-shaped curve. This finding, although unexpected, bears some similarity to the inverted U-shaped influence of BDNF met gene dosing on gray and white matter morphometry in humans (Forde et al., 2014). Also of note, synaptic scaling can occur after supra-physiological BDNF overexpression (Rutherford et al., 1998), which here, could conceivably impair optimal mOFC function and account for a modest inflation of breakpoints in wild-type mice administered BDNF. Immediately after the last session, mice were killed, and vmPFC tissue was homogenized to evaluate whether BDNF 4608 • J. Neurosci., April 20, 2016 • 36(16):4600 – 4613 Gourley, Zimmermann et al. • The Role of BDNF in the Medial Orbital Cortex Figure 6. mOFC BDNF infusion rescues behavioral sensitivity to a PR schedule of reinforcement in Bdnf⫹/⫺ mice. a, Baseline responding on a PR schedule was established in Bdnf⫹/⫺ mice and wild-type littermates before intracranial BDNF infusion. b, A single mOFC BDNF infusion occluded the expected increase in PR responding in Bdnf⫹/⫺ mice. Mean breakpoint ratios achieved during three test sessions are shown. Inset, Needle tracks terminated within the anatomical boundaries drawn at approximately ⫹2.46 and ⫹2.1 mm from bregma on images from the Mouse Brain Atlas (Rosen et al., 2000). n ⫽ 8 –13 per group. c, Mice were killed immediately after the last session. As expected, pERK2 expression was decreased by 30 – 40% in Bdnf⫹/⫺ mice infused with saline compared to wild-type mice infused with saline. BDNF infusion blocked this deficit. Representative blots are below. d, In the dorsal striatum, c-fos patterns mirrored behavioral response patterns (compare with b), with elevated c-fos associated with high breakpoint ratios. Representative blots are adjacent. e, Breakpoint ratios covaried with striatal c-fos. Bars and symbols represent means ⫾ SEMs per group, except in e, in which each symbol represents a single mouse. *p ⱕ 0.05, **p ⬍ 0.001. wt, Wild-type; sal, saline. infusion normalized pERK2 expression, in parallel with behavioral responding. As expected, pERK2 was decreased in Bdnf⫹/⫺ mice infused with saline (interaction, F(1,26) ⫽ 6.3, p ⫽ 0.02, post hoc p ⫽ 0.02; Fig. 6c). However, BDNF infusion restored pERK2 levels such that BDNF-infused Bdnf⫹/⫺ mice did not differ from control mice ( p ⫽ 0.3). As shown, rodents can learn to select actions according to the value of a reinforcer. Meanwhile, behavioral insensitivity to reinforcer value is associated with a dorsolateral striatal “habit” circuit (Yin et al., 2008, 2009). Thus, the dorsal striatum was also extracted and immunoblotted for the immediate-early gene c-fos. Expression patterns strongly resembled behavioral response patterns (interaction, F(1,26) ⫽ 18, p ⫽ 0.003; Fig. 6d, compare with b): Bdnf⫹/⫺ mice had high c-fos expression levels ( p ⬍ 0.05), whereas BDNF infusion normalized expression ( p ⫽ 0.005 compared with salineinfused Bdnf⫹/⫺ mice). BDNF infusion in wild-type mice increased c-fos ( p ⫽ 0.007). Even from a correlational perspective, high striatal c-fos was associated with high breakpoint ratios (r ⫽ 0.37, p ⫽ 0.045; Fig. 6e). Discrete stimuli signaling reinforcer availability rescue responding Dorsolateral striatal systems are associated with stimulusdependent, as opposed to value-dependent, decision-making (Yin et al., 2008; Hart et al., 2014). Thus, we evaluated whether PR responding could be normalized if Bdnf⫹/⫺ mice were provided with discrete stimuli signaling reinforcer availability; this would presumably access cue-sensitive striatal systems. A light/tone stimulus was coupled with reinforcer delivery during both the response training (Fig. 7a) and PR testing phases (Fig. 7b). Response rates and breakpoint ratios were indistinguishable between Bdnf⫹/⫺ and wild-type mice, suggesting that Bdnf⫹/⫺ mice were able to use pavlovian stimuli to regulate responding (PR test main effect, F ⬍ 1; interaction, F(4,56) ⫽ 1.6, p ⫽ 0.2; Fig. 7b). An alternative perspective is that stimulus– outcome associations energized responding in wild-type mice, but it does not necessarily account for the lack of group differences during training, when the stimuli were also present. Nevertheless, we replicated this experiment in a separate group of mice, and again, wild-type and Bdnf⫹/⫺ mice did not differ (main effect and interaction F values ⬍1; Fig. 7b⬘). Extinction responding in Bdnf⫹/⫺ mice To rule out insensitivity to non-reinforcement or general behavioral inflexibility as causal factors in behavioral abnormalities in Bdnf⫹/⫺ mice, reinforcement was withheld entirely (extinction). A main effect of session (F(5,130) ⫽ 14.5, p ⬍ 0.001), but no effect of genotype or interactions, was detected (F values ⬍1), again indicating that responding was indistinguishable based on genotype (Fig. 7c). Finally, we also confirmed that locomotor activity was unaffected by genotype (F values ⬍1; Fig. 7d). Overall, activity decreased across the session (F(11,110) ⫽ 2.7, p ⫽ 0.004), indicating habituation to the novel environment, but we detected no differences between groups. Importantly, mice were food Gourley, Zimmermann et al. • The Role of BDNF in the Medial Orbital Cortex J. Neurosci., April 20, 2016 • 36(16):4600 – 4613 • 4609 Figure 7. Cues signaling reinforcer delivery rescue PR responding in Bdnf⫹/⫺ mice. No differences in extinction conditioning or locomotor activity. a, A group of wild-type and Bdnf⫹/⫺ mice was tested on a PR schedule as in previous experiments but with the addition of discrete stimuli that signaled reinforcer delivery. Instrumental response acquisition (with stimuli presentation) was unaffected by genotype. b, When presented with stimuli signaling reinforcer delivery, Bdnf⫹/⫺ mice were able to regulate their responding identically to wild-type mice. Gray bars sit at the overall mean breakpoint for each group, with the width representing ⫾ SEM. Light gray, Wild-type; dark gray, Bdnf⫹/⫺. bⴕ, We replicated this experiment in another group of naive mice, and again, breakpoints did not differ when mice were provided discrete stimuli signaling reinforcer delivery. c, Responding in extinction was also unaffected. d, Ambulation counts were also equivalent in food-restricted Bdnf⫹/⫺ and wild-type mice. Symbols represent means ⫾ SEMs. wt, Wild-type. n ⫽ 6 –18 per group throughout. Figure 8. Pretest but not posttest MEK inhibition reduces breakpoint ratios. a, Naive mice were initially trained to perform the instrumental response. b, When a MEK inhibitor [PD0184161 (PD)] was administered before the session, responding on a PR schedule was elevated. Mice were given a 3 week drug washout period and retested, revealing no persistent effects on responding. c, As a control, other mice were injected after the sessions; in this case, breakpoint ratios were unchanged. Bars and symbols represent means ⫾ SEMs per treatment group. *p ⬍ 0.05 versus vehicle. n ⫽ 10 –11 per group. 4.1, p ⫽ 0.02). Specifically, wild-type mice injected with PD0184161 before the test achieved higher breakpoint ratios than mice injected with vehicle before the test ( p ⫽ 0.02; pretest injections represented in Fig. 8b), whereas mice injected with PD0184161 after the test did not differ from corresponding control mice ( p ⫽ 0.17; posttest injections represented in Fig. 8c). Also, when tested drug free, PD0184161-exposed mice were indistinguishable from control mice ( p ⫽ 0.4; Fig. 8b, inset). This pattern suggests ERK1/2 regulates action selection online rather than via postsession memory consolidation (i.e., consolidating information regarding the PR schedule of reinforcement). PR responding predicts vmPFC, but not striatal, BDNF in wild-type mice Our findings suggest that BDNF within the mOFC regulates effortful instrumental response selection. However, an alternative possibility is that mOFCderived BDNF acts via axonal transport from the mOFC to the dorsal striatum. Figure 9. Endogenous prefrontal, but not dorsal striatal, BDNF covaries with instrumental responding in intact mice. a, A large To explore this possibility, we characgroup of naive mice was trained to perform the nose-poke response, and then several PR tests were conducted. Total responses on the active aperture are represented. b, vmPFC BDNF expression correlated with response number during the last test session. c, In terized responding on a PR schedule in contrast, dorsal striatal BDNF did not covary with the same measure. Symbols represent means ⫾ SEMs per group in a; symbols in several naive mice (Fig. 9a). Immediately after the final session, vmPFC and b and c represent individual mice. n ⫽ 15. dorsal striatal samples were extracted. vmPFC BDNF significantly covaried restricted during this test, recapitulating locomotor activity with responding during the last session (r ⫽ 0.62, p ⫽ 0.02; levels during PR and devaluation testing. Fig. 9b), whereas striatal BDNF did not (r ⫽ 0.05, p ⫽ 0.86; Fig. 9c). This outcome suggests that local mOFC BDNF is a MEK inhibition elevates PR responding determining factor in action selection strategies. ⫹/⫺ ERK1/2 phosphorylation was blunted in Bdnf mice. To evaluate whether suppressing MEK, a kinase directly upstream Discussion of ERK1/2, could mimic Bdnf heterozygosity, adult male C57BL/6 mice were trained to perform an instrumental reThe mOFC is a primary component of an anatomically interconsponse for food reinforcement. Mice were matched based on nected medial prefrontal cortical network and is considered disreinforcements earned during training and assigned to either tinct relative to the central and lateral compartments of the OFC vehicle or PD0184161 groups (Fig. 8a). When mice were (Ongür and Price, 2000; Wallis, 2012). These compartments are treated with PD0184161, breakpoint ratios differed (F(3,27) ⫽ instead part of a “sensory integration” network classically associ- 4610 • J. Neurosci., April 20, 2016 • 36(16):4600 – 4613 ated with behavioral flexibility during reversal conditioning (Iversen and Mishkin, 1970; Schoenbaum et al., 2002; McAlonan and Brown, 2003; for contemporary models, see Stalnaker et al., 2015). In contrast, vmPFC structures may be essential to determining behavioral response strategies based on general representations of outcome value (Wallis, 2012). Goal-directed action selection after reinforcer devaluation correlates significantly with neural activity in the human mOFC (Valentin et al., 2007). Furthermore, the mOFC is activated during willingness-to-pay calculations (Plassmann et al., 2007), a finding that may be particularly germane to our current study, because we find that the rodent mOFC is essential to behavioral sensitivity to outcome value and the appropriate “pay” (i.e., effort expenditure) for a given reinforcer in a PR task. Lesions or DREADD-mediated inactivation of the mOFC in rats induce failures in retrieving outcome identity memories (Bradfield et al., 2015), suggesting that the healthy mOFC serves to access outcome value information when it is not immediately observable and thereby guide goaldirected decision-making. We argue that BDNF is essential for this function, given that both tasks used in Bdnf-deficient mice here require mice to sustain a stable representation of reinforcer value to appropriately inhibit responding (after reinforcer devaluation) or gate responding when response requirements escalate (as in the PR task). Bidirectional regulation of behavioral sensitivity to reinforcer value We initiated these studies by expressing in the mOFC Gscoupled DREADDs, engineered G-protein-coupled receptors activated by the otherwise inert ligand CNO (Urban and Roth, 2015). We then decreased reinforcer value using a prefeeding procedure wherein mice can freely consume reinforcer pellets before a probe test conducted in extinction. Additionally, instrumental responding according to a PR schedule of reinforcement was tested. Gs–DREADDs stimulation enhanced response inhibition after reinforcer devaluation, evidence of increased sensitivity to decreased outcome value. Mice were also more sensitive to the escalating demands of the PR schedule, achieving lower breakpoints. mOFC stimulation did not affect ambulation or stereotypy-like behaviors, contrary to evidence that optogenetic stimulation of a mOFC-striatal circuit induces repetitive stereotypy (Ahmari et al., 2013). Repeated burst-like activity caused by optogenetic stimulation, relative to slow depolarization caused by Gs–DREADDs, could account for different behavioral consequences. What molecular factors might regulate mOFC function? Excitatory pyramidal neurons are a primary source of BDNF in the cortex, and high-frequency stimulation induces synaptic BDNF secretion (Hartmann et al., 2001). To determine the role of locally synthesized BDNF in mOFC-dependent decision-making, we reduced local Bdnf, revealing the opposite behavioral profile. Specifically, mOFC-selective Bdnf knockdown interfered with behavioral sensitivity to reinforcer devaluation, and PR responding was inflated. Bdnf⫹/⫺ mice developed the same aberrantly elevated PR breakpoints as those with site-selective knockdown, which allowed us to next confirm that selective infusion of BDNF into the mOFC normalized responding in constitutive Bdnf⫹/⫺ mice. Thus, mOFC BDNF is both necessary and sufficient for appropriate response inhibition. Notably, PRPs did not differ between Bdnf⫹/⫺ and wild-type littermates, suggesting that exaggerated primary motivation to acquire the reinforcer does not account for breakpoint differences (Skjoldager et al., 1993). Gourley, Zimmermann et al. • The Role of BDNF in the Medial Orbital Cortex Moreover, elevated breakpoints could not be attributed to motoric hyperactivity or enhanced hedonic sensitivity to food. These are important measures, given that Bdnf⫹/⫺ mice develop lateadulthood obesity (Kernie et al., 2000). Involvement of ERK/MAP kinase The ERK MAP kinase signaling cascade is coupled to multiple receptor systems, including trkB, the high-affinity receptor for BDNF. ERK1/2 is also implicated in several forms of learning, memory, and neuroplasticity (Mazzucchelli and Brambilla 2000; Rodrigues et al., 2004), and furthermore, LTP induction preferentially activates ERK2 (English and Sweatt, 1996). Phosphorylation of both trkB and ERK1/2 was decreased in the Bdnf⫹/⫺ vmPFC, driven specifically by decreased pERK2, suggesting that ERK2-mediated signaling may be essential to mOFC function. Pharmacologically inhibiting MEK, immediately upstream of ERK1/2, recapitulated the effects of mOFC-selective Bdnf knockdown, augmenting breakpoints in the PR task. Notably, post-session MEK inhibition had no effects, implicating ERK1/2 signaling in online action selection rather than postsession consolidation processes. This perspective is further supported by our DREADDs studies, given that CNO was onboard during testing and is in general agreement with the argument that the mOFC retrieves outcome identity/value information during task performance (Bradfield et al., 2015). Interestingly, this selective function contrasts with that of the ventrolateral orbitofrontal cortex, in which BDNF–trkB appears to also regulate the consolidation or retention of response– outcome associative memory (Zimmermann et al., 2015). Replacement of BDNF selectively within the mOFC rescued pERK1/2 in Bdnf⫹/⫺ mice, and immunostaining for pERK1/2 revealed a sustained elevation in expression fanning from the infusion site. BDNF-induced pERK1/2 extended in some mice into the dorsally situated prelimbic cortex. We believe that the behavioral effects of BDNF infusion can nonetheless be attributed to actions in the mOFC, because Bdnf knockdown in the prelimbic PFC decreases, rather than increases, PR responding (Gourley et al., 2012). Prelimbicselective knockdown also facilitates extinction conditioning, whereas extinction was spared here (Fig. 2; Gourley et al., 2009a). These opposing roles for BDNF in the mOFC and prelimbic cortex may account for why mOFC-selective Bdnf knockdown rapidly modified PR response patterns, whereas genotypic differences between Bdnf⫹/⫺ mice and wild-type littermates emerged only with repeated testing. Corticostriatal interactions in reward-related decision-making BDNF is subject to anterograde transport. For example, cortical pyramidal neurons are a predominant source of BDNF in the striatum, which contains little Bdnf mRNA (Altar et al., 1997). Furthermore, BDNF infusion in the dorsal PFC can increase striatal and amygdalar BDNF (McGinty et al., 2010). Conversely, PFC-selective Bdnf knockdown reduces BDNF in downstream structures (Gourley et al., 2009a, 2013c; Zimmermann et al., 2015). Thus, it is conceivable that mOFC BDNF regulates action selection by binding locally or in distal targets, such as the dorsomedial striatum (Fig. 1; Schilman et al., 2008). However, we found that BDNF in the vmPFC, but not striatum, predicted response patterns. Thus, BDNF subjected to axonal transport from the mOFC to dorsal striatum may serve important purposes, but striatal BDNF was a poor predictor of response patterns here. Gourley, Zimmermann et al. • The Role of BDNF in the Medial Orbital Cortex Unlike BDNF, striatal immediate-early gene expression closely mirrored response patterns, with high c-fos expression in Bdnf⫹/⫺ mice that responded persistently, despite escalating response demands in the PR task. Why might this be? Classically, goal-directed behaviors are defined as those sensitive to outcome value. mPFC damage (lesions, inactivations) and stressors can induce a shift from goal-directed to “habitual” modes of response that are, in contrast, insensitive to reinforcer value (Balleine and O’Doherty, 2010; Schwabe and Wolf, 2011). Converging neuroanatomical models characterize this process as a transition from PFC–striatal systems that act in concert to a cue-sensitive dorsolateral striatum neurocircuit (Yin et al., 2008, 2009; Kimchi et al., 2009; Gourley et al., 2013b). Thus, c-fos expression here may reflect the recruitment of a dorsolateral striatal system that drives stimulusdependent responding in the absence of BDNF-mediated signaling in the mOFC. To test this perspective, we provided discrete stimuli signaling reinforcer availability, “bridging” the association between the nose-poke response and reinforcer. These stimuli eliminated response differences between groups. This effect was robust, detected in multiple experiments and despite cohort variances, suggesting that stimulusdependent response regulation is intact in Bdnf⫹/⫺ mice. In marmosets, vmPFC lesions including the mOFC increase breakpoints (Pears et al., 2003), and mice with mOFC lesions develop nearly identical patterns of responding on a PR schedule relative to Bdnf⫹/⫺ mice (Gourley et al., 2010). However, we have also reported previously that mOFC lesions do not affect behavioral sensitivity to reinforcer devaluation (Gourley et al., 2010). How might we reconcile these findings? Previously, mice were trained before lesion placement, whereas here, Bdnf was knocked down first. It is thus possible that information acquired before insult can help to sustain inhibitory control. Effects of acute BDNF infusion We used multiple complementary genetic, pharmacological, viral, and chemogenetic approaches to provide evidence that the mOFC inhibits instrumental responding when response demands escalate and that BDNF is necessary for this function. One additional discovery was that a single BDNF infusion into the mOFC had sustained behavioral consequences, normalizing response patterns and pERK1/2 in Bdnf⫹/⫺ mice multiple days after infusion (Fig. 6). In conceptually similar studies, BDNF infusion into the dorsomedial PFC after cocaine self-administration in rats reduced the reinstatement of cocaine seeking several days later (Berglind et al., 2007, 2009; Whitfield et al., 2011), a sustained response. However, the relationship between mPFC BDNF and reward seeking is likely quite complex, e.g., cocaine abstinence causes progressive increases in mPFC BDNF above control levels (Lu et al., 2010; McGinty et al., 2010; Giannotti et al., 2014), and the precise behavioral significance of cocaine-induced BDNF overexpression is debated (Sadri-Vakili et al., 2010; Pitts et al., 2016). Ongoing studies using targeted manipulation of BDNF in specific circuits and structures such as the mOFC will help to clarify how BDNF regulates the function of corticolimbic regions in balancing reward seeking versus behavioral inhibition. References Ahmari SE, Spellman T, Douglass NL, Kheirbek MA, Simpson HB, Deisseroth K, Gordon JA, Hen R (2013) Repeated cortico-striatal stimulation generates persistent OCD-like behavior. Science 340:1234 –1239. CrossRef Medline J. Neurosci., April 20, 2016 • 36(16):4600 – 4613 • 4611 Altar CA, Cai N, Bliven T, Juhasz M, Conner JM, Acheson AL, Lindsay RM, Wiegand SJ (1997) Anterograde transport of brain-derived neurotrophic factor and its role in the brain. Nature 389:856 – 860. CrossRef Medline Arana FS, Parkinson JA, Hinton E, Holland AJ, Owen AM, Roberts AC (2003) Dissociable contributions of the human amygdala and orbitofrontal cortex to incentive motivation and goal selection. J Neurosci 23: 9632–9638. Medline Balleine BW, O’Doherty JP (2010) Human and rodent homologies in action control: corticostriatal determinants of goal-directed and habitual action. Neuropsychopharmacology 35:48 – 69. CrossRef Medline Berglind WJ, See RE, Fuchs RA, Ghee SM, Whitfield TW Jr, Miller SW, McGinty JF (2007) A BDNF infusion into the medial prefrontal cortex suppresses cocaine seeking in rats. Eur J Neurosci 26:757–766. CrossRef Medline Berglind WJ, Whitfield TW Jr, LaLumiere RT, Kalivas PW, McGinty JF (2009) A single intra-PFC infusion of BDNF prevents cocaine-induced alterations in extracellular glutamate within the nucleus accumbens. J Neurosci 29:3715–3719. CrossRef Medline Bouret S, Richmond BJ (2010) Ventromedial and orbital prefrontal neurons differentially encode internally and externally driven motivational value in monkeys. J Neurosci 30:8591– 8601. CrossRef Medline Bradfield LA, Dezfouli A, van Holstein M, Chieng B, Balleine BW (2015) Medial orbitofrontal cortex mediates outcome retrieval in partially observable task situations. Neuron 88:1268 –1280. CrossRef Medline Chourbaji S, Hellweg R, Brandis D, Zörner B, Zacher C, Lang UE, Henn FA, Hörtnagl H, Gass P (2004) Mice with reduced brain-derived neurotrophic factor expression show decreased choline acetyltransferase activity, but regular brain monoamine levels and unaltered emotional behavior. Mol Brain Res 121:28 –36. CrossRef Medline Colwill RM, Rescorla RA (1986) Associative structures in instrumental learning. In: Psychology of learning and motivation, Vol 20 (Bower GH, ed). New York: Elsevier. Dickinson A (1980) Contemporary animal learning theory. Cambridge, MA: Cambridge UP. English JD, Sweatt JD (1996) Activation of p42 mitogen-activated protein kinase in hippocampal long term potentiation. J Biol Chem 271:24329 – 24332. CrossRef Medline Figurov A, Pozzo-Miller LD, Olafsson P, Wang T, Lu B (1996) Regulation of synaptic responses to high-frequency stimulation and LTP by neurotrophins in the hippocampus. Nature 381:706 –709. CrossRef Medline Forde NJ, Ronan L, Suckling J, Scanlon C, Neary S, Holleran L, Leemans A, Tait R, Rua C, Fletcher PC, Jeurissen B, Dodds CM, Miller SR, Bullmore ET, McDonald C, Nathan PJ, Cannon DM (2014) Structural neuroimaging correlates of allelic variation of the BDNF val66met polymorphism. Neuroimage 90:280 –289. CrossRef Medline Genoud C, Knott GW, Sakata K, Lu B, Welker E (2004) Altered synapse formation in the adult somatosensory cortex of brain-derived neurotrophic factor heterozygote mice. J Neurosci 24:2394 –2400. CrossRef Medline Giannotti G, Caffino L, Calabrese F, Racagni G, Riva MA, Fumagalli F (2014) Prolonged abstinence from developmental cocaine exposure dysregulates BDNF and its signaling network in the medial prefrontal cortex of adult rats. Int J Neuropsychopharmacol 17:625– 634. CrossRef Medline Gorski JA, Zeiler SR, Tamowski S, Jones KR (2003) Brain-derived neurotrophic factor is required for the maintenance of cortical dendrites. J Neurosci 23:6856 – 6865. Medline Gourley SL, Taylor JR (2009) Recapitulation and reversal of a persistent depression-like syndrome in rodents. Curr Protoc Neurosci Chapter 9:Unit 9.32. CrossRef Medline Gourley SL, Taylor JR (2016) Going and stopping: Dichotomies in behavioral control by the prefrontal cortex. Nat Neurosci, in press. Gourley SL, Wu FJ, Kiraly DD, Ploski JE, Kedves AT, Duman RS, Taylor JR (2008a) Regionally specific regulation of ERK MAP kinase in a model of antidepressant-sensitive chronic depression. Biol Psychiatry 63:353–359. CrossRef Medline Gourley SL, Kiraly DD, Howell JL, Olausson P, Taylor JR (2008b) Acute hippocampal BDNF restores motivational and forced swim performance after corticosterone. Biol Psychiatry 64:884 – 890. CrossRef Medline Gourley SL, Howell JL, Rios M, DiLeone RJ, Taylor JR (2009a) Prelimbic cortex bdnf knock-down reduces instrumental responding in extinction. Learn Mem 16:756 –760. CrossRef Medline 4612 • J. Neurosci., April 20, 2016 • 36(16):4600 – 4613 Gourley SL, Koleske AJ, Taylor JR (2009b) Loss of dendrite stabilization by the Abl-related gene (Arg) kinase regulates behavioral flexibility and sensitivity to cocaine. Proc Natl Acad Sci U S A 106:16859 –16864. CrossRef Medline Gourley SL, Lee AS, Howell JL, Pittenger C, Taylor JR (2010) Dissociable regulation of goal-directed action within mouse prefrontal cortex. Eur J Neurosci 32:1726 –1734. CrossRef Medline Gourley SL, Swanson AM, Jacobs AM, Howell JL, Mo M, Dileone RJ, Koleske AJ, Taylor JR (2012) Action control is mediated by prefrontal BDNF and glucocorticoid receptor binding. Proc Natl Acad Sci U S A 109:20714 – 20719. CrossRef Medline Gourley SL, Swanson AM, Koleske AJ (2013a) Corticosteroid-induced neural remodeling predicts behavioral vulnerability and resilience. J Neurosci 33:3107–3112. CrossRef Medline Gourley SL, Olevska A, Gordon J, Taylor JR (2013b) Cytoskeletal determinant of stimulus-response habits. J Neurosci 33:11811–11816. CrossRef Medline Gourley SL, Olevska A, Zimmermann KS, Ressler KJ, Dileone RJ, Taylor JR (2013c) The orbitofrontal cortex regulates outcome-based decisionmaking via the lateral striatum. Eur J Neurosci 38:2382–2388. CrossRef Medline Hart G, Leung BK, Balleine BW (2014) Dorsal and ventral streams: the distinct role of striatal subregions in the acquisition and performance of goal-directed actions. Neurobiol Learn Mem 108:104 –118. CrossRef Medline Hartmann M, Heumann R, Lessmann V (2001) Synaptic secretion of BDNF after high-frequency stimulation of glutamatergic synapses. EMBO J 20: 5887–5897. CrossRef Medline Hodos W (1961) Progressive ratio as a measure of reward strength. Science 134:943–944. CrossRef Medline Hoover WB, Vertes RP (2011) Projections of the medial orbital and ventral orbital cortex in the rat. J Comp Neurol 519:3766 –3801. CrossRef Medline Iversen SD, Mishkin M (1970) Perseverative interference in monkeys following selective lesions of the inferior prefrontal convexity. Exp Brain Res 11:376 –386. Kang H, Schuman EM (1995) Long-lasting neurotrophin-induced enhancement of synaptic transmission in the adult hippocampus. Science 267:1658 –1662. CrossRef Medline Kang H, Schuman EM (1996) A requirement for local protein synthesis in neurotrophin-induced hippocampal synaptic plasticity. Science 273: 1402–1406. CrossRef Medline Kernie SG, Liebl DJ, Parada LF (2000) BDNF regulates eating behavior and locomotor activity in mice. EMBO J 19:1290 –1300. CrossRef Medline Kimchi EY, Torregrossa MM, Taylor JR, Laubach M (2009) Neuronal correlates of instrumental learning in the dorsal striatum. J Neurophysiol 102:475– 489. CrossRef Medline Korte M, Carroll P, Wolf E, Brem G, Thoenen H, Bonhoeffer T (1995) Hippocampal long-term potentiation is impaired in mice lacking brainderived neurotrophic factor. Proc Natl Acad Sci U S A 92:8856 – 8860. CrossRef Medline Korte M, Griesbeck O, Gravel C, Carroll P, Staiger V, Thoenen H, Bonhoeffer T (1996) Virus-mediated gene transfer into hippocampal CA1 region restores long-term potentiation in brain-derived neurotrophic factor mutant mice. Proc Natl Acad Sci U S A 93:12547–12552. CrossRef Medline Linnarsson S, Björklund A, Ernfors P (1997) Learning deficit in BDNF mutant mice. Eur J Neurosci 9:2581–2587. CrossRef Medline Lu H, Cheng PL, Lim BK, Khoshnevisrad N, Poo MM (2010) Elevated BDNF after cocaine withdrawal facilitates LTP in medial prefrontal cortex by suppressing GABA inhibition. Neuron 67:821– 833. CrossRef Medline MacQueen GM, Ramakrishnan K, Croll SD, Siuciak JA, Yu G, Young LT, Fahnestock M (2001) Performance of heterozygous brain-derived neurotrophic factor knockout mice on behavioral analogues of anxiety, nociception, and depression. Behav Neurosci 115:1145–1153. CrossRef Medline Mazzucchelli C, Brambilla R (2000) Ras-related and MAPK signaling in neuronal plasticity and memory formation. Cell Mol Life Sci 57:604 – 611. CrossRef Medline McAllister AK, Lo DC, Katz LC (1995) Neurotrophins regulate dendritic Gourley, Zimmermann et al. • The Role of BDNF in the Medial Orbital Cortex growth in developing visual cortex. Neuron 15:791– 803. CrossRef Medline McAllister AK, Katz LC, Lo DC (1997) Opposing roles for endogenous BDNF and NT-3 in regulating cortical dendritic growth. Neuron 18: 767–778. CrossRef Medline McAlonan K, Brown VJ (2003) Orbital prefrontal cortex mediates reversal learning and not attentional set shifting in the rat. Behav Brain Res 146: 97–103. CrossRef Medline Mcdonald AJ, Mascagni F, Guo L (1996) Projections of the medial and lateral prefrontal cortices to the amygdala: a Phaseolus vulgaris leucoagglutinin study in the rat. Neuroscience 71:55–75. CrossRef Medline McGinty JF, Whitfield TW Jr, Berglind WJ (2010) Brain-derived neurotrophic factor and cocaine addiction. Brain Res 1314:183–193. CrossRef Medline Montkowski A, Holsboer F (1997) Intact spatial learning and memory in transgenic mice with reduced BDNF. Neuroreport 8:779 –782. CrossRef Medline Nakata H, Nakamura S (2007) Brain-derived neurotrophic factor regulates AMPA receptor trafficking to post-synaptic densities via IP3P and TRPC calcium signaling. FEBS Lett 581:2047–2054. CrossRef Medline Ongür D, Price JL (2000) The organization of networks within the orbital and medial prefrontal cortex of rats, monkeys, and humans. Cereb Cortex 10:206 –219. CrossRef Medline Padoa-Schioppa C, Assad JA (2006) Neurons in the orbitofrontal cortex encode economic value. Nature 441:223–226. CrossRef Medline Padoa-Schioppa C, Assad JA (2008) The representation of economic value in the orbitofrontal cortex is invariant for changes in menu. Nat Neurosci 11:95–102. CrossRef Medline Patterson SL, Abel T, Deuel TA, Martin KC, Rose JC, Kandel ER (1996) Recombinant BDNF rescues deficits in basal synaptic transmission and hippocampal LTP in BDNF knockout mice. Neuron 16:1137–1145. CrossRef Medline Paulus MP, Frank LR (2003) Ventromedial prefrontal cortex activation is critical for preference judgments. Neuroreport 14:1311–1315. CrossRef Medline Pears A, Parkinson JA, Hopewell L, Everitt BJ, Roberts AC (2003) Lesions of the orbitofrontal but not medial prefrontal cortex disrupt conditioned reinforcement in primates. J Neurosci 23:11189 –11201. Medline Pitts EG, Taylor JR, Gourley SL (2016) Prefrontal cortical BDNF: a regulatory key in cocaine- and food-reinforced behaviors. Neurobiol Dis. Advance online publication. Retrieved March 12, 2016. doi: 10.1016/ j.nbd.2016.02.021. CrossRef Medline Plassmann H, O’Doherty J, Rangel A (2007) Orbitofrontal cortex encodes willingness to pay in everyday economic transactions. J Neurosci 27: 9984 –9988. CrossRef Medline Rodrigues SM, Schafe GE, LeDoux JE (2004) Molecular mechanisms underlying emotional learning and memory in the lateral amygdala. Neuron 44:75–91. CrossRef Medline Rosen G, Williams AG, Capra JA, Connolly MT, Cruz B, Lu L, Airey DC, Kulkarni A, Williams RW (2000) The mouse brain library at www.Mbl. Org. Presented at the 14th International Mouse Genome Conference, Crete, Greece, Accessed May 2, 2011. Rutherford LC, Nelson SB, Turrigiano GG (1998) BDNF has opposite effects on the quantal amplitude of pyramidal neuron and interneuron excitatory synapses. Neuron 21:521–530. CrossRef Medline Sadri-Vakili G, Kumaresan V, Schmidt HD, Famous KR, Chawla P, Vassoler FM, Overland RP, Xia E, Bass CE, Terwilliger EF, Pierce RC, Cha JH (2010) Cocaine-induced chromatin remodeling increases brainderived neurotrophic factor transcription in the rat medial prefrontal cortex, which alters the reinforcing efficacy of cocaine. J Neurosci 30:11735–11744. CrossRef Medline Schilman EA, Uylings HB, Galis-de Graaf Y, Joel D, Groenewegen HJ (2008) The orbital cortex in rats topographically projects to central parts of the caudate-putamen complex. Neurosci Lett 432:40 – 45. CrossRef Medline Schoenbaum G, Nugent SL, Saddoris MP, Setlow B (2002) Orbitofrontal lesions in rats impair reversal but not acquisition of go, no-go odor discriminations. Neuroreport 13:885– 890. CrossRef Medline Schwabe L, Wolf OT (2011) Stress-induced modulation of instrumental behavior: from goal-directed to habitual control of action. Behav Brain Res 219:321–328. CrossRef Medline Skjoldager P, Pierre PJ, Mittleman G (1993) Reinforcer magnitude and pro- Gourley, Zimmermann et al. • The Role of BDNF in the Medial Orbital Cortex gressive ratio responding in the rat: effects of increased effort, prefeeding, and extinction. Learn Motiv 24:303–343. CrossRef Stalnaker TA, Cooch NK, Schoenbaum G (2015) What the orbitofrontal cortex does not do. Nat Neurosci 18:620 – 627. CrossRef Medline Urban DJ, Roth BL (2015) DREADDs (designer receptors exclusively activated by designer drugs): chemogenetic tools with therapeutic utility. Annu Rev Pharmacol Toxicol 55:399 – 417. CrossRef Medline Valentin VV, Dickinson A, O’Doherty JP (2007) Determining the neural substrates of goal-directed learning in the human brain. J Neurosci 27: 4019 – 4026. CrossRef Medline Wallis JD (2012) Cross-species studies of orbitofrontal cortex and value-based decision-making. Nat Neurosci 15:13–19. CrossRef Medline Whitfield TW Jr, Shi X, Sun WL, McGinty JF (2011) The suppressive effect of an intra-prefrontal cortical infusion of BDNF on cocaine-seeking is Trk receptor and extracellular signal-regulated protein kinase mitogen-activated protein kinase dependent. J Neurosci 31:834 – 842. CrossRef Medline J. Neurosci., April 20, 2016 • 36(16):4600 – 4613 • 4613 Wilson RC, Takahashi YK, Schoenbaum G, Niv Y (2014) Orbitofrontal cortex as a cognitive map of task space. Neuron 81:267–279. CrossRef Medline Yin HH, Ostlund SB, Balleine BW (2008) Reward-guided learning beyond dopamine in the nucleus accumbens: the integrative functions of the cortico-basal ganglia networks. Eur J Neurosci 28:1437–1448. CrossRef Medline Yin HH, Mulcare SP, Hilário MR, Clouse E, Holloway T, Davis MI, Hansson AC, Lovinger DM, Costa RM (2009) Dynamic reorganization of striatal circuits during the acquisition and consolidation of a skill. Nat Neurosci 12:333–341. CrossRef Medline Zimmermann KS, Yamin JA, Rainnie DG, Kessler KJ, Gourley SL (2015) Connections of the mouse orbitofrontal cortex and regulation of goal-directed action selection by BDNF-trkB. Biol Psychiatry. Advance online publication. Retrieved March 12, 2016. doi: 10.1016/ j.biopsych.2015.10.026. CrossRef