Project Details
A functional model for emotions
Applicant
Dr. Prakhar Godara
Subject Area
General, Cognitive and Mathematical Psychology
Term
since 2024
Project identifier
Deutsche Forschungsgemeinschaft (DFG) - Project number 546913108
The two-year research project is centered around the exploration of emotions' role in decision-making within the context of reinforcement learning (RL). The core objective involves integrating emotional mechanisms into the RL framework, conceptualizing emotions as a hierarchical controller influencing a baseline (emotionless) RL agent. This approach is designed to be broad, encompassing various theoretical perspectives on embedding emotions in RL agents. Rather than predefining the specific functional form of emotional control, the project adopts a novel perspective, treating emotional mechanisms as the result of an optimization problem. This optimization aligns with evolutionary theories of emotion, aiming to endow RL agents with emotional mechanisms that optimize their survival across diverse environments. While this perspective has previously been adapted to generate intrinsic reward signals, we extend it more broadly to other aspects of emotional control (which we briefly describe below). By refraining from prescribing a fixed emotional control structure, the project allows to study the relationship between the space of environments and the agent's emotions. This knowledge not only holds the promise to help us understand the origin of our own emotional mechanisms, but also equip us with the right tools to develop generally intelligent artificial agents. Specifically, the hierarchical control of the emotional mechanism influences the baseline agent in three key areas: 1. Learning: The agent's learning process undergoes modification. This could involve for instance, the preferential weighting of certain memories to alter the learned representation of the environment. 2. Planning: In the utilization of the agent model for predicting potential futures (rollouts), the emotional mechanism enables the agent to selectively sample a subset of possible trajectories, akin to attention control. 3. Action Selection: At each time step, potential futures are ranked based on the expected reward gained in those futures. In addition to an external reward signal, the emotional mechanism introduces an intrinsic reward at each time step, extending temporally into the agent's future. The agent then maximizes a weighted sum of extrinsic and intrinsic reward. The primary objectives of the project are centered around an in-depth exploration of each of these functions, aiming to identify exclusive behavioral dynamics arising from these distinct forms of control. As the main goal of the project is to study the theoretical consequences of modelling emotions as hierarchical control, as is being implicitly done in a range of existing literature, we illustrate our model primarily in bandit tasks. These tasks, due to their simplicity, offer the cleanest illustration of the predictions of our theory.
DFG Programme
WBP Fellowship
International Connection
USA