Project Details
RoGeRL: Robust and General Reinforcement Learning via AutoML
Applicant
Professor Dr. Marius Lindauer
Subject Area
Methods in Artificial Intelligence and Machine Learning
Term
since 2024
Project identifier
Deutsche Forschungsgemeinschaft (DFG) - Project number 555323245
Reinforcement learning (RL) enables learning through interaction with the environment. Thus, it is a crucial part of AI systems making sequential decisions and learning from them, e.g., in robotics, natural sciences like physics or medicine, or large language models. RL, however, is not only powerful but also hard to apply: current methods tend to be brittle, possess only limited generalization abilities, and their success depends heavily on their design decisions, such as neural architectures or hyperparameters. All three of these factors interact, necessitating extensive tuning and thus making RL research and application tedious and time-consuming. In recent years, automated reinforcement learning (AutoRL) methods have gained traction to achieve improved performance, robustness, and training efficiency through systematic and data-driven approaches. However, we do not yet have general and efficient AutoRL methods, established benchmarks, or approaches for many non-standard RL paradigms like unsupervised or contextual RL. Our goal is to improve both the efficiency and robustness of AutoRL to enable broad usage of these tools for RL practitioners and researchers alike. We will make these methods applicable for extensions of the standard RL problem formulation, such as unsupervised and contextual RL. Furthermore, we will also extend the capabilities of AutoRL methods themselves to enable learning on multiple RL tasks with a single tuning run. To accomplish this, we develop new approaches for studying the landscape properties of AutoRL s.t. we can develop new AutoRL algorithms in a data-driven way. Next, we will extend multi-fidelity optimization to AutoRL -- this will improve the budget demands and thus make RL more accessible on small computational budgets. We continue with the contextual setting based on meta-learned AutoRL and novel meta-features. In contrast to other fields in machine learning, small and simplistic architectures are the norm in RL, but this might be suboptimal. We will develop neural architecture search methods for the dynamic nature of RL. Lastly, to apply our methods to unsupervised RL, we need to develop reliable performance estimation pipelines for this special case where we want to pre-train for several downstream tasks. Throughout, we will analyze the behavior of the systems we develop through methods from interpretable AutoML. AutoRL will contribute to democratizing RL training and thus open new possibilities in all areas of RL research and applications. As a result of this project, we will provide (i) robust and efficient AutoRL systems that can provide practitioners with better results with less computational cost, (ii) protocols and intuitions on how to efficiently research RL methods even in small labs and (iii) improved generalization of AutoRL. We believe this is an integral step towards general RL agents.
DFG Programme
Research Grants