Project Details
Online convex optimization for control of dynamical systems
Applicant
Professor Dr.-Ing. Matthias Müller
Subject Area
Automation, Mechatronics, Control Systems, Intelligent Technical Systems, Robotics
Term
since 2022
Project identifier
Deutsche Forschungsgemeinschaft (DFG) - Project number 505182457
Online convex optimization (OCO) is a framework originally developed in the online optimization and machine learning community. In OCO, at each time instant, a learner has to decide on an action solely based on previous actions and cost functions. Only after the action is implemented, a new cost function is revealed by the environment. The goal is to minimize the learner's accumulated cost over a finite horizon. The main advantages of the OCO framework are its ability to handle a priori unknown and time-varying cost functions, which arise frequently in many applications, such as, e.g., changing energy prices or loads in energy dispatch, and its ability to take constraints on the action into account. Furthermore, OCO algorithms are generally computationally efficient since they do not require solving the optimization problem at hand. Instead, only one iteration of an appropriate online optimization scheme, e.g., one gradient descent step, is performed at every time instance. These advantages are also highly desirable in the context of control of dynamical systems. Therefore, OCO-based schemes for control have been proposed recently by our group and others. Some first theoretical guarantees for OCO-based control algorithms have been obtained, which, however, require restrictive assumptions and typically limit the considered setting to stabilization of a priori unknown and time-varying setpoints or neglect constraints. The main goal of this project is the development of OCO-based control schemes for general cost functions and constraints without relying on restrictive assumptions. To this end, we plan to employ OCO to directly control dynamical systems as well as pairing OCO with established control strategies such as reference governors and model predictive control (MPC). We will develop algorithms and investigate their specific advantages and resulting closed-loop properties and performance. Furthermore, we will exploit insights from OCO to study the regret (a measure typically used in OCO to evaluate the resulting performance) of general MPC schemes in the context of time-varying and a priori unknown cost functions within this project.
DFG Programme
Research Grants