Relationale Exploration, Lernen und Inferenz - Grundlagen des Autonomen Lernens in natürlichen Umgebungen

Antragsteller Professor Dr. Kristian Kersting; Professor Dr. Marc Toussaint

Fachliche Zuordnung Bild- und Sprachverarbeitung, Computergraphik und Visualisierung, Human Computer Interaction, Ubiquitous und Wearable Computing

Förderung Förderung von 2011 bis 2016

Projektkennung Deutsche Forschungsgemeinschaft (DFG) - Projektnummer 200318003

Erstellungsjahr 2019

Zusammenfassung der Projektergebnisse

In this project we used the task of robot table tennis as a test-bed to study several learning paradigms of sequential decision making under the constraints of a physical system. These constraints encouraged the development of learning algorithms focused on modularity, sample efﬁciency and safety. In imitation learning, we developed robust learning methods for probabilistic movement primitives. The probabilistic nature of the primitives was leveraged in a new set of operators we introduced to temporally scale and couple the primitives in a safe way. In reinforcement learning, we developed sample efﬁcient optimizers to locally improve pre-trained primitives. Sample efﬁciency was obtained by modeling the agent’s behavior. One of the main takeaway of our work was that modeling the reward was more efﬁcient than modeling the forward dynamics. We then layered our model-based principle to hierarchical reinforcement learning to allow the composition of multiple primitives. In the future, we want to extend our work to the two robot table tennis that we have setup at the MPI in Tübingen and that allows training through self-play. We hope that such a goal will foster our understanding of the mechanisms with which robots can autonomously learn skills within the constraints of the physical world.

Projektbezogene Publikationen (Auswahl)

Exploration in model-based reinforcement learning by empirically estimating learning progress. In Neural Information Processing Systems (NIPS 2012), 2012
Manuel Lopes, Tobias Lang, and Marc Toussaint
Exploration in relational domains for model-based reinforcement learning. Journal of Machine Learning Research, 13:3691–3734, 2012
Tobias Lang, Marc Toussaint, and Kristian Kersting
Active learning for teaching a robot grounded relational symbols. In Proc. of the Int. Joint Conf. on Artiﬁcial Intelligence (IJCAI 2013), 2013
Johannes Kulick, Marc Toussaint, Tobias Lang, and Manuel Lopes
Exploiting symmetries for scaling loopy belief propagation and relational training. Machine Learning, 92(1): 91–132, 2013
Babak Ahmadi, Kristian Kersting, Martin Mladenov, and Sriraam Natarajan
Reduce and re-lift: Bootstrapped lifted likelihood maximization for MAP. In Marie desJardins and Michael L. Littman, editors, Proceedings of the Twenty-Seventh AAAI Conference on Artiﬁcial Intelligence (AAAI-2013), 2013
Fabian Hadiji and Kristian Kersting
Efﬁcient lifting of MAP LP relaxations using k-locality. In Proceedings of the Seventeenth International Conference on Artiﬁcial Intelligence and Statistics (AISTATS-2014), pages 623–632, 2014
Martin Mladenov, Kristian Kersting, and Amir Globerson
Lifted message passing as reparametrization of graphical models. In Nevin L. Zhang and Jin Tian, editors, Proceedings of the Thirtieth Conference on Uncertainty in Artiﬁcial Intelligence (UAI-2014), pages 603–612, 2014
Martin Mladenov, Amir Globerson, and Kristian Kersting
Model-based relational RL when object existence is partially observable. In Proc. of the Int. Conf. on Machine Learning (ICML 2014), 2014
Ngo Anh Vien and Marc Toussaint
Poisson dependency networks: Gradient boosted models for multivariate count data. Machine Learning, 100(2-3):477–507, 2015
Fabian Hadiji, Alejandro Molina, Sriraam Natarajan, and Kristian Kersting
“Model-Free Trajectory Optimization for Reinforcement Learning”. In: International Conference on Machine Learning (ICML). 2016
Riad Akrour, Abbas Abdolmaleki, Hany Abdulsamad, and Gerhard Neumann
“Probabilistic Inference for Determining Options in Reinforcement Learning”. In: Machine Learning (2016)
Christian Daniel, Herke van Hoof, Jan Peters, and Gerhard Neumann
“Using Probabilistic Movement Primitives for Striking Movements”. In: International Conference on Humanoid Robots (Humanoids). 2016
Sebastian Gomez-Gonzalez, Gerhard Neumann, Bernhard Schölkopf, and Jan Peters
“Empowered skills”. In: International Conference on Robotics and Automation (ICRA). 2017
Alexander Gabriel, Riad Akrour, Jan Peters and Gerhard Neumann
“Layered Direct Policy Search for Learning Hierarchical Skills”. In: International Conference on Robotics and Automation (ICRA). 2017
Felix End, Riad Akrour, Jan Peters and Gerhard Neumann
“Local Bayesian Optimization of Motor Skills”. In: International Conference on Machine Learning (ICML). 2017
Riad Akrour, Dmitry Sorokin, Jan Peters, and Gerhard Neumann
“Adaptation and Robust Learning of Probabilistic Movement Primitives”. In: IEEE Transactions on Robotics (T-RO) (2018)
Sebastian Gomez-Gonzalez, Gerhard Neumann, Bernhard Schölkopf, and Jan Peters
“Model-Free Trajectory-based Policy Optimization with Monotonic Improvement”. In: Journal of Machine Learning Resource (JMLR) (2018)
Riad Akrour, Abbas Abdolmaleki, Hany Abdulsamad, Jan Peters, and Gerhard Neumann
“Regularizing Reinforcement Learning with State Abstraction”. In: International Conference on Intelligent Robots and Systems (IROS). 2018
Riad Akrour, Filipe Veiga, Jan Peters, and Gerhard Neumann

Servicenavigation

Hauptnavigation

Relationale Exploration, Lernen und Inferenz - Grundlagen des Autonomen Lernens in natürlichen Umgebungen

Zusammenfassung der Projektergebnisse

Projektbezogene Publikationen (Auswahl)

Zusatzinformationen

Servicenavigation

Hauptnavigation

Relationale Exploration, Lernen und Inferenz - Grundlagen des Autonomen Lernens in natürlichen Umgebungen

Zusammenfassung der Projektergebnisse

Projektbezogene Publikationen (Auswahl)

Zusatzinformationen

Textvergrößerung und Kontrastanpassung