Project Details
Projekt Print View

Relational exploration, learning and inference - Foundations of autonomous learning in natural environments

Subject Area Image and Language Processing, Computer Graphics and Visualisation, Human Computer Interaction, Ubiquitous and Wearable Computing
Term from 2011 to 2016
Project identifier Deutsche Forschungsgemeinschaft (DFG) - Project number 200318003
 
Final Report Year 2019

Final Report Abstract

In this project we used the task of robot table tennis as a test-bed to study several learning paradigms of sequential decision making under the constraints of a physical system. These constraints encouraged the development of learning algorithms focused on modularity, sample efficiency and safety. In imitation learning, we developed robust learning methods for probabilistic movement primitives. The probabilistic nature of the primitives was leveraged in a new set of operators we introduced to temporally scale and couple the primitives in a safe way. In reinforcement learning, we developed sample efficient optimizers to locally improve pre-trained primitives. Sample efficiency was obtained by modeling the agent’s behavior. One of the main takeaway of our work was that modeling the reward was more efficient than modeling the forward dynamics. We then layered our model-based principle to hierarchical reinforcement learning to allow the composition of multiple primitives. In the future, we want to extend our work to the two robot table tennis that we have setup at the MPI in Tübingen and that allows training through self-play. We hope that such a goal will foster our understanding of the mechanisms with which robots can autonomously learn skills within the constraints of the physical world.

Publications

  • Exploration in model-based reinforcement learning by empirically estimating learning progress. In Neural Information Processing Systems (NIPS 2012), 2012
    Manuel Lopes, Tobias Lang, and Marc Toussaint
  • Exploration in relational domains for model-based reinforcement learning. Journal of Machine Learning Research, 13:3691–3734, 2012
    Tobias Lang, Marc Toussaint, and Kristian Kersting
  • Active learning for teaching a robot grounded relational symbols. In Proc. of the Int. Joint Conf. on Artificial Intelligence (IJCAI 2013), 2013
    Johannes Kulick, Marc Toussaint, Tobias Lang, and Manuel Lopes
  • Exploiting symmetries for scaling loopy belief propagation and relational training. Machine Learning, 92(1): 91–132, 2013
    Babak Ahmadi, Kristian Kersting, Martin Mladenov, and Sriraam Natarajan
  • Reduce and re-lift: Bootstrapped lifted likelihood maximization for MAP. In Marie desJardins and Michael L. Littman, editors, Proceedings of the Twenty-Seventh AAAI Conference on Artificial Intelligence (AAAI-2013), 2013
    Fabian Hadiji and Kristian Kersting
  • Efficient lifting of MAP LP relaxations using k-locality. In Proceedings of the Seventeenth International Conference on Artificial Intelligence and Statistics (AISTATS-2014), pages 623–632, 2014
    Martin Mladenov, Kristian Kersting, and Amir Globerson
  • Lifted message passing as reparametrization of graphical models. In Nevin L. Zhang and Jin Tian, editors, Proceedings of the Thirtieth Conference on Uncertainty in Artificial Intelligence (UAI-2014), pages 603–612, 2014
    Martin Mladenov, Amir Globerson, and Kristian Kersting
  • Model-based relational RL when object existence is partially observable. In Proc. of the Int. Conf. on Machine Learning (ICML 2014), 2014
    Ngo Anh Vien and Marc Toussaint
  • Poisson dependency networks: Gradient boosted models for multivariate count data. Machine Learning, 100(2-3):477–507, 2015
    Fabian Hadiji, Alejandro Molina, Sriraam Natarajan, and Kristian Kersting
    (See online at https://doi.org/10.1007/s10994-015-5506-z)
  • “Model-Free Trajectory Optimization for Reinforcement Learning”. In: International Conference on Machine Learning (ICML). 2016
    Riad Akrour, Abbas Abdolmaleki, Hany Abdulsamad, and Gerhard Neumann
  • “Probabilistic Inference for Determining Options in Reinforcement Learning”. In: Machine Learning (2016)
    Christian Daniel, Herke van Hoof, Jan Peters, and Gerhard Neumann
    (See online at https://doi.org/10.1007/s10994-016-5580-x)
  • “Using Probabilistic Movement Primitives for Striking Movements”. In: International Conference on Humanoid Robots (Humanoids). 2016
    Sebastian Gomez-Gonzalez, Gerhard Neumann, Bernhard Schölkopf, and Jan Peters
    (See online at https://doi.org/10.1109/HUMANOIDS.2016.7803322)
  • “Empowered skills”. In: International Conference on Robotics and Automation (ICRA). 2017
    Alexander Gabriel, Riad Akrour, Jan Peters and Gerhard Neumann
    (See online at https://doi.org/10.1109/ICRA.2017.7989760)
  • “Layered Direct Policy Search for Learning Hierarchical Skills”. In: International Conference on Robotics and Automation (ICRA). 2017
    Felix End, Riad Akrour, Jan Peters and Gerhard Neumann
    (See online at https://doi.org/10.1109/ICRA.2017.7989761)
  • “Local Bayesian Optimization of Motor Skills”. In: International Conference on Machine Learning (ICML). 2017
    Riad Akrour, Dmitry Sorokin, Jan Peters, and Gerhard Neumann
  • “Adaptation and Robust Learning of Probabilistic Movement Primitives”. In: IEEE Transactions on Robotics (T-RO) (2018)
    Sebastian Gomez-Gonzalez, Gerhard Neumann, Bernhard Schölkopf, and Jan Peters
    (See online at https://doi.org/10.1109/TRO.2019.2937010)
  • “Model-Free Trajectory-based Policy Optimization with Monotonic Improvement”. In: Journal of Machine Learning Resource (JMLR) (2018)
    Riad Akrour, Abbas Abdolmaleki, Hany Abdulsamad, Jan Peters, and Gerhard Neumann
    (See online at https://doi.org/10.5445/IR/1000118268)
  • “Regularizing Reinforcement Learning with State Abstraction”. In: International Conference on Intelligent Robots and Systems (IROS). 2018
    Riad Akrour, Filipe Veiga, Jan Peters, and Gerhard Neumann
    (See online at https://doi.org/10.1109/IROS.2018.8594201)
 
 

Additional Information

Textvergrößerung und Kontrastanpassung