Detailseite
Projekt Druckansicht

Skalierbares Autonomes Selbstverstärkendes Lernen durch Reduzierung der Vorstrukturierung

Fachliche Zuordnung Bild- und Sprachverarbeitung, Computergraphik und Visualisierung, Human Computer Interaction, Ubiquitous und Wearable Computing
Förderung Förderung von 2014 bis 2021
Projektkennung Deutsche Forschungsgemeinschaft (DFG) - Projektnummer 260194412
 
Erstellungsjahr 2019

Zusammenfassung der Projektergebnisse

In summary, the project has achieved its planned goals, albeit with not all at the same performance level. We accomplished several tasks exactly as planned, addressing the topic of learning state representation for RL. We deviated from the plan, reorienting ourselves towards transfer learning to facilitate easier exploration and representationrelated topics with several publications in order to gain more insight into real-world applications of deep neural networks. Due to issues with multiple conflicting objectives arising from the robot tetherball experiments, we addressed the topic of MORL, going beyond the minimalist plan of the proposal and achieving state-of-the-art results. Due to a substantial change of personnel, the fragility of the robot hardware, and the large number or repairs needed, the performance of follow-up tasks was below expectations. Allover, we gained important insights in scaling many different aspects of reinforcement learning towards autonomy. Based on the results of the project, we are continuing our research towards scalable autonomous reinforcement learning, and we believe we can solve the remaining pieces of the puzzle.

Projektbezogene Publikationen (Auswahl)

  • Autonomous learning of state representations for control: An emerging field aims to autonomously learn state representations for reinforcement learning agents from their real-world sensor observations. KI - Künstliche Intelligenz, 29(4), Nov 2015
    Wendelin Böhmer, Jost Tobias Springenberg, Joschka Boedecker, Martin Riedmiller, and Klaus Obermayer
    (Siehe online unter https://doi.org/10.1007/s13218-015-0356-1)
  • Embed to control: A locally linear latent dynamics model for control from raw images. In Advances in Neural Information Processing Systems (NIPS), 2015
    Manuel Watter, Jost Springenberg, Joschka Boedecker, and Martin Riedmiller
  • Multimodal deep learning for robust rgb-d object recognition. In IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2015
    Andreas Eitel, Jost Tobias Springenberg, Luciano Spinello, Martin Riedmiller, and Wolfram Burgard
    (Siehe online unter https://doi.org/10.1109/IROS.2015.7353446)
  • Reinforcement learning vs human programming in tetherball robot games. In Proceedings of the International Conference on Intelligent Robots and Systems (IROS), 2015
    Simone Parisi, Hany Abdulsamad, Alexandros Paraschos, Christian Daniel, and Jan Peters
    (Siehe online unter https://doi.org/10.1109/IROS.2015.7354296)
  • Local-utopia policy selection for multi-objective reinforcement learning. In Proceedings of the International Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL), 2016
    Simone Parisi, Alexander Blank, Tobias Viernickel, and Jan Peters
    (Siehe online unter https://doi.org/10.1109/SSCI.2016.7849369)
  • Unsupervised and semi-supervised learning with categorical generative adversarial networks. In International Conference on Learning Representations (ICLR), 2016
    Jost Tobias Springenberg
  • Deep reinforcement learning with successor features for navigation across similar environments. In IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2017
    Jingwei Zhang, Jost Tobias Springenberg, Joschka Boedecker, and Wolfram Burgard
    (Siehe online unter https://doi.org/10.1109/IROS.2017.8206049)
  • Goal-Driven dimensionality reduction for reinforcement learning. In Proceedings of the International Conference on Intelligent Robots and Systems (IROS), 2017
    Simone Parisi, Simon Ramstedt, and Jan Peters
    (Siehe online unter https://doi.org/10.1109/IROS.2017.8206334)
  • Manifold-based Multi-objective policy search with sample reuse. Neurocomputing, 263:3–14, 2017
    Simone Parisi, Matteo Pirotta, and Jan Peters
    (Siehe online unter https://doi.org/10.1016/j.neucom.2016.11.094)
  • Policy search with High-Dimensional context variables. In Proceedings of the Conference on Artificial Intelligence (AAAI), 2017
    Voot Tangkaratt, Herke van Hoof, Simone Parisi, Gerhard Neumann, Jan Peters, and Masashi Sugiyama
 
 

Zusatzinformationen

Textvergrößerung und Kontrastanpassung