Loading...

ÇмúÇà»ç

Korean AI Association

ÇмúÇà»ç
2019³â °­È­ÇнÀ (Reinforcement learning) ÇÏ°è´Ü±â°­ÁÂ

  >   ÇмúÇà»ç   >   ±¹³»Çмú´ëȸ

±¹³»Çмú´ëȸ

¿¬»ç ¹× °­ÀǼҰ³
¢º Introduction to Reinforcement Learning
 
Abstract:
°­È­ ÇнÀÀº ±â°èÇнÀÀÇ ÇÏÀ§ºÐ¾ßÀÌÁö¸¸ ÀÚµ¿È­ µÈ ÀÇ»ç °áÁ¤ ¹× AI¸¦ À§ÇØ ¹ü¿ëÀ¸·Î Àû¿ëÇÒ ¼ö ÀÖ´Â ºÐ¾ßÀÔ´Ï´Ù. ÀÌ °­ÀÇ¿¡¼­´Â agent°¡ ¸í½ÃÀûÀ¸·Î ÇൿÇÏ°í ¼¼°è¿Í »óÈ£ ÀÛ¿ëÇÏ´Â Åë°è ÇнÀ ±â¼úÀ» ¼Ò°³ÇÕ´Ï´Ù. »çȸ°¡ Á¡Â÷ÀûÀ¸·Î »óÈ£ÀÛ¿ëÇÏ´Â ·Îº¿À̳ª 꺿°ú °°ÀÌ »óÈ£ÀÛ¿ëÇÏ´Â agent ±×¸®°í Áö´ÉÀû ÀÇ»ç°áÁ¤¿¡ ´õ ¸¹Àº °ü½ÉÀÌ ¸ð¾ÆÁö¸é¼­, ÇнÀÇÏ´Â agentÀÇ Á߿伺°ú ³­Á¦µéÀ» ÀÌÇØÇÏ´Â ÇÏ´Â °ÍÀº ¿À´Ã³¯ ¸Å¿ì Áß¿äÇÕ´Ï´Ù.  ÀÌ °­¿¡¼­´Â °­È­ÇнÀÀÇ Ãʼ®ÀÌ µÉ ¼ö ÀÖ´Â Markov Decision Process (MDP), exploration/exploitation tradeoff/decision-making through value function À» ¼Ò°³ ÇÕ´Ï´Ù.
À¯Ã¢µ¿ ±³¼ö(KAIST) Homepage: http://slsp.kaist.ac.kr
¢º Reinforcement Learning with Generalization
 
Abstract:

The theoretical foundation of reinforcement learning rests on solving a Markov Decision Process. This foundation is in deep tension with applications of reinforcement learning which rely heavy on generalization---the ability to successfully learn what to do given never-before-seen circumstances. We've pursued an agenda of adding generalization to reinforcement learning for well over a decade now, resulting in:
  Contextual Bandits: Addressing learning of immediate rewards with generalization.  This is now a service ( http://aka.ms/personalizer) winning the AI system of the year award at IJCAI 2019.

  Learning to Search: An efficient approach to improving on existing policies, with or without simulators.

  Contextual Decision Processes: A new theory directly combining strategic exploration, generalization, and temporal credit assignment which we've used to solve 2^100-sparse problems.

John Langford Homepage: http://hunch.net/~jl/
¢º Model-based reinforcement learning
 
Abstract:
Reinforcement learning (RL) algorithms, a class of iterative methods that solve optimal control problems through self-play, have demonstrated an ability to succeed in a few arduous tasks, emerging as a general framework for decision making in robotics and neuroscience. Recent studies have improved their design for the sake of adaptation and task generalization. The first part of the course provides a concise introduction of model-based RL theory and algorithms. The second part outlines a new approach to model-based RL design with a human-like intelligence, called neuroscience-inspired AI.
ÀÌ»ó¿Ï ±³¼ö(KAIST) Homepage: http://aibrain.kaist.ac.kr
¢º Model Predictive Control
 
Abstract:
Model predictive control (MPC), which is a powerful control method, is nowadays getting more popular with the advent of powerful GPUs. MPC’s popularity is evidenced by the fact that MPC was used by top-ranked teams in the 2019 AlphaPilot Innovation Challenge Qualifiers organized by Lockheed Martin. In this course, we will cover not only  MPC but also dynamic programming and linear-quadratic regulators that are precursors to MPC. We will see a lot of similarity between MPC and  reinforcement learning, which will undoubtedly help us understand reinforcement learning more and better.
À嵿ÀÇ ±³¼ö(KAIST) Homepage: http://control.kaist.ac.kr
¢º Robot Learning: When Machine Learning Meets Robotics
 
Abstract:
With recent advances in hardware, sensing, and algorithms, we are witnessing the emergence of a new robotics industry. I will present a few examples of new services provided by upcoming service robots, assisting us in the near future in places, such as offices, malls, and homes. But, for a robot to coexist with humans and operate successfully in crowded and dynamic environments, a robot must be able to learn from experiences to act safely and harmoniously with human participants. I will discuss research challenges for service robots and our attempts to address those challenges. In particular, I will present our recent work in foundations in robot learning: nested sparse networks for allowing a single deep neural network to perform multiple tasks in a resource-aware manner and Tsallis reinforcement learning, a unifying framework for maximum entropy reinforcement learning. If time permits, I will describe other research activities in our lab.
¿À¼ºÈ¸ ±³¼ö(¼­¿ï´ëÇб³) Homepage: http://rllab.snu.ac.kr/