Zoeken
Zoeken kan via de modus 'eenvoudig zoeken' (één veld) of uitgebreid via 'geavanceerd zoeken' (meerdere velden). Zo kan je bv. zoeken op een combinatie van een auteursnaam (auteur), een jaartal (jaar) en een documenttype.
Boekenmand
Nuttige resultaten kan je aanvinken en toevoegen aan een mandje. De inhoud hiervan kan je exporteren of afdrukken (naar bv. PDF).
RSS
Op de hoogte blijven van nieuw toegevoegde publicaties binnen uw interessegebied? Dit kan door een RSS-feed (?) te maken van jouw zoekopdracht.
nieuwe zoekopdracht
Behavior adaptation by means of reinforcement learning
|
| Author keywords |
Optimal Policy, Reinforcement Learning, Autonomous Underwater Vehicle, Learning Sample, Future Reward |
| Auteurs | | Top |
- Carreras, M.
- El-fakdi, A.
- Ridao, P.
|
|
|
| Abstract |
Machine learning techniques can be used for learning the action-decision problem that most autonomous robots have when working in unknown and changing environments. Reinforcement learning (RL) offers the possibility of learning a state-action policy that solves a particular task without any previous experience. A reinforcement function, designed by a human operator, is the only required information to determine, after some experimentation, the way of solving the task. This chapter proposes the use of RL algorithms to learn reactive AUV behaviors and therefore not having to define the state-action mapping to solve the task. The algorithms will find the policy that optimizes the task and will adapt to any environment dynamics encountered. The advantage of the approach is that the same algorithms can be applied to a range of tasks, assuming that the problem is correctly sensed and defined. The two main methodologies that have been applied in RL-based robot learning for the past 2 decades, value-function methods and policy gradient methods, are presented in this chapter and evaluated in two AUV tasks. In both cases, a well-known theoretical algorithm has been modified to fulfill the requirements of the AUV task and has been applied with a real AUV. Results show the effectiveness of both approaches, each of them with some advantages and disadvantages, and point out the further investigation of these methods for making AUVs perform more robustly and adaptively in future applications. |
IMIS is ontwikkeld en wordt gehost door het VLIZ.