Self-Paced Absolute Learning Progress as a Regularized Approach to Curriculum Learning

Abstract

The usability of Reinforcement Learning is restricted by the large computation times it requires. Curriculum Reinforcement Learning speeds up learning by defining a helpful order in which an agent encounters tasks, i.e. from simple to hard. Curricula based on Absolute Learning Progress (ALP) have proven successful in different environments, but waste computation on repeating already learned behaviour in new tasks. We solve this problem by introducing a new regularization method based on Self-Paced (Deep) Learning, called Self-Paced Absolute Learning Progress (SPALP). We evaluate our method in three different environments. Our method achieves performance comparable to original ALP in all cases, and reaches it quicker than ALP in two of them. We illustrate possibilities to further improve the efficiency and performance of SPALP.

Type
Publication
Self-Paced Absolute Learning Progress as a Regularized Approach to Curriculum Learning
This report was created for the course “Integrated Project: Robot Learning” at the Technical University of Darmstadt and is not meant to be published anywhere.
Tobias Niehues
Tobias Niehues
PhD Student in Cognitive Science

My research interests include statistical modeling of human behavior, especially regarding human decision-making and (the interplay of) perception & action.