Maithra Raghu
Google Brain and Cornell University
{maithrar}@gmail.com
Alex Irpan
Google Brain
Jacob Andreas
University of California, Berkeley
Robert Kleinberg
Cornell University
Quoc V. Le
Google Brain
Jon Kleinberg
Cornell University
ABSTRACT
Deep reinforcement learning has achieved many recent successes, but our understanding
of its strengths and limitations is hampered by the lack of rich environments
in which we can fully characterize optimal behavior, and correspondingly
diagnose individual actions against such a characterization. Here we consider a
family of combinatorial games, arising from work of Erdos, Selfridge, and Spencer,
and we propose their use as environments for evaluating and comparing different
approaches to reinforcement learning. These games have a number of appealing
features: they are challenging for current learning approaches, but they form (i)
a low-dimensional, simply parametrized environment where (ii) there is a linear
closed form solution for optimal behavior from any state, and (iii) the difficulty
of the game can be tuned by changing environment parameters in an interpretable
way. We use these Erdos-Selfridge-Spencer games not only to compare different
algorithms, but also to compare approaches based on supervised and reinforcement
learning, to analyze the power of multi-agent approaches in improving performance,
and to evaluate generalization to environments outside the training set.
CAN DEEP REINFORCEMENT LEARNING SOLVE ERDOS-SELFRIDGE-SPENCER GAMES?
最后编辑于 :
©著作权归作者所有,转载或内容合作请联系作者
- 文/潘晓璐 我一进店门,熙熙楼的掌柜王于贵愁眉苦脸地迎上来,“玉大人,你说我怎么就摊上这事。” “怎么了?”我有些...
- 文/花漫 我一把揭开白布。 她就那样静静地躺着,像睡着了一般。 火红的嫁衣衬着肌肤如雪。 梳的纹丝不乱的头发上,一...
- 文/苍兰香墨 我猛地睁开眼,长吁一口气:“原来是场噩梦啊……” “哼!你这毒妇竟也来了?” 一声冷哼从身侧响起,我...