Sequential Classification Based Optimization for Direct Policy Search

iericzhu 17 0 PDF 2021-03-02 00:03:22

Direct policy search often results in high-quality policies in complex reinforcement learning problems, which employs some optimization algorithms to search the parameters of the policy for maximizing the its total reward. Classificationbased optimization is a recently developed framework for deriva

用户评论
请输入评论内容
评分:
暂无评论