keras实现REINFORCE算法强化学习

long_biti 79 0 2018-12-28 23:12:09

keras实现REINFORCE算法强化学习: # Policy Gradient Minimal implementation of Stochastic Policy Gradient Algorithm in Keras ## Pong Agent ![pg](./assets/pg.gif) This PG agent seems to get more frequent wins after about 8000 episodes. Below is the score graph.

用户评论
请输入评论内容
评分:
暂无评论