Skip to content

policy_gradient没有效果 #12

@yaoxunji

Description

@yaoxunji

https://github.com/ljpzzz/machinelearning/blob/master/reinforcement-learning/policy_gradient.py

直接复制运行
episode: 0 Evaluation Average Reward: 15.0
episode: 100 Evaluation Average Reward: 10.2
episode: 200 Evaluation Average Reward: 9.2
episode: 300 Evaluation Average Reward: 9.3
episode: 400 Evaluation Average Reward: 9.4
episode: 500 Evaluation Average Reward: 9.0
episode: 600 Evaluation Average Reward: 9.6
episode: 700 Evaluation Average Reward: 9.4
episode: 800 Evaluation Average Reward: 9.7
episode: 900 Evaluation Average Reward: 9.5
episode: 1000 Evaluation Average Reward: 9.6
episode: 1100 Evaluation Average Reward: 9.7
episode: 1200 Evaluation Average Reward: 9.1
episode: 1300 Evaluation Average Reward: 9.2
episode: 1400 Evaluation Average Reward: 9.3
episode: 1500 Evaluation Average Reward: 9.3
episode: 1600 Evaluation Average Reward: 9.4
episode: 1700 Evaluation Average Reward: 9.3
episode: 1800 Evaluation Average Reward: 9.4
episode: 1900 Evaluation Average Reward: 8.8
episode: 2000 Evaluation Average Reward: 9.3
episode: 2100 Evaluation Average Reward: 9.4
episode: 2200 Evaluation Average Reward: 9.6
episode: 2300 Evaluation Average Reward: 9.6
episode: 2400 Evaluation Average Reward: 9.3
episode: 2500 Evaluation Average Reward: 9.3
episode: 2600 Evaluation Average Reward: 9.4
episode: 2700 Evaluation Average Reward: 9.7
episode: 2800 Evaluation Average Reward: 9.6
episode: 2900 Evaluation Average Reward: 9.8

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions