Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Policy Gradient REINFORCE algorithm not converging. #26

Open
padmaja-kulkarni opened this issue May 16, 2020 · 0 comments
Open

Policy Gradient REINFORCE algorithm not converging. #26

padmaja-kulkarni opened this issue May 16, 2020 · 0 comments

Comments

@padmaja-kulkarni
Copy link

padmaja-kulkarni commented May 16, 2020

First of all, thank you for the tutorial here!

I am trying to implement/run your code mentioned in the tutorial, however, the results are not converging after 500 steps as shown in the image 'Reward: Training progress of Policy Gradient RL in Cartpole environment". Even after 5000 steps, the reward is around 10. Is this correct?

Thanks again!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant