You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am trying to implement/run your code mentioned in the tutorial, however, the results are not converging after 500 steps as shown in the image 'Reward: Training progress of Policy Gradient RL in Cartpole environment". Even after 5000 steps, the reward is around 10. Is this correct?
Thanks again!
The text was updated successfully, but these errors were encountered:
First of all, thank you for the tutorial here!
I am trying to implement/run your code mentioned in the tutorial, however, the results are not converging after 500 steps as shown in the image 'Reward: Training progress of Policy Gradient RL in Cartpole environment". Even after 5000 steps, the reward is around 10. Is this correct?
Thanks again!
The text was updated successfully, but these errors were encountered: