Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What is the default setting (e.g., total training steps, learning rate) of DQN? #14

Closed
xinghua-qu opened this issue May 6, 2020 · 4 comments

Comments

@xinghua-qu
Copy link

Hi,

In your code, the training parameter setting is imported from utils.
from rlzoo.common.utils import call_default_params

May I check is there any document that explains what is this default setting and how to you fix it?

@quantumiracle
Copy link
Member

Hi,
The call_default_params returns the hyper-parameters stored in two dictionaries alg_params and learn_params, which can be printed to see what are contained inside. Hyper-parameters in these two dictionaries can also be changed by users before instantiating the agent and starting the learning process.

If you want to know exactly where the default hyper-parameters come from, they are stored in an individual Python script as default.py in each algorithm file in ./rlzoo/algorithms/.

@quantumiracle
Copy link
Member

We will release new version of RLzoo with much more explicit hyper-parameters configuration process soon!

@xinghua-qu
Copy link
Author

Many thanks for the clarification. Now it's more clear for me.
It's really a nicer baseline comparing with stablebaselines and openai baseline.

BTW, if you can provide some benchmark policies (just like what have been done in stablebaselines zoo) that are well tuned, that will be so great. In that way, the toolbox can be treated as a standard initialization for some research directions (e.g., offline RL and adversarial robustness).

If you already have some policies well trained on Freeway, BankHeist, Boxing et al., could you please share it?

@quantumiracle
Copy link
Member

Thanks for your suggestions.
We will consider providing the benchmark policies later on (soon), but right now we do not have these results yet. A thorough benchmark will have some requirements on the computation machines and human labours, and I wish you could understand that at present.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants