Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

demonstration file #9

Open
rasoolfa opened this issue Feb 24, 2020 · 3 comments
Open

demonstration file #9

rasoolfa opened this issue Feb 24, 2020 · 3 comments

Comments

@rasoolfa
Copy link

rasoolfa commented Feb 24, 2020

Hi @aravindr93 ,

Thanks for releasing codes and demonstrations files for this work.

I got two questions about demonstration files (e.g. hammer-v0_demos.pickle).
I might be missing something here, but how does DDPGfD use demonstrations as these files only contain (s, a, r) not (s,s', a, r)(s' indicates next state)? And can you please provide more details about the structure of those files so it would be easier to compare and reproduce your paper results?

Thanks.

@rasoolfa
Copy link
Author

I guess one way to utilize demonstrations is to collect samples by using actions in those files.

@bennevans
Copy link
Collaborator

Since the observations/states are in a time-ordered list you can get s' by taking the state at the next index in the list.

@bennevans
Copy link
Collaborator

bennevans commented Mar 3, 2020

As for the structure of a file, it is a list of dictionaries each representing a trajectory. They have keys

['actions', 'observations', 'rewards', 'init_state_dict']

Which correspond to the actions, states, and rewards across the time of the trajectory + initial state information

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants