-
Notifications
You must be signed in to change notification settings - Fork 377
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add hpt model and corresponding examples. #839
base: main
Are you sure you want to change the base?
Conversation
from .bcq import BCQ | ||
from .edac import EDAC | ||
from .qgpo import QGPO | ||
from .ebm import EBM, AutoregressiveEBM | ||
from .havac import HAVAC | ||
from .policy_stem import PolicyStem |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
merge policy_stem into hpt
class HPT(nn.Module): | ||
def __init__(self, state_dim, action_dim): | ||
super(HPT, self).__init__() | ||
# 初始化 Policy Stem |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add English comments
from ding.utils.registry_factory import MODEL_REGISTRY | ||
from ding.model.template.policy_stem import PolicyStem | ||
@MODEL_REGISTRY.register('hpt') | ||
class HPT(nn.Module): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add paper link and original github repo link to refer others' work
|
||
# 检查模型是否在 GPU | ||
for param in model.parameters(): | ||
print("模型参数所在设备:", param.device) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use logging.info
Description
Added HPT to ding.model.template, which calls PolicyStem. the model can be used to replace network structures. A replacement example is in lunarlander_hpt_example.py.
Related Issue
TODO
Check List