Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

reset_parameters fixes #1199

Merged
merged 2 commits into from
Mar 26, 2024
Merged

reset_parameters fixes #1199

merged 2 commits into from
Mar 26, 2024

Conversation

carmocca
Copy link
Contributor

No description provided.

@carmocca carmocca self-assigned this Mar 26, 2024
@@ -151,7 +151,8 @@ def scaled_dot_product_attention(
return y + self.gating_factor * ay

def reset_parameters(self) -> None:
torch.nn.init.zeros_(self.gating_factor)
if hasattr(self, "gating_factor"):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch.
Adapter is added only at config.adapter_start_layer, so not all layers have it.

self.lora_A = nn.Parameter(torch.zeros((r, in_features)))
self.lora_B = nn.Parameter(torch.zeros((out_features, r)))
self.lora_A = nn.Parameter(torch.empty((r, in_features)))
self.lora_B = nn.Parameter(torch.empty((out_features, r)))
self.scaling = self.lora_alpha / self.r
self.reset_parameters()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like reset_parameters should be called automatically by torch during a layer creation. But I highly unconfident about that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's more of a silent convention. Only FSDP calls it

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it.

@carmocca carmocca merged commit d296c98 into main Mar 26, 2024
6 of 8 checks passed
@carmocca carmocca deleted the carmocca/reset-parameters-fixes branch March 26, 2024 19:33
rasbt pushed a commit that referenced this pull request Apr 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants