Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove generate/lora #1115

Merged
merged 2 commits into from
Mar 14, 2024
Merged

Remove generate/lora #1115

merged 2 commits into from
Mar 14, 2024

Conversation

awaelchli
Copy link
Contributor

@awaelchli awaelchli commented Mar 14, 2024

The special script generate/lora.py or litgpt generate lora is no longer needed as of #1081, because automatic merging takes place during finetuning in the final checkpoint. The docs were already updated.

We could in theory keep this, but the benefits are very minor.

Pros for keeping lora.py:

  • Educational purposes
  • Slightly faster because you don't need to merge + save checkpoint first (but on the other hand merging is a one-time thing).

Cons for keeping lora.py:

  • We don't have a chat/lora.py anyway, would we add it?
  • More duplicated code to maintain
  • The generate script has still all lora hyperparameters exposed, which is error prone. We would have to fix this like we did for merging to use the correct parameters automatically (extra work but not a real con)

Alternative:

  • If it's confusing to have generate {base, adapter} but no lora, we could also keep the lora CLI command but reroute it to base.py

@awaelchli awaelchli marked this pull request as draft March 14, 2024 00:06
@rasbt
Copy link
Collaborator

rasbt commented Mar 14, 2024

One thing is that the lora.py generate script show the elegance of LoRA where you never have to modify the original pretraining model. That being said, in practice, I don't think it is that much of an advantage in terms of runtime or memory resources, because you have to always run the pretrained model (merged or unmerged, it's the same size) anyways.

The only thing is that merging could maybe require additional memory but I don't think so because of the lazy merging.

Overall, I am in strong favor of removing it, because it will simplify the code base and leave us with less code to maintain like you said. It will also be easier for writing tutorials etc. Let's remove it :).

@lantiga
Copy link
Contributor

lantiga commented Mar 14, 2024

Let's remove for now @awaelchli, we can revisit in the future if we want to serve multiple lora layers or increase velocity for users, but that would require a proper flow

@awaelchli awaelchli marked this pull request as ready for review March 14, 2024 18:46
@carmocca carmocca merged commit 232810e into wip Mar 14, 2024
6 of 8 checks passed
@carmocca carmocca deleted the remove/generate-lora branch March 14, 2024 21:14
awaelchli added a commit that referenced this pull request Mar 15, 2024
awaelchli added a commit that referenced this pull request Mar 15, 2024
awaelchli added a commit that referenced this pull request Mar 15, 2024
rasbt pushed a commit that referenced this pull request Mar 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants