-
Notifications
You must be signed in to change notification settings - Fork 9.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Phi-2 model dies upon reaching the context size limit when using console llama.cpp app #4625
Comments
This is not an inference issue, rather a limitation with the model itself. The same gibberish output will occur when inferencing with llama 1. Llama 2 is better designed to maintain coherence after 4k context, but still will fail into incoherent loops normally after ~4.3k |
To my knowledge upon reaching context size limit llama.cpp clears it, then fills it half way with the immediately preceding text and regenerates. This is not happening here, the model completely breaks down. You can check the attached log. I don't know why everyone reads 'gibberish' as 'normal text bit incoherent' outright. |
Try to keep the system prompt in the context using |
The model completely breaks (as in becomes silent to responses or produces gibberish: a bunch of random letters). I tried running Phi-2 on the llama.cpp server and seeing what happens when the context gets full, and it doesn't break like the console app - it proceeds by clearing the KV cache and filling it in half by immediately preceding text, as it's supposed to. Then counties working normally. |
Fixed via #4889 |
Upon reaching the context size limit (2048) Phi-2 model suddenly becomes silent or starts producing gibberish (random letters). Log is included.
NOTE: The server app is not affected by this bug, only the console app.
llama.cpp: version: 1661 (a7aee47), 1698 (753be37)
Model: https://huggingface.co/TheBloke/dolphin-2_6-phi-2-GGUF
Logs:
1661
1698
Reference: #4490
The text was updated successfully, but these errors were encountered: