Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Phi-2 model dies upon reaching the context size limit when using console llama.cpp app #4625

Closed
Slider2k opened this issue Dec 24, 2023 · 5 comments

Comments

@Slider2k
Copy link

Slider2k commented Dec 24, 2023

Upon reaching the context size limit (2048) Phi-2 model suddenly becomes silent or starts producing gibberish (random letters). Log is included.
NOTE: The server app is not affected by this bug, only the console app.

llama.cpp: version: 1661 (a7aee47), 1698 (753be37)
Model: https://huggingface.co/TheBloke/dolphin-2_6-phi-2-GGUF

Logs:
1661
1698

Reference: #4490

@BarfingLemurs
Copy link
Contributor

This is not an inference issue, rather a limitation with the model itself. The same gibberish output will occur when inferencing with llama 1.

Llama 2 is better designed to maintain coherence after 4k context, but still will fail into incoherent loops normally after ~4.3k

@Slider2k
Copy link
Author

Slider2k commented Dec 26, 2023

To my knowledge upon reaching context size limit llama.cpp clears it, then fills it half way with the immediately preceding text and regenerates. This is not happening here, the model completely breaks down. You can check the attached log. I don't know why everyone reads 'gibberish' as 'normal text bit incoherent' outright.

@ggerganov
Copy link
Owner

Try to keep the system prompt in the context using --keep and see if this helps

@Slider2k
Copy link
Author

Slider2k commented Dec 27, 2023

@ggerganov

Try to keep the system prompt in the context using --keep and see if this helps

The model completely breaks (as in becomes silent to responses or produces gibberish: a bunch of random letters). I tried running Phi-2 on the llama.cpp server and seeing what happens when the context gets full, and it doesn't break like the console app - it proceeds by clearing the KV cache and filling it in half by immediately preceding text, as it's supposed to. Then counties working normally.

@Slider2k Slider2k changed the title Phi-2 model dies upon reaching the context size limit Phi-2 model dies upon reaching the context size limit when using console llama.cpp app Dec 28, 2023
@ggerganov
Copy link
Owner

Fixed via #4889

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants