You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Jun 24, 2024. It is now read-only.
warning: llm (lib) generated 1 warning (run cargo fix --lib -p llm to apply 1 suggestion)
Finished release [optimized] target(s) in 0.26s
Running target/release/llm infer -m ../models/vicuna-13b-v1.5.Q4_K_M.gguf -p 'Write a long story' -r mistralai/Mistral-7B-v0.1
⣻ Loading model...2024-02-08T17:56:25.386579Z INFO infer: cached_path::cache: Cached version of https://huggingface.co/mistralai/Mistral-7B-v0.1/resolve/main/tokenizer.json is up-to-date
✓ Loaded 363 tensors (7.9 GB) after 292ms
The application panicked (crashed). Message: not yet implemented
Location: crates/llm-base/src/inference_session.rs:120
Backtrace omitted. Run with RUST_BACKTRACE=1 environment variable to display it.
Run with RUST_BACKTRACE=full to include source snippets.
Code is commented and not possible to infer anything. Is there an ETA for this to resolve.
Can we know what's the current status?
Where does help is required?
The text was updated successfully, but these errors were encountered:
Hi, apologies - I realised that updating to the latest llama.cpp would require a rewrite, and it's been hard to find the motivation to do so. I have a few ideas for a redesign / reimplementation, but I haven't made the time to attend to them.
In the meantime, I'd suggest sticking to the gguf branch (which uses an older llama.cpp's GGML and supports Llama/Mistral) or https://github.com/edgenai/llama_cpp-rs .
warning:
llm
(lib) generated 1 warning (runcargo fix --lib -p llm
to apply 1 suggestion)Finished release [optimized] target(s) in 0.26s
Running
target/release/llm infer -m ../models/vicuna-13b-v1.5.Q4_K_M.gguf -p 'Write a long story' -r mistralai/Mistral-7B-v0.1
⣻ Loading model...2024-02-08T17:56:25.386579Z INFO infer: cached_path::cache: Cached version of https://huggingface.co/mistralai/Mistral-7B-v0.1/resolve/main/tokenizer.json is up-to-date
✓ Loaded 363 tensors (7.9 GB) after 292ms
The application panicked (crashed).
Message: not yet implemented
Location: crates/llm-base/src/inference_session.rs:120
Backtrace omitted. Run with RUST_BACKTRACE=1 environment variable to display it.
Run with RUST_BACKTRACE=full to include source snippets.
Code is commented and not possible to infer anything. Is there an ETA for this to resolve.
Can we know what's the current status?
Where does help is required?
The text was updated successfully, but these errors were encountered: