Support for Embedding Models #4117
Replies: 2 comments 10 replies
-
Same here, tying to find working model in gguf format. Is there anything I did wrong? |
Beta Was this translation helpful? Give feedback.
-
Did anyone figure out how to make this work? I tried using the server-mode /embedding endpoint to get some embeddings but all I get is an array of 0.0 values. Do I need to use a specific model for this? Or can any model work? |
Beta Was this translation helpful? Give feedback.
-
I want to mainly throw my support for wanting a solid embedding model in GGML. Even without typical LLM capabilities, I think there would be a TON of use for embedding documents of various types and doing similarity lookup, even if the return was simply the passage matched. So much better than string search.
I have found this fork, which appears to be abandoned but works
https://github.com/skeskinen/bert.cpp
Note for Windows users: I needed to make two modifications to get it to work:
Python sample_client.py - change
with open(os.path.join(os.path.dirname(__file__), txt_file), 'r') as f:
to this
with open(os.path.join(os.path.dirname(__file__), txt_file), 'r', encoding="utf-8") as f:
And in the windows server.cpp, change
ssize_t bytes_received = read(socket, buffer, sizeof(buffer));
to
ssize_t bytes_received = recv(socket, buffer, sizeof(buffer), 0);
And it seems to at least run the example.
I have also found this discussion from a month ago
#3667
Which said it seemed close.
Bert is better than nothing, but BGE is one of the top retreival embeddings on the huggingface embedding leaderboards:
https://huggingface.co/spaces/mteb/leaderboard
Hopefully this isn't too far off, as I'd love to just drop this into an app I'm building.
Thanks for all the effort on GGML - it is an amazing offering btw.
Beta Was this translation helpful? Give feedback.
All reactions