Accept buffer in LLMPipeline ctor #1262

pavel-esir · 2024-11-27T12:02:34Z

Ticket: CVS-158144

ilya-lavrenov · 2024-11-27T15:38:05Z

src/cpp/src/tokenizer.cpp

+        if (m_core) {
+            return m_core;
+        }
+        m_core = std::make_shared<ov::Core>();


can we use https://github.com/openvinotoolkit/openvino.genai/blob/master/src/cpp/src/utils.hpp#L87 ?

src/cpp/src/utils.hpp

ilya-lavrenov · 2024-11-27T15:39:41Z

src/cpp/src/utils.cpp

@@ -260,6 +260,23 @@ void slice_matmul_statefull_model(std::shared_ptr<ov::Model> model) {
    }
 }

+template <typename T>


template function cannot be defined in .cpp file, how does it work?

ilya-lavrenov · 2024-11-27T15:41:17Z

src/cpp/src/tokenizer.cpp

+
+        auto ov_tokenizer = core.read_model(tokenizer_model_str, tokenizer_weights_tensor);
+        auto ov_detokenize = core.read_model(detokenizer_model_str, detokenizer_weights_tensor);
+        *this = TokenizerImpl(std::make_pair(ov_tokenizer, ov_detokenize), properties);


instead of such hacks, we can check whether tokenizer_model_str and tokenizer_weights_tensor are not empty, then call read_model.

In this case we don' need TokenizerImpl(const std::pair<std::shared_ptr<ov::Model>, std::shared_ptr<ov::Model>>& models, const ov::AnyMap& properties)

ilya-lavrenov · 2024-11-27T15:44:32Z

samples/cpp/chat_sample/chat_sample.cpp

+    std::string model_path = models_path + "/openvino_model.xml";
+    std::string weights_path = std::regex_replace(model_path, std::regex(".xml"), ".bin");
+    std::ifstream model_file(model_path, std::ios::binary | std::ios::ate);
+    std::ifstream weights_file(weights_path, std::ios::binary | std::ios::ate);


let's create a dedicated samples, which demonstrates that models are loaded via some "decrypt_llm_model", "decrypt_tokenizer", "decrypt_detokenizer" functions? such functions just read files to str + Tensor and return them

Note, str + Tensor can be ONNX, PDPD or other model type

This sample was added by mistake. Of course for reading from buffer i will crease a separate sample.

ilya-lavrenov · 2024-11-27T15:45:50Z

src/cpp/include/openvino/genai/tokenizer.hpp

+    Tokenizer(
+        std::string& tokenizer_model_str,
+        ov::Tensor& tokenizer_weights_tensor,
+        std::string& detokenizer_model_str,


Suggested change

std::string& detokenizer_model_str,

const std::string& detokenizer_model_str,

github-actions bot added category: LLM LLM pipeline (stateful, static) category: sampling Sampling / Decoding algorithms category: tokenizers Tokenizer class or submodule update category: samples GenAI samples category: GenAI C++ API Changes in GenAI C++ public headers labels Nov 27, 2024

pavel-esir changed the title ~~initial~~ Accept buffer in LLMPipeline ctor Nov 27, 2024

ilya-lavrenov assigned ilya-lavrenov and Wovchena Nov 27, 2024

pavel-esir added 2 commits November 27, 2024 13:13

initial

624d9f5

use string and ov::Tensor instead of a raw buffer

9ed8f6e

ilya-lavrenov reviewed Nov 27, 2024

View reviewed changes

ilya-lavrenov added this to the 2024.6 milestone Nov 27, 2024

ilya-lavrenov reviewed Nov 27, 2024

View reviewed changes

github-actions bot added category: continuous batching Continuous batching category: speculative decoding Speculative decoding labels Nov 28, 2024

continuous batching ctor with model from buffer

dd69db2

github-actions bot removed the category: samples GenAI samples label Nov 28, 2024

revert chat sample

413c015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Accept buffer in LLMPipeline ctor #1262

Accept buffer in LLMPipeline ctor #1262

pavel-esir commented Nov 27, 2024

ilya-lavrenov Nov 27, 2024

ilya-lavrenov Nov 27, 2024

ilya-lavrenov Nov 27, 2024

ilya-lavrenov Nov 27, 2024 •

edited

Loading

pavel-esir Nov 28, 2024

ilya-lavrenov Nov 27, 2024

	std::string& detokenizer_model_str,
	const std::string& detokenizer_model_str,

Accept buffer in LLMPipeline ctor #1262

Are you sure you want to change the base?

Accept buffer in LLMPipeline ctor #1262

Conversation

pavel-esir commented Nov 27, 2024

ilya-lavrenov Nov 27, 2024

Choose a reason for hiding this comment

ilya-lavrenov Nov 27, 2024

Choose a reason for hiding this comment

ilya-lavrenov Nov 27, 2024

Choose a reason for hiding this comment

ilya-lavrenov Nov 27, 2024 • edited Loading

Choose a reason for hiding this comment

pavel-esir Nov 28, 2024

Choose a reason for hiding this comment

ilya-lavrenov Nov 27, 2024

Choose a reason for hiding this comment

ilya-lavrenov Nov 27, 2024 •

edited

Loading