-
Notifications
You must be signed in to change notification settings - Fork 179
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Accept buffer in LLMPipeline ctor #1262
base: releases/2024/5
Are you sure you want to change the base?
Accept buffer in LLMPipeline ctor #1262
Conversation
if (m_core) { | ||
return m_core; | ||
} | ||
m_core = std::make_shared<ov::Core>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@@ -260,6 +260,23 @@ void slice_matmul_statefull_model(std::shared_ptr<ov::Model> model) { | |||
} | |||
} | |||
|
|||
template <typename T> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
template function cannot be defined in .cpp file, how does it work?
|
||
auto ov_tokenizer = core.read_model(tokenizer_model_str, tokenizer_weights_tensor); | ||
auto ov_detokenize = core.read_model(detokenizer_model_str, detokenizer_weights_tensor); | ||
*this = TokenizerImpl(std::make_pair(ov_tokenizer, ov_detokenize), properties); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
instead of such hacks, we can check whether tokenizer_model_str and tokenizer_weights_tensor are not empty, then call read_model.
In this case we don' need TokenizerImpl(const std::pair<std::shared_ptr<ov::Model>, std::shared_ptr<ov::Model>>& models, const ov::AnyMap& properties)
std::string model_path = models_path + "/openvino_model.xml"; | ||
std::string weights_path = std::regex_replace(model_path, std::regex(".xml"), ".bin"); | ||
std::ifstream model_file(model_path, std::ios::binary | std::ios::ate); | ||
std::ifstream weights_file(weights_path, std::ios::binary | std::ios::ate); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's create a dedicated samples, which demonstrates that models are loaded via some "decrypt_llm_model", "decrypt_tokenizer", "decrypt_detokenizer" functions? such functions just read files to str + Tensor and return them
Note, str + Tensor can be ONNX, PDPD or other model type
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This sample was added by mistake. Of course for reading from buffer i will crease a separate sample.
Tokenizer( | ||
std::string& tokenizer_model_str, | ||
ov::Tensor& tokenizer_weights_tensor, | ||
std::string& detokenizer_model_str, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
std::string& detokenizer_model_str, | |
const std::string& detokenizer_model_str, |
Ticket: CVS-158144