-
Notifications
You must be signed in to change notification settings - Fork 179
Pull requests: openvinotoolkit/openvino.genai
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
StaticLLMPipeline: Decide when to enable NPUW_DQ_FULL property
category: LLM
LLM pipeline (stateful, static)
category: sampling
Sampling / Decoding algorithms
Fixed pyi file build when OpenVINO_DIR is externally defined
bug
Something isn't working
category: cmake / build
Cmake scripts
category: Python API
Python API for GenAI
Use whole history in case of undetermined tokenization of sequence
category: LLM
LLM pipeline (stateful, static)
category: sampling
Sampling / Decoding algorithms
category: visual language
Visual language pipeline
#1268
opened Nov 27, 2024 by
sbalandi
Loading…
StaticLLMPipeline: Handle CACHE_DIR for NPUW
category: LLM
LLM pipeline (stateful, static)
category: NPU
port to LTS
PR needs to be ported to LTS
Text2Img models from buffer
category: GenAI C++ API
Changes in GenAI C++ public headers
category: sampling
Sampling / Decoding algorithms
category: text to image
Text 2 image pipeline
[VLM] Support compile OV model and weight from buffer
category: GenAI C++ API
Changes in GenAI C++ public headers
category: sampling
Sampling / Decoding algorithms
category: visual language
Visual language pipeline
VLM performance metrics.
category: GenAI C++ API
Changes in GenAI C++ public headers
category: Python API
Python API for GenAI
category: visual language
Visual language pipeline
category: whisper
Whisper pipeline
Accept buffer in LLMPipeline ctor
category: continuous batching
Continuous batching
category: GenAI C++ API
Changes in GenAI C++ public headers
category: LLM
LLM pipeline (stateful, static)
category: sampling
Sampling / Decoding algorithms
category: speculative decoding
Speculative decoding
category: tokenizers
Tokenizer class or submodule update
Add slice before matmut transformation for CB scenario
category: continuous batching
Continuous batching
category: sampling
Sampling / Decoding algorithms
category: speculative decoding
Speculative decoding
do_not_merge
do_not_review
no-match-files
Update requirements.txt for 2024.6 validation
do_not_merge
#1260
opened Nov 27, 2024 by
peterchen-intel
•
Draft
StaticLLMPipeline: Decide when to enable NPUW_DQ_FULL property
category: LLM
LLM pipeline (stateful, static)
category: NPU
[VLM] Image resize model
category: GHA
CI based on Github actions
category: tokenizers
Tokenizer class or submodule update
category: visual language
Visual language pipeline
Use whole history in case of undetermined tokenization of sequence
category: LLM
LLM pipeline (stateful, static)
category: visual language
Visual language pipeline
no-match-files
port to LTS
PR needs to be ported to LTS
Parallel sampling with threadpool
category: continuous batching
Continuous batching
category: sampling
Sampling / Decoding algorithms
no-match-files
#1252
opened Nov 25, 2024 by
mzegla
Loading…
fill prompt for sampler analysis with real tokens in VLM pipeline
category: visual language
Visual language pipeline
[Prompt lookup]
category: cmake / build
Cmake scripts
category: continuous batching
Continuous batching
category: GenAI C++ API
Changes in GenAI C++ public headers
category: LLM
LLM pipeline (stateful, static)
category: samples
GenAI samples
category: speculative decoding
Speculative decoding
no-match-files
Static llm pipeline dynamic shape model
category: LLM
LLM pipeline (stateful, static)
category: samples
GenAI samples
#1240
opened Nov 20, 2024 by
AsyaPronina
•
Draft
Parallel sampling with ov::threading
category: cmake / build
Cmake scripts
category: continuous batching
Continuous batching
category: sampling
Sampling / Decoding algorithms
no-match-files
Move beam search in case of chat scenario to sampler.cpp
category: continuous batching
Continuous batching
category: LLM
LLM pipeline (stateful, static)
category: visual language
Visual language pipeline
no-match-files
Added chat template to CLI.
category: WWB
PR changes WWB
#1208
opened Nov 13, 2024 by
andreyanufr
•
Draft
[CPU] Change kvcache default type of PagedAttention to u8 for CPU plugin
category: continuous batching
Continuous batching
category: GHA
CI based on Github actions
Test master logits
category: LLM
LLM pipeline (stateful, static)
category: samples
GenAI samples
category: sampling
Sampling / Decoding algorithms
do_not_merge
do_not_review
Previous Next
ProTip!
Filter pull requests by the default branch with base:master.