-
Notifications
You must be signed in to change notification settings - Fork 105
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: Fix various tests caused by cases byte size validation not handled properly #364
Conversation
…e split across buffers
src/infer_request.cc
Outdated
int64_t buffer_memory_id; | ||
|
||
// Validate elements until all buffers have been fully processed. | ||
while (remaining_buffer_size || buffer_idx < buffer_count) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not like below to keep variables in scope
for (buf : buffers) {
buffer_size = ...;
for (checked_size = 0; checked_size < buffer_size) {
..
}
}
Re-assigning to @yinggeh as this involves a lot of refactoring to the original validation logic feature he wrote and it seems like he has some ideas for further refactoring. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See new comments in opened discussion.
Please update the description with the latest information and pipeline ID, I need to see the test status |
I believe I have updated with the latest info and pipeline ID. Is there anything particular you think is missing? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't approve this PR because I originally opened it, but I am verbally approving this if all the affected CI tests in the test plan description are passing. I'd like to get these fixes in ASAP to fix the broken tests in CI and reduce cherry-picks.
I added a follow-up ticket DLIS-6833 to address some of the logical refactoring brought up by Guan after code freeze.
Please also review related server PR github.com/triton-inference-server/server/pull/7326 |
What does the PR do?
The raw binary tensor in the failing test case consists of two memory chunks by the time it gets validated:
This PR makes some generic refactoring to try to handle any series of buffers, where each element is not guaranteed to fit within a single buffer, and each buffer is not guaranteed to contain a single element. Also it skips byte size validation if input memory type is GPU or input platform is TensorRT.
Caveats:
byte_size
indicator for a given element is contained within a single buffer, and is not split across buffers. If a byte_size is split across buffers, then an error is returned. This was just easier to implement, open to suggestions if anyone has a slick solution. Technically the buffer APIs should support doing this, but I assume it is unlikely to be done. Example of this currently unsupported case:buffer1=[<size1><element1><size2_partial>]
buffer2=[<size2_partial><element2>...]
<size2_partial>
is split across buffers, and is currently rejected.Open Items
Checklist
<commit_type>: <Title>
Commit Type:
Check the conventional commit type
box here and add the label to the github PR.
Related PRs:
triton-inference-server/server#7326
Where should the reviewer start?
N/A
Test plan:
CI Pipeline ID: 15635924
Background
None
Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)
None