Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

enhance: prevent multiple query nodes from causing excessive occupancy of a single node, leading to GPU memory overflow. #38617

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

Presburger
Copy link
Member

Support GPU resource watermark management to prevent excessive loading, which could lead to system crashes.

@sre-ci-robot sre-ci-robot added area/compilation size/L Denotes a PR that changes 100-499 lines. labels Dec 20, 2024
@sre-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: Presburger
To complete the pull request process, please assign czs007 after the PR has been reviewed.
You can assign the PR to them by writing /assign @czs007 in a comment when ready.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@Presburger Presburger changed the title prevent multiple query nodes from causing excessive occupancy of a single node, leading to GPU memory overflow. enhence: prevent multiple query nodes from causing excessive occupancy of a single node, leading to GPU memory overflow. Dec 20, 2024
@mergify mergify bot added the dco-passed DCO check passed. label Dec 20, 2024
Copy link
Contributor

mergify bot commented Dec 20, 2024

@Presburger

Invalid PR Title Format Detected

Your PR submission does not adhere to our required standards. To ensure clarity and consistency, please meet the following criteria:

  1. Title Format: The PR title must begin with one of these prefixes:
  • feat: for introducing a new feature.
  • fix: for bug fixes.
  • enhance: for improvements to existing functionality.
  • test: for add tests to existing functionality.
  • doc: for modifying documentation.
  • auto: for the pull request from bot.
  1. Description Requirement: The PR must include a non-empty description, detailing the changes and their impact.

Required Title Structure:

[Type]: [Description of the PR]

Where Type is one of feat, fix, enhance, test or doc.

Example:

enhance: improve search performance significantly 

Please review and update your PR to comply with these guidelines.

@Presburger Presburger changed the title enhence: prevent multiple query nodes from causing excessive occupancy of a single node, leading to GPU memory overflow. enhance: prevent multiple query nodes from causing excessive occupancy of a single node, leading to GPU memory overflow. Dec 20, 2024
@mergify mergify bot added kind/enhancement Issues or changes related to enhancement and removed do-not-merge/invalid-pr-format labels Dec 20, 2024
Copy link
Contributor

mergify bot commented Dec 20, 2024

@Presburger Please associate the related issue to the body of your Pull Request. (eg. “issue: #”)

Copy link

codecov bot commented Dec 20, 2024

Codecov Report

Attention: Patch coverage is 45.61404% with 31 lines in your changes missing coverage. Please review.

Project coverage is 81.03%. Comparing base (bb5f38e) to head (e78def5).
Report is 7 commits behind head on master.

Files with missing lines Patch % Lines
internal/querynodev2/segments/segment_loader.go 31.11% 29 Missing and 2 partials ⚠️
Additional details and impacted files

Impacted file tree graph

@@           Coverage Diff           @@
##           master   #38617   +/-   ##
=======================================
  Coverage   81.02%   81.03%           
=======================================
  Files        1380     1381    +1     
  Lines      195145   195208   +63     
=======================================
+ Hits       158115   158183   +68     
- Misses      31448    31453    +5     
+ Partials     5582     5572   -10     
Components Coverage Δ
Client 78.26% <ø> (ø)
Core 69.33% <ø> (ø)
Go 83.01% <45.61%> (+<0.01%) ⬆️
Files with missing lines Coverage Δ
pkg/util/hardware/gpu_mem_info.go 100.00% <100.00%> (ø)
pkg/util/paramtable/component_param.go 98.38% <100.00%> (+<0.01%) ⬆️
internal/querynodev2/segments/segment_loader.go 71.40% <31.11%> (-1.55%) ⬇️

... and 29 files with indirect coverage changes

@Presburger Presburger force-pushed the master branch 2 times, most recently from d5c6921 to e78def5 Compare December 24, 2024 09:46
@mergify mergify bot added the ci-passed label Dec 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/compilation ci-passed dco-passed DCO check passed. do-not-merge/missing-related-issue kind/enhancement Issues or changes related to enhancement size/L Denotes a PR that changes 100-499 lines.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants