Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Include GPU check validation #20840

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

yarunachalam
Copy link
Contributor

Include GPU check validation for NVIDIA test flavor

my @out = split(' ', script_output("kubectl get pods --namespace $namespace -o custom-columns=':metadata.name'"));
record_info("Pod names", join(" ", @out));
POD_LOOP: foreach my $pod (@out) {

if ($pod =~ /^ollama/) {
$ollama_pod = $pod;
Copy link
Contributor

@m-dati m-dati Dec 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you expect 1 only possible ollama pod, then all is ok.

Is there any possiblity that get pods would return 2 or more? In that case you'd store last one only, so, be sure that 1 only is possible here, unless even multiple are equivalent.

@@ -188,6 +194,25 @@ sub install_aistack_chart {
} # pod loop
Copy link
Contributor

@m-dati m-dati Dec 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this point, having you detected eventual errors in pod or logs, it'd be better to soon report here issues and react on that.
Therefore I'd suggest to move here your checks performed below at lines from 211 to 218.

Moreover, also cosider if needed to exit on error detected in log pod: but up to you, based on system requirements.

Copy link
Contributor

@m-dati m-dati left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please also consider an update as by suggestions in the previous PR #20723 (comment), related to Lines here starting from L261 up to L278

Do not exist for log issues only pod failure.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants