Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

t176 #3329

Draft
wants to merge 7 commits into
base: main
Choose a base branch
from
Draft

t176 #3329

wants to merge 7 commits into from

Conversation

waltdisgrace
Copy link
Contributor

No description provided.

@waltdisgrace waltdisgrace marked this pull request as draft January 17, 2024 15:58
Copy link
Contributor

@kgaillot kgaillot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good start

include/crm/common/output.h Outdated Show resolved Hide resolved
lib/common/output_xml.c Outdated Show resolved Hide resolved
lib/pacemaker/pcmk_output.c Outdated Show resolved Hide resolved
@kgaillot
Copy link
Contributor

Changed my mind on validate vs verify, updated comment :)

Copy link
Contributor

@kgaillot kgaillot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, it's getting there

lib/pacemaker/pcmk_output.c Outdated Show resolved Hide resolved
lib/pacemaker/pcmk_output.c Outdated Show resolved Hide resolved
cts/cli/regression.crm_mon.exp Outdated Show resolved Hide resolved
@@ -27,13 +27,13 @@
<shortdesc lang="en">Pacemaker controller options</shortdesc>
<parameters>
<parameter name="dc-version">
<longdesc lang="en">Includes a hash which identifies the exact revision the code was built from. Used for diagnostic purposes.</longdesc>
<longdesc lang="en">Includes a hash which identifies the exact changeset the code was built from. Used for diagnostic purposes.</longdesc>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was updated in #3325. I'm not sure how you ended up with just some of the changes from that; maybe try exporting your commits as patches and reapply them to a clean branch.

@kgaillot
Copy link
Contributor

kgaillot commented Feb 1, 2024

Seeing the XML output reminds me that I forgot the schema portion of the crm_verify project :/ I'll open a new task for that ...

@kgaillot
Copy link
Contributor

kgaillot commented Feb 1, 2024

Seeing the XML output reminds me that I forgot the schema portion of the crm_verify project :/ I'll open a new task for that ...

Actually, crm_verify XML output only has status and errors, so it fits in the existing schemas. So, all we need to do is update the crm_mon schema for this project once it's ready.

Copy link
Contributor

@kgaillot kgaillot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great.

Add crm_mon regression tests -- you can use one of the crm_verify_invalid*xml files to test invalid syntax and crm_mon.xml to test valid syntax. Test text and XML for each using defaults, and also test --include=verifications for text+valid and --exclude=verifications for XML+invalid

Is there anything left you're unsure about?

lib/pacemaker/pcmk_output.c Outdated Show resolved Hide resolved
include/crm/common/output.h Outdated Show resolved Hide resolved
lib/pacemaker/pcmk_output.c Outdated Show resolved Hide resolved
lib/pacemaker/pcmk_output.c Outdated Show resolved Hide resolved
tools/crm_mon.c Show resolved Hide resolved
@kgaillot
Copy link
Contributor

Add crm_mon regression tests -- you can use one of the crm_verify_invalid*xml files to test invalid syntax and crm_mon.xml to test valid syntax. Test text and XML for each using defaults, and also test --include=verifications for text+valid and --exclude=verifications for XML+invalid

Actually we have plenty of text/XML + valid tests, so we just need to add tests for invalid and --include/--exclude

Copy link
Contributor

@kgaillot kgaillot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Getting there! When you're close to happy with it, flatten the code commits and the test output commits for the next review. Thanks!

pcmk__output_free(verify_out);

if (verify_rc == pcmk_rc_ok) {
if (pcmk_is_set(section_opts, pcmk_section_verify)) {
PCMK__OUTPUT_LIST_HEADER(out, false, rc, "Cluster Summary");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The way this macro works is that rc has to be initialized to pcmk_rc_no_output, and the macro will output the header the first time it's called and change rc to pcmk_rc_ok. Later calls will do nothing since the rc is changed. So, all calls for the same header have to use the same rc variable, which is why the header checks are all in the cluster-summary implementations.

This is a little different since we always want to output errors. Probably the easiest approach is to separate the verification itself from the output. We can always do the verification in the cluster-summary implementations, then output the message if the flag is set or there are errors.

Copy link
Contributor Author

@waltdisgrace waltdisgrace May 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To make sure I'm understanding this correctly, do you think the verification (pcmk__verify() call) should happen in cluster-summary and the output (PCMK__OUTPUT_LIST_HEADER() and out->list_item()) should happen in cluster-verify?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking cluster-summary should call pcmk__verify() and output the header, and cluster-verify would just output the message based on the rc. however at that point we might as well just do everything in cluster-summary and not have a separate message for cluster-verify

scheduler = pe_new_working_set();
scheduler->priv = verify_out;

verify_rc = pcmk__verify(scheduler, verify_out, scheduler->input);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You don't want a new scheduler object here, because it won't have input set to anything. To avoid output, you can save the initial priv value, reset it as you have here, then set it back after this

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I use the old scheduler object, there are some different failures:

  • Failed (rc=000): crm_resource - Try to move a resource to its existing location
  • Failed (rc=064): crm_resource - Move a resource from its existing location
  • Failed (rc=108): crm_resource - Move dummy to node1

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Calling pcmk__verify() shouldn't have any effects on the scheduler object, not sure how that's possible

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

try rebasing on current main and see if that changes anything

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also ditto for scheduler->priv, that could be causing output that's changing the tests


pcmk__output_new(&verify_out, "none", NULL, NULL);
verify_rc = pcmk__verify(scheduler, verify_out, scheduler->input);
pcmk__output_free(verify_out);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You also need to set scheduler->priv = verify_out before calling pcmk__verify() (and save and restore the original value) because the scheduler code will output messages there too (pcmk__output_cluster_status() sets scheduler->priv before calling the cluster-status message)

out->info(out, "CIB syntax is valid");
}
} else {
out->info(out, "CIB syntax has errors (for details, run crm_verify -LV).");
Copy link
Contributor

@kgaillot kgaillot May 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

info() does nothing for XML, you have to add the XML element and attribute like you defined in the schema

scheduler = pe_new_working_set();
scheduler->priv = verify_out;

verify_rc = pcmk__verify(scheduler, verify_out, scheduler->input);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also ditto for scheduler->priv, that could be causing output that's changing the tests

pcmk__output_free(verify_out);

if (verify_rc == pcmk_rc_ok) {
if (pcmk_is_set(section_opts, pcmk_section_verify)) {
PCMK__OUTPUT_LIST_HEADER(out, false, rc, "Cluster Summary");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking cluster-summary should call pcmk__verify() and output the header, and cluster-verify would just output the message based on the rc. however at that point we might as well just do everything in cluster-summary and not have a separate message for cluster-verify

Copy link
Contributor

@kgaillot kgaillot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I see what's happening ...

@@ -18,6 +18,8 @@
#include <crm/common/xml.h>
#include <crm/pengine/internal.h>

#include <pcmki/pcmki_verify.h>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a problem ...

libpe_status can't use anything from libpacemaker, which is higher in the library stack. We could move pcmk__verify() to libpe_status except that it calls pcmk__schedule_actions() from libpacemaker. So, we'll have to move the cluster-summary messages from libpe_status (lib/pengine/pe_output.c) to libpacemaker (lib/pacemaker/pcmk_output.c). The only caller of cluster-summary is in libpacemaker anyway, and output messages are not public API, so there's no problem that way.

cluster-summary and other messages use get_node_feature_set(), so that will have to be exposed (with a pcmk__ prefix)

@@ -450,6 +458,28 @@ cluster_summary(pcmk__output_t *out, va_list args) {
scheduler->localhost, last_written, user, client, origin);
}

// Use the existing scheduler, but avoid scheduler output
pcmk__output_new(&verify_out, "none", NULL, NULL);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

check return value for errors

} else {
/* If there are verification errors, always print a statement about that, even if not requested */
PCMK__OUTPUT_LIST_HEADER(out, false, rc, "Cluster Summary");
out->list_item(out, NULL, "CIB syntax has errors (for details, run crm_verify -LV)");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looking at this again, let's omit the crm_verify arguments -- the user could be running against a CIB_file. it's up to the user to figure out what arguments they need.

priv_orig = scheduler->priv;
scheduler->priv = verify_out;

verify_rc = pcmk__verify(scheduler, verify_out, scheduler->input);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's happening is that pcmk__verify() calls pcmk__schedule_actions(), and then crm_simulate calls pcmk__schedule_actions() again, and it's not happy with being called twice.

One of these two approaches should work:

  • Call pe_reset_working_set() after running pcmk__verify(). That will free scheduler->input, so you'll have to make a copy of that first and set it back afterward.
  • Create a new scheduler object and copy the input and now members.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants