TestHistoryMetrics.testEvents graph changes structure in updates-testing #21237

cockpituous · 2024-11-09T02:26:19Z

The job fedora-40/updates-testing failed on commit f474f0d.

Log: https://cockpit-logs.us-east-1.linodeobjects.com/pull-0-f474f0da-20241109-013334-fedora-40-updates-testing/log.html

The text was updated successfully, but these errors were encountered:

martinpitt · 2024-11-10T07:42:33Z

Confirmed in #21241 this seems to be real.

jelly · 2024-11-13T11:34:29Z

Relevant package updates:

 pcp                   x86_64 6.3.2-1.fc40                updates-testing 1.5 M
 pcp-conf              x86_64 6.3.2-1.fc40                updates-testing  30 k
 pcp-libs              x86_64 6.3.2-1.fc40                updates-testing 651 k
 pcp-selinux           x86_64 6.3.2-1.fc40                updates-testing  31 k

Previous version:

pcp-selinux-6.3.1-1.fc40.x86_64
pcp-conf-6.3.1-1.fc40.x86_64
pcp-libs-6.3.1-1.fc40.x86_64
pcp-6.3.1-1.fc40.x86_64
python3-pcp-6.3.1-1.fc40.x86_64

Changelog: https://github.com/performancecopilot/pcp/blob/main/CHANGELOG

Test assertion failure:

  File "/home/jelle/projects/cockpit/main/./test/verify/check-metrics", line 306, in testEvents
    self.assertGreaterEqual(getCompressedMinuteValue(test=self, g_type="cpu", saturation=True, hour=1598950800000, minute=8), 0.3)
AssertionError: 0.12799999713897706 not greater than or equal to 0.3

The metrics are historical so from a known dump so the values should not really change.

That version changes the interpretation of existing data somewhat, and thus events slightly move around. Make the assertions flexible enough to work with old and new versions. There was not much science in the original assertions anyway -- I simply recorded some stuff and then pinned down the status quo. We can make this stricter again once all supported OSes have PCP ≥ 6.3.2 (but that'll be a few years). Addresses the non-pixel parts of cockpit-project#21237

martinpitt · 2024-11-21T11:44:02Z

#21295 should fix the textual bits. Making the pixel tests accomodate both versions is just too awful. https://bodhi.fedoraproject.org/updates/FEDORA-2024-22b21606d8 should hit updates in one or two days, then let's refresh the image and accept the new pixel refs. We can suffer through two more nightly failures. (There's also the Python crash, though)

mvollmer · 2024-11-21T12:17:07Z

I could reproduce the differences with pmval:

With pcp-6.3.1-1.fc40.x86_64:

# pmval -t 60 -a /var/log/pcp/pmlogger/localhost.localdomain/20200901.04.59.meta kernel.all.load

metric:    kernel.all.load
archive:   /var/log/pcp/pmlogger/localhost.localdomain/20200901.04.59.meta
host:      localhost.localdomain
start:     Tue Sep  1 08:59:46 2020
end:       Tue Sep  1 09:08:16 2020
semantics: instantaneous value
units:     none
samples:   9
interval:  60.00 sec
08:59:46.594  No values available

                 1 minute      5 minute     15 minute 
09:00:46.594       0.6400        0.1300     4.000E-02 
09:01:46.594       0.4600        0.1700     6.000E-02 
09:02:46.594       0.2900        0.1800     7.000E-02 
09:03:46.594       0.3300        0.2100     9.000E-02 
09:04:46.594      32.61          7.640         2.540  
09:05:46.594      15.22          7.620         2.880  
09:06:46.594      50.38         17.47          6.470  
09:07:46.594      20.93         15.34          6.440

With pcp-6.3.2-2.el10.x86_64:

# pmval -t 60 -a /var/log/pcp/pmlogger/localhost.localdomain/20200901.04.59.meta kernel.all.load

metric:    kernel.all.load
archive:   /var/log/pcp/pmlogger/localhost.localdomain/20200901.04.59.meta
host:      localhost.localdomain
start:     Tue Sep  1 04:59:46 2020
end:       Tue Sep  1 05:08:16 2020
semantics: instantaneous value
units:     none
samples:   9
interval:  60.00 sec
04:59:46.594  No values available

                 1 minute      5 minute     15 minute 
05:00:46.594       0.4600        0.1700     6.000E-02 
05:01:46.594       0.2900        0.1800     7.000E-02 
05:02:46.594       0.3300        0.2100     9.000E-02 
05:03:46.594      32.61          7.640         2.540  
05:04:46.594      15.22          7.620         2.880  
05:05:46.594      50.38         17.47          6.470  
05:06:46.594      20.93         15.34          6.440  
05:07:46.594       7.680        12.54          6.040

Note how the values jump up one minute later in the newer version.

Both versions show the same raw data in the archives, of course:

# pmval -U /var/log/pcp/pmlogger/localhost.localdomain/20200901.04.59.meta kernel.all.load

metric:    kernel.all.load
archive:   /var/log/pcp/pmlogger/localhost.localdomain/20200901.04.59.meta
host:      localhost.localdomain
start:     Tue Sep  1 04:59:46 2020
end:       Tue Sep  1 05:08:16 2020
semantics: instantaneous value
units:     none
samples:   all

                 1 minute      5 minute     15 minute 
04:59:46.609       0.6400        0.1300     4.000E-02 
05:00:46.617       0.4600        0.1700     6.000E-02 
05:01:46.621       0.2900        0.1800     7.000E-02 
05:02:46.630       0.3300        0.2100     9.000E-02 
05:03:46.620      32.61          7.640         2.540  
05:04:46.618      15.22          7.620         2.880  
05:05:46.623      50.38         17.47          6.470  
05:06:46.632      20.93         15.34          6.440  
05:07:46.627       7.680        12.54          6.040

In fact, there isn't any interpolation of values going on. The sample time is rounded differently. In the old version 05:03:46.594 gets the raw data for 05:02:46.630, while in the new version, it gets the raw data for 05:03:46.620. The new version seems "more correct", the old version seems to round down, while the new version might round to nearest.

Anyway, no bug to report. We have to survive with both.

That version changes the interpretation of existing data somewhat, and thus events slightly move around. Make the assertions flexible enough to work with old and new versions. There was not much science in the original assertions anyway -- I simply recorded some stuff and then pinned down the status quo. We can make this stricter again once all supported OSes have PCP ≥ 6.3.2 (but that'll be a few years). Addresses the non-pixel parts of #21237

mvollmer · 2024-11-21T13:09:07Z

What about not doing pixel tests for updates-testing? We don't have a mechanism to cope with pixel alternatives and any failure in updates-testing will just be pain.

If we write code like

self.assert_pixels("dialog" if not updates_testing else "dialog-testing", ...)

then the machinery will yet at us that one of "dialog" or "dialog-testing" is unused... but we could invent something if we want to.

martinpitt · 2024-11-21T13:17:17Z

What about not doing pixel tests for updates-testing?

That crossed my mind, but I feel that would diminish their value too much. They did help us to find this difference (which could have been a regression). I wouldn't mind skipping this specific pixel test on updates-testing if you prefer. But honestly I currently lean towards dragging our feet a bit and then forgetting about it.

martinpitt · 2024-11-25T06:48:41Z

cockpit-project/bots#7126 refreshes the image, but that fails on a handful of other issues. We'll update the pixel tests as part of that.

cockpituous added the nightly label Nov 9, 2024

martinpitt changed the title ~~Tests failed on f474f0da1c6f723b9f55b4b599b5468047fc8a59~~ TestHistoryMetrics.testEvents graph changes structure in updates-testing Nov 10, 2024

martinpitt mentioned this issue Nov 10, 2024

Tests failed on f474f0da1c6f723b9f55b4b599b5468047fc8a59 #21241

Closed

martinpitt added this to Pilot tasks Nov 10, 2024

martinpitt moved this to detriment in Pilot tasks Nov 10, 2024

This was referenced Nov 11, 2024

Tests failed on f63bd0b0e4f3d531d740630de870f999615e3715 #21244

Closed

Tests failed on 9601aee69e8606957fcd17091d60107cedfdf576 #21252

Closed

Tests failed on 6598ea0ffeb4482a9b2050204acbcd0fda251dbd #21259

Closed

martinpitt moved this from detriment to next in Pilot tasks Nov 13, 2024

martinpitt mentioned this issue Nov 21, 2024

Image refresh for centos-10 cockpit-project/bots#7091

Merged

3 tasks

martinpitt self-assigned this Nov 21, 2024

martinpitt mentioned this issue Nov 21, 2024

test: Add enough TestHistoryMetrics.testEvents wiggle room for PCP 6.3.2 #21295

Merged

This was referenced Nov 22, 2024

Python core dumps with metrics tests #21301

Closed

Tests failed on c7d327f89f01ae54bd127020e7665dccea00aff3 #21313

Closed

martinpitt mentioned this issue Nov 24, 2024

rpc.statd crash regression in Fedora 40 #21312

Closed

This was referenced Nov 25, 2024

test: Update pixel references for PCP 6.3.2 #21322

Merged

Tests failed on 1f14cc1f0656f4cc49782b3617c8a14099e5cfd1 #21328

Closed

martinpitt closed this as completed in #21322 Nov 26, 2024

github-project-automation bot moved this from next to improvement in Pilot tasks Nov 26, 2024

martinpitt moved this from improvement to detriment in Pilot tasks Nov 26, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TestHistoryMetrics.testEvents graph changes structure in updates-testing #21237

TestHistoryMetrics.testEvents graph changes structure in updates-testing #21237

cockpituous commented Nov 9, 2024

martinpitt commented Nov 10, 2024

jelly commented Nov 13, 2024 •

edited

Loading

martinpitt commented Nov 21, 2024

mvollmer commented Nov 21, 2024

mvollmer commented Nov 21, 2024

martinpitt commented Nov 21, 2024

martinpitt commented Nov 25, 2024

TestHistoryMetrics.testEvents graph changes structure in updates-testing #21237

TestHistoryMetrics.testEvents graph changes structure in updates-testing #21237

Comments

cockpituous commented Nov 9, 2024

martinpitt commented Nov 10, 2024

jelly commented Nov 13, 2024 • edited Loading

martinpitt commented Nov 21, 2024

mvollmer commented Nov 21, 2024

mvollmer commented Nov 21, 2024

martinpitt commented Nov 21, 2024

martinpitt commented Nov 25, 2024

jelly commented Nov 13, 2024 •

edited

Loading