Skip to content

Commit

Permalink
TarOutput: Close collected files after reading them to free buffers
Browse files Browse the repository at this point in the history
With tar output, TarOutput.addRealFile() did not close file objects.
So in tar mode, the file objects of all captures regular files were
not closed.

As the fixed line use `TarFile.addfile(tarinfo, open("fn"))` the
opened file object could have never been passed anywhere else so
it is safe to replace this pattern using a context manager.

Summary:

- The changes to xen-bugtool are 4 safe lines to use `with open`
- All other changes are improvements for the unit tests to cover it.

Description:

This was found by PYTHONDEVMODE=yes to trace unclosed resources.
I set this up in .pre-commit-config.yaml, so while developing
tests, I saw the warning while testing with pre-commit run -v.

Fixes in this PR:

- Use context managers to fix the locations which are easy to fix.
- Extend the unit tests (those in master now) to cover both locations.
- While updating those tests, improve them to check the file list of
  the ZIP and tar archives generated in this particular test case.
- Configure GitHub CI to create source annotations for PYTHONWARNINGS
  which are issued by such cases and are shown in code review of PRs.

Notes:

In the course of a year, my development machine accumulated 24000
files in /var/xapi/blobs (spread evenly over all subdirectories),
so Python apparently does some garbage collection, but it's certainly
not good to never close so many open file-like objects in Python.

Closing them in time has the potential to reduce the memory usage
of the xen-bugtool / xenserver-status-report process.

There are more locations where Python 3.10 reports unclosed file
objects, but it seems we may have to use enable tracemalloc as
recommended by the issued PYTHONWARNINGS to find their source.

Signed-off-by: Bernhard Kaindl <[email protected]>
  • Loading branch information
bernhardkaindl committed Jan 12, 2024
1 parent 1aae840 commit 5fc5d42
Show file tree
Hide file tree
Showing 8 changed files with 80 additions and 11 deletions.
17 changes: 17 additions & 0 deletions .github/workflows/PYTHONWARNINGS-problemMatcher.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
{
"problemMatcher": [
{
"owner": "python-libs-PYTHONWARNINGS",
"severity": "warning",
"pattern": [
{
"regexp": "^.*/python-libs/(.+):([0-9]*):(.*):(.*)$",
"file": 1,
"line": 2,
"code": 3,
"message": 4
}
]
}
]
}
8 changes: 8 additions & 0 deletions .github/workflows/main.yml
Original file line number Diff line number Diff line change
Expand Up @@ -89,9 +89,17 @@ jobs:
path: ~/.cache/pre-commit
key: pre-commit|${{ env.pythonLocation }}|${{ hashFiles('.pre-commit-config.yaml') }}

# https://docs.python.org/3/library/devmode.html#resourcewarning-example
# If pytest runs with PYTHONDEVMODE=yes, it enables resource checks like
# unclosed file warnings. Configure GitHub to show all such warnings as
# annotations at the source location they occur in the PR code review:

- run: echo "::add-matcher::.github/workflows/PYTHONWARNINGS-problemMatcher.json"

- uses: pre-commit/[email protected]
name: Run pre-commit checks
env:
PYTHONDEVMODE: yes # Enable Python3 checks. See the comment above.
# Skip the no-commit-to-branch check inside of GitHub CI (for CI on merge to master)
SKIP: no-commit-to-branch

Expand Down
1 change: 1 addition & 0 deletions tests/integration/dom0-template/etc/group
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
root:x:0:
1 change: 1 addition & 0 deletions tests/integration/dom0-template/etc/passwd
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
root:x:0:0:root:/root:/bin/bash
Original file line number Diff line number Diff line change
@@ -1,5 +1,7 @@
<collect>
<list>/etc</list>
<files>/etc/passwd</files>
<files>/etc/group</files>
<files>/proc/self/status</files>
<directory pattern=".*ax_user_watches">/proc/sys/fs/epoll</directory>
<command label="proc_version">cat /proc/version</command>
Expand Down
12 changes: 10 additions & 2 deletions tests/unit/test_load_plugins.py
Original file line number Diff line number Diff line change
Expand Up @@ -23,15 +23,23 @@ def test_load_plugins(bugtool, dom0_template):
False,
9,
)
# Assert the size of the created mock capability:
assert bugtool.cap_sizes["mock"] == 0
# Assert the size of the files of the created mock inventory entry:
assert bugtool.cap_sizes["mock"] > 0 # /etc/passwd should have content
# Assert the entries added to the bugtool.data dict:
assert bugtool.data == {
"ls -l /etc": {
"cap": "mock",
"cmd_args": ["ls", "-l", "/etc"],
"filter": None,
},
"/etc/passwd": {
"cap": "mock",
"filename": "/etc/passwd",
},
"/etc/group": {
"cap": "mock",
"filename": "/etc/group",
},
"/proc/self/status": {
"cap": "mock",
"filename": "/proc/self/status",
Expand Down
43 changes: 36 additions & 7 deletions tests/unit/test_output.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,8 +13,24 @@ def assert_valid_inventory_schema(inventory_tree):
XMLSchema(parse(xml_schema)).assertValid(inventory_tree)


def assert_mock_bugtool_plugin_output(extracted):
def assert_mock_bugtool_plugin_output(temporary_directory, subdir, names):
"""Assertion check of the output files from the mock bugtool plugin"""

# Assert the list of file names in the status report archive:
expected_names = [
subdir + "/etc/group",
subdir + "/etc/passwd.tar",
subdir + "/inventory.xml",
subdir + "/ls-l-%etc.out",
subdir + "/proc/self/status",
subdir + "/proc/sys/fs/epoll/max_user_watches",
subdir + "/proc_version.out",
]
assert sorted(names) == expected_names

extracted = "%s/%s/" % (temporary_directory, subdir)

# Will be refactored to be more easy in a separate commit soon:
assert_valid_inventory_schema(parse(extracted + "inventory.xml"))
with open(extracted + "proc_version.out") as proc_version:
assert proc_version.read()[:14] == "Linux version "
Expand All @@ -24,6 +40,17 @@ def assert_mock_bugtool_plugin_output(extracted):
assert status.read()[:5] == "Name:"
with open(extracted + "proc/sys/fs/epoll/max_user_watches") as max_user_watches:
assert int(max_user_watches.read()) > 0
with open(extracted + "etc/group") as group:
assert group.readline() == "root:x:0:\n"

# Check the contents of the sub-archive "etc/passwd.tar":
with tarfile.TarFile(extracted + "etc/passwd.tar") as tar:
assert tar.getnames() == [subdir + "/etc/passwd"]
# TarFile.extractfile() does not support context managers on Python2:
passwd = tar.extractfile(subdir + "/etc/passwd")
assert passwd
assert passwd.readline() == b"root:x:0:0:root:/root:/bin/bash\n"
passwd.close()


def minimal_bugtool(bugtool, dom0_template, archive, subdir):
Expand All @@ -32,7 +59,9 @@ def minimal_bugtool(bugtool, dom0_template, archive, subdir):
# Load the mock plugin from dom0_template and process the plugin's caps:
bugtool.PLUGIN_DIR = dom0_template + "/etc/xensource/bugtool"
bugtool.entries = ["mock"]
archive.declare_subarchive("/etc/passwd", subdir + "/etc/passwd.tar")
bugtool.load_plugins(just_capabilities=False)
# Mock the 2nd argument of the ls -l /etc to collect it using dom0_template:
bugtool.data["ls -l /etc"]["cmd_args"][2] = dom0_template + "/etc"
bugtool.collect_data(subdir, archive)
bugtool.include_inventory(archive, subdir)
Expand All @@ -51,10 +80,9 @@ def test_tar_output(bugtool, tmp_path, dom0_template):

# Check the TarFile contents
tmp = tmp_path.as_posix()
tar_archive = tarfile.TarFile(tmp + "/tarball.tar")
tar_archive.extractall(tmp)
tar_archive.close()
assert_mock_bugtool_plugin_output(tmp + "/" + subdir + "/")
with tarfile.TarFile(tmp + "/tarball.tar") as tar:
tar.extractall(tmp)
assert_mock_bugtool_plugin_output(tmp, subdir, tar.getnames())


def test_zip_output(bugtool, tmp_path, dom0_template):
Expand All @@ -69,5 +97,6 @@ def test_zip_output(bugtool, tmp_path, dom0_template):

# Check the ZipFile contents
tmp = tmp_path.as_posix()
zipfile.ZipFile(tmp + "/zipfile.zip").extractall(tmp)
assert_mock_bugtool_plugin_output(tmp + "/" + subdir + "/")
with zipfile.ZipFile(tmp + "/zipfile.zip") as zip:
zip.extractall(tmp)
assert_mock_bugtool_plugin_output(tmp, subdir, zip.namelist())
7 changes: 5 additions & 2 deletions xen-bugtool
Original file line number Diff line number Diff line change
Expand Up @@ -1697,7 +1697,8 @@ class TarSubArchive(io.BytesIO):
:param name: Recorded path of the the file in the tar archive for extraction
:param filename: Real file name of the file to be added to the tar archive
"""
self.file.addfile(self.file.gettarinfo(filename, name), open(filename, "rb"))
with open(filename, "rb") as buffered_reader:
self.file.addfile(self.file.gettarinfo(filename, name), buffered_reader)

class ArchiveWithTarSubarchives(object):
"""Base class for TarOutput and ZipOutput with support to create sub-archives"""
Expand Down Expand Up @@ -1762,7 +1763,9 @@ class TarOutput(ArchiveWithTarSubarchives):
s = os.stat(filename)
ti.mtime = s.st_mtime
ti.size = s.st_size
self.tf.addfile(ti, open(filename, "rb"))
with open(filename, "rb") as buffered_reader:
self.tf.addfile(ti, buffered_reader)


def add_path_with_data(self, name, data):
ti = self._getTi(name)
Expand Down

1 comment on commit 5fc5d42

@github-actions
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Coverage

https://github.com/marketplace/actions/pytest-coverage-comment
FileStmtsMissCoverMissing
__init__.py00100% 
conftest.py300100% 
test_dir_list.py110100% 
test_fs_funcs.py100100% 
test_load_plugins.py50100% 
test_output.py470100% 
test_process_output.py80100% 
test_xapidb_filter.py170100% 
xen-bugtool152368654%95, 100–101, 410, 490–491, 493–494, 511, 517, 540, 551–552, 555–556, 602, 607–608, 612, 614–618, 620, 632–634, 637, 656–657, 660, 662–666, 668–669, 672–673, 677–678, 680–681, 685–688, 690–693, 696–698, 700–701, 705–707, 710–711, 720, 722, 724–727, 729–731, 733–734, 738–739, 741, 743–749, 753–754, 756–760, 765–773, 775–777, 779–781, 783–784, 786, 801, 803–807, 809–810, 812, 814, 816–839, 842–846, 848, 850–871, 874–875, 877, 879, 882, 884–893, 895, 897–900, 902–912, 914–930, 932–936, 938–944, 947–976, 978–988, 990–993, 995–1015, 1017–1020, 1022, 1024, 1026–1027, 1029–1030, 1032, 1047–1048, 1050–1051, 1055–1056, 1058–1059, 1061, 1063, 1065–1066, 1068–1069, 1071–1080, 1083–1089, 1092–1093, 1095–1096, 1105–1106, 1108–1109, 1113, 1135–1138, 1140–1144, 1146–1153, 1155, 1157–1160, 1162, 1165–1167, 1169–1175, 1178–1180, 1184–1185, 1187–1188, 1190–1192, 1195–1196, 1199, 1201–1202, 1204, 1206, 1208–1209, 1213–1216, 1219, 1222, 1225–1227, 1230–1234, 1237–1238, 1241–1244, 1250–1256, 1259–1263, 1265–1281, 1284–1292, 1295–1296, 1298, 1300–1301, 1303–1311, 1314, 1316–1319, 1321, 1323–1324, 1358, 1386, 1401, 1409–1410, 1414–1416, 1418, 1421, 1423–1425, 1427–1431, 1433, 1437, 1440, 1443–1449, 1451, 1453–1457, 1463–1465, 1467–1472, 1475–1479, 1483–1484, 1486, 1503–1504, 1506, 1508–1509, 1511–1513, 1515–1522, 1524–1529, 1533–1534, 1536, 1538, 1540, 1556–1557, 1559, 1561–1563, 1566–1567, 1569, 1571–1573, 1576–1577, 1579, 1581–1584, 1587–1597, 1600–1604, 1614, 1626, 1636, 1640, 1651, 1675–1677, 1744, 1750, 1785, 1809, 1824, 1884, 1903, 1905, 1907, 1910–1918, 1921–1922, 1925–1932, 1934–1935, 1937, 1941–1942, 1946, 1950–1952, 1956–1957, 1960, 1964, 1969, 1975–1976, 1978, 1982, 1984–1987, 1990–2000, 2004, 2006, 2091–2093, 2096–2098, 2110, 2129–2131, 2171–2175, 2182, 2184–2187, 2191, 2205
TOTAL165168658% 

Tests Skipped Failures Errors Time
11 0 💤 0 ❌ 0 🔥 0.895s ⏱️

Please sign in to comment.