Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test: Consolidate test cases in test_identity.py #7214

Closed
wants to merge 1 commit into from
Closed

Test: Consolidate test cases in test_identity.py #7214

wants to merge 1 commit into from

Conversation

liswang89
Copy link
Contributor

@liswang89 liswang89 commented Feb 26, 2024

Consolidate test cases in test_identity.py with similar setups for tidying up.

It combines everything that uses "id" into one test case, and "getent" into another test case.
test_identity__lookup_username_with_id
test_identity__lookup_uid_with_id
test_identity__lookup_group_membership_by_username_with_id
test_identity__lookup_group_membership_by_group_with_id

test_identity__lookup_groupname_with_getent
test_identity__lookup_gid_with_getent
test_identity__lookup_user_with_getent
test_identity__lookup_user_by_group_with_getent
test_identity__lookup_initgroups_with_getent

@shridhargadekar
Copy link
Contributor

A test case should test ideally single test scenario instead of combining multiple scenarios in one.

@jakub-vavra-cz
Copy link
Contributor

I would rather keep them separated as sssd is caching and one query affects the following ones.
Merging in the same scenario can introduce inherent risk of masking an issue.

@spoore1
Copy link
Contributor

spoore1 commented Mar 5, 2024

I would rather keep them separated as sssd is caching and one query affects the following ones. Merging in the same scenario can introduce inherent risk of masking an issue.

@jakub-vavra-cz Good point about the cache. Wouldn't clearing the cache between the lookups be less overhead than the setup/teardown for separate test cases?

These particular tests probably run pretty quick but, I believe there are others as well that are being reviewed for consolidation like this. The point as I understand it is to reduce the run times by reducing the number of setup/teardowns where possible. This includes tests where the steps performed are almost identical.

Maybe we should have some type of guide for when/where consolidation like this is or isn't desired?

12. Check that users have correct ids
13. Find users with getent.initgroups(name)
14. Check that user has correct name
15. Check that user has correct initgroups
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Steps do not match expected results.

3. Check that users are members of correct group using memberof(group)
4. Check that users are members of correct groups using memberof(gid)
5. Find 'user1', 'user2' and 'user3' with id(uid)
4. Check that users have correct names and ids
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Repeated 4.
Steps do not match results.

@jakub-vavra-cz
Copy link
Contributor

@spoore1 Generally speaking: Cleaning caches between lookups (where different lookups are performed) can help and would be cheaper than full setups/teardowns. When different asserts are applied to the same command we can safely merge.
We should avoid running cache/cleanups (sssd restarts) in loops as they are also costly.

"""
:title: Resolve user by name with id
:title: Resolve user by name and uid, group membershif by username and group with id
Copy link
Contributor

@jakub-vavra-cz jakub-vavra-cz Mar 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo

@danlavu
Copy link

danlavu commented Mar 15, 2024

@jakub-vavra-cz , @shridhargadekar

A test case should test ideally single test scenario instead of combining multiple scenarios in one.

I think we need to start moving away from this idea, that a test does one one specific thing. Remember, one generic provider test is now 5 tests, it runs against AD, IPA, KRB, LDAP and Samba. I think clubbing them into one test is good, for a few reasons. We have several other look up tests, specific cache tests, so we have to consider the other tests that do something similar. The maintenance and management of tests will be easier with fewer test cases. It's more functional than doing a single lookup.

What is different about these tests? It uses different commands to lookup users, it looks up different types of objects, and by different names. What are you gaining from having one test for each lookup?

test_id:id user1 ; id group1 ; id netgroup1
test_getent: getent user1; getent group1; getent netgroup1
(10 test cases)

test_id user1
test_id group1
test_id netgroup1
test_getent user1
test_getent group1
test_getent netgrup1
(30 test cases)

@spoore1 Generally speaking: Cleaning caches between lookups (where different lookups are performed) can help and would be cheaper than full setups/teardowns. When different asserts are applied to the same command we can safely merge.
We should avoid running cache/cleanups (sssd restarts) in loops as they are also costly.

Both are costly, for generic providers tests, setups and teardowns are significantly more costly now. I think these kind of tests we need to be functional, what are we gaining from clearing the cache? It doesn't lookup the server on IDs that has already been queried but when it does groups, it's not cached yet. FWIW, I'm fine with clearing or not clearing the cache. We have tests that explicitly test the cache, negative cache and mem_cache. So let's do more singular tests for the caches where it seems to make a little more sense? I do feel strongly that we should merge a lot of things single tests together when it comes to the generic test provider but It doesn't mean we can't do both. We will discuss this in the test porting meeting next week.

@liswang89 I'm sorry for making you do more work if the team doesn't like this approach. I know I suggested it, and I still think it's the better way forward. It needs to be discussed.

@pbrezina
Copy link
Member

pbrezina commented Mar 19, 2024

@jakub-vavra-cz , @shridhargadekar

A test case should test ideally single test scenario instead of combining multiple scenarios in one.

I think we need to start moving away from this idea, that a test does one one specific thing. Remember, one generic provider test is now 5 tests, it runs against AD, IPA, KRB, LDAP and Samba. I think clubbing them into one test is good, for a few reasons. We have several other look up tests, specific cache tests, so we have to consider the other tests that do something similar. The maintenance and management of tests will be easier with fewer test cases. It's more functional than doing a single lookup.

What is different about these tests? It uses different commands to lookup users, it looks up different types of objects, and by different names. What are you gaining from having one test for each lookup?

In my opinion, there are three main use cases for having each test case really separated from each other:

  1. If a test fails, you already know what it does and you don't have to look into particular line that fails.
  2. If a test fails you already know if only single test is affected or a set of tests. In this case, if test_identity__lookup_with_id fails on the first assertions that checks id $name how do you tell if id $uid is also affected or not?
  3. It gives you clean environment. In this case, uid lookups take different code path because the user is already cached, it takes different code path, so it does different thing then the original test.

If you want to merge them together, you have to be much smarter about how do you merge the tests cases in order to make sure that it follows the path that you want it to. Even if you have the knowledge to decide, it is easy to miss something even during review. Further more, it relies on internal implementation that can change.

Therefore, in general, I am personally against it. We should do it were it brings us large performance benefits. If you want to speed up the identity tests, it would be better to only use getent tests or id tests since its using the same POSIX interface (but id also run initgroups).

test_id:id user1 ; id group1 ; id netgroup1 test_getent: getent user1; getent group1; getent netgroup1 (10 test cases)

test_id user1 test_id group1 test_id netgroup1 test_getent user1 test_getent group1 test_getent netgrup1 (30 test cases)

@spoore1 Generally speaking: Cleaning caches between lookups (where different lookups are performed) can help and would be cheaper than full setups/teardowns. When different asserts are applied to the same command we can safely merge.
We should avoid running cache/cleanups (sssd restarts) in loops as they are also costly.

Both are costly, for generic providers tests, setups and teardowns are significantly more costly now. I think these kind of tests we need to be functional, what are we gaining from clearing the cache? It doesn't lookup the server on IDs that has already been queried but when it does groups, it's not cached yet. FWIW, I'm fine with clearing or not clearing the cache. We have tests that explicitly test the cache, negative cache and mem_cache. So let's do more singular tests for the caches where it seems to make a little more sense? I do feel strongly that we should merge a lot of things single tests together when it comes to the generic test provider but It doesn't mean we can't do both. We will discuss this in the test porting meeting next week.

Actually, teardown is only costly for IPA. Other backends are quite fast actually. SSSD start is also fast. The bottleneck is ssh communication and the amount of commands that we are executing. But we can't really avoid that, otherwise we are back in bash script.

Soon, I want to provide per-module and per-class setup and teardown so where it makes sense to share the users or other environment setting where it would bring large performance benefit. But in reality, it is already reeeeally fast.

In PR CI we run:

  • old multihost basic tests/multihost/basic: 25 test cases in 20 minutes (48 seconds per test) - these tests are only executed against 389ds running on the same host as SSSD
  • pytest-mh tests/system: 383 test cases in 41 minutes (6.5 seconds per test) running against ldap, samba and ipa.

@liswang89 I'm sorry for making you do more work if the team doesn't like this approach. I know I suggested it, and I still think it's the better way forward. It needs to be discussed.

@pbrezina
Copy link
Member

pbrezina commented Mar 19, 2024

We can measure time required to run each test case and see what tests are slow, but think that identity tests won't be it, or do you already have some numbers?

@danlavu
Copy link

danlavu commented Mar 20, 2024

We can measure time required to run each test case and see what tests are slow, but think that identity tests won't be it, or do you already have some numbers?

    <div class="time">3.15 s</div>
                                            </em><em class="status">passed</em>test_identity__lookup_groupname_with_getent (ad)</span>
--
                                                <div class="time">3.21 s</div>
                                            </em><em class="status">passed</em>test_identity__lookup_gid_with_getent (ad)</span>
--
                                                <div class="time">3.35 s</div>
                                            </em><em class="status">passed</em>test_identity__lookup_user_with_getent (ad)</span>
--
                                                <div class="time">3.12 s</div>
                                            </em><em class="status">passed</em>test_identity__lookup_user_by_group_with_getent (ad)</span>
--
                                                <div class="time">4.47 s</div>
                                            </em>test_identity__lookup_group_membership_by_username_with_id</span>
--
                                                <div class="time">4.46 s</div>
                                            </em><em class="status">passed</em>test_identity__lookup_group_membership_by_group_with_id (ad)</span>
--
                                                <div class="time">6.69 s</div>
                                            </em><em class="status">passed</em>test_identity__lookup_initgroups_with_getent (ad)</span>
--
                                                <div class="time">3.02 s</div>
                                            </em><em class="status">passed</em>test_identity__lookup_users_with_fully_qualified_name (ad)</span>
--
                                                <div class="time">3.86 s</div>
                                            </em><em class="status">passed</em>test_identity__lookup_users_when_case_insensitive (ad)</span>
--
                                                <div class="time">7.18 s</div>
                                            </em><em class="status">passed</em>test_identity__lookup_users_fully_qualified_name_and_case_insensitive (ad)</span>
--
                                                <div class="time">4.78 s</div>
                                            </em><em class="status">passed</em>test_identity__lookup_idmapping_of_posix_and_non_posix_user_and_group (ad)</span>
--
                                                <div class="time">2.58 s</div>
                                            </em><em class="status">passed</em>test_identity__lookup_when_private_groups_set_to_hybrid (ad)</span>
--
                                                <div class="time">5.12 s</div>
                                            </em>test_identity__lookup_username_with_id</span>
--
                                                <div class="time">5.44 s</div>
                                            </em>test_identity__lookup_uid_with_id</span>
--
                                                <div class="time">4.73 s</div>
                                            </em><em class="status">passed</em>test_identity__lookup_groupname_with_getent (ipa)</span>
--
                                                <div class="time">4.61 s</div>
                                            </em><em class="status">passed</em>test_identity__lookup_gid_with_getent (ipa)</span>
--
                                                <div class="time">4.48 s</div>
                                            </em><em class="status">passed</em>test_identity__lookup_user_with_getent (ipa)</span>
--
                                                <div class="time">4.30 s</div>
                                            </em><em class="status">passed</em>test_identity__lookup_user_by_group_with_getent (ipa)</span>
--
                                                <div class="time">6.58 s</div>
                                            </em>test_identity__lookup_group_membership_by_username_with_id</span>
--
                                                <div class="time">6.08 s</div>
                                            </em><em class="status">passed</em>test_identity__lookup_group_membership_by_group_with_id (ipa)</span>
--
                                                <div class="time">6.83 s</div>
                                            </em><em class="status">passed</em>test_identity__lookup_initgroups_with_getent (ipa)</span>
--
                                                <div class="time">4.28 s</div>
                                            </em><em class="status">passed</em>test_identity__lookup_users_with_fully_qualified_name (ipa)</span>
--
                                                <div class="time">5.14 s</div>
                                            </em><em class="status">passed</em>test_identity__lookup_users_when_case_insensitive (ipa)</span>
--
                                                <div class="time">9.28 s</div>
                                            </em><em class="status">passed</em>test_identity__lookup_users_fully_qualified_name_and_case_insensitive (ipa)</span>
--
                                                <div class="time">4.34 s</div>
                                            </em><em class="status">passed</em>test_identity__lookup_when_private_groups_set_to_hybrid (ipa)</span>

@liswang89 , I'm sorry for making you do more work, the team has decided that it makes better sense not to combine the tests. Can you revert it to the first PR?

@danlavu
Copy link

danlavu commented Mar 25, 2024

@liswang89 I'm sorry to have to do this, and thank you so much for your work on it. The consolidation effort is no longer required for this bit of code. Let's sync up and find you another task.

@danlavu danlavu closed this Mar 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants