REF: Add tests.groupby.methods #55312

rhshadrach · 2023-09-27T22:06:45Z

closes #xxxx (Replace xxxx with the GitHub issue number)
Tests added and passed if fixing a bug or adding a new feature
All code checks passed.
Added type annotations to new arguments/methods/functions.
Added an entry in the latest doc/source/whatsnew/vX.X.X.rst file if fixing a bug or adding a new feature.

Is this a bad idea? Just moving around code without much value? It has a similar structure to tests.frame.methods. The end goal is to have test_function.py completely removed.

mroeschke · 2023-09-28T16:21:16Z

I think this is a good idea. Always good to have tighter scoped test organization

lithomas1 · 2023-09-28T18:19:32Z

In general, I like splitting big files, however the downside is that when trying to only run 1 test by hand, the pytest command becomes longer.

In this case, I don't think its worth it if there's just 1 test in a file.

Maybe we can split the test_function.py file into a file for reductions and another file for transformations instead?

rhshadrach · 2023-09-28T20:54:50Z

In this case, I don't think its worth it if there's just 1 test in a file.

That isn't going to be the final state - but I didn't want to put too much into a PR. I moved the ones that were clearly meant just to test the given method from test_function.py. There will be more.

Maybe we can split the test_function.py file into a file for reductions and another file for transformations instead?

We already have method-specific files for some reductions and transformations. Are you suggesting to consolidate all of the existing ones into these files? E.g.

nunique
quantile
rank
size
skew
min_max
nth
any_all
shift_diff

lithomas1 · 2023-09-28T22:26:01Z

In this case, I don't think its worth it if there's just 1 test in a file.

That isn't going to be the final state - but I didn't want to put too much into a PR. I moved the ones that were clearly meant just to test the given method from test_function.py. There will be more.

Are you planning on moving other tests in addition to tests from test_function.py?
(I think you've moved over half of the functions from test_function already)

IMO, if there are less than 10 tests in a file, I don't think it'd be worth it to make the new file in general.

Maybe we can split the test_function.py file into a file for reductions and another file for transformations instead?

We already have method-specific files for some reductions and transformations. Are you suggesting to consolidate all of the existing ones into these files? E.g.

nunique

quantile

rank

size

skew

min_max

nth

any_all

shift_diff

Yeah, I think you already do that for the cum* methods (I saw you put them in test_cum.py).

This might make the PR too big though. Some of those files are also really large for some reason (at least min_max can be improved with more parameterization).

rhshadrach · 2023-09-29T01:50:37Z

I don't think line or test counts per file should be our primary concern. To me, the main thing I want to make easy is the scenario "I'm a new contributor, I am fixing a behavior, where do I put a test?". For this, I think we should prefer a more consistent test organization rather than sacrificing layout for grouping a sufficient number of tests to satisfy some limit.

That said, having too many files in the test suite can definitely be detrimental. I don't think this structure approaches that amount, but that is quite opinionated (as is the rest of this comment 😉 ).

lithomas1 · 2023-09-29T02:35:01Z

I don't think line or test counts per file should be our primary concern. To me, the main thing I want to make easy is the scenario "I'm a new contributor, I am fixing a behavior, where do I put a test?". For this, I think we should prefer a more consistent test organization rather than sacrificing layout for grouping a sufficient number of tests to satisfy some limit.

The concern for me is typing out the extra characters in the path to the test file when I want to run a single test.
e.g.
pytest pandas/tests/groupby/test_function.py -v -k ... -> pytest pandas/tests/groupby/methods/test_foo.py -v -k ...
(if there's only a couple of tests in that file, then it isn't worth the annoyance for me esp. if I break multiple tests across files and have to type a long path a bunch of times)

This isn't a very strong opinion, though, and the current version in the PR is definitely better than what we have in main.

jbrockmendel · 2023-10-07T16:39:52Z

pandas/tests/groupby/methods/test_cum.py

@@ -0,0 +1,291 @@
+import numpy as np


in tests.frame and tests.series with have test_cumulative.py. can we use that pattern? and if we're really trying to follow the pattern, that file goes outside the methods/ directory

jbrockmendel · 2023-10-07T16:40:31Z

pandas/tests/groupby/methods/test_mean.py

+    assert df.groupby("user")["connections"].mean()["A"] == 3689348814740003840
+
+
+def test_mean_on_timedelta():


elsewhere we have test_reductions.py, less fine-grained than this

jbrockmendel · 2023-10-07T16:41:41Z

the main thing I want to make easy is the scenario "I'm a new contributor, I am fixing a behavior, where do I put a test?

+1. This can be non-trivial for even experienced contributors.

…n_gb_tests_5 � Conflicts: � pandas/tests/groupby/test_function.py

…n_gb_tests_5 # Conflicts: # pandas/tests/groupby/test_function.py

pandas/tests/groupby/methods/test_corrwith.py

…n_gb_tests_5

mroeschke · 2023-10-12T16:27:48Z

pandas/tests/plotting/frame/test_frame.py

@@ -603,11 +603,11 @@ def test_area_lim(self, stacked):
        lines = ax.get_lines()
        assert xmin <= lines[0].get_data()[0][0]
        assert xmax >= lines[0].get_data()[0][-1]
-        assert ymin == 0
+        assert ymin == 0, ymin


Was this intended to be included in this PR?

Whoops - no, that was for another PR. Thanks, reverted.

…n_gb_tests_5

mroeschke · 2023-10-12T22:31:52Z

Thanks @rhshadrach. This is definitely a net positive

REF: Add tests.groupby.methods

0c8ab05

rhshadrach added Refactor Internal refactoring of code Testing pandas testing functions or related to the test suite Groupby Needs Discussion Requires discussion from core team before further action labels Sep 27, 2023

mroeschke added this to the 2.2 milestone Sep 28, 2023

mroeschke approved these changes Sep 28, 2023

View reviewed changes

rhshadrach requested a review from jbrockmendel October 7, 2023 14:52

jbrockmendel reviewed Oct 7, 2023

View reviewed changes

rhshadrach added 4 commits October 7, 2023 13:31

Merge branch 'main' of https://github.com/pandas-dev/pandas into clea…

5384d8f

…n_gb_tests_5 � Conflicts: � pandas/tests/groupby/test_function.py

Merge cleanup

10aaffb

Refactor

8448dd9

Refactor

e118109

rhshadrach requested a review from jbrockmendel October 7, 2023 17:44

rhshadrach added 2 commits October 8, 2023 21:31

Merge branch 'main' of https://github.com/pandas-dev/pandas into clea…

9c24ba6

…n_gb_tests_5 # Conflicts: # pandas/tests/groupby/test_function.py

Show value of ymin

dcb0626

jbrockmendel reviewed Oct 9, 2023

View reviewed changes

pandas/tests/groupby/methods/test_corrwith.py Show resolved Hide resolved

rhshadrach requested a review from jbrockmendel October 9, 2023 20:58

rhshadrach added 2 commits October 10, 2023 22:32

Merge branch 'main' of https://github.com/pandas-dev/pandas into clea…

78ded9b

…n_gb_tests_5

fixup

45455d6

mroeschke reviewed Oct 12, 2023

View reviewed changes

Merge branch 'main' of https://github.com/pandas-dev/pandas into clea…

4fe7688

…n_gb_tests_5

rhshadrach added 2 commits October 12, 2023 17:00

Revert

63002c0

Revert

1b741eb

mroeschke approved these changes Oct 12, 2023

View reviewed changes

mroeschke merged commit 9de2a19 into pandas-dev:main Oct 12, 2023
33 checks passed

rhshadrach deleted the clean_gb_tests_5 branch October 13, 2023 02:14

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

REF: Add tests.groupby.methods #55312

REF: Add tests.groupby.methods #55312

rhshadrach commented Sep 27, 2023

mroeschke commented Sep 28, 2023

lithomas1 commented Sep 28, 2023 •

edited

Loading

rhshadrach commented Sep 28, 2023 •

edited

Loading

lithomas1 commented Sep 28, 2023

rhshadrach commented Sep 29, 2023

lithomas1 commented Sep 29, 2023

jbrockmendel Oct 7, 2023

jbrockmendel Oct 7, 2023

jbrockmendel commented Oct 7, 2023

mroeschke Oct 12, 2023

rhshadrach Oct 12, 2023

mroeschke commented Oct 12, 2023

		assert df.groupby("user")["connections"].mean()["A"] == 3689348814740003840


		def test_mean_on_timedelta():

REF: Add tests.groupby.methods #55312

REF: Add tests.groupby.methods #55312

Conversation

rhshadrach commented Sep 27, 2023

mroeschke commented Sep 28, 2023

lithomas1 commented Sep 28, 2023 • edited Loading

rhshadrach commented Sep 28, 2023 • edited Loading

lithomas1 commented Sep 28, 2023

rhshadrach commented Sep 29, 2023

lithomas1 commented Sep 29, 2023

jbrockmendel Oct 7, 2023

Choose a reason for hiding this comment

jbrockmendel Oct 7, 2023

Choose a reason for hiding this comment

jbrockmendel commented Oct 7, 2023

mroeschke Oct 12, 2023

Choose a reason for hiding this comment

rhshadrach Oct 12, 2023

Choose a reason for hiding this comment

mroeschke commented Oct 12, 2023

lithomas1 commented Sep 28, 2023 •

edited

Loading

rhshadrach commented Sep 28, 2023 •

edited

Loading