-
Notifications
You must be signed in to change notification settings - Fork 704
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
KEP-2170: Add unit and E2E tests for model and dataset initializers #2323
base: master
Are you sure you want to change the base?
Conversation
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
# Private HuggingFace dataset test | ||
# ( | ||
# "HuggingFace - Private dataset", | ||
# "huggingface", | ||
# { | ||
# "storage_uri": "hf://username/private-dataset", | ||
# "use_real_token": True, | ||
# "expected_files": ["config.json", "dataset.safetensors"], | ||
# "expected_error": None | ||
# } | ||
# ), | ||
# Invalid HuggingFace dataset test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have an access token for testing login and downloading resources from private repo?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not yet, maybe we can track this in a separate issue that we should create Kubeflow-owned account in HF for the Token.
current_dir = os.path.dirname(os.path.abspath(__file__)) | ||
self.temp_dir = tempfile.mkdtemp(dir=current_dir) | ||
os.environ[VOLUME_PATH_DATASET] = self.temp_dir |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I currently test the dataset/model download by downloading resources to a temp folder and removing the temp folder after the test.
run: pytest ./sdk/python/kubeflow/training/api/training_client_test.py | ||
run: | | ||
pytest ./sdk/python/kubeflow/training/api/training_client_test.py | ||
pytest ./pkg/initializer_v2/test/unit |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I currently put the unit test under the training SDK step. Should I add another step?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, let's add another steps called: Run Python unit tests for v2.
@pytest.fixture | ||
def real_hf_token(): | ||
"""Fixture to provide real HuggingFace token for E2E tests""" | ||
token = os.getenv("HUGGINGFACE_TOKEN") | ||
# if not token: | ||
# pytest.skip("HUGGINGFACE_TOKEN environment variable not set") | ||
return token |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we have a private token, I will use this fixture to inject the token. If we don't, I can remove this.
f4167e5
to
f6345df
Compare
Pull Request Test Coverage Report for Build 11758432746Details
💛 - Coveralls |
f6345df
to
1887c5b
Compare
python3 -m pip install -e sdk/python; pytest -s sdk/python/test --log-cli-level=debug --namespace=default | ||
env: | ||
GANG_SCHEDULER_NAME: ${{ matrix.gang-scheduler-name }} | ||
|
||
- name: Run specific tests for Python 3.10+ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since match
is released in python 3.10, I created another step for the e2e.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where do you use match in the tests ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't use match
in the tests. match
is used in https://github.com/kubeflow/training-operator/blob/master/pkg/initializer_v2/model/__main__.py#L23 and https://github.com/kubeflow/training-operator/blob/master/pkg/initializer_v2/dataset/__main__.py#L23
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, good point.
Let's actually use the same Python version that we use in our initializer images: https://github.com/kubeflow/training-operator/blob/master/cmd/initializer_v2/dataset/Dockerfile#L1.
E.g. Python 3.11
"HuggingFace - Public dataset", | ||
"huggingface", | ||
{ | ||
"storage_uri": "hf://karpathy/tiny_shakespeare", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does anyone know which dataset
/model
in huggingface is suitable for the connectivity test?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@seanlaii Which connectivity test do you want to perform ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would like to test the actual downloading process and would like to know if there is any recommended dataset/model for testing. I currently choose a dataset that is only 1.11 MB.
bd1c8fd
to
8930b80
Compare
Signed-off-by: wei-chenglai <[email protected]>
8930b80
to
c6e0a83
Compare
Hi @andreyvelich , Could you help review this PR? I have some questions. Once the SDK's PR gets approved, I will modify it accordingly. Thank you! |
@seanlaii Sorry for the delay, sure, I will review it today |
@@ -0,0 +1,52 @@ | |||
import os |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would suggest we put the e2e tests under /test/e2e/initializer_v2/....
and the unit tests close to the actual files, e.g. /pkg/initializer_v2/dataset/huggingface_test.py
.
That is what we do for Go, also we've done the same for SDK V1 unit tests: https://github.com/kubeflow/training-operator/tree/c6e0a832afd019a7d1fa8fa9442b81caf53b54c0/sdk/python/kubeflow/training/api
WDYT @seanlaii @kubeflow/wg-training-leads @Electronic-Waste @droctothorpe ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@andreyvelich I agree with you since it would be more clear.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good. Thanks for the advice! I will change it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for this effort @seanlaii!
I left my initial thoughts.
Please take a look @Electronic-Waste @deepanker13 @kubeflow/wg-training-leads @varshaprasad96 @akshaychitneni @saileshd1402
python3 -m pip install -e sdk/python; pytest -s sdk/python/test --log-cli-level=debug --namespace=default | ||
env: | ||
GANG_SCHEDULER_NAME: ${{ matrix.gang-scheduler-name }} | ||
|
||
- name: Run specific tests for Python 3.10+ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where do you use match in the tests ?
run: pytest ./sdk/python/kubeflow/training/api/training_client_test.py | ||
run: | | ||
pytest ./sdk/python/kubeflow/training/api/training_client_test.py | ||
pytest ./pkg/initializer_v2/test/unit |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, let's add another steps called: Run Python unit tests for v2.
@@ -0,0 +1,86 @@ | |||
import runpy |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we want to use runpy to execute the tests ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you really need to execute __main__.py
as part of your unit tests ?
E.g. the entire logic can be tested in the dataset/huggingface_test.py
and model/huggingface_test.py
I guess, you can verify that __main__.py
executes correctly as part of your E2E tests.
WDYT @seanlaii ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I can verify the __main__.py
in the E2E tests.
Or perhaps we can wrap the logic in the script to a function, e.g., main()
, so we can avoid using runpy
to run the script, and just validate the main()
function in huggingface_test.py
.
The reason I try to execute the script is mainly to validate the overall flow and some exceptions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, that makes sense. Maybe we should wrap our logic under main()
func in the __main__.py
file given that usually runpy
is used for integration tests, not for unit testing.
So we can have this. __main__.py
:
def main():
# logic here.
if __name__ == "__main__":
main()
main_test.py
: Contains unit tests for the main()
function.
WDYT @seanlaii ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it sounds good to me.
from pkg.initializer_v2.model.config import HuggingFaceModelInputConfig | ||
|
||
|
||
def test_huggingface_model_config_creation(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we use @pytest.mark.parametrize with tests cases here for consistency across all unit tests ?
@@ -0,0 +1,86 @@ | |||
import runpy |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would name this file huggingface_test.py
where we are going to unit tests all functionality from the huggingface.py file.
@@ -0,0 +1,25 @@ | |||
import pytest |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can name this as utils_test.py
from sdk.python.kubeflow.storage_initializer.constants import VOLUME_PATH_MODEL | ||
|
||
|
||
class TestModelE2E: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@seanlaii @kubeflow/wg-training-leads @deepanker13 @Electronic-Waste @saileshd1402 What do you think about actually using Kubernetes to perform E2E tests for our initializers ?
E.g. we can deploy a single Pod that runs two initContainer for initializers and one Container to just verify that model and dataset exists under /workspace/model
and /workspace/dataset
dirs.
In that case, in our E2Es we verify that our Docker containers actually work to initialize assets.
Do we see any values in tests that I propose compare to running just initializers Python scripts ?
@@ -0,0 +1,86 @@ | |||
import runpy |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you really need to execute __main__.py
as part of your unit tests ?
E.g. the entire logic can be tested in the dataset/huggingface_test.py
and model/huggingface_test.py
I guess, you can verify that __main__.py
executes correctly as part of your E2E tests.
WDYT @seanlaii ?
What this PR does / why we need it:
I added unit tests and e2e tests for model and dataset initializers.
Which issue(s) this PR fixes (optional, in
Fixes #<issue number>, #<issue number>, ...
format, will close the issue(s) when PR gets merged):Fixes #2305
Checklist: