-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rationalize and expand testing model #24
Conversation
…nerate artifacts
…simplified global TestRunner tag=value configuration as TestRunSession.test_run_parameters slot; further documented usage in a few places; removed some TestCase subclasses that felt superfluous given the aforementioned TestRunner configuration changes.
type TestRunSession | ||
{ | ||
id: Uriorcurie! | ||
name: String | ||
description: String | ||
tags: [String] | ||
testRunnerSettings: [String] | ||
components: [ComponentEnum] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think components should be all the way up at the TestRunSession. I think it needs to stay down at TestCase. We could conceivably want to run tests against different components in the same Test Run (Session)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A use case could be TRAPI validation. We would hit all levels, KPs, ARAs, ARS, make sure that none of them are breaking the standard.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @maximusunc, components
are a List[str]
of infores identifiers of target components
. The implicit assumption that was made (perhaps incorrect, I guess) was that all of the test_entities
(test data - TestAssets/Cases/Suites) of a given TestRunSession will be run against all of the specified components
. If a TestRunSession is a heterogeneous mix of testing targets for which the data is not mapped one-to-one, then of course, this assumption is violated. I guess it just depends how one decides to cut the cake.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As promised @maximusunc (cc: @sierra-moxon), I will try to express my sense of the testing landscape. I suspect there is no "right answer" to this... that your current implementation is equally tenable... Just trying to "think out loud" here.
Anyhow, I would see the planning of the test process as an ordered series of test configuration decisions. Here's an order I have sort of followed in earlier testing work:
- which component do you wish to test (UI, ARS, ARA, KP, ??)
- within which environment are you running the component to be tested?
- which category of test ("TestRunner") do you which to run against the component
- within what global context do you wish to test (e.g. TRAPI version, Biolink version, ???)
- what test data (TestAssets/Cases/Suites) do you wish to use as input?
- when do you want to run the test?
- how do you wish to assess the test results?
My general sense is that the answers to all of the above questions could be aggregated within a top level TestRunSession declaration.
What I understand, however, is that from your perspective, a TestRunSession is mainly a scheduled (timestamped) container for any ad hoc set of test data (TestAssets/Cases/Suites) to be processed, and that the test data itself will specify where it is consumed (i.e. which component) thus different members of the test_entities
array of test data could be routed to different target components, albeit, all processed by the same specified TestRunner.
Nothing wrong with that, I guess... one is simply embedding the answer to question 1 above, inside the test data returned in response to question 5. On that note, would the test_env
also belong there as well, or does that parameter tend to be more global to a given test run?
Just an alternate way of organizing the testing... "cutting the cake..." LOL. In terms of granularity, I might tend to suggest that the granularity for this is the TestSuite - in that components might "own" a TestSuite, right (even if in some cases, there is duplication of test data in TestSuite data across components).
I guess that is actually what the legacy SRI_Testing test harness did to some extent, since the test data was resolved from KP's using their Translator SmartAPI 'info.x-trapi.test_data_location
' property. that said, the top level test run specification decision of SRI_Testing was still to specify which component(s) were to be tested.... OK... just a different way of cutting the cake.
In the context of the above list of questions, I should probably ask your opinion about question 4. In particular, I suppose that TestRunner specific parameters like TRAPI version and Biolink version (e.g. for a standards validation test runner) could be pushed down into the TestSuite level, if it is the case that specific TestSuite instances are run as checks on standards. Alternately, if one expects to generally run the standards validation as an overlay test using many of the test data sets used in other tests, then where is it best specified? The TestRunSession seemed to be the best location but if so, should these Translator level declarations be specified at the TestHarness level CLI (perhaps defaulting to the 'current' consortium releases, if not specified)?
Come to think of it, the TestRunner also currently seems to be specified at the TestCase level. This seems very granular. Is there a possible rationale for moving this up to the TestSuite level, or even, the TestRunSession level (actually, there is already a test_runner_name
slot in TestRunSession which reflects this idea...). More alternative ways of cutting the cake?
Any further thoughts @maximusunc and @sierra-moxon on any or all of the above?
@@ -53,6 +53,10 @@ class TestObjectiveEnum(str, Enum): | |||
BenchmarkTest = "BenchmarkTest" | |||
# Quantitative test | |||
QuantitativeTest = "QuantitativeTest" | |||
# One Hop Tests of knowledge graph navigation | |||
OneHopTests = "OneHopTests" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be plural?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess not, for consistency. I'll change that. Done
# One Hop Tests of knowledge graph navigation | ||
OneHopTests = "OneHopTests" | ||
# TRAPI and Biolink Model ("reasoner-validator") validation | ||
StandardsValidation = "StandardsValidation" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we slap a "Test" on the end of this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or remove "Test" from all?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK. Done
|
||
dummy = "dummy" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the goal here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no idea.. the file was autogenerated (maybe @sierra-moxon knows...) This is an open-ended "reachable_from" Enum. Perhaps the LinkML code generator needs further work to spit out something more sensible?
@@ -558,6 +487,7 @@ class StandardsComplianceTestSuite(TestSuite): | |||
name: Optional[str] = Field(None, description="""A human-readable name for a Test Entity""") | |||
description: Optional[str] = Field(None, description="""A human-readable description for a Test Entity""") | |||
tags: Optional[List[str]] = Field(default_factory=list, description="""A human-readable tags for categorical memberships of a TestEntity (preferably a URI or CURIE). Typically used to aggregate instances of TestEntity into formally typed or ad hoc lists.""") | |||
test_runner_settings: Optional[List[str]] = Field(default_factory=list, description="""Scalar parameters for the TestRunner processing a given TestEntity.""") | |||
|
|||
|
|||
class OneHopTestSuite(TestSuite): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does the OneHopTestSuite not need a TestCase class?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same for TRAPIValidation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@maximusunc, isn't that covered by the class attribute:
test_cases: Optional[Dict[str, TestCase]] = Field(default_factory=dict, description="""List of explicitly enumerated Test Cases.""")
That being said, I'm less convinced that these subclasses of TestSuite - i.e. AcceptanceTestSuite, BenchmarkTestSuite (this one looks weird...), StandardsComplianceTestSuite and OneHopTestSuite - are even necessary (anymore, if they ever were) and could perhaps be removed from the model. @sierra-moxon .. any thoughts about this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we use TestCase for the runner to help collate assets into a single query. Happy to talk this through on a call.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm fine with removing these subclass Suite classes. In the Test Harness, I basically start with grabbing a single TestSuite, and then for each TestCase, I look at the specified test_case_objective to determine which Runner to send it to.
I would also argue that a Suite can have different types of TestCases within it, so these subclass Suites might not even actually fit well.
… project artifacts
…ing but with better definitions) * main: (25 commits) adding additional suites enable dynamic versioning remove support for pydantic 1 add new asset file add new asset file remove debugging statements pass entire dict instead of just identifier to dump method refactoring methods to make code cleaner add categories to chemical gene merge in work with updated schema, not using TestRun yet merge in work with updated schema, not using TestRun yet Rationalize and expand testing model (#24) making a new release of TranslatorTesting Model fix up parsing script, update pf file strip whitespace minor changes to semantics of the name Update test names to include expected output Fixed TestObjectiveEnum values as per feedback from Max ditto Repaired example data to fix unit tests ... # Conflicts: # src/translator_testing_model/schema/translator_testing_model.yaml
test_env
andtest_runner_settings
upwards to TestRunSessiontest_run_parameters
tag=value' slot in TestRunSession to allow for global configuration of TestRunnerrunner_settings
totest_runner_settings
and move upwards to TestEntity, for fine grained test-data driven local configuration of a TestRunner