Rationalize and expand testing model #24

RichardBruskiewich · 2024-03-04T20:49:30Z

Added OneHopTests and StandardsValidation to TestObjectiveEnum
Refactored test_env and test_runner_settings upwards to TestRunSession
Embed a multivalued test_run_parameters tag=value' slot in TestRunSession to allow for global configuration of TestRunner
Renamed runner_settings to test_runner_settings and move upwards to TestEntity, for fine grained test-data driven local configuration of a TestRunner
Removed unneeded TestCase subclasses

…nerate artifacts

…simplified global TestRunner tag=value configuration as TestRunSession.test_run_parameters slot; further documented usage in a few places; removed some TestCase subclasses that felt superfluous given the aforementioned TestRunner configuration changes.

maximusunc · 2024-03-07T14:26:31Z

project/graphql/translator_testing_model.graphql

 type TestRunSession
  {
    id: Uriorcurie!
    name: String
    description: String
    tags: [String]
+    testRunnerSettings: [String]
+    components: [ComponentEnum]


I don't think components should be all the way up at the TestRunSession. I think it needs to stay down at TestCase. We could conceivably want to run tests against different components in the same Test Run (Session)

A use case could be TRAPI validation. We would hit all levels, KPs, ARAs, ARS, make sure that none of them are breaking the standard.

Hi @maximusunc, components are a List[str] of infores identifiers of target components. The implicit assumption that was made (perhaps incorrect, I guess) was that all of the test_entities (test data - TestAssets/Cases/Suites) of a given TestRunSession will be run against all of the specified components. If a TestRunSession is a heterogeneous mix of testing targets for which the data is not mapped one-to-one, then of course, this assumption is violated. I guess it just depends how one decides to cut the cake.

As promised @maximusunc (cc: @sierra-moxon), I will try to express my sense of the testing landscape. I suspect there is no "right answer" to this... that your current implementation is equally tenable... Just trying to "think out loud" here.

Anyhow, I would see the planning of the test process as an ordered series of test configuration decisions. Here's an order I have sort of followed in earlier testing work:

which component do you wish to test (UI, ARS, ARA, KP, ??)

within which environment are you running the component to be tested?

which category of test ("TestRunner") do you which to run against the component

within what global context do you wish to test (e.g. TRAPI version, Biolink version, ???)

what test data (TestAssets/Cases/Suites) do you wish to use as input?

when do you want to run the test?

how do you wish to assess the test results?

My general sense is that the answers to all of the above questions could be aggregated within a top level TestRunSession declaration.

What I understand, however, is that from your perspective, a TestRunSession is mainly a scheduled (timestamped) container for any ad hoc set of test data (TestAssets/Cases/Suites) to be processed, and that the test data itself will specify where it is consumed (i.e. which component) thus different members of the test_entities array of test data could be routed to different target components, albeit, all processed by the same specified TestRunner.

Nothing wrong with that, I guess... one is simply embedding the answer to question 1 above, inside the test data returned in response to question 5. On that note, would the test_env also belong there as well, or does that parameter tend to be more global to a given test run?

Just an alternate way of organizing the testing... "cutting the cake..." LOL. In terms of granularity, I might tend to suggest that the granularity for this is the TestSuite - in that components might "own" a TestSuite, right (even if in some cases, there is duplication of test data in TestSuite data across components).

I guess that is actually what the legacy SRI_Testing test harness did to some extent, since the test data was resolved from KP's using their Translator SmartAPI 'info.x-trapi.test_data_location' property. that said, the top level test run specification decision of SRI_Testing was still to specify which component(s) were to be tested.... OK... just a different way of cutting the cake.

In the context of the above list of questions, I should probably ask your opinion about question 4. In particular, I suppose that TestRunner specific parameters like TRAPI version and Biolink version (e.g. for a standards validation test runner) could be pushed down into the TestSuite level, if it is the case that specific TestSuite instances are run as checks on standards. Alternately, if one expects to generally run the standards validation as an overlay test using many of the test data sets used in other tests, then where is it best specified? The TestRunSession seemed to be the best location but if so, should these Translator level declarations be specified at the TestHarness level CLI (perhaps defaulting to the 'current' consortium releases, if not specified)?

Come to think of it, the TestRunner also currently seems to be specified at the TestCase level. This seems very granular. Is there a possible rationale for moving this up to the TestSuite level, or even, the TestRunSession level (actually, there is already a test_runner_name slot in TestRunSession which reflects this idea...). More alternative ways of cutting the cake?

Any further thoughts @maximusunc and @sierra-moxon on any or all of the above?

maximusunc · 2024-03-07T14:36:19Z

src/translator_testing_model/datamodel/pydanticmodel.py

@@ -53,6 +53,10 @@ class TestObjectiveEnum(str, Enum):
    BenchmarkTest = "BenchmarkTest"
    # Quantitative test
    QuantitativeTest = "QuantitativeTest"
+    # One Hop Tests of knowledge graph navigation
+    OneHopTests = "OneHopTests"


Should this be plural?

I guess not, for consistency. I'll change that. Done

maximusunc · 2024-03-07T14:36:34Z

src/translator_testing_model/datamodel/pydanticmodel.py

+    # One Hop Tests of knowledge graph navigation
+    OneHopTests = "OneHopTests"
+    # TRAPI and Biolink Model ("reasoner-validator") validation
+    StandardsValidation = "StandardsValidation"


Should we slap a "Test" on the end of this?

Or remove "Test" from all?

maximusunc · 2024-03-07T14:37:39Z

src/translator_testing_model/datamodel/pydanticmodel.py


+    dummy = "dummy"


What's the goal here?

no idea.. the file was autogenerated (maybe @sierra-moxon knows...) This is an open-ended "reachable_from" Enum. Perhaps the LinkML code generator needs further work to spit out something more sensible?

maximusunc · 2024-03-07T14:43:27Z

src/translator_testing_model/datamodel/pydanticmodel.py

@@ -558,6 +487,7 @@ class StandardsComplianceTestSuite(TestSuite):
    name: Optional[str] = Field(None, description="""A human-readable name for a Test Entity""")
    description: Optional[str] = Field(None, description="""A human-readable description for a Test Entity""")
    tags: Optional[List[str]] = Field(default_factory=list, description="""A human-readable tags for categorical memberships of a TestEntity (preferably a URI or CURIE). Typically used to aggregate instances of TestEntity into formally typed or ad hoc lists.""")
+    test_runner_settings: Optional[List[str]] = Field(default_factory=list, description="""Scalar parameters for the TestRunner processing a given TestEntity.""")


 class OneHopTestSuite(TestSuite):


Does the OneHopTestSuite not need a TestCase class?

Same for TRAPIValidation

@maximusunc, isn't that covered by the class attribute:

test_cases: Optional[Dict[str, TestCase]] = Field(default_factory=dict, description="""List of explicitly enumerated Test Cases.""")

That being said, I'm less convinced that these subclasses of TestSuite - i.e. AcceptanceTestSuite, BenchmarkTestSuite (this one looks weird...), StandardsComplianceTestSuite and OneHopTestSuite - are even necessary (anymore, if they ever were) and could perhaps be removed from the model. @sierra-moxon .. any thoughts about this?

I think we use TestCase for the runner to help collate assets into a single query. Happy to talk this through on a call.

I'm fine with removing these subclass Suite classes. In the Test Harness, I basically start with grabbing a single TestSuite, and then for each TestCase, I look at the specified test_case_objective to determine which Runner to send it to.
I would also argue that a Suite can have different types of TestCases within it, so these subclass Suites might not even actually fit well.

… project artifacts

…ing but with better definitions) * main: (25 commits) adding additional suites enable dynamic versioning remove support for pydantic 1 add new asset file add new asset file remove debugging statements pass entire dict instead of just identifier to dump method refactoring methods to make code cleaner add categories to chemical gene merge in work with updated schema, not using TestRun yet merge in work with updated schema, not using TestRun yet Rationalize and expand testing model (#24) making a new release of TranslatorTesting Model fix up parsing script, update pf file strip whitespace minor changes to semantics of the name Update test names to include expected output Fixed TestObjectiveEnum values as per feedback from Max ditto Repaired example data to fix unit tests ... # Conflicts: # src/translator_testing_model/schema/translator_testing_model.yaml

Added OneHopTests and StandardsValidation to TestObjectiveEnum; rege…

c2a1cc6

…nerate artifacts

RichardBruskiewich requested review from maximusunc and sierra-moxon March 4, 2024 20:50

RichardBruskiewich added 4 commits March 4, 2024 17:23

Subclass KnowledgeGraphNavigationTestCase from ComplianceTestCase

d469c1a

Repaired example data to fix unit tests

15b6163

ditto

4b38ace

RichardBruskiewich changed the title ~~Added OneHopTests and StandardsValidation to TestObjectiveEnum; regenerate artifacts~~ Rationalize and expand testing model Mar 7, 2024

maximusunc reviewed Mar 7, 2024

View reviewed changes

Fixed TestObjectiveEnum values as per feedback from Max

51964bb

sierra-moxon approved these changes Mar 20, 2024

View reviewed changes

Merge branch 'main' into add-new-test-object-enum and regenerated all…

f1268b3

… project artifacts

RichardBruskiewich merged commit 218aa0d into main Mar 21, 2024
2 checks passed

RichardBruskiewich deleted the add-new-test-object-enum branch March 21, 2024 21:19

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rationalize and expand testing model #24

Rationalize and expand testing model #24

RichardBruskiewich commented Mar 4, 2024 •

edited

Loading

maximusunc Mar 7, 2024

maximusunc Mar 7, 2024

RichardBruskiewich Mar 8, 2024

RichardBruskiewich Mar 8, 2024 •

edited

Loading

maximusunc Mar 7, 2024

RichardBruskiewich Mar 8, 2024

maximusunc Mar 7, 2024

maximusunc Mar 7, 2024

RichardBruskiewich Mar 8, 2024

maximusunc Mar 7, 2024

RichardBruskiewich Mar 8, 2024

maximusunc Mar 7, 2024

maximusunc Mar 7, 2024

RichardBruskiewich Mar 8, 2024

sierra-moxon Mar 20, 2024

maximusunc Mar 21, 2024

Rationalize and expand testing model #24

Rationalize and expand testing model #24

Conversation

RichardBruskiewich commented Mar 4, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

RichardBruskiewich Mar 8, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

RichardBruskiewich commented Mar 4, 2024 •

edited

Loading

RichardBruskiewich Mar 8, 2024 •

edited

Loading