-
Notifications
You must be signed in to change notification settings - Fork 0
Flight Evaluation Report explained
The 4.x.x release added a new level of detail in the telemetry data collected during test execution. Thanks to these enhancements, the report can contain the following details about each run:
- launchId - A unique identifier of each launch
- start time
- end time
- class name - The name of the class or feature file which was launched
- method name - The name of the method or scenario launched
- stage name - An indicator, whether the launch was a countdown (test instance preparation) or mission (actual test execution)
- thread name - The name of the Java thread used for running the test
- matchers - The Abort-Mission matchers matching to the run
- result - The result of the launch (success, failure, aborted, suppressed)
- display name - The name of the launch using the display name notation of the test runner (e.g. the name used by Jupiter, Cucumber or others)
- exception class - The type of the exception thrown if the launch ended with an exception
- exception message - The message of the aforementioned exception if there was any
- exception stack trace - The stack trace entries of the exception if there was any, filtered based on the global configuration
This can help with fine-tuning the matcher and evaluator configurations, identification of which dependency was failing during the run, as well as let you see what your tests were doing and when. This can help with identification of issues related to poor test case isolation or simply help you understand which actions contributed to the test failures you are facing.
In order to let you get the most out of this tool, let us go through the contents of the report one by one.
The picture below provides a quick look at the new report look and feel.
If you want to try it in action, please feel free to check out this example generated by Jupiter tests, or this one using Cucumber.
The report can be split into smaller sections. Please see these on the following picture:
This section contains a quick summary of each test execution, showing how many test runs were observed with each outcome (Success, Failure, Abort, Suppression) as well as sharing time statistics about them. Please keep in mind, that these tests may be running on many threads, so the sum of their individual execution times can differ from the actual time we can observe when waiting for our tests.
When filtering is active (i.e. filter rules were added using the filter inputs), a few rows appear right below the filter inputs. These indicate what the current filter criteria are and how they affect the rest of the report. This looks like the items on the picture below.
As indicated on the picture, this section has 3 important groups:
- Time based filtering boundaries - Showing which timestamps are used of
end after
orstart before
filtering criteria. These are evaluated as follows:- The end after timestamp means, that only those runs will be kept, which completed their execution after the given moment. Despite the name, this is practically the starting boundary of the log records due to the semantics (if a run ends before the timestamp, it is filtered out).
- The start before timestamp means, that only those runs will be kept, which began their execution before the given moment. Same as before, this is practically the ending boundary of the log records due to the semantics (if a run starts after the timestamp, it is filtered out).
- Inclusion criteria - Defines which classes/features, methods/scenarios, matchers, threads, results, stages should be included in the event log. An event is included if there are no inclusion criteria, or if any of the criteria are matching the event.
- Exclusion criteria - Further refines the result set of the inclusion by removing selected matching events. A previously included event is excluded if any of the exclusion criteria match to the event in question.
This section becomes visible when the execution overview link is clicked in any of the events belonging to the class/feature displayed by the overview. Please see the different parts of the overview in the picture below:
As seen on the picture above, the overview is focusing on exactly one class/feature. It can be closed using the relevant link in the top right corner of the section.
Regarding the contents, it displays the name and statistics of the class/feature in total, and each method/scenario it contains, including the countdowns as well. Each of these have a name in the first valuable cell of the row. These cells are color-coded by the worst outcome of the unit they display. The results go from success (best), suppressed, aborted, to failure (worst). If the given unit has no matching run, the cell is left with the default/empty background.
There is a brief header in the section containing execution time information like:
- Start time - The start of the first execution related to the class/feature
- End time - The end of the last execution related to the class/feature
The following information are displayed for each unit:
- Name
- Count - Number of runs
- Execution time - The total duration spent on the execution of the matching runs. If the execution is multi-threaded, then the value is the sum of the duration from each thread.
- Min - The shortest duration measured for a single countdown or mission execution in the scope of the entry
- Average - The average duration measured for countdown or mission executions in the scope of the entry
- Max - The longest duration measured for a single countdown or mission execution in the scope of the entry
- Success - The number of successful countdown or mission executions in the scope of the entry
- Failure - The number of failed countdown or mission executions in the scope of the entry
- Abort - The number of aborted countdown or mission executions in the scope of the entry
- Suppressed - The number of countdown or mission executions in the scope of the entry when the abort was suppressed
Additionally, each countdown/mission row contains additional information when expanded, such as:
- Start time - The start of the first execution of a matching run
- End time - The end of the last execution of a matching run
- Relevant matchers - Lists all of the matchers which were used/matched during the execution of the test class/method it belongs to
The execution overview has two modes. By default it is only displaying information about the rows which were not filtered out. This may be less informative at times, therefore you can switch to the alternative view which is displaying full information about the class/feature by ignoring the filtering criteria.
In addition to the information provided, the execution overview has an important role in convenient exclusion of certain runs. Please notice the related links in the section.
The next section is the event log. As you will see on the next picture, it is responsible for displaying information in chronological order. Each row (or row group) represents a millisecond where something happened during execution. This time information is saved into the first column, followed by an illustration showing the activity of each testing thread, then followed by the relevant messages detailing what happened at that moment.
Each thread lifeline and log entry is clickable. Clicking them will highlight the start and end entries as well as the relevant thread lifeline. The thread lifelines are color-coded based on their stage and outcome. The base colors are the same as in case of the execution overview section. Countdowns are using darker shades of the colors compared to missions.
In case time based filtering is active, the end after and start before boundaries are displayed using prominent blue lines as seen in the example below:
Please note, that some items which start or end outside of the boundary can be displayed as well (as long as they had an active execution lifeline during any moment in the boundary).