Skip to content

Evaluation of testresults: 5 steps

Marcel Wigman edited this page Jun 10, 2021 · 71 revisions

Step 1: check if the testrun was succesfull

Evaluate the lateste loadtest result called "productie"

Was testrun successful
  • Check if metric Transactions failed is <10%, which is the number of transactions that returned an HTTP error or the content check did not meet the expectations
  • Check if metric Duration of the testrun is as expected for this type of test, usually 1 hour
  • Visually check if the progress of the test (graph) shows that average response times (blue) are stable, errors (red) are minimal and throughput (black) should be stable during whole test

Step 2: does the application meet the requirements?

Evaluate the lateste loadtest result called "productie"

Does appliation meet requirements summary Does appliation meet requirements transactions
  • Check if metric Threshold violations is 0. This number reflects the number of transactions that exceed the threshold, they are marked red in column 'evaluated'. All yellow marked transactions are just a warning for the developer, only red's are considered as violations. Report to developers the individual transactions that violated their requirement

Step 3: did the application speed improve or degrade?

Evaluate the lateste loadtest result called "productie"

Does application improve summary Does application improve transactions
  • Check if metric Baseline warnings is 0. This is the the number of transactions that degraded in performance more than 15% compared to the baseline. Such response times are marked red in column 'Delta'. Only transactions that are yellow or red are evalued in the baseline delta comparison
Does application improve trend
  • If a transaction violates its threshold, check when it started in the transaction response time history graph. If tests run on a daily basis the scope of changes is limited and one of them is responsible for the performance penalty. Walking this path will reduce your time to investigate a performance problem

Step 4: will the application keep being stable (endurance)?

Evaluate the latest endurance testresult called "duurtest"

Does application improve trend
  • Check if metric Trendbreak stability is 100% (breaks at 100% progress = does not break), see help on this metric.
  • Visually check if response times and throughput were stable during the full duration of the testrun

Step 5: to what level can the application be overloaded (stress)?

Evaluate the latest stresstest result called "stresstest"

Important: This test is meant to create an overload situation. During a stresstest, the application will reach a state of saturation: response times and error rates will rise as a result. Do not care about threshold violations and baseline warnings for this type of test, just concentrate on the breaking point

What is breaking point
  • Check what the value of metric Trendbreak scalability is compared to previous stresstests, see help on this metric. This metric reflects the percentage of the ultimate loadlevel at which the throughput drops or errors start occurring. Considering that the testenvironmemt is not powered to the level of a production environment, it is still interesting to see what the margins are and if the breaking point is moving or not. Be sure to apply load that is beyond what the application can handle in the environment used. If not this metric will be 100% and will not be of much use.
  • Visually check if the metric number reflects the test outcome, in some cases the algorythm used to calculate the metric does not draw the right conclusion
  • Check if the trendbreak metric is stable over previous tests, use the trend graph for this. If trendbreak scalability is decreasing over time, this means that the maximum load your application can handle is decreasing. Your team should be alerted if margins are getting critical
  • Please consider that maintenance jobs of any kind running simultaniously on the same hardware could impact this test. So if this metric is showing bad results, repeat the test or wait for another stresstest to be executed automatically to make sure the result is not an incident.