GOING FOR THE NUMBERS
Failure Pattern summary
The automation team is pushed into automating as many tests as possible while disregarding test robustness
This failure pattern has been added by Michael Stahl. Failure patterns are also called "anti-patterns", as they are things you shouldn't do. They are Issues in this wiki.
Look out for this failure pattern if your management requires that you meet a schedule, but is not concerned with the quality of the automation
A common practice with test automation planning is to define the number of tests that are to be automated, draw a timeline for when it will all be done, and track performance against schedule.
A goal for the test team is to meet the schedule; maybe even get ahead of schedule. This looks good in progress reports. Slowdowns are very visible and put the automation team and the test manager on the spot for not meeting commitments.
Writing a robust automated test is not trivial. Note the “robust” attribute. Writing an automated test can be rather easy. It’s the robustness that’s hard to achieve. Robustness means the test is not sensitive to small fluctuations in system and environment response time; that it can deal gracefully when expected resources are not available; that it creates good logs, with a level of detail fitting for the result (more for Fail, less for Pass); that it can run on multiple platforms and operating systems; that it is well commented and easy to maintain, and more.
Achieving this is not simple at all. Sometimes, it means writing fail-tolerant library routines; sometimes it means adding features or major re-writes of the automation infrastructure. Even in the simplest cases it means new tests need to be run on many different machines to ensure they work on all configurations.
When progress is measured by “number of completed tests”, there is a strong push to declare tests as Done once their coding is completed and push the robustness work till later. In some way it’s synonym with developers declaring Done at Code Complete point. We all know they are far from done, since Testing is far from done.
Once the timeline is drawn and committed, it’s hard to explain schedule slips by “we are slower than plan since we make sure the tests are robust”. It sounds as credible as developers telling you “we are behind schedule since we are making an effort to write high quality code”. You are expected to write robust tests, right?
The real issue is that it is hard to “sell” an automation project if the effort estimation includes the huge work needed to make the tests robust. Worse: even when testers do try to add this effort into the timeline, they many times miscalculate how much added work is needed before a test is robust. So when automation plans are presented for planning or for approval, there is a tendency to gloss over the huge effort needed for achieving robustness.
The end result is that many automated test suites are not robust and create a lot of noise: false fails, halt of the testing progress, machine time loss and mountains of useless logs. It can get to the point that more time is wasted on analyzing and debugging test results than would have been invested in just running the tests manually.
Michael Stahl on his experiences:
There isn’t really a simple or smart way around the problem. In order to get robust automated tests you must invest in quality assurance of the tests and the test code.
What you should do though, is systemize this work, so you can (a) know what you are aiming for and what you get when a test is declared Done, and (b) you can start collecting metrics on the time and effort the robustness work takes, which is the first step to optimizing this activity.
For one of the automation projects I was involved with, I developed a detailed “Test Acceptance checklist” which listed everything that needs to be checked for a new test. There were ~30 items to check, starting with documentation, and ending with running the test on all target OSs. Each line by itself made a lot of sense. The overall result was 2-3 days’ work… It was good on paper, but useless in reality.
The checklist needs to be trimmed down, to the most critical elements – and this may change according to the context. But you need to have such a checklist, and a test is not Done until the checklist is completed.
The time spent on this activity will in most cases be saved later down the road, with tests that always run to completion and whose results are trustworthy.