What I dislike about “The Cucumber Book, Second Edition”: How it relates to software development

On page 97, there is a section called “Tester Apartheid:”

“Testers are too often unfairly regarded as second-class citizens on a software team.”

As someone who has done significant work in the developer, test, and QA roles, I have seen many cases where such a view is understandable. Sometimes it is because testers don’t have any more than a shallow understanding of what they’re doing and/or don’t have tools that are really capable for the very important, open-ended, and challenging task that they are faced with.

Personally, I would like to fix that “apartheid” that the authors write about. That is an important part of my work.

I’m unhappy with voices in the test/QA community who promote things that would unintentionally make the apartheid worse, rather than better. I think this book does that, which is ironic, because it talks so much about bringing the different roles on a software development team closer together.

For example, on page 195: Gherkin keywords are supported in “Pirate” or “LOLCAT?” That could be fun but for hard-working team members it must look like a big waste of time, and it could make the team’s work unreadable, so that sounds like a terrible idea to even mention such a capability, let alone support it in Cucumber.

Another example is on pages 72-74, with the section “Nesting Steps” which makes it look like it would be very awkward to have more than two levels of abstraction, but still, it could be useful for managing complexity; except, on page 75: “Often it makes sense to use the Ruby methods directly rather than obscuring them with layers of nested steps.” Immediately following this sentence on page 75 is another section “The Dangers of Nested Steps” gives good reasons to not use nested steps with Cucumber at all.

Deprecating the capability completely would be a good choice for the tools’ authors, and with languages or tools that are used for serious product development, it’s frequently done. Why doesn’t Cucumber do this?

For another example of making the “tester apartheid” worse, the authors use Ruby for glue code that implements the steps of the Gherkin definition, although they also point out that other languages work for implementing the steps. Ruby and Gherkin are both designed to make a software system under test (SUT) do things, but they are both poorly equipped to manage the complexity of a modern software system or the information that comes from driving and measuring it. For one thing, they both defer errors to runtime, which can slow people down much more than necessary. (Ruby once did this to me and caused me to waste 15 minutes, in a way that never would have happened if I’d had a compiler on my side.)

Then there is this problem: the authors don’t seem to understand that, although searching for and finding bugs is important, repeatable automation isn’t good at finding bugs; however repeatable automation is essential for keeping quality moving forward. The team needs assurance that the system still does what the team needs it to do, despite the code churn that comes with adding new features or refactoring existing ones. This shows up on page 5: “This reveals the true value of acceptance tests: as a communication and collaboration tool” and on page 71: "Just remember that you'll never be able to prove there are no bugs." I know that the meme “test is all about finding bugs” started in 1979 with Glenford Myers’ book and it’s highly influential, but it’s also wrong and tends to hide the value of repeatable automation for quality management as part of a software project; if the automation doesn’t find a lot of bugs, and the team believes Myers’ assertion (I write about this extensively elsewhere), then they will dismiss the QA role as underproductive.

This crops up on page 88 as an important missed opportunity for the authors: “… if the user-experience people decided to change the wording on the submit button from Login to Log in, the scenario will fail, for no good reason at all.” HTML supports the “id” attribute for most elements and “name” for many. These are good for client JavaScript, of course, but also for non-displayed identification for elements. Why would the authors assume that automation would navigate around a page by displayed text? Do they now know that this will fail with a simple superficial design change to the page, or localizing the page to anything different from the original locale? Using displayed text to script a page is weak strategy. If the pages are dynamically generated, the IDs are very useful to identify elements as they show up in e.g. a table where the number of elements varies. This is where edits to XSL on the server side is a very powerful way to generate IDs that client automation can find (as I did at Research In Motion years ago). I guess the authors don’t have any knowledge about this because they don’t offer a better approach than searching for displayed text on the web page. The missed opportunity here tends to make the “tester apartheid” problem worse again.

There’s another miss on page 99: “… some of the behavior you’ve specified in Cucumber scenarios could be pushed down and expressed in fast unit tests instead.” This ignores the important differences between unit tests and full-system tests for risk management; what about dependencies, internal and external? What about integration? The “push down” the authors recommend might or might be a good idea but it does create risks so there’s a missed opportunity to describe those quality risks so the reader knows about them and can manage them in different ways. This omission crops again on page 111: “… we’ll probably move this check further down the testing stack into a unit test for the Account and take it out of the step definition.” There is nothing in the book about how integration tests and system tests differ from unit tests in terms of quality risk management.

I find this sentence on page 37 very strange: “Step definitions sit right on the boundary between the business’s domain and the programmer’s domain.” Aren’t programmers part of the business? We’re better off thinking about the business in the big picture, and programmers are an important part of it.

On the topic of business, the book and methodology use a simplistic model of requirements; apparently, requirements are only written down as scenarios in Gherkin language. All requirements are functional requirements, and so are measurable by automation. Implementation-independent business requirements don’t seem to exist at all in the Cucumber approach. This simplistic view on requirements creates a risk of what Goldsmith (in his 2004 book on requirements, here) calls “requirements creep” where design changes can lose sight of customer needs because design and customer needs were never connected very well in the first place.

On the other hand, maybe business requirements are what the authors dismissively refer to as “vague requirement statements” on page 27. They really need to pick up Goldsmith’s book!

The Cucumber method also puts all the product requirements in code owned by the QA role, which I’m guessing creates difficulties when any changes to what the product is expected to do (functional requirements) have to go through QA.

The authors write about requirements discovery on page 35, and note that this is “One of the greatest benefits of working with a tool like Cucumber…” completely missing (or, intentionally dismissing) the fact that there are other and more effective ways of approaching requirements discovery.

There are other places in the book where the authors seem to make “tester apartheid” worse. On page 122, they are writing product code for an automated teller machine: “It’s odd that when we designed this method from the outside we thought we’d need the account parameter, but now we don’t seem to need it.” They just implemented (and, instructed the reader how to implement) a security hole big enough to drive an armored money-truck through, but they don’t fix it until page 130 (in the next chapter), and even then, they can’t bother themselves enough to consider double-entry bookkeeping that has been in use since it was invented about 1,000 years ago. Is it any wonder developers don’t respect the people doing “test automation?”

However, on page 153, they write to “… make our system more enterprisey…” which is IMO a bogus word. How about a more robust or extensible, scalable, cloud-enabled or segregated architecture? How about client-server? They’re just making the Rodney Dangerfield problem worse (“no respect”).

The example on pages 198-207 uses a web application with the JavaScript Object Notation (JSON ) file format, but AFAICT there is no JavaScript involved so they’d be better off with a custom schema of the much more powerful XML metalanguage. For example, that would enable them to use schema, transforms, and XQuery, which are well-established standards with cross-platform implementations. XML is human-readable, too, and compressed over network transfer so it’s performant. The authors missed an opportunity to e.g. apply a schema to their fruit table to find issues sooner. This is what shift-left is all about. With XML, the authors could easily avoid the problem they mention on page 215 “You can use indented JSON documents in your scenarios, as long as you ignore whitespace when you compare them to documents produced by your API.”

Software developers know that data comparisons are most meaningful in the intended schema; text is not ideal for comparing JSON or XML. This was a missed education opportunity in the book.

Previous posts in this series:

Future posts in this series (with links, as the posts happen):


No Comments

Add a Comment

Sign up here for emails with bites of wisdom on quality automation and MetaAutomation

Recent Blog Posts