Monday, November 10, 2014

James Bach and Dr. James Whittaker: An attempt to understand viewpoints

Introduction, Intentions and Limits

I don't have the political ties to many of the big thinkers of the industry.  I have seen James Bach speak, had him address my blog, addressed his blog, read his book and I have even asked him a question at CAST, but I don't know him personally.  I have attempted to address Dr. Whittaker in his blog, but have never got a response and I have never meet him.  I have read Dr. Whittaker's (in)famous exploratory testing book, which is currently the second most popular posts I have written.  I take that to mean I am not the only one who has interest in Whittaker's views and I suspect I'm not the only one to never have met him.  In my consideration of both these individuals, I am not attempting to make it personal.  As you read this, please keep in mind that this is my attempt to understand their viewpoints.  This is an attempt to parse the words of each of them to gain insights.

I came from CAST a few months ago and while I'm not going to directly talk about CAST I want to talk about a subject near and dear to my heart. In listening to Jame Bach's talk, most of which was around the process of testing, but one piece was clearly controversial. This was the Testing vs Checking and the ideas around it. In talking with Bach, I now know I hold disagreements with some of his model. I mostly kept silent regarding this, except for my consideration of Heuristics and Algorithms where I pondered if what humans and computers do is equal, it just happens human are complex and unknown currently compared to that of a computer. This leads down the free will question that is unanswerable, which is why I didn't deep dive into that subject. I still am resisting that particular topic for now, but I may return to it on a later date.

Bach has interesting ways of subdividing his world, with the terms Testing and Checking, with the word Test meaning something different in the past, even for Bach. Some day I will do a word of the week on Test, but that isn't today's subject either. In fact I'm still dodging giving my detailed, full opinion on checking vs testing, but be assured it will come up. Today's topic actually is about my studies of James Whittaker and what appears in my view as James Bach's answer to that viewpoint. I choose these two in part because I was interested in why Whittaker left testing and for Bach's part, he is a prolific exemplar of a testing professional for me to consider.  Bach also has address some areas that Whittaker has talked about, making him a easy to cite exemplar as well. This is not meant to be name calling, but rather I am trying to describe how I see these thinkers in test hold positions and viewpoints that are deeply held by many individuals. Certainly other people have variations of these viewpoints and certainly I maybe miss-reading parts of these viewpoints. I struggle not to put words in people's mouths, but I have to admit some of this is speculation based upon the comments I've found that Whittaker has made between 2008ish and 2014. Those are limited and I have studied them over months, so my sourcing is limited. James Bach, on the other hand, is prolific, which means I might be missing data. With that said, on with the show!

Warning: This is a long winded personal discovery post, you have been warned.

 What Dr. Whittaker has said on Testing

In .NET Rocks 408 from 2009, Whittaker complained about only having guiding practices, with no solid knowledge, saying,
“I'll just be happy when testing works... because they have guiding practices, guiding theory....”   
Later he noted how more testers means, in his words, lazy developers:
“We claim in many groups a 1 : 1 dev/testing ratio [in Microsoft]... Google [has]... 1:10 tester ratio... 1:1, you think oh wow, that is a lot of testing, also a lot of lazy developers to be perfectly honest... At Google they have no choice but to be a little more careful in development.”
Whittaker wants to remove as much inventive thinking from the process as possible, saying things like “In Visual Studio 2010... much of that [change analysis] we will automate completely..." and
“What can you tell me about the new tools... we think you will be able to reproduce every bug.”  That is a fairly wide and amazing claim.  He even starts talking about intelligent code that will find bugs all by itself,  “Only if it is the right sort of unit testing. If it isn't finding actionable bugs... throw it out. If the test cases themselves were feature aware, they know, they know what bugs they are good at finding... this is the future I want.”  Why do you need to inject intelligent reasoning when all problems have been solved before, but are hard-coded to one particular program?  “I'm convinced, on this planet, everything that can be tested, has been tested.”  In fact, why not remove testing as an activity all together?  Just have the machine figure out what needs to be tested, even before you hit compile!  “We[test] need to be invited in to be a full fledged member... we have to get to the point where we [test] are contributing more. Can we get software testing to be more like spell check is in word.... this needs to be the focus of software testers. These late cycle heroics and these testers who are married to late cycle heroes are Christmas Tester Past. ”

Dr. Whittaker imagines a future where testers are more-or-less developers. They might have a few test skills but basically once the magic tools exist, then developers should be solving these problems.  But of course he might have changed his mind since then?  I mean that was in 2009!  What about modern times?

Quality is no less important, of course, but achieving it requires a different focus than in the past. Hiring a bunch of testers will ensure that you need those testers. Testers are a crutch. A self-fulfilling prophesy. The more you hire the more you will need. Testing is much less an actual role now as it is an activity that has blended into the other activities developers perform every day. You can continue to test like its 1999, but why would you? - March 25, 2014,
Nope, Dr. Whittaker's viewpoint has gotten even more extreme. Dr. Whittaker might say I am a developer given the 'advanced' automation I write, but he certainly thinks most testers should just go away. He would rather have testers all fired, because [manual?] testers just enable developers to write bad code.  Why does he feel this way? Well first of all, he's in a super-large company that hires a great deal of the best talent there is.  Microsoft has some amazing developers and clearly they can write some amazing code, so maybe he sees them and assumes everyone else is like that.  Before he was at Microsoft, he worked for Google, so his perspective might be skewed.  Perhaps Microsoft and Google can get away with fewer or no testers.  Facebook seems to do it. He certainly seems to hint at it with his most recent blog post on the subject.  His infamous 'traditional test is dead' talk also seems to indicate that you need enough people to dog food your product, which requires a large community of people willing to file bugs for you. That does not apply to everyone, but he might not realize that?  Dr. Whittaker's history has, as best I can tell, been mostly about bigger companies and teaching. Perhaps there are other reasons I have not been able to discover.  I'd love to hear from Dr. Whittaker!  Clearly James Bach, has a reply...

James Bach and Automation

James Bach, a man who consults with and trains hundreds (or more?) of testers each year also has an interesting vantage point.  He mostly works on testing from an intelligent gathering of information through investigation.  He argues, in essence that any investigation made by a programmer in order to write automated tests (i.e. "checks") or executing the automated tests is in fact not equal to that of a human being.  All of Dr. Whittaker's magic tools make less sense in the mind of Bach.  Not that Bach doesn't want these tools, but he would argue without an intelligent human, the tool provides no additional value.  He has gone so far as to start labeling automated testing as checking. Not saying they have no value, but saying that their scope is limited compared to a human. I think Bach would even argue an AI-oriented test method, load testing, model driven testing and other "beyond human" testing could be done by a tester, given infinite amounts of time and/or humans. To be clear, James Bach seems to not view it as possible that there are tests that are beyond human, but I will address this later.  Therefore, I could simplify the argument to this:

if(Human.Testing > Computer.Testing) then throw new DoNotOverloadWord("Computer Testing and Human Testing are not the same!");

Ironically, Dr. Whittaker might agree with that simplification if only you change the ">" to "<".  While I might agree with James Bach in his basic point, I think that there are a few concerns.  One is that trying to change a word used in many different cultures, not just software testing to something new just makes another standard.  He's just added another definition to an already complicated word, making it even harder to understand. Worse yet, it seems like in defining the word that way, Bach has accidentally made his own standard for the word test, which will compete with other standards. I don't know if Bach wants to be a sort of defacto-ISO standard or not, but that's how it might become.  I certainly sometimes see how his passionate arguments could be perceived as an attempt to be sort of standard.  On the other hand, I also hate the answer 'it depends on context' (without something to back it up), so at least when I read Bach's blogs, I know what he means when he says checking.  Dr. Whittaker doesn't seem to put in nearly as much rigor in his words, making his ideas a little more slippery.  I don't know that there is an ideal way of handling this problem.  Perhaps that is part of why schools of testing showed up, so you know whose standard definitions you use.

When you say "Test" which definition are you using?  The Bachian 2014 definition.  Oh, you've not moved on to the 2015 definition?  No, it's out now?

Back to the question of computers and if they can go beyond humans in testing/checking.  I take it that Bach feels that computer's ability to test/check cannot go beyond people from this graphic:

Machine Checking is inside Testing and James Bach defines Testing as:
Testing is the process of evaluating a product by learning about it through exploration and experimentation, which includes to some degree: questioning, study, modeling, observation, inference, etc. 
So if you run a load test, the load that the machine generates allows you to observe the system under load, using machine generated metrics like page load times or database CPU usage. The question is where does the load test code belong?  Is it 'checking' in that it navigates the pages?  I suppose it does blow up if it can't load the page, which is a check, but the primary point of load tests is not about the functions of the page as much as the non-functional requirements. Is the data gathered by the load test a model?  If I have a graph of users to page load times, isn't that a model... created by a machine?  Bach does not seem to have a reply to this, other than to say that a machine embodies its script and thus is separate from humans. Whittaker on the other hand does not care as much about the distinction as he does about getting the technology good enough to get the data into engineers hands without a middleman tester writing a specialized load test script. Better yet, in Whittaker's mind, get your users to do the testing by deploying the code.

Another issue is that while the test executes, it is limited to the data and code written, even if you have a genetic algorithm for testing and use Google as a datasource.  These are powerful algorithms and Google as a datasource outmatches my own knowledge base in a sense. While I agree this is perhaps different than a human who is perhaps in some way more advanced than said algorithms, it can also be beyond human.  Perhaps because humans can reason and the algorithms are limited to the code written there is a distinction to be made, which is fair enough, even if I still am not 100% convinced.  Is reasoning just another term for 'a process we don't fully understand' while code is something we can?  Is 'tacit knowledge' and 'cultural understand' really limited to just humans or will computers with enough data be able to generate similar models some day?  I admit human brains seem different, maybe we have a soul, maybe we have free will, maybe....  But to be fair, even with my imagined super-advanced testing system, I know 98% of automated testing systems are not attempting to approach these sorts of advancements.  Even if our automation systems aren't nearly as good as I think they can be, does that mean that testing vs checking will always be a dichotomy or will programs eventually get good enough that we can't tell the difference?  And what if humans are always special and better, what about diminishing returns? Hand-washed clothing maybe cleaner and possibly more water efficient, but a washing machine is preferred by nearly everyone in the modern world.  Perhaps all of this doesn't matter right now, but Dr. Whittaker seems to think we're there.  Maybe places like Microsoft are there?  Maybe "Tester" as a role makes less sense as more of that role is automated away, and maybe when you throw enough developers to create the automation, you can have most of your testing automated?

So then there is another question that comes to mind, if executing a "automated test" is checking, while the human is writing the algorithm to do the 'checking', is that testing?  In speaking with Bach, it is not in his mind.  My brain, thinking about the problem, writing code, creating models in my head about the system under 'test' (system under check?) is not testing according to Bach.  Very interesting viewpoint, making my claim of about this site being 98% about testing untrue in Bach's viewpoint.  Dr. Whittaker would seem to say that of course that is testing, even if I personally shouldn't be doing it or at least I should also be writing production code.

To be fair, Dr. Whittaker also argues that users are better than machines or testers.  This is a point both Whittaker and Bach might agree on, even if Bach might say only in some context.  But, in saying that I think Dr. Whittaker loses the point that users of a product can change products quickly and are fickle.  The mighty IE 6 failed and clearly IE is no longer top dog.  Listening to users might have been part of that, but an insider advocating for the users also might have been helpful! Users often don't know what they want and what they want changes quickly. This has known this for years. What makes Dr. Whittaker think that all products have a single user base that only wants one set of features? One man's bug is another man's feature.  For that matter, what about products that don't have a exploitable user base?  How do you dog food software for shipping products (OMS) when it costs money every minute the software is broken?  How about we dog food a pace maker; that will work, right?  James Bach would note that quality is not a thing you can build into software but rather a sort of relationship.  I suspect that if you break the software too often, you harm the user's interest in that software and they will move on.

What can I see from here?

I very much appreciate that these giants of the industry allow me to stand on their shoulders and look out upon the sea of possibilities.  I don't know exactly where I stand in an 'official' sense, but I am sure that both of them hold good ideas, ideas that should be investigated... nay, should be tested.  To his credit, James Bach agrees and wants to continue to make his ideas better.  The problem is that with different context there are different needs.  I have pondered from time to time if we could lock down context (e.g. E-Commerce website, X-Y income, X-Y products sold, Z types of browsers), could we say something about which style of testing would make more sense?  In considering both arguments, I have to say that in my experience, automation can go beyond what a human can test, but the value provided often is different than that provided by manual testing.  On the other hand, often the results require human analysis, meaning that human judgment is still required. Like a load test that knocks over a system after throwing 500% of normal load, a human is required to do the analysis and render a value judgment.  I having also seen some development shops that have gone without testers. At least in one case, they seemed to have almost pulled it off in a superficial view, but once you dig in, I think it became clear that some people or places do need testers.  I don't think there is a way of giving detailed advice that is universal, and without some well controlled studies, it maybe impossible to tell when a particular methodology is superior in a given context.

While I can't speak to what method is the best or which are superior to others, I can speak to what worked and what didn't for me in my context. Using only my own contexts and what I have seen I can make a few broad statements.  Few people do automation well.  The most successful automation does not involve a UI or when it does, it is in a manumated process. Those who do automation well are treasures.  Few, but more people do manual testing well.  These people are also treasures.  If you have only bench testing by the developers, adding either automated or manual testing (or checking) is likely to increase the quality for the product in small to medium size organizations.  I am not sure how to logically organize test/development for large organizations (more than 500 people in software development).

Not everything is easy to be tested by code.  Not everything is easy to test by people.  If you can't get people to do the testing, either it might not matter enough or you should automate it and accept the limits of checking.  If your automation keeps breaking either your software is flaky or perhaps it shouldn't be automating (yet, if ever).  Testing is hard and you need people who understand testing, be they developers or testers.  For that matter, you also need people to understand software development, the business, the customers and probably the accounting.  How you label these people may affect how they do the job, and maybe it matters, but what matters most is that the job is done as much as it needs to be done.

Ultimately, software development is hard.  Dr. Whittaker and James Bach come at the problems from very different places, but both want to improve the quality of software.  Even with their extremely different views both have contributed towards their goals.  These extremes do cause some polarization and sadly this post might sound negative towards both their views, but it is hard to find common ground between two extreme views. On the other hand, Whittaker helped make Visual Studio 2008 and 2010 better at testing while Bach has been working on perfecting RST!  Both these men have worked towards their visions and have created useful artifacts from their near disparate visions. While I have disagreements with both sides, I am glad they were around making my job of testing better and easier.

So if I haven't said it yet, thank you.

One last thing.  If Dr. James Whittaker or James Bach respond to this with any corrections, I will update this post.  I maybe challenging my understanding of their ideas but I am not trying to put words in their mouths.

No comments:

Post a Comment