Thursday, January 2, 2014

Exploratory Software Testing: Tips, Tricks, Tours, and Techniques to Guide Test Design

Exploratory testing, according to James A. Whittaker, is “When the scripts are removed entirely…, the process is called exploratory testing.”  Whittaker’s statement taken at face value assumes that you cannot perform exploratory testing with a script, although he does later revise this to include what I might call “partial scripting” (running a test up to a point and then exploring from there) and Whittaker calls exploratory testing in the small.  He also looks at exploratory testing in the large, which is what I typically consider exploratory testing (which I will cover later on).  In my opinion, the problem with his initial statement and exploratory testing in the small is he is actually describing testing various states just without written steps.  Really, you could easily create a simple table of states and outcomes and it would be just scripted testing.  This to me is not exploratory testing.  To be fair, he later on acknowledges that you can in fact combined scripting and exploratory testing, but that seems to not be covered in his definition of exploratory testing (which I find ironic since it is his primary purpose for the book and as a tester he should have noticed this incongruity).

Exploratory testing in the large, as I described earlier is the act of testing one or more areas by using non-scripted patterns.  Whittaker describes a system of patterns, which he calls Tours.  Tours are, according to Whittaker, a “…mix of structure and freedom.…”  His Tours are designed around either attempts to change your point of view, the methods of testing or the scope of the testing.  Personally, I find the names of the Tours as very poor representatives of the concepts he is trying to put across.  For example, the Money Tour could have easily been called Customer Demo Testing or the FedEx Tour (one of the better name Tours) could be called Data Flow Testing.  I suspect that the reason for the funky naming convention was either so he could use the word Tour (and thus the analogies) or the so that he could make a book rather than a long blog.  While I might object to his implementation, I can’t object to his goal of making better testers.

My other objection comes from the limits of what he ultimately came up with.  He describes to a very limited degree the attempt to see the application from other user’s prospective.  He tells stories of tourists, but I didn’t notice him explicitly link the tourists to users, what the users do and what they expect.  He uses these tourists to ask questions, but not to change one’s bias from one type of imagined user to another.  For me, when testing, I make a great deal of effort to simulate various types of users and to intuit what a given user would do.  If I see a button, I might see 6 different styles of users.  The elderly person who doesn’t get it is a button because it doesn’t look like a button, the expert who wants to middle click the button to open in a new window, etc.  Even then I sometimes go further and imagine not only the users but what their reactions might be to a given system.  What it is that they would want, where they would get confused, where they want the system simpler rather than more complex, etc.?  The book does not seem to address this type of cross-user non-functional style thinking (exploratory testing?) or what type of outcomes might be generated from it.

Outside of the actual techniques addressed in this book, another question left unaddressed is how one manages the various forms of testing in harmony.  I realize this is a complex question, but that is why one pays good money for the book.  To be clear, I don’t mean, what Tours to choose for a given project, which he does address through examples (ironically not written by the author; how many pages did he actually write?).  I do mean, when Tours, when scripting, when automating, or when doing whatever other types of testing the author might have cared to mention.  

NOTE: In this post, I use the term tester as a generic term including Test Engineers, SDETs, Developers writing unit tests, hackers attempting to break in or even customers on a beta product.  In all cases, the person is testing the system.  Only customers on production systems should not be considered testers, since at that point the product should have a fairly low number of bugs, and the customer is not looking for defects.

Unstructured Additional Notes [I wanted to include some unstructured misc thoughts on little bits and bobs that I wanted to make some minor comments.]:
  • On page 120, Memorylessness is described, which I think is a huge problem with testing outside of monolithic organizations, but not exactly in the way described.  The problem he describes includes testers forgetting what they tested (a reasonable problem) and testers forgetting what testing techniques find large sets of bugs (a mostly unreasonable problem IMHO and too easily abused by management).
    • He should also mention how it is hard to “remember” what is automated already, as that too seems to be a serious issue.  I suppose perhaps his “Testipdedia” is the closest thing he had to a solution, but that seems like a very manual effort or so generic to be of little value.
  • The WMP and VSTS (Page 97-111) testing comments were of some value.
  • The idea of tracking bugs by % of effort (time wise) to % of bugs found seemed like a good idea (Page 104).  Also the visual test tool falls under a similar area, where testers are able to see where areas have the most functionality, testing/automated focus, bugs, etc.
  • I think his commandments where worth reading, even if only 5 pages of the book.
  • I found his comments on measuring a tester by the improvements to the development team interesting, but rather hard to measure.  Do you look at bug trends, points completed by the team or what?  To add to that, I think that quality of the design (both interface and code) can be part of a QA’s responsibility, but how do you grade those things?  Even worse, a manager of a large group of developers can’t be sure which tester caused which improvements or if it was indeed testers who did it.
  • Whittaker asserts automation needs to be backed up by manual testing, and I agree that automation has a great deal of limitations, and probably needs another 30 years (if ever) to mature before it really provides what has been sold to upper management.
  • The comments on correcting Microsoft’s quality problems were some of the better insights, yet the “how are we going to test this thing” with regard to design, doesn’t get  addressed often enough in the book.


  1. Great post.
    Would be interested to discuss bit further with you about the Exploratory testing.
    I'm looking for ways to better document ET and come up with common TestObjectives for new testers to be more effective. Would like to get your view and hear more about your experiences..
    Please ping back interested. You can also find me from

    1. One concern I have with my writing is you don't actual define what a TestObjective is. Allow me to give you a quote from your site:
      "Frequently there is a list of common #TestObjectives for each element (in addition to project specific requirements) so it got me thinking, how about building a list of common #TestObjectives for those elements. This soon became my main goal; collate a list of common #TestObjectives that I can always refer my testers at."

      So is a TestObjective testing a single element of a UI? Can you have TestObjectives for Databases? Would you call 'finding bugs' a TestObjective? Would you call helping to determine a release is ready a TestObjective? For now I am going to assume you're talking about generating test cases around a particular area of a system, but the rest of this reply maybe invalid if that is not what you meant.

      Your list of data for testing of dates lack some testing which I think could be helpful. I know these are technology specific, but the ideas still apply. I would look at:

      Also your currency testing fails to take into account Arabic numbers, which if you are going to consider Euros, you might want to consider them too (See Turkey Test above).

      You don't have any "TestObjective" for Strings (either unformatted or with a specific format), which seems like a big missing part of data. You should be aware Elizabeth Hendricks who also wrote a book on ET has a great list of tests to consider. I have not read her book, but her checklist has been useful to me on many occasions. You can find that here:

      As for ET, I will be honest, that is not my expertise. I do keep myself educated but my specialty is automation. If you want to talk about Exploratory Automation, Doug Hoffman and Cem Kaner are the guys who coined the term:

      Without some details on what you mean by TestObjectives I think writing more around ET would be premature. I hope this helps. Feel free to come back with more specific questions.