About 98 Percent Done: January 2014

Thursday, January 30, 2014

Book Consideration: Rethinking Systems Analysis & Design

To be clear, this is my second book by Gerald M. Weinberg, and I’m not reading the books he wrote in the particular order he published them in. I like the author’s style of thinking, but in the Introduction to Systems Thinking book, he was very general in his descriptions, hitting a great deal of subjects, with his thoughts. This of course is a big part of system’s thinking – the idea that you can apply what you know in one field against many different fields using a form of logical thinking and general rules. While some of the design of the book could have used some refactoring, it was a good start.

In Rethinking Systems, the book focuses mostly on the development of software systems and how this thinking can be directly applied to software. While some of the ideas are fairly standard (and the book was published more than 10 years ago), it provides some interesting insights. He starts out by considering analysis vs slogans. He talks of how we sometimes we over use history as a method for predicting the future. While history is valuable, analysis will provide data not found in history. That is to say if X is good, X+1 might not be better, depending upon the attributes that we want out of X+1. However, we might have a slogan “Bigger is better [for X]”, which might be true up to a given point, where the system then fails to scale for some reason. He warns of how software developers are particularly vulnerable to certain methodologies, because [in my reading of his opinion] we as an industry don’t have a long history, and even a 5% increase in productivity is great if we consider the number of failed projects in the past. Unlike the physical world where you have to build a new bridge every time, in spite of it being very similar design, where as we in software always build something new, else we could just copy the previous build for “free”.

Chapter two talks of what actually makes up a [software] system, with a fairly reasonable picture, including the external world, the organization, training materials, training done on the job, wetware, existing files, test data, job control, (and all of this before), the program and hardware (Pg 34). I thought it was a good point that part of the investment in software is the people and the wetware (the knowing what code is where because you built it which is faster than reading code or documentation). He also talks about how we need to use our history more wisely, as we don’t often understand why something is the way it is. While not an example given, I would say the Unicode set(s), is a good example, as they are design for different forms of optimization, but are fairly ironic since they were in fact an attempt to solve a very real world issue ASCII had, as computers were originally designed for English. I also think of Joel’s amazing article on Martian Headsets (It is worth reading. Go now, I’ll wait).

Chapter 3 is about how the observer fails often to observe anything of value (including thinking they saw X but saw Y), or at least the wrongs things (like how magicians have you look the wrong way). In QA this is an important point, and one which I have suffered through more than once. I actually talked on this a little in a presentation I gave about changing one’s perspective, a valuable thing to do. He talks about studying the existing system to understand rather than criticize. This is an important lesson, I hope to take more to heart, as a QA engineer, I have to be careful not to be hurtful to those who created the code, but more of an impartial observer, stating what I saw and why it seems wrong, not just to me, but from a more broad perspective. In some ways this reminds me of a fair witness.

Chapter 4 talks of self-validating questions, that is to say, questions that require a response that will in fact validate an understanding of the question itself. He talks of the question “Does that contain special characters?” which gets the answer “no”. He assumes special characters means alpha + numeric, but nothing else. In this day and age, it would mean something else, but by asking a yes/no question, without a follow up can get you into trouble. He speaks of the problem that programmers tend to be defensive in what they do and how they act because users accuse them of deliberately causing trouble if the program says “don’t do X” but the user demands it anyway. On the other hand, if they just ignore the user and do what they want, they can stubbornly miss important details. This is a trick problem I also talked about in my talk with dealing with users. I felt he had good insights and this chapter alone was worth the effort.

The last few chapters are on design of software, including the philosophy, trade offs and the mind of a designer. I really don’t have much to say about these chapters. It is not that they are bad, but it was not the most exciting to me. He talks of being aware of your designing, not to over design nor to ignore the reality of the world. I find these not exactly contradictory, but certainly a narrow path to follow. To me, he could have done an entire book on design, but I suspect his strong point is system analysis, not design. He suggests designing for understanding (know your audience), strike balance between variation and selection (don’t be too original or ground breaking for your audience), etc. He speaks of trade offs and how to represent them as curves, which is a somewhat novel way of displaying the information, but fairly obvious to anyone who knows that famous triangle with fast, good, cost (pick two).

There are three other things which I would like to mention, but which I have not found (in my consideration of the book) what chapter they were in:

He speaks of spell check which is ‘obviously good’, but is it? It seems to him that the cost of the grievous errors might be greater than the minor typos. He noted examples where this caused confusion as he had intentional mistakes that were fixed by editors. He actually complains about one editor using glue and razors to fix an intentional bug in a different book, which I found pretty funny.
He speaks of students in his class using systems analysis to gain ‘masters level’ knowledge in a subject and that on average some subjects took just 3 months, but others took longer (he didn’t say how long). He specifically said English was a subject that took longer and computer science was one of the shorter subjects time wise. I have my doubts, but interesting none the less, particularly when I recall an (apocryphal?) story of a man who said he could pass any test for his degree. He was studying something like psychology, but said he could pass any test and thus was given a test in dentistry, which he got a B in, getting his degree.
I found it interesting that in the author’s epilogue, he states: “basic human needs… air, water, food, sex… the need to judge other people.” What I find interesting about it is I have had several talks about this subject with various people, one of whom claims not to judge, when judging is defined as “A person able or qualified to give an opinion on something.” The author continues, “To write a book… you have to have an uncontrollable urge to snoop and pass judgment… to study what people do and tell them how to redesign their activities.” It seems to me that judgment is of utmost importance in both the high and the low. In the high level, we build something we judge to be useful to others, in spite of never having perfect information (making one not qualified?). This egomaniacal belief that we are good enough to play a sort of demigod, handing design from on high, with the assumption that what we do will be good, or at least better than having not. We design not truly ever knowing what our users really want and at best we can develop heuristics on this, but not hard and fast rules. On the low, we design the system’s structures with limitations based upon our best understanding of what will affect it, judging how the system will be used (E.G. Computers will only need English characters) and who will use them, not really knowing what the future will bring. The trick in my opinion is to be wise enough, not to intend harm with our judgments, nor be too harsh; for often times we end up blinded by the judgment, unable to let go based upon our assumptions. Judgments, it seems to me (based upon the author’s words) are just another heuristic, not a truth. In my own view, judgments are a non-binding, unenforceable guesses (with a probability weight behind it) as to how something works, based upon previously observed factors.

In my opinion, I would read the systems thinking book first and then read this book as Systems thinking gives you a broader base of theory, but this book is more practical and does have value.

Thursday, January 16, 2014

Why can't anyone talk about frameworks?

In writing for WHOSE, I was dismayed at the total lack of valuable information regarding automation frameworks and developing them. I could find some work on the frameworks with names (data driven, model driven and keyword driven), but almost nothing on how to design a framework. I get that few people can claim to have written 5-10 frameworks like I have, but why is it we are stuck with only these 3 types of frameworks?

Let me define my terms a little (I feel like a word of the week might show up sometime soon for this). An architecture is a concept, the boxes you write on a board that are connected by lines, the UML diagram or the concepts locked in someone's head. Architecture never exists outside of the stuff of designs and isn't tied to anything, like a particular tool. Frameworks on the other hand have real stuff behind them. They have code, they do things. They still aren't the tests, but they are the pieces that assist the test and are called by the test. A test results datastore is framework, a file reading utility is framework, but the test along with its steps is not part of the framework.

Now let me talk about a few framework ideas I have had for the past 10 years. Some of them are old and some are relatively recent. I am going to pull from some of my presentations of old, but the ideas have at least been useful for one framework of mine, if not more.

Magic-Words

I'm sure I'm not the first one to come to this realization, but I have found no records of other automation engineers speaking of this before me. I have heard the term DSL (Domain Specific Language) which I think is generally too tied to Keyword-driven testing, but a close and reasonable label. The concept is to use the compiler and auto complete to assist in your writing of the framework. Some people like the keyword driven frameworks, but in my past experience, they don't give compile time checking nor do they help you via auto complete. So I write code using a few magic words. Example: Test.Steps.*, UI.Page.*, DBTest.Data, etc. These few words are all organizational and allow for a new user to 'discover' the functionality of the automation. It also forces your automation to separate out the testing from the framework. A simple example of that can be given:

@Test()
public void aTestOfGoogleSearch() {
 Test.Browser.OpenBrowser("www.google.com");
 Test.Steps.GoogleHome.Search("test");
 Test.Steps.GoogleSearch.VerifySearch("test");
}

//Example of how Test might work in C#, in Java it would have to be a method.
public class TestBase { //All tests inherit this
  private TestFramework test = new TestFramework();
  public TestFramework Test { get { return test; } }
}

Clearly the steps are somewhere else while the test is local to what you can see. The "Test.*" provides access to all the functionality and is the key to discoverability.

Reflection-Oriented Data Generation

I have spoken of reflections a lot, and I think reflections are a wonderful tool for solving data-generation style problems. Using annotations/attributes to tell each piece of data how to generate, what sorts of expectations there are (success, failure with exception x, etc.), filter the values you allow to generate and then picking a value and testing with it is great. I have a talk later this year where I will go in depth on the subject and I hope to have a solid code example to show. I will certainly post that up when I have it, but for now I will hold off on that.

...

Okay, fine, I'll give you a little preview of what it would look like (using Java):

public class Address {

 @FieldData(classes=NameGenerator.class)
 private String Name;
 @FieldData(classes=StateGenerator.class)
 private String State;
 //...

}
public class NameGenerator {

  public List<Data> Generate() {
   List<Data> d = new ArrayList<Data>();
   d.add(new Data("Joe", TestDetails.Positive);
   d.add(new Data(RandomString.Unicode(10),  {TestDetails.Unicode, TestDetails.Negative));//Assume we don't support Unicode, shame on us.
   //TODO More test data to be added
   return d;
  }

}

Details

Why is it that we as engineers who love the details fail to talk about them? I get that we have time limits and I don't want to write a book for every blog post, but rarely do I see anyone outside of James McCaffrey and sometimes Doug Hoffman talk on the details. Even if you don't have a framework, or a huge set of code, why can't you talk about your minor innovations? I come up with new and awesome ideas once in a while, but I come up with lots of little innovations all the time.

Let me give one example and maybe that will get your brain thinking. Maybe you'll write a little blog on the idea and even link to it in the comments. I once helped write a framework piece with my awesome co-author, Jeremy Reeder, to figure out the most likely reason a test would fail. How?

Well we took all the attributes we knew, mostly via reflections of the test and put them into a big bag. We knew all the words used in the test name, all the parameters passed in, the failures in the test, etc. We would look at all the failing tests and see which ones had similar attributes. Then we looked at the passing tests and looked to see which pieces of evidence could 'disprove' the likeliness of a cause.

For example, say 10 tests failed. All 10 involving a Brazilian page. 7 of those touched checkout and 5 of those ordered an item. We would assume that the Brazilian language is the flaw if all tests failed, as that might be the most common issue. However, if we had passing tests involving Brazilian, then that seems less likely, so we would see if we could at least establish if all checkout failures had no passing tests involving checkout. If none had, we would say there was a good chance that checkout was broken and notify manual testers to investigate that part of the system first. It worked really well and solved a lot of bugs quickly.

I do admit I am skipping some of the details in this example, like we did consider variables in concert, like Brazilian tests that involved checkout might be considered together rather than just as separate variables, but I hope this is enough that if you wanted to you could build your own solution.

Now your turn. Talk about your framework triumphs. Blog about them and if you want to put a link in the comments.

Friday, January 3, 2014

Heroes and Villains

Heroes

I have a host of people that I read and read and read. These are my literary Heroes. They constantly give me new gems and bright insights. I pale in my work before these folks. The sad news is that my heroes have slowly slipped away and all I have is their tremendous bodies of work. In the blog-arena, I really appreciate Jeff Atwood, Joel Spolsky and Steve Yegge (who has multiple blogs). Just as an example of my reading, I am all the back to 2005 in Jeff's blog, reading it backwards entry by entry.

Now two things you'll note from that. One, I really enjoy what developers have to say and I really don't have a lot of test-related bloggers I hit on a regular basis. Even if I start widening my knowledge net, Martin Fowler, Scott Hanselman and those Dot Net Rocks podcast guys probably has beaten out other test-related heroes I have read. Now I do tend to focus on what they have to say on test when they have something on test, but I spend way more time on the development process than on capital-T Test. These guys are brilliant and have wonderful things to say. They come up with all sorts of clever coding patterns and practices. They consider the process as a whole and care about the craft. They are Heroes.

Then I should consider my list of more generic Heroes. Heroes of science like Richard Feynman, Heroes of science fiction like Robert Heinlein. Heroes of thought like Rene Des Cartes, Heroes of humanity like George Bernard Shaw. These are all men I have read and found to have enlightening things to say, although some are rather obscure.

Right now, this is how I feel about test and "Heroes". There are no Heroes. There are a few knowledgeable people, but the fractured nature makes it hard to pin down. Most people who come into test come into it by 'accident'. We are still forging paths and hitting dead ends. Yet we do have another aspect.

Villains

Perhaps our lack of Heroes is the nature of our business, so at best we get Villains. Well, Villains are cool, right? Who doesn't like a good Super Villain? The Riddler or Magneto come to mind. They have super powers, they do cool things and in the end, they often feel justified in their deeds. I hesitate to call anyone a "Villain" in a community I work in. Worse yet, CDT might be so accepting that Villains are considered good guys in their own way. Now if I was James Bach, I would have "“depraved” enemies". Now does that make someone like Rex Black a Villain in Bach's eyes? Is the visa-versa true for Black? Am I a sort of Villain of the CDT community for asking questions? In talking with one senior-level tester, I could be seen as a Villain. The all-mighty dollar of consultancy demands for true purity of our testing Scotsmen and my questions for CDT is dangerous for those dollars. I won't answer these questions regarding villainy for you, but I do want to be clear, I am not calling any of these gentlemen Villains.

The downfall of our method is that we don't have any golden boys of testing. Instead, we challenge each other to get better but don't have an easy way of showing off our skills. I can't even personally say if Kaner, Bach or Black are good testers. I can look at some of Atwood, Spolsky and Yegges code and judge them, as that is often visible. Testing is often an invisible task, not one with an output easily examined. I think some of the information of the test community can be useful, but I don't use all of any practitioner's 'methods'.

I am left with two things. Are there any real Villains of testing and can we have Heroes in testing? How can I evaluate a given person's skill in "Test" compare to writing or management or bug writing or all the other skills that make up "Testing"? Even with testing an application, it often requires parameters that are hard to control, like the versions of software, builds, time, memory, threading and all the other 'impossibility of complete testing' pieces. So even with bug reports, test cases, etc. it might be impossible for me to truly evaluate the output. If I can't know (there are plenty of heuristics, but actually knowing is much harder) someone is a good tester, then I guess I'm just left with is a Villain just a perspective or are their absolutes?

I don't know that I can answer that, but maybe I'll look into getting an outfit, just in case.

Thursday, January 2, 2014

Exploratory Software Testing: Tips, Tricks, Tours, and Techniques to Guide Test Design

Exploratory testing, according to James A. Whittaker, is “When the scripts are removed entirely…, the process is called exploratory testing.” Whittaker’s statement taken at face value assumes that you cannot perform exploratory testing with a script, although he does later revise this to include what I might call “partial scripting” (running a test up to a point and then exploring from there) and Whittaker calls exploratory testing in the small. He also looks at exploratory testing in the large, which is what I typically consider exploratory testing (which I will cover later on). In my opinion, the problem with his initial statement and exploratory testing in the small is he is actually describing testing various states just without written steps. Really, you could easily create a simple table of states and outcomes and it would be just scripted testing. This to me is not exploratory testing. To be fair, he later on acknowledges that you can in fact combined scripting and exploratory testing, but that seems to not be covered in his definition of exploratory testing (which I find ironic since it is his primary purpose for the book and as a tester he should have noticed this incongruity).

Exploratory testing in the large, as I described earlier is the act of testing one or more areas by using non-scripted patterns. Whittaker describes a system of patterns, which he calls Tours. Tours are, according to Whittaker, a “…mix of structure and freedom.…” His Tours are designed around either attempts to change your point of view, the methods of testing or the scope of the testing. Personally, I find the names of the Tours as very poor representatives of the concepts he is trying to put across. For example, the Money Tour could have easily been called Customer Demo Testing or the FedEx Tour (one of the better name Tours) could be called Data Flow Testing. I suspect that the reason for the funky naming convention was either so he could use the word Tour (and thus the analogies) or the so that he could make a book rather than a long blog. While I might object to his implementation, I can’t object to his goal of making better testers.

My other objection comes from the limits of what he ultimately came up with. He describes to a very limited degree the attempt to see the application from other user’s prospective. He tells stories of tourists, but I didn’t notice him explicitly link the tourists to users, what the users do and what they expect. He uses these tourists to ask questions, but not to change one’s bias from one type of imagined user to another. For me, when testing, I make a great deal of effort to simulate various types of users and to intuit what a given user would do. If I see a button, I might see 6 different styles of users. The elderly person who doesn’t get it is a button because it doesn’t look like a button, the expert who wants to middle click the button to open in a new window, etc. Even then I sometimes go further and imagine not only the users but what their reactions might be to a given system. What it is that they would want, where they would get confused, where they want the system simpler rather than more complex, etc.? The book does not seem to address this type of cross-user non-functional style thinking (exploratory testing?) or what type of outcomes might be generated from it.

Outside of the actual techniques addressed in this book, another question left unaddressed is how one manages the various forms of testing in harmony. I realize this is a complex question, but that is why one pays good money for the book. To be clear, I don’t mean, what Tours to choose for a given project, which he does address through examples (ironically not written by the author; how many pages did he actually write?). I do mean, when Tours, when scripting, when automating, or when doing whatever other types of testing the author might have cared to mention.

NOTE: In this post, I use the term tester as a generic term including Test Engineers, SDETs, Developers writing unit tests, hackers attempting to break in or even customers on a beta product. In all cases, the person is testing the system. Only customers on production systems should not be considered testers, since at that point the product should have a fairly low number of bugs, and the customer is not looking for defects.

Unstructured Additional Notes [I wanted to include some unstructured misc thoughts on little bits and bobs that I wanted to make some minor comments.]:

On page 120, Memorylessness is described, which I think is a huge problem with testing outside of monolithic organizations, but not exactly in the way described. The problem he describes includes testers forgetting what they tested (a reasonable problem) and testers forgetting what testing techniques find large sets of bugs (a mostly unreasonable problem IMHO and too easily abused by management).

He should also mention how it is hard to “remember” what is automated already, as that too seems to be a serious issue. I suppose perhaps his “Testipdedia” is the closest thing he had to a solution, but that seems like a very manual effort or so generic to be of little value.

The WMP and VSTS (Page 97-111) testing comments were of some value.
The idea of tracking bugs by % of effort (time wise) to % of bugs found seemed like a good idea (Page 104). Also the visual test tool falls under a similar area, where testers are able to see where areas have the most functionality, testing/automated focus, bugs, etc.
I think his commandments where worth reading, even if only 5 pages of the book.
I found his comments on measuring a tester by the improvements to the development team interesting, but rather hard to measure. Do you look at bug trends, points completed by the team or what? To add to that, I think that quality of the design (both interface and code) can be part of a QA’s responsibility, but how do you grade those things? Even worse, a manager of a large group of developers can’t be sure which tester caused which improvements or if it was indeed testers who did it.
Whittaker asserts automation needs to be backed up by manual testing, and I agree that automation has a great deal of limitations, and probably needs another 30 years (if ever) to mature before it really provides what has been sold to upper management.
The comments on correcting Microsoft’s quality problems were some of the better insights, yet the “how are we going to test this thing” with regard to design, doesn’t get addressed often enough in the book.