Thursday, October 24, 2013

Software Testing is in my Genes (maybe).

In possibly good news, we may now hire based upon a genetic test! I wonder how that will go wrong as I'm sure it will.  As a personal aside, I find that the glass contains what appears to be roughly 50% atmosphere and 50% liquid, but I have not tested it to validate those observations.

Monday, October 21, 2013

The case for an Automation Developer

Disclaimers: As is always true, context matters. Your organization or needs may vary. This is only based upon my experience in the hardware, B2B, ecommerce and financial industries. Given the number of types of businesses and situations, I assume you can either translate this to your situation or see how it doesn't translate to your situation.

Automation


Automation within this context is long living, long term testing somewhere between vertical integration testing (e.g., Unit testing including all layers) and system testing (including load testing).  These activities include some or all of the following activities:
  • Writing database queries to get system data and validate results.
  • Writing if statements to add logic, about things like the results or changing the activities upon environment.
  • Creating complex analysis of results such as reporting those to an external system, rerunning failed tests, assigning like reasons for failure, etc.
  • Capturing data about the system state when a failure occurs, such as introspection of the logs to detect what happened in the system.
  • Providing feedback to the manual testers or build engineers in regards to the stability of the build, areas to investigate manual, etc.
  • Documenting the state of automation, including what has been automated and what hasn't been.
  • Creating complex datasets to test many variations of the system, investigating the internals of the system to see what areas can or should be automated.
  • Figuring out what should and shouldn't be automated.
Developer

Developer within this context is the person who can design complex systems.  They need to have a strong grasp on the current technology sets and be able to speak to other developers at roughly the same level.  They need to be able to take very rough high level ideas and translate them into working code.  They should be able to do or speak to some or all of the following activities:
  • Design
  • Understand OOP
    • Organization
  • Database
    • Design
    • Query
  • Refactor
  • Debug
  • Reflections
Automation Developer

You will notice that the two lists are somewhat similar in nature.  I tried to make the first set feel more operational and the second set to be a little more skills based, but in order to do those operations, you really have to have the skills of a developer.  In my experience, you need at least one developer-like person on a team of automators.  If you want automation to work, you have to have someone who can treat the automation as a software development project.  That also of course assumes your automation is in fact a software development project.  Some people only need 10 automated tests, or record-playback is good enough for their work.  For those sorts of organizations, a 'manual' tester (that is to say, a person who has little programming knowledge) is fine for those sorts of needs.

Automation Failures

I know of many stories of automation failure.  Many of the reasons revolve around expectations, leadership and communication.  As that is an issue everywhere I don't want to consider those in too much depth other than to say a person who doesn't understand software development will have a hard time to clearly state what they can or can't do.

Technical reasons for failure involve things as simple as choosing the wrong tool to building the wrong infrastructure.  For example, if you are trying to build an automated test framework, have you got an organized structure defining the different elements and sub elements.  These might be called "categories" and "pages" with multiple pages in a category and multiple web elements in a page.  How you organize the data is important.  Do you save the elements as variables, call getters or embed that in actions in the page?  Do actions in the page return other pages or is the flow more open?  What happens when the flow is changed based upon the user type?  Do you verify that the data made it into the database or just check the screen?  Are those verifications in the page layer or in a database layer?  Organization matters and sticking to that organization or refactoring it as need be is a skill set most testers don't have initially.  This isn't the only technical development skill most testers don't have, but I think it illustrates the idea. Maybe they can learn it, but if you have a team for automation, that team needs a leader.

Real Failure

These sorts of problems I talk about aren't new (Elisabeth Hendrickson from 1998) which is why I hesitate to enumerate the problems with much more detail.  The question is how have we handled such failures as a community?  Like I said, Elisabeth Hendrickson said in 1998 (1998! Seriously!):
Treat test automation like program development: design the architecture, use source control, and look for ways to create reusable functions.
 So if we knew this 15 years ago, then why have we as a community failed to do so?  I have seen suggestions that we should separate the activities into two camps, checking vs testing, with checking being a tool to assist in testing, but not actually testing.  This assumes that automation purely assists because it doesn't have the ability to come up with judgment.  This may be insightful in trying to denote roles, but this doesn't really tell you much about who should do the automating.  CDT doesn't help much, they really only note that it depends on external factors.

When often automation fails or at least seems to have limited value, who can suggest what we should do?  My assertion is that testers typically don't know enough about code to evaluate the situation other than to say "Your software is broken" (as that is what testers do for a living,).  Developers tend to not want to test is typically noted when talking about developers doing testing.  Furthermore, what developer ever intentionally writes a bug (that is to say, we are often blind to our own bugs)?

A Solution?

I want to be clear, this is only one solution, there maybe others which is why the subheading starts with "A".  That being said, I think a mixed approach is reasonable.  What you want is a developer-like person leading the approach, doing the design and enforcing the code reviews.  They 'lead' the project's framework while the testers 'lead' the actual automated tests.  This allows for the best of both worlds.  The Automation Developer is mostly doing code as a software development project while the testers do what they do best, develop tests.  Furthermore, the testers then have buy-in in the project and they know what actually is tested.

Thoughts?

Wednesday, October 16, 2013

Reflections


I have been recently been reading over some of Steve Yegge's old posts and they reminded me of a theme I wanted to cover.  There is a idea we call meta-cognition, that testers often use to defocus and focus, to occasionally come back for air and look for what we might have missed.  It is a important part of our awareness.  We try to literally figure out what we don't know and transfer that into a coherent question(s) or comment(s).  Part of what we do is reflect on the past, using a learned sub-conscious routine and attempt to gather data.

In the same way, programming too has ways of doing this, in some cases, in some frames of reference.  This is the subject I wish to visit upon and consider a few different ways.  In some languages this is called reflections, which uses a reification of typing to introspect on the code.  Other languages allow other styles of the same concept and they call them 'eval' statements.  No matter the name, the basic idea is brilliant.  Some computer languages literally can consider things in a meta sense intelligently.

Reflections

So lets consider an example. Here is the class under consideration:
class HighScore {
 String playerName;
 int score;
 Date created;
 int placement;
 String gameName;
 String levelName;
 //...Who knows what else might belong here.
}

First done poorly in pseudo code, here is a way to inject test variables for HighScore:
function testVariableSetup(HighScore highScore) {
highScore.playerName = RandomString();
highScore.gameName = RandomString();
highScore.levelName = RandomString();
highScore.score = RandomNumber();
highScore.create = RandomDate();
//... I got tired.
}

Now here is a more ideal version:
function testVariableSetup(Object class) {
for each variable in class.Variables {
 if(variable.type == String) then variable.value = RandomString();
 if(variable.type == Number) then variable.value = RandomNumber();
 if(variable.type == Date) then variable.value = RandomDate();
}

Now what happens when you add a new variable to your class?  For that matter, what happens when you have 2 or more classes you need to do this in?  The first version can be applied to anything that has Strings, Dates and Numbers.  Perhaps we are missing some types, like Booleans, but that doesn't take too much effort to get the majority of the simple types.  Once you have that, you only have to pass in a generic object and it will magically set all fields.  Perhaps you want filtering, but that too is just another feature in the method.

The cool thing is, this can also be used to get all the fields without knowing what the fields are. In fact, this one is so simple, I am going to show a real life example, done in Java:

//import java.lang.reflect.Field;
//import java.util.*;

 public static List<String> getFieldNames(Object object) {
  List<String> names = new ArrayList<String>();
  for(Field f : object.getClass().getFields()) {
   f.setAccessible(true);
   names.add(f.getName());
  }
  return names;
 }

 public static Object getFieldValue(String fieldName, Object object) {
  try{
   Field f = object.getClass().getDeclaredField(fieldName);
   f.setAccessible(true);
   return f.get(object);
  }catch (Throwable t) {
   throw new Error(t);
  }
 }

 public static Map<String, Object> getFields(Object object) {
  HashMap<String, Object> map = new HashMap<String, Object>();
  for(String item : getFieldNames(object)) {
   map.put(item, getFieldValue(item, object));
  }
  return map;
 }

Lets first define the term "Field."  This is a Java term for a variable, be it public or private.  In this case, there is code to get all the field names, get any field value and get a map of field names to values. This allows you to write really quick debug strings by simply automatically reflecting any object and spitting out name/value pairs. Furthermore, you can make it so that it would filter out private variables, variables with a name like X or rather than getting fields, use it to get properties rather than variables.  Obviously this can be rather powerful.


Eval

Let me give one other consideration of how reflective like properties can work.  Consider the eval statement, a method of loading in code dynamically.  First, starting with a very simple JavaScript function, let me show you what eval can do:

  var x = 10;
  alert( eval('(x + 2) * 5'));

This would display an alert with the value 60. In fact, an eval can execute any amount of code, including very complex logic. This means you can generate logic using strings rather than hard code it.

While I believe the eval statement is rather slow (in some cases), it can be useful for generating dynamically generated code.  I'm not going to write out an exact example for this, but I want to give you an idea of a problem:

for(int a = 0; a!=10; a++) {
  for(int b = 0; b!=10; b++) {
    //for ... {
      test(X[a][b][...]);
    //}
  }
}

First of all, I do know you could use recursion to deal with this problem. That's actually a hard problem to solve, hard to follow and hard to debug. If you were in a production environment, maybe that would be the way to go for performance reasons, but for testing, performance is often not as critical. Now imagine if you had something that generated dynamic strings? I will again attempt to pseudo code an example:

CreateOpeningsFor(nVariables, startValue, endValue) {
  String opening = "for(a{0} = {1}; a{0}!={2}; a{0}++) {";  
  String open = "";
  for(int i = 0; i!=nVariables; i++) {
    open = open + "\n" + String.Format(opening, i, startValue, endValue);
  }
  return open;
}


eval(CreateOpenFor(5, 0, 10) + CreateFunctionFor("test", "X", 5) + CreateCloseFor(5));
//TODO write CreateFunctionFor, CreateCloseFor...  
//Should look like this: String function = "{0}({1}{2});" String closing = "}";

While I skipped some details, it should be obvious to someone who programs that this can be completed. Is this a great method? Well, it does the same thing as the hard coded method, yet it is dynamically built, thus is easily changed. You can log the created function even and place it in code if you get worried about performance. Then if you need to change it, change the generator instead of the code. I don't believe this solves all problems, but it is a useful way of thinking about code. Another tool in the tool belt.

The next question you might ask is, do I in fact use these techniques? Yes, I find these tools to be invaluable for some problem sets. I have testing code that automatically reflects and generates lots of different values and applies them to various fields or variables. I have used it in debugging. I have used it to create easily filterable dictionaries so I could get all variables with names like X. It makes the inflexibly type system into a helpful systems. I have even used it in creating a simple report database system which used reflections to create insert/update statements where the code field names are the column names in the database. Is it right for every environment? No, of course not, but be warned, as a tool it can feel a little like the proverbial hammer which makes everything look like nails. You just have to keep in mind that it is often a good ad-hoc tool, but not always a production worthy tool without significant design consideration. That being said, after a few uses and misuses, most good automators can learn when to use it and when not to.

Another reasonable question is what other techniques are useful and similar to this? One closely related tool would be regular expressions, a tool I don't wish to go in depth on as I feel like there is a ton of data on it. The other useful technique is known as annotations or attributes. These are used to define meta data about a field, method or class. As I think there is a lot more details to go over, I will try to write a second post on this topic in combinations with reflections as they are powerful together.

Tuesday, October 15, 2013

Interviewing, post 3

TLDR: brain dump.

This is my internal thoughts, presented for you to look at. (cause I want to remove my internal stigma of having to have perfect / point-filled blogs before I post.)

Little background: I used to interview people at a much larger company, we had a team of roughly 50 testers / SDETs. This allowed me a certain amount of freedom in hiring people that might not perfectly fit any particular profile. Aka they could be higher risk (in my mind) in certain key areas, cause they were offset but a large group of other people. Or we could move them to a team that offset any other risky areas while we saw if they panned out over a 2-3 month period. Or were able to learn what we thought was necessary.
I now work for a company with a 4 person QA team, hiring a final person for this year to make it a team of 5. Personally I think this gives me significantly less leeway on what it is that we can hire. I have less wiggle room for potential issues. I don't have the cycles I would like to, to devote to training someone with only some potential. Basically what I think I'm looking for and what I need to fill are more tightly coupled for this current position.

So, in interviewing people, I look for certain key talents. One of those talents is self-motivation / drive / passion. In an effort to attempt to figure out what I'm looking for in good people to hire, one of my employees asked me to define what I meant by motivation or drive.

Before looking it up in a dictionary:
Motivation: A reason to do a task.
Drive: Internal motivation, requiring little to no outside force(s). This can be synonomous with Self-motivated.
Passion: A strong drive that allows one to singularly focus on a task or set of tasks. Passion can be good and bad. Not allowing oneself to defocus when necessary (OCD).

Dictionary (the definition I thought the most pertinent):
Motivation: a motivating force, stimulus, or influence : incentive
Drive: to press or force into an activity, course, or direction
Passion: a strong liking or desire for or devotion to some activity, object, or concept

What does all this mean to me? I think my ideas of what motivation / drive are…are pretty reasonable (perhaps self-motivated is the more appropriate word). Now the real question comes down to HOW do you find out if someone is self-motivated or driven?
I've been reading Hiring with your Head (1) recently to see how someone regarded as a great interviewer goes about it. Adler likes the idea of past proof of having done it. He asks, "What is your most significant accomplishment related to X?" 
Personally, I'm not sure I want to just ask someone directly, "Are you motived? Prove it." People can make up anything if they know what you are looking for.

Lately I've been saying something to the point of:
One of the key objectives for this position is the ability to write bug reports that are clear, concise, accurate and relevant. At a high level, what do you think about this and how would you go about this? What have you accomplished that's most similar to this?
One of the key objectives for this position is the conversion of component, functional, system and regression tests into automation. At a high level, what do you think about this and how would you go about this? What have you accomplished that's most similar to this?

And then digging into the details of the given answers to get specific details. I have found this seems to give me the data I need to determine "is this person self-motivated". But since I just changed up the interview questions I use to start taking this into account, I'll reserve judgment till later.

-------------------

(1) Adler, Lou. Hire with Your Head: Using Performance-based Hiring to Build Great Teams. 3rd ed. Hoboken, N.J: John Wiley & Sons, 2007. Print.
http://www.amazon.com/Hire-Your-Head-Performance-Based-ebook/dp/B000SEOVH6

Monday, October 7, 2013

Words of the Week: Heuristic [& Algorithm]

Isaac and I were debating the meaning of Heuristics the other day, trying to come to some common ground.  We ended up going into some interesting places, but it lead into a good question about what they are and how they are used.  Let me start with my off the cuff definition using no wiki links, then we'll hit into a more formal look.  A heuristic in my mind is a way of getting a not always right answer, but an answer one hopes is good enough.  In comparison, an algorithm will always provide a 'correct' answer.  This means that the output is consistent given the input and will always provide the best answer it can. This leads into the question of, can a computer give a non-algorithmic (heuristic) response?

Now when Isaac and I were talking through this, he noted that a computer always gives the same answer if you lock down all of the inputs.  Random.Int() uses a seed, and if you replay with that same seed it will do the same thing.  If you change the configuration or time of the computer and that has an affect, that is an input, and in theory if you lock that down too as an input.  If the number of CPU cycles for the process is an input, that too would be locked down.  Now given my informal definition, are all heuristics really algorithms, just with inputs that are hard to define?

Lets flip this on its head.  What about people?  Isaac asked me, "What would you call that plant?"  I said, "Green" to his chagrin.   He said, "No, the name of the plant is?"  A co-worker interjected, "George."  Obviously it wasn't Isaac's day, but the point Isaac was going for was, if I didn't know the name and then he told it to me, would I then be able to get a closer to accurate answer.  Even if Isaac was wrong, I had some knowledge base to draw on and could use that to give a heuristically better answer, but I could still be wrong, given that the name might also be George.  The problem is...  What if we locked down all of my life experiences, genes and repeat the experiment?  Obviously it can't be done, but am I really an algorithm that is just a walking input/output device of large complexity?  We are hitting into the realm of determinism and free will.  Is everything an algorithm?  I don't really want to get too far into the weeds, but I believe a point will emerge from this.

The time is now that we start looking at more formal and test oriented definitions to Heuristics.  In wiki it says,
In computer science, artificial intelligence, and mathematical optimization, a heuristic is a technique designed for solving a problem more quickly when classic methods are too slow, or for finding an approximate solution when classic methods fail to find any exact solution. This is achieved by trading optimality, completeness, accuracy, or precision for speed.

 Bach and Kaner say,
A heuristic is a fallible method of solving a problem or making a decision.
So now that we know that, I want us to be able to contrast it to Algorithm, the other word that incidentally is being considered.  Again, let us consider wiki:
In mathematics and computer science, an algorithm is a step-by-step procedure for calculations. Algorithms are used for calculation, data processing, and automated reasoning.  ... While there is no generally accepted formal definition of "algorithm," an informal definition could be "a set of rules that precisely defines a sequence of operations."
In both definitions, Heuristics acknowledge failure as a possibility.  On the other side, the Algorithms definition notes that their is no formal definition and the only 'failure' I noted that was somewhat on topic was the question if an Algorithm needed to stop.  If an Algorithm does not care if it gives back a correct value, just that it has a set of finite steps, then it too acknowledges failure is allowed.  In theory, some might note that it should end eventually, depending on if you think a program that has a halting state is an Algorithm but this is the only question about outcome.  I suppose you could say an Algorithm can also fail, as success is not a required condition, only halting.  In all the definitions, their is some method for finding a result.  The only difference appears that Heuristics specifically acknowledge limits and stopping points and Algorithms don't.

So what is the difference between Heuristic and Algorithm?  One of the popular StackOverflow answers says:
  • An algorithm is typically deterministic and proven to yield an optimal result
  • A heuristic has no proof of correctness, often involves random elements, and may not yield optimal results.
So in this very formal world, an Algorithm requires mathematical proof of correctness (within a given context, such as assuming our universe's constraints).  Heuristics on the other hand need no such formal proof.  In that case, most code is in fact Heuristical in nature and most of our testing is also Heuristical in nature.  This starts to lead into the question of sapient testing vs checking, but still, I don't want to get into that yet.  Well not much.  I do want to address one other quote from Bach,
There are two main issues: something that helps you solve a problem without being a guarantee. This immediately suggests the next issue: that heuristics must be applied sapiently (meaning with skill and care).
The idea that Heuristics require skill and care is an interesting one.  When I write an automated test or when I write a program, I use skill and care.  Am I using Heuristics in my development or is my Algorithm the Heuristic?  When I test, am I exploring a system using a Heuristic but when I write automation, the Heuristic of my exploration is lost after the writing of the test, and then it becomes something else, an Algorithm to formally prove if the two systems can interact the same way controlling for the majority of the inputs?  What happens when computers start getting vision and understand a little of the context?  Are they now sapient in a given context (meaning are they skilled and take care to manage the more important aspects compared to the less important aspects)?

I don't intend on going on in this questioning manner, but rather to hit you with a surprise left.  Sometimes, words are so squirrelly, that when one person attempts to pin them down, they end up creating a unintended chain of events.  They create just another version of the meaning of the word.  Next thing you know, no one really knows what the word means or what the difference is between two words is.  I have done way more research on this than most do, and yet I don't think there is a good answer.  I too will attempt to put my finger in the dike, but I don't expect to stop the flow of definitions:

  • A Heuristic is an attempt to create a reasonable solution in a reasonable amount of time.  Heuristics are always Algorithmic in that they have a set of steps, even if those steps are not formal.
  • A Algorithm is a series of steps that given the control of all inputs will consistent give back a result without necessarily considering other external factors, such as time or resources.  These steps have some formal rigor.

Wednesday, October 2, 2013

Refactoring: Improving The Design Of Existing Code

Having read many books on the basics of programming, building up queries and the basics of design, I have found that almost none really talk about how to deal with designing code outside abstract forms. Most books present design in a high level, possibly where a UML is shown and revolves around most of the OOP connections. Some talk over how to use design patterns or create data structures of various types. All seem to be under the illusion that, we as programmers actually can apply abstract concepts into real every day practical methodologies. Almost always they use things like animals or shapes or other "simple" (but not practical) OOP examples. They are always well designed, and seem relatively simple. In spite of that, I think I have learned how to do proper design to some degree by years of trial and error.

Refactoring, on the other hand starts out with a highly simple, yet more real world, movie system, and then slowly but surely unwraps the design over 50 pages. It is one of the most wonderful sets of “this code works, but let us make it better” I have ever seen, and it is dead simple in implementation. The book starts with a few classes, including a method to render to a console. Then blow by blow, the shows differences via a bolding of each change. The book talks design choices like how having a rendering method that contains what most would call “business logic” prevents you from easily having multiple methods of rendering (I.E. Console and html) without duplicate code. They also make a somewhat convincing argument against temporary variables, although I am not 100% convinced. Their reasoning is that it is harder to refactor with temp. variables, but sometimes temp variables (in my opinion) can provide clarity, not to mention they tend to show up in debugger watch windows. To be fair, later on, the author notes that adding back in temporary variables for clarity, but it is not emphasized nearly as much.

As I continued through the book, a second and important point was brought up. Refactoring requires reasonable amounts of unit testing in order to ensure that you do not accidentally break the system by the redesign. The argument continues that all coding falls into two categories. One is creating new functionality and the other is refactoring. When creating the initial code, you need testing around it, which connects into the TDD. The point is to be careful with refactors because they can cause instability, and to work slowly in the refactoring. They talk about constantly hitting the compile button just to be sure you have not screwed something up.

Sometimes, it is comforting to know that I am not the only one who runs into some types of trouble. One of my favorite statements in the book is in chapter 7 where the author notes after 10+ years of OOP, he still gets the placement of responsibilities in classes wrong. This to me is perhaps the entire key to refactoring. The point is, we as humans are flawed, even if we are all flawed in unique ways; it is something we have in common. This book, in my opinion is not ultimately how to correctly design, but how to deal with the mess that you will inevitably make. This is so important I think it bears repeating; this book says that no one is smart enough to build the correct software the first time.  Instead, it says that you must bravely push on with your current understanding and then, when you finally do have something of a concept of what you are doing, go back and refactor the work that you just created. That is hard to put into practice, since you just got it working, and now you have to go tear it up and build again. The beauty of the refactor is that you don’t have to go it alone, you have techniques that allow you to make wise choices on what the change and the changes should have no affect on the final product.

One final thing I think is worth mentioning. The way the book is laid out, you can easily re-visit any given refactoring technique that you didn’t “get” the first time, as it is fairly rationally structured, grouping refactors together but keeping each refactor individualized. It makes me wonder how many times they refactored the book before they felt they had it right?