Friday, June 27, 2014

My Current Test Framework: Testing large datasets

I recently wrote how I felt few people talk about their framework design in any detail.  I feel this is a shame, and should be corrected as soon as possible.  Unfortunately, most companies don't allow software, including testing automation to be released into the public.  So most of the code we see is from consultants with companies occasionally okaying something.  In my case, I did something like a clean room implementation of my code.  It is much simplified, does not demonstrate anything I did for my company nor does it directly reference them.  It is open source and free to use.  Without further delay, here is the link: https://github.com/jc-d/Demos.

It is in Java and was intended as part of a demo for a 2 hour presentation.  Because of the complexity of the system, I'll write some notes about it here.  I used IntelliJ to develop this code and recommend using it to view the code.  It does use Maven for the libraries, including TestNG which you can run from IntelliJ.  Many of the concepts could be translated to C# with little difficulty.

So what does it do?  It demonstrates a few different, simple examples of reflections and then a build up of methods for generating test data in reflective and possibly smarter fashion (depending on context).  I'm sure your sick of hearing about reflections from me, so I'll try to make this my last talk on them for a while, unless I come up with something new and clever.

As a brief aside, Isaac claims that while this is a valiant attempt to write out a code walkthrough, it really needs to be a video or audio recording.  Perhaps so, but I don't want to devote the time unless people want it or will find it useful.  As I have my doubts, I'm going to let the text stand and see if I get anyone requesting a video.  If I do maybe I'll put some time into it.  Maybe. :)

Now on to the code...

First the simple examples which I do use in my framework but not as simple as this.  The two simple examples are DebugData and ExampleOfList and both are under test/java/SimpleReflections.  DebugData shows how you can use reflections to print out a simple object one level. down.  It 'toStrings' each field in the object given.  Obviously if you wanted sub-fields that would take more complex code, but this is often useful.  ExampleOfList takes a list of string and runs a method on each item on the list and returns back the modified list.  Obviously this could be any command, but for simplicity of the demo I limited it to methods that did not take arguments.

Now all the rest of the code is around different methods for generating data.  I will briefly describe each of them and if you want to you can review the code. 

The HardcodedNaiveApproach is where all of the values are hard coded by using quoted strings.  E.G. x.setValue("Hard coded");  This is a good method for 1-3 tests but if you need more you probably don't want to copy and paste that data.  It is hard to maintain, so you might go to the HardcodedSmarterApproach.  This method uses functions to return objects with static data so you can follow the DRY principle.  However, all the data is the same each time.  So you add some random value, maybe append it to the end.  The problem is what are your equivalent class values?  For example, do you want to generate all Unicode characters?  What about the error and 'null' characters?  Are negative numbers equally valid to positive numbers.  If not, then your methods are less DRY than you might want as you will need different methods for each boundary, if that matters.  Also you are writing setters for each value, which might fail when a new property is added.  We haven't even talked about validation yet, which would require custom validators based upon the success/failure criteria generated by the functions you write.  That is to say if you generate a negative number and it should fail for that, not only does your generator have to handle that but your validator does as well.   What to do?

Perhaps reflections could help solve these problems?  The ReflectiveNaiveApproach instead uses the typing system to determine what to generate for each field in a given class.  An integer would generate a random integer and a string would generate a random string.  We know the field name and class type so we could add if statements for each field/type but that puts in the same maintenance of new properties we had with the hard coded approaches.  If we didn't do that we could still handle new properties assuming we knew how to set the type, but it might not fit the rules of the business logic and we have no way to know if it should work or not.  For fuzz testing this is alright, but not functional testing.  Is there any solutions?  Maybe.

The final answer I currently have is the ReflectiveSmarterApproach.  In effect when you need to generate lots of different data for lots of different fields, you need to have custom generators per class of fields.  What is needed is an annotation for each field telling it what needs to be generated.  An example of that can be found in the Address class.  Here is a partial example of this:

public class Address {
 @FieldData(dataGenerators = AverageSizedStringGenerator.class)
 private String name;
 @FieldData(dataGenerators = AddressGenerator.class)
 private String address1;
 //...
}

Now let us look at an example generator:


public class AddressGenerator extends GenericGenerator {
 @Override
 public List<dynamicdata> generateFields() {
  List<dynamicdata> fields = new ArrayList<dynamicdata>();
  fields.add(new DynamicData(RandomString.randomAddress1(), "Address", DynamicDataMetaData.PositiveTest));
  fields.add(new DynamicData("", "Empty",
   new DynamicDataMetaData[] {DynamicDataMetaData.NegativeTest, DynamicDataMetaData.EmptyValue}).
   setErrorClass(InvalidDataError.class));

  return fields;
 }
}


This generator generates a random address as well as an empty address.  It is clear one of these addresses is valid while the empty address appears to be a negative test.

Through the power of reflections you can do something like this:


List<DynamicDataMetaData> exclude = new ArrayList<DynamicDataMetaData>();
exclude.add(DynamicDataMetaData.NegativeTest);
ReflectiveData<Address> shippingAddress = new CreateInstanceOfData<Address>().setObject(new Address(), exclude);


The exclude piece is where you might filter out generating certain values.  Say you want to only do positive (as in expected to be successful) tests, you might filter out the negative tests (those that expect not to complete the task and possibly cause an error).  The third line generates you an object with all the properties that have the attached annotation and values.  Now this does not handle new fields automatically but it could certainly be designed to error out if it found any un-annotated fields (it is not at present designed to do this) and if you embed the code in your production code, it would be more obvious to the developer they need to add a generator. 

Now how do we pick which value to test when we could test with the empty address or a real address?  At present it picks it randomly because according to James Bach, random only takes roughly 2x to get equal coverage to pair wise testing.  Since we know all about the reason for generating a particular value (what error it would cause, etc.) we can at run time say how the validation should occur.  The example validation is probably a bit complex, but I was running out of time and got a bit slap-dash on that part.  One issue with this method is it is hard to know what your coverage is.  You can serialize the objects for later examination and even create statistical models around what you generated if needed.

Summary

Obviously this is a somewhat heavy framework for generating say 20 test data values.  But when you have a much larger search space that approaches infinite this is a really valuable tool.  I have generated as many as 50 properties/fields about 400,000 times in a 24 hour period.  That is to say, generating roughly 400,000 tests.  I have found bugs that even with our generator would only be seen 1 : 40,000 times and would probably have never been found in manual testing (but would likely be seen in production).  The version I use at work has more than a years worth of development and research, supporting a lot more complexity than exists in this example, but I also don't think it could be easily be adapted as it was built around our particular problems.

This simple version can be made to support other environments with relatively little code modification.  It took much of the research and ideas I had and implemented in a simpler fashion which is more flexible.  You should easily be able to hook up your own class, create annotations and generates and have tests being generated within a day (once you understand how).  On the other hand it might take a little longer to figure out how to do the validation as that can be tricky.

One problem I have with what I have generated is there is no word or phrase to describe it.  In some sense it is designed to create exploratory data.  In another sense it is a little like model driven testing in that it generates data, has an understanding of what state it should go to and a method to validate it went to the correct state.  However it doesn't traverse multiple states and isn't designed like a traditional MDT system.   Data Driven Testing describes a method for testing using static data from a source like a csv or database.  While similar, this creates dynamic tests that no tester may have imagined.  Like combinatorics, this creates combinations of values, but unlike pairwise testing, the goal isn't just generating the combinations (which can be impossible/unpractical to enumerate) but to generate almost innumerable values and pick a few to test with, while enforcing organization of your test data.  This method also encourages usage of ideas like random values while combinatorics is designed to have a more static set of values.  Yes you can make combinatorial ideas works with non-static sets, but it requires more abstraction (E.G. Create a combination of Alpha, Alpha-numeric, ... and this set of Payment method, now use the string type to choose what generate you use) and complexity.  Finally combinatoric methods can have difficulties when you have too many variables, depending on implementation.  This is a strange hybrid of multiple different techniques.  I suppose that means it is up to me to try to name it. Let's call it:  JCD's awesome code.  Reflective Test Data Model Generation

I would say don't expect any major changes/additions to the design unless I start hearing people using it and needing support.  That being said I love feedback, both positive and negative.

While researching for this article I came across this which is cool, but I found no good place to cite it.  So here is a random freebie: http://en.wikipedia.org/wiki/Curse_of_dimensionality

Thursday, June 19, 2014

What is the Highest Level of Skill in Automation?

Thanks to Robert Sabourin for generating this topic.  Rob asked me roughly, 'What in your opinion is the highest level of skill in automation?'  He asked this to me in the airport after WHOSE had ended, while we waited for our planes.  It gave me pause in considering the skills I have learned and help generate this post.

Let me make clear a few possible issues and assumptions regarding what the highest level of skill is in automation.  First of all, I think that there is an assumption of pure hierarchy, which may not exist.  That is to say, there might not be a 'top' skill at all or the top skill might vary by context.  So I really am mostly speaking from a personal level and with my own personal set of automation problems I have faced.  When I answered Rob's question in person, I neglected to add that stipulation.  The other possible concern is that the answer I give is overloaded, and so I will have to work on describing the details after I give the short answer.  Without making you wait, here is my rough answer: Reflections.

What are reflections?

In speaking of reflections, you might assume I am speaking of the technology, and for good reason.  I have spoken on them many times in this blog.  However, that is just a technical trick, albeit a useful one. I am not talking about that trick, even if the comp-science term 'reflections' is part of the answer.  In speaking of reflections, I mean something much broader.

There is the famous "thinker" sitting on his rock just pondering is much closer to what I had in mind.  But you might say, "Wait, isn't that human thinking?  Isn't that critical thinking or introspection?"  Yes, yes it is.  What I mean by reflections is the art form of making a computer think.  While a computer's intelligence is not exactly human intelligence, the closer we approach that vast gulf, the closer we are to generating better automation.

Most people might start to argue that requires someone with in depth knowledge or artificial intelligence or at least a degree in computer science or someone with a development oriented background.  Perhaps that is the logical conclusion we will ultimately see in the automation field, but I don't think that either an in depth knowledge of development or AI is required for now.  I know that you need to go to that level to start understanding this concept.

Instead, I think you need to start thinking of the automation in the way you think about writing tests.  In some ways this relates to test design.  Why can't the automation ask what am I missing? Why can't my automation tell me what the most likely reason a failure occurred*?  Why can't the automation work around failures*?  Or at the very least, ignore some failures so it isn't blocked by the first issue it runs into*?

* I've done some work around these, so don't say they are impossible.

Now that I have walked around the definition, let me define the reflections in context of this article.

Reflections:  Developing new ideas based upon what is already know.

An example

A good example for the need for reflections is the brilliant talk given by Vishal Chowdhary, in which he notes that in translations (and searches, etc), you can't know what the correct answer is.  You have no Oracle to determine if the results are correct.  Many words could be chosen for a translation and it is hard to predict which ones are the 'best'.  Since computer language translations are adaptive, you can't just write "Assert.Equals(translation, expectedWord)" with hardcoded values.  Since these values are dynamic, the best you can do is to use a "degree of closeness".  You see, they couldn't predict how the translation service would work because it has dynamic data and the world changes quickly, including new words, proper titles, et cetera.

So how do you test with this?  Well you can look at the rate of change between translations.  You can translate a sentence, translate it back and record how close it was to the original sentence.  Now track how close it is over time, with different code and data changes.  You could take translation string lengths and see how they vary over time and note when large deviations occur. There are lots of methods to validate a translation, but most of them require the code to reflect on past results, known sentences and the likes.  The automation 'thinks' about its past, and on that basis judges the current results.

Not to say some automation shouldn't be reflective.  For example you could hard code a sentence with "Bill Clinton" in it and check to make sure that it didn't in fact translate his name.  You could translate a number and check to see it didn't change the value.  You might translate a web page and check something not related to the translation such as layout.

Not just the code

In reading my blog you might assume that because I specialize in automation I think reflections is a code-oriented activity.  I do think that, but I think it applies more broadly.  When I write a test, I should be reflecting on that activity.  That is to say I should be thinking "Is that really the best design?", "Should I be copying and pasting?", "Should I really be automating this?", etc.  In always having part of my brain reflecting on the code, I too am write better code.  Hopefully between my writing better code and my code trying to do better testing using reflections, we do better testing overall.  This also applies to testing in general, with considerations around things like "That doesn't look like the rest of the UI." or "I don't recall that button there in the last build."

I have only scratched the surface of this topic and made it more specifically apply to automation/testing, but I think this applies to life too.  For a more broad look at this topic I would highly recommend Steve Yegge's blog post Gödel-Escher-Blog.  It will make you smarter.  Then next time you go do some automation, reflect upon these ideas. :)  And if you are feeling really adventurous, please put a comment about your reflections on this article here.