About 98 Percent Done: April 2015

Monday, April 20, 2015

Expectations...again...

This was written a long time ago...but finally edited to readability.

So I just finished the AST BBST Instructors course. In it, we started by introducing ourselves. I of course summarized my introduction from this blog cause I was being lazy, and it said what I thought I wanted to say.

In response to that I was asked:

"It sounds like you have the confidence and experience to be a manager, but I wonder if this might make it hard for you to teach? What do you think?" - Another Student

I responded with:

"Yes I find that my mental model has a hard time bonding with people who aren't also self-starters and/or newbies. However I've been specifically working on how do we 'as a testing community' get the first level of people into testing. I still have a problem dealing with people who just want to be spoon-feed information.
My experience is mainly in taking people with 3-5 years of experience and tuning (yes tuning) them into the next skill they need to be the best at their current position. However so far this has been limited to where I work, as I usually understand the context enough to say "if you could do this" it's the next skill you should learn.
So part of my goal is to be able to teach less experienced people. But I still struggle with if I should TRY to deal with non-self-starters…"

I’d like to take a more detailed look at my personal thoughts on the matter. Specifically around, “Does your own attitude and personality cause you difficulty in teaching people?” I’m going to try and detail an internal dialogue I’ve been having about this lately. I first tried to sum up some feelings in an older blog about Expectations.

This is what I mean when I say, "not-self starters". There have always been people at places I've worked that are comfortable doing just the basics, show up at work, do their job, go home. These are the people that when asked in an interview "What do you do to improve your skills", have an answer similar to "On the job training". These are the people that don't seem to want to take time outside of work to improve their skills (which to me translates to "I don't consider this job a career"). Some of these people seem bright, others are merely content at getting along and that's enough for them.

Then there are the self-starter people who will go out of their way to learn new things. These people ask you questions, and when you say, "tell me what you know about it so far", they blow your mind with the research they have done. Or alter your understanding of the subject with some new pieces that you haven't heard or seen of yet. Sometimes at a minimum they have clearly researched and/or understood the material, but don't change your level of knowledge.

I know mentors / leaders want to use terms like taught, educated or instructed. But when you have self-starters, that isn't the appropriate wording. The closest I've come is tuning them. When I chat with a self-starter cause they are asking "What should I learn next?" (and them asking is one of the key points). There are a couple of ways to encourage them. And generally that is all you can do. Point in a direction, say 'that way', and then get out of their way.

Have them learn a new skill. Sometimes these are easy for beginning self-starters. Don't know SQL, yep I have yet to meet any tester that doesn't benefit from knowing it. As the self-starters get more and more experience, this can get harder and harder to find for each individual. Currently I've had serious success with just allowing them the freedom to find new things. This is the real power of the self-starter, they aren't okay with sitting idly by and surfing reddit, they WANT to provide value, they WANT to solve problems.
Have them level up a skill they are already strong in. This helps those who have just finished something rigorous, demanding or seriously mentally intensive. (This can be work related like finishing a project or a mentally challenging class, or a non work related life changing event.) It allows them to learn something, but with the permission to not be as intensive with it. An example would be having someone learn the singleton pattern, instead of a whole new programming paradigm like LISP.
Have them up a skill they are weak in. I don't normally recommend this for very many people. I tend to follow the Good to Great idea, that weakness isn't inherently bad. But there are genuinely some weaknesses you need to either compensate for, or bring up to a minimum bar.

Sometimes even that isn't enough. I recall one situation where I was attempting to tune someone I classify as a full-blown self-starter. We were going over the OSI model, and I was attempting to explain how each layer could be an attack point for testing. Unfortunately we were not speaking the same language, after about 30 minutes, they were frustrated and on the verge of crying. I was frustrated and not understanding why they couldn't get the 'simple concepts'. We parted that day, and I don’t recall us ever trying to train together again. I took this as a personal failure…how could I not explain such simple concepts (which OSI is and is not) to someone who tries really hard when they self-educate and generally succeeds? It took me a long time to understand that I had a base of knowledge about computers that they had never had exposure to. Bad on me for assuming what they knew. The problem was that I know stuff they didn't. They also know stuff I don't. It takes time and effort to truly get to know people. I hadn't taken the time or maybe didn't have the perception to understand that they had no idea what I was talking about. They didn't have the trust to tell me 'What the hell are you talking about'.

S'long as you learn to temper what you tell people you want, with realistic acceptance of what people can really accomplish. It all comes down to do you trust your people to work hard, do they trust you to not hose them with unreasonable expectations. It takes some serious time to build that trust. It's why you see people follow a strong leader around to companies.

If you don't fall into the category of self-starter as I've laid out here. There is still hope, you can become one. Start today, motivation breeds motivation, find something interesting, and learn it. When you're done learning that, find another. Then repeat ad nauseum.

I've going to wrap up and caveat this entire article with a thinking exercise for you.
Can your expectations be too high? Is this a bad thing?
How else would we ever achieve something new?

Thursday, April 9, 2015

The Future of Load Testing

First of all, I won't pretend I actually know what the future of Load Testing will look like, but I want to describe some of the different ideas I have seen and done. Some of these things I have not seen or heard of by anyone else, certainly not on the internet. So hopefully these will expand your thinking around Load Testing.

What is the Purpose of Load Testing?

Functional testing is designed to exercise the functionality of the system. Frequently people talk about using Selenium or QTP to exercise a particular piece of functionality, often in the UI. With the Test Pyramid it is suggested that these sorts of tests should try to hit below the UI level for various reasons. No matter, if you test an API, a UI, Console App or some other hook below the UI, if your concern is around the Testing Pyramid, it is likely you are trying to test the functionality. You are often interested in large sets of behaviors and the way the system responds. When you do Load Testing, in most cases, few of the broad tests are as important. You are not trying to see if the system will handle all the corner cases and often Load Tests don't even check anything besides a response code. The raison d'etre for a Load Test on the other hand often has less to do with how the system functions but in ways the system reacts to many different inputs occurring in a relatively quick fashion. Granted some Load Tests are less about the number of inputs and more about the style of the input (E.G. large files) or other types of constraints (e.g. with less RAM). To sum it up in a general statement, Wikipedia describes it this way:

Load testing is the process of putting demand on a system or device and measuring its response. - http://en.wikipedia.org/wiki/Load_testing

However, the majority of Load Tests simply want to understand how many inputs a system can handle given a certain profile of inputs.

All of that sounds rather abstract, but if you go reading Microsoft's very hand guide on types of performance tests, you will see that the underlying purpose of this sort of testing varies. They use terms like load test, performance test and stress test for different meanings. I think all of this data is really useful and valid. I however, am going to use Load Test (in capitals) to describe any number of different purposes. Instead, what I want to look at is what sorts of ways we can organize our testing to make for a better long term experience. These ideas could be applied to many of the various purposes of Load Testing, so assuming you understand your Load Testing goals you can tailor these ideas to your organization.

35-50% of the Internet Traffic

The first idea I think worth exploring is the Netflix model of Load Testing. Effectively, trying to have a production-like QA system is silly for Netflix because that would be like having a second set of the internet for QA. In fact, if you were like Netflix, you would then have to have an insane number of additional systems to generate a load anything like your customers... or you could just have production traffic mirrored between the two systems. Having a second prod-like environment is of course not going to work. So they came up with a radical set of strategies, but I think this sums it up nicely:

We have found that the best defense against major unexpected failures is to fail often. By frequently causing failures, we force our services to be built in a way that is more resilient. - http://techblog.netflix.com/2012/07/chaos-monkey-released-into-wild.html

The best method they found was basically to attack production and see how it would respond. This however still leaves the question of how do they specifically test their code will handle new loads. While they do some Load Testing, their biggest defenses are the scaled down production traffic mirrors they run, the ability to go back to old code and the fact that AWS allows you to spin up new instances all the time. In effect, they turn the problem on its head, but this really only is helpful if you can use AWS to grow quickly and if your traffic is relatively uniform. Also, at Netflix scale, you can hire a lot of engineering talent to build this up.

The Limits of Load Test Systems

When trying to leverage user traffic doesn't work, you have to start looking for other options. Using something like JMeter is interesting. I call out JMeter because that is what Netflix uses above and beyond their production traffic mirror. A tool like JMeter takes traffic from a proxy and feeds it into a script (I guess you could hand code it if you are crazy). Then you edit the script and parameterize it. You run the script over and over again using multiple threads and try to create load. These tools might instrument the systems under test or you might have to do that yourself. In either case, the data gathered is outputted and left for some poor soul to try to understand. After having been in this position several times, let me say that it truly is difficult to understand these results, in particular because you had to use record-playback from a proxy. Just as it is a bad idea to use record-playback in automating your functional tests, I think it is a bad idea to do so with such tools.

In one of my former companies, they had one specialist whose only job was the deal with these proxy-recorded scripts. They are a mess and I'm not going to pretend I know how to fix them. However, I do have a few ideas and all of them involve your already created friends, the functional tests.

Functional Tests == Load Tests: With or Without UI

When you have a functional test, you might have a complex setup and tear down. However, the test itself is often fairly simple. There are two major ways you can create functional tests, one is using a UI and the other is to hit just below the UI, perhaps at an API level. So logically you can do two things to create load. One is to scale up your UI tests. I have seen this done and know of others who have tried to do it. It is a fairly big engineering feat to create a Load Test using Selenium, but I know it can be done. Be warned, this can be very expensive, as it requires one OS per hand full of threads you want to run plus overhead for selenium hub nodes. The other option is to use your existing API tests and create a Load Test overtop of that. You might have to simplify the data creation and you might have to remove the validation if those take too long, but this is a very easy method of Load Testing the system. I have personally built several Load Tests around this idea.

Now we have talked for years about how functional tests should be run in a CI environment. If you are writing your Load Tests like you write your functional tests, using the same systems, then why not run your Load Tests nightly? Obviously there are some questions you want to ask up front, like if someone will be alerted because of it or what impact that has on your functional tests. Another question you might consider is what sort of load do you want? If you need to actively watch to make sure the system stays up, a nightly Load Test would not make much sense. On the other hand, what if you did a small load for a short period of time? You could capture the resulting data and plot it on a chart. This way as you gathered more nightly data, you would have a rough understanding of what you expect. Now, you aren't running a Load Test once a sprint with little idea of what changes might cause impact, with specialized scripts that take a lot of effort to maintain. Instead, you have a trend line and will notice changes. Now this won't tell you when you will fall over or some other data points, but it does give you a change detector. Furthermore, when you have to fix your functional test, your fix automatically goes into the Load Test.

The next piece is you can run multiple 'threads' at once, not all of which are load related. If your Load Test can't do validation, while running the Load Test you can run some functional tests to see if the system still appears to function. Since this is a job in your CI manager, it should be easy to kick off. You can even manually test while your Load Test runs.

Eventually, you might realize that your functional and load tests only vary in how much load there is and how complex the setup/validation is. You might realize, like the Load Test, you can metrickitize the functional tests so that you see if a functional test start taking longer.

Load Test? What is that?

If you notice my description, my efforts are to make the code base easier to maintain and have as much data as possible. This ultimately makes your Load Testing efforts and your functional efforts look very similar. They live in the same code base, call the same functions, effectively the concepts merge. The only differences are in the design, setup, cleanup and how heavy the validation is. I do think there is value in having different words to describe the intent, but merging the code allows you to get more done with less resources. One of the biggest advantages I have personally seen is that I actually understand what the test does, where as when I was using tools like JMeter, I often had no idea how it worked. I have learned a ton about HTTP and HTTPS just by building my own tools. Now not everyone has time for that and I think ultimately we will want tooling to help make this easier. However, I am not sure if the costs of having a different tech stack and code base is worth the value the current tools provide, so you might have to make your own for now.

If you have found your Load Testing tools are working well today, then feel free to ignore this. I know not everyone has the same needs as we have. I know that the tools we have today do serve some purposes, but my experience hints that they frequently are as much trouble as the value they add.

"In the Year 2000"

- Conan O'Brien, et al.

In trying to predict the future it is really difficult to say what will or will not happen. Conan O'Brien has been predicting what will happen in the year 2000 for more than 15 years, but unlike him, I have not had the benefit of seeing the future. With that said, I suspect that in the future we will see more machine learning style systems that take our Load Test data and create load profiles based upon real time data. Such a system would be able to adaptively adjust based upon metrics in the systems under test and would also detect what changes happened and what code appears to have caused a particular slowdown. It will correlate this and help find what is causing systems to fail.

I also suspect that we will have a better set of load profiles that can push or break systems. Load Test systems of the future might go through those profiles on a daily or build-basis and inform you when a build/day has uncharacteristic results, again based upon some form of machine learning. It will start looking a lot more like functional tests which you only examine when something strange occurs or when you are auditing your tests.

Obviously all of this takes some fair amount of work and effort to produce. We presently don't have the tooling to do this and while bits and pieces have been worked at, I have heard of no one actually doing this.

What sorts of things would you like to see in future load testing systems? What areas have you struggled with? I have found very little experiential data around load testing. I'd love to hear from others on this topic, even if you just dump a link to your blog post.

Monday, April 6, 2015

Tainter's Composite

Introduction

First let me start by defining my terms. Dr. Joseph Tainter, is an anthropologist who looked into the question of why societies collapse. A composite is combining multiple things into a single image. I believe using Tainer's mode of thinking, one can create a model or system including for software and organizations. This model, using a lot of systems thinking includes the work of the Dreyfus model, The Gervais Principle and general economics, but the framework all rests on is from Tainter. I will explain Tainer's ideas, however, if you don't know about the Dreyfus model or Gervais Principle, you will need to read up on those before continuing further. In fact, without those you will be lost. I realize it is a huge amount of reading just to read this post, but I promise you it is worth your while if you want to understand the inner workings of business culture.

Tainter asserts that societies become more complex as the societies needs for solutions to problems increase. That is to say, in order to solve a problem, the society adds complexity. For instance, when a tax loophole is discovered, society might create a rule around that loophole to end it. This complexity means that someone must create the law, the people who are affected must learn of it, there must be a means of enforcing the law, someone has to interpret the law and when violated someone must enforce the penalty. Most laws might add minor complexity, but at some point the laws become uncountable, making it very difficult to follow the law. Now this by itself might not be a problem, unless the value received from additional complexity declines compared to the cost. When enough of these sorts of mounting complexities cost more to the society than the society produces, eventually the society will fall.

A Corporate Example

Let me consider a different form of this, one that is very easy to understand. Let us suppose that you have a piece of software that requires 9 engineers but has an income stream sufficient to pay for 10 engineers. Customers keep demanding new features, and the complexity of the system rises. Each new features has both cross-cutting concerns and interlaces with existing features. More bugs are found, but few new customers are added. Eventually, those whom knew the project will leave, but the complexity of the project does not. The company finds itself needing more engineers in order to support the product. The new engineers conclude that the best thing to do is to rewrite the application. Even if you ignore Joel's words (from the last link about refactoring vs rewrite), you know that then they had to hire even more engineers to support the old application while the rewrite was in progress. The company had not been making that much money, didn't appear to be able to make much more and so the obvious thing to do is the close shop. A sociopath from The Gervais Principle would clearly do so. They would not feel bad for the engineers who lost their jobs, for the customers who lost the product or even the history of the company.

This scenario has not happened to me personally, but I know that at least twice my entire automation code set was abandoned because the person whom replaced me was either not an expert or was at least less experienced compared to me in their programming skills and the company had no one else beside myself who knew the automation code base. Instead of trying to learn the code base, the less knowledgeable people threw it away and started over. Why do that? My guess is the person doesn't understand the complexity in the system and decide that the last guy or gal must have been an idiot. I too have inherited code bases, and not once have I started over from scratch. I have thrown away pieces and parts, but never the whole, so I can't directly answer it. I suppose, I once discover there was former automation scripts about 6 months after I started a project written in a different programming language. In that particular case, the code developed had already outstripped all the functionality from the existing discovered code. Even in this case, I still reviewed the work to ensure we had capture the same set of solutions. This leads into an interesting question of losing complexity by forgetting, however, I am not going to cover that topic.

Company Growth & Avoid Responsibility

This also applies to big companies and as I alluded to earlier, governments. According to Tainer, complexity growth without equal or greater forms of production end up with an eventual collapse. Consider HR policies of different organizations. With self-employment, there is no HR, thus zero complexity. With a company of just a few people there is likely still no HR, but rather just a group of people all working towards a unified goal. Often these people are very close. Eventually there is a need for someone to manage the complexity and you hire someone into HR. As you keep growing, someone does a boneheaded thing, like not taking time off and not call in sick for a week and then show up expecting that they still have a job. There is no policy saying they will be fired and so they demand that they either keep their job or get unemployment. So a policy is added, no big deal...

Then you get to be a mega corp with 100,000 employees. You have divisions bigger than most medium sized companies and you have a handbook big enough to use as a deadly weapon. Big organizations are often not run nearly as kindly towards employees because (in part) no one is empowered enough to go do that or if they are, it is spread unevenly causing jealousy, which causes things to then be 'not allowed'. Having a pizza party for success now makes 500 others unhappy as they smell your reward. You made 50 people happy and 500 grumpy. An expert would have seen the problem ahead of time and planned on making it outdoors or away from the other employees, but the person who had the party didn't know better. So the company decides to make a rule saying you can't do that. Eventually all these rules add up. They hurt morale because the Dreyfus expert knows its stupid but has to think about how to get around the rule, making their 'expert' skill less valuable (as the point of the expert is they don't use rules). They, like most Gervais losers (economic losers) flee, making such companies lose their best employees. It also encourages people to not try to make work better as that would break norms and likely added more rules to the already overly large set of rules they have to deal with. Finally it encourages people to quit thinking because there is a rule, making them even less effective.

There is yet another reason that rules are created. As The Gervais Principle describes, the sociopaths are always interested in getting the clueless to accept as much responsibility as possible while giving limited credit or authority. The clueless are a buffer for the sociopaths so that they get as much value as possible from the losers without having to deal with the losers. This is a form of protection that a zealous clueless person is willing to do, however, the clueless often don't know how to actually run an organization. The clueless are in a position in which they use baby talk to pretend they know what they are doing. Since they don't actually know what they are doing, the sociopaths use rules to force the clueless to do what the sociopath thinks is right. These types of rules create a different type of complexity. This is a social complexity, where the sociopath must juggle personalities to get the most production. The clueless is actually a cost, but a small cost considering they are a sort of hedge that if things go wrong, the clueless can be blamed. In a larger sense, this means that the person with bad ideas but who is good at manipulation can maintain power for a long time in spite of making bad choices because those choices aren't attributed to that person. This can eat away at an organization or society, building up costs from complex institutions with mild production value, further corrupting the society.

Cat & Mouse

Capitalism is an attempt to alleviate the ineffective manager who somehow remains employed via competition. The problem with capitalism, is like any system, the complexity builds up to protect the system from being hacked. This can be anything from monopoly laws to defenses from regulatory capture. However, this sort of question is mostly capture by Tainter himself, so I won't dive into it.

Ultimately the Tainter Composite describes how complexity not only infects society but the institutions around society. The final piece I want to describe is how this leads to a cat and mouse set of activities. Often rules are made because there is a real and legitimate concern. Some of my examples might seem silly but are actually true (I have experienced some of them). However, even 'smart' rules can get you into trouble.

Consider black hat crackers and the white hat hackers. Code is in fact just a form of rules, often very useful rules. You are currently reading this using a system with more lines of rules than one person could memorize. The trouble is, there is always a group who wants to enjoy finding ways around these rules for their own purposes, just like some of the troubles in capitalism. It might be someone who just adds everyone to your friends list or it might be to take all the money out of your bank account. As security gets better, more rules have to be applied to protect the system. Will code ever get to the point where maintaining and adding rules are too expensive for one person ? Will it ever get to the point where building up all the rules is more costly than accepting the attacks? To some degree that has happened where most people have traded up freedom (complexity) for a walled garden (less value). Once again, this gets into reducing complexity, which is a topic for another day.

Rather than telling you how this applies to testing, effectively creating rules for your brain, I am going to try not to add complexity and let you interpret it for yourself. If you found this useful, please write a comment and I will write more on the topic, otherwise, I will leave this dense, difficult topic alone for a while.