Thursday, January 8, 2015

White Papers and Who Owns Your Education?

Let me start out by point you to this rather complex white paper, A Large Scale Study of Programming Languages and Code Quality in Github.  It inspired me to write a little about reading white papers and why they matter.  In trying to read or perhaps decipher, it made me feel a little stupid.  What, never heard of the Chi-Square test?  How about Cramér's-V?  Does that make you feel like, "Then why are you reading this?  Don't you feel dumb?  You should go back to yahoo news or wherever you came from."  That is the voice in the back of my head, and yes, reading these sorts of papers certainly can make me feel dumb.  Fortunately, I know better.  Even if I am not a accomplished statistician, I'm sure there could be bits of data that are worth gathering.  I personally believe we should try to improve our development* in whatever ways possible, and this study is trying to do so by providing a very high level view of computer languages.  Even though it is not particularly about finding bugs or testing, it does give some insights into bugs and their relationship to languages.
* Pun intended

Let me start with a word of warning before we deep dive into the analysis of this whitepaper.
"One should not overestimate the impact of language on defects.  While these relationships [in the chart in the study] are statistically significant, the effects are quite small.... activity [commits] in a project accounts for the majority of explained deviance [difference in bug rates]... the next closest predictor which accounts for less than one percent of the total deviance, is language."
So while language is important, code churn is more important.  That however does not end the story.  There are still important pieces to pickup.  I am going to leave the definitions of what constitutes a particular category, such as a 'functional language' to the actual paper, which you can look up if it is of interest.  Perhaps you know the language you are working with, so considering a list of languages might be of interest.
"...a greater number of defect fixes... include C, C++, JavaScript, Objective-C, Php, Python. ... Closure, Haskell, Ruby, Scala, and TypeScript... are less likely than average to result in defect fixing commits."
For most companies this is bad news.  Few people write Scala or Haskell on a day to day basis.  Worse yet, I wonder if the experience level of those who write in such languages is greater than average, something the study could not control for.  From what I could tell, they didn't even consider looking at the length of time a user's name was associated to a repository as a proxy for experience.  Since experience was not considered, perhaps the broader categories such as 'functional languages' is more helpful.
"...strongly typed languages are less error prone than average while weakly typed languages are more error prone than average."
"This is strong evidence that functional static languages are less error prone than functional dynamic languages..."
"Functional languages have a smaller relationship to defects than either procedural or scripting languages."
So if you are using a strongly typed language you are better off and if you are using a functional language you are better off.  Does it matter what sort of application you are working on?
"...there is only a weak relationship between domain and language class."
Oh, well I guess then it doesn't really matter what you do, it is how you use the language.  Are there classes of defects that matter?
"Languages matters more for specific categories than it does for defects overall."
"In fact, we noticed 28.89% of all the memory errors in Java come from a memory leak.  In terms of effect size, language has a larger impact on memory defects than all other cause categories."
Interestingly, even though Java has managed memory, it seemed to suffer more from memory problems than most other memory managed languages.  This could be because of the applications considered, such as ellasticsearch, a Java based search engine compared to SparkleShare, a C# file sync tool.  I couldn't find anything language related that seemed obvious for Java's problems with memory compared to C#.  On the other hand Java is way better than any unmanaged memory languages.
"Go has a lot more concurrency bugs related to race conditions due to its race condition detect tool..."
"TypeScript is intended to be used as a static, strongly typed language... However in practice, we noticed developers often [50%]... use any type, a catch-all union type, thus makes TypeScript dynamic and weak."
In looking at what may be wrong with this study, I found it interesting that tool support maybe a reason for differences between languages.  Basically, because the language is only as good as the people who use it and the tools around it, it could be the study gives a false sense of knowledge.  It could be that the use cases are so complex, that this one study and method will not get us much in regards to languages and when to use them.  To be fair, the researchers noted other studies, with their own limitations and similarities.  Confirmatory studies are useful, but humanity is really complex and their study wasn't completely confirmatory.  This is a sort of social-science meets well recorded data.  Git has perfectly recorded data, but even with it, it is hard to know because there is so much data.  There was a 'study' done a few years ago regarding swearing and checkins in github.  While amusing, it also is one more piece of data.  Keeping in mind the warning I started with at the beginning of the article, this was the authors' conclusions
"The data indicates functional languages are better than procedural languages; it suggests that strong typing is better than weak typing; that static typing is better than dynamic; and that managed memory usage is better than unmanaged. Further, that the defect proneness of languages in general is not associated with software domains. Also, languages are more related to individual bug categories than bugs overall."
For the average tester, is there something to learn here?  I think the primary thing is that code churn has more to do with defects than anything else.  Fighting over the tool to use after a project is started is probably not worth it, unless you are rebuilding from scratch.  Keep in mind what else comes along with the language and what value that gives you.  If race conditions matter, then Go is likely a winner.  Just because you have managed memory does not mean you can't have memory issues.  Functional languages like SQL tend to have less defects, possibly because state is less of an issue.  Finally, no matter what language you are using and no matter what you are doing, you will have bugs.  What else did you pick up from the white paper?

Wait, wait... don't drop the curtains yet.

In violation of the end an article with a question to cause action, I want to give pause in considering that perhaps this subject is not to your liking.  Well why not read up on all the data browsers leak out.  It will give you a good feel for just how complicated the simple underlying HTTP connections are.  Maybe you want to learn how to model learning, in which case the Dreyfus Model of Skill Acquisition might be up your alley.  Or perhaps you want something more on testing, such as the pair wise testing white paper.  Or perhaps the famous session based test management paper.  Perhaps none of the ideas I've suggested have struck your fancy.  Cem Kaner developed hundreds of links in his citations for one of his classes, many of which are white papers.  And that is to say nothing of rfcs, a Request for Comments, which helps define standards we all use.  Even books like The Reflective Practitioner, which I am slowly reading through, but has been well analyzed in Wikipedia.

Which brings me to the problem with Wikipedia and blogs like mine that analyze content.  People like myself pull out the conclusions for you and so you skip reading the white paper.  Do you really want to leave your entire education up to those who form conclusions for you?  Let's me be honest, most of you won't click half the links I provided and just as many won't seriously read the cited white paper.  It is not unusual for a large percentage of people not even to read this far, complaining "TL/DR".  "So what?", you, the wise reader who has made it this far might ask.  In my book review of An Introduction to General System’s Thinking I tried to pull out some of my favorite little pieces, some flavour, which was not directly related to the review.  Some of Gerald Weinberg's wisdom will be lost if all you do is read a short summary of his ideas.  Now you say you don't have time, wiki is better than nothing.  Maybe that is true.  I won't argue that you shouldn't use wiki and speed through some of one's education.  Just make sure to occasionally spend time looking at a subject deeply to help deepen your thinking and understanding.  I am sure you might come up with more reasons why you shouldn't, and I won't try to out reason you in your quest to not read deeply.  It is your choice.  I will say it might be hard to do that sort of deep reading, but it will be worth it.

So once again, we are back to the cheesy, end with a question.  Well I won't burden you with that sort of guilt-driven work harder question at the end of this post.  Instead I will give you an offer and a statement.  If anyone leaves a comment with a white paper or rfc for me to read, I will read it and I might even report back on it.  Book suggestions are a dime a dozen and sadly even I have finite amounts of patience for books and a stack of books I want to read, but if you give a recommendation I will at least seriously consider it.  And now for the statement:  I want to gain a deeper understanding of the world, and I will share some of what I learn, but you must do your own investigating as well and I want to say...  Good luck with your own education!

2 comments:

  1. Hi JCD, I didn't find one debatable question in Git white paper: how language ability to write, for instance, unit test quickly and effectively influences on number of defects. Let me clarify. From my experience write good unit test in Python is simpler that in C++. So if developer writes unit tests it can improve code quality and decrease number of defects. Sometimes developer doesn't use unit testing just because it's difficult (we'll not discuss of his professionalism).
    What do you think?

    ReplyDelete
    Replies
    1. I would agree that unit tests might have a relationship to defects, although I don't know if all languages support unit testing at the same level. That in and of itself could explain some of the variations between languages. They noted that tools such as the race condition finding tool Go has do have an affect on defects found in the code. They also did not evaluate if these fixes are good or not. That is to say, we don't know if these were bugs found before or after the code went to a customer.

      TDD says you write the tests first, but I know many devs who write the tests afterwards. Depending on the deploy strategy, that can be a reasonable method. Particularly with git, where you can have multiple commits but a single push. The problem with their methodology and trying to measure unit tests is that we simply don't know if a unit test caught a bug which they then fixed, that using there methodology would count as a bug. And since they did not look at the bug tracking system, we don't know if there is any relationship between check in bug fixes and customer-found bugs.

      The important insight I got from the paper is that the toolset behind the language matters more than the language does. If you happen to know a particular toolset and will use its features, that is the language you should use. I would go further and say because activity (commits) matters the most for defect count, I would say your toolset needs to have some way of running on commit or at least nightly. Finally I would say your toolset should support some sort of method for helping to mitigate against code churn (lots of commits). Unit Testing is a tool that does that. So to me, a language that supports unit testing, where you know how to write unit tests is more important than the actual language feature set, unless you need particular feature for your development.

      As an aside, Isaac and I once worked with a man who was so careful for the two years I worked with him and the two years Isaac worked with him, neither of us found any code-oriented defects. We found bad assumptions, we found unclear requirements but never a defect related to the code. He was primarily SQL developer, so he didn’t do a lot of unit testing, but he did a massive amount of testing before he handed his code off. He wasn’t the fastest developer, but he was the most careful and when you had something that had to be done right, he was the man to do it. I suspect that being careful and slow makes for less defects. Unit Tests maybe just a means to an end, but I haven’t seen any studies around that.

      Delete