Monday, January 24, 2011

Peer Reviews and the 50/50 Rule

Surprisingly, there is a fairly easy way to know if your peer reviews are effective. They should be finding about 50% of your defects, and testing should be finding the other 50%. In other words the 50/50 rule for peer reviews is they should find half the defects you find before shipping the product.  If peer reviews aren't finding that many defects, something is wrong.

I base this rule of thumb on some study data and my observations of a number of real projects. If you want to find out if teams are really doing peer reviews (and doing ones that are effective), just ask what fraction of defects are found in peer reviews. If you get an answer in the 40%-60% range probably they're doing well.  If you get a lower answer than 40%, peer reviews are being skipped or are being done ineffectively. If they answer is "we do them but don't log them" then most of the time they are being done ineffectively, but you need to dig deeper to find out what is going on.

If you are trying to find all your defects in test (instead of letting peer review get half of them for you), you are taking some big risks. Test is usually a more expensive way to find defects. More importantly, peer review tends to find many defects or poor design choices that are difficult to find by testing with any reasonable effort.

So, why make your testing expensive and your product more bug prone? Try some peer reviews and see what they find.

2 comments:

  1. interesting split, 50/50. Would you also analysis the type of defect being picked up? ie, The peer review picked up a couple of show-stoppers today, great review! or We picked up a bunch of coding violations today. (still good, but effective?)

    ReplyDelete
  2. Good question. It may not be easy to know what defects are important and what aren't. If I recall correctly a big chunk of the US phone system went down at one point (years ago) due to a single conditional branch instruction with an inverted sense ("branch if" instead of "branch if not").

    But, even if you could rank them, I'm not sure I would say logic defects are necessarily more important than coding violations. Coding violations indicate sloppy coding, and might be more indicative of pervasive problems than a single logic error. Sloppy coding practices are more likely to be found in peer review than they are in testing. So I'd expect peer review to find different things than testing, and in large part that's the point of doing both.

    In other words, any defect can be just a random event or can be a symptom of an underlying problem, regardless of the severity of that particular defect. So I'd want to see some data before I said that severity of defect found in peer review correlated with outcomes.

    ReplyDelete

Please send me your comments. I read all of them, and I appreciate them. To control spam I manually approve comments before they show up. It might take a while to respond. I appreciate generic "I like this post" comments, but I don't publish non-substantive comments like that.

If you prefer, or want a personal response, you can send e-mail to comments@koopman.us.
If you want a personal response please make sure to include your e-mail reply address. Thanks!

Static Analysis Ranked Defect List

  Crazy idea of the day: Static Analysis Ranked Defect List. Here is a software analysis tool feature request/product idea: So many times we...