Monday, October 9, 2017

Top Five Embedded Software Management Misconceptions

Here are five common management-level misconceptions I run into when I do design reviews of embedded systems. How many of these have you seen recently?

(1) Getting to compiled code quickly indicates progress. (FALSE!)

Many projects are judged by "coding completed" to indicate progress.  Once the code has been written, compiles, and kind of runs for a few minutes without crashing, management figures that they are 90% there.  In reality, a variant of the 90/90 rule holds:  the first 90% of the project is in coding, and the second 90% is in debugging.

Measuring teams on code completion pressures them to skip design and peer reviews, ending up with buggy code. Take the time to do it right up front, and you'll more than make up for those "delays" with fewer problems later in the development cycle.  Rather than measure "code completed" do something more useful, like measure the fraction of modules with "peer review completed" (and defects found in peer review corrected).  There are many reasonable ways to manage, but waterfall-ish projects that treat "code completed" as the most critical milestone is not one of them.

(2) Smart developers can write production-quality code on a long weekend (FALSE!)

Alternate form: marketing sets both requirements and end date without engineering getting a chance to spend enough time on a preliminary design to figure out if it can actually be done.

The true bit is anyone can slap together some code that doesn't work.  Some folks can slap together code in a long weekend that almost works.  But even the best of us can only push so many lines of code in a short amount of time without making mistakes, much less producing something anyone else can understand.  Many of us remember putting together hundreds or thousands of lines on an all-nighter when we were students. That should not be mistaken for writing production embedded code.

Good embedded code tends to cost about an hour for every 1 or 2 lines of non-comment code all-in, including testing (on a really good day 3 lines/hr).  Some teams come from the Lake Wobegone school, where all the programmers are above average.  (Is that really true for your team?  Really?  Good for you!  But you still have to pay attention to the other four items on this list.)  And sure, you can game this metric if you try. Nonetheless, it is remarkable how often I see a number well above about 2 SLOC/hour of deeply embedded code corresponding to a project that is in trouble.

Regardless of the precise productivity number, if you want your system to really work, you need to treat software development as a core competency.  You need an appropriately methodical and rigorous engineering process. Slapping together code quickly gives the illusion of progress, but it doesn't produce reliable products for full-scale production.

(3) A “mostly working,” undisciplined prototype can be deployed.  (FALSE!)

Quick and dirty prototypes provide value by giving stakeholders an idea of what to expect and allowing iterations to converge on the right product. They are invaluable for solidifying nebulous requirements. However, such a prototype should not be mistaken for an actual product!   If you've hacked together a prototype, in my experience it's always more expensive to clean up the mess than it is to take a step back and start a project from scratch or a stable production code base.

What the prototype gives you is a solid sense of requirements and some insight into pitfalls in design.

A well executed incremental deployment strategy can be a compromise to iteratively add functionality if you don't know all your requirements up front. But an well-run Agile project is not what I'm talking about when I say "undisciplined prototype." A cool proof of concept can be very valuable.  It should not be mistaken for production code.

(4) Testing improves software quality (FALSE!)

If there are code quality problems (possibly caused by trying to bring an undisciplined prototype to market), the usual hammer that is brought to bear is more testing.  Nobody ever solved code quality problems by testing. All that testing does is make buggy code a little less buggy. If you've got spaghetti code that is full of bugs, testing can't possibly fix that. And testing will generally miss most subtle timing bugs and non-obvious edge cases.

If you're seeing lots of bugs in system test, your best bet is to use testing to find bug farms. The 90/10 rule applies: many times 90% of the bugs are in bug farms -- the worst 10% of the modules. That's only an approximate ratio, but regardless of the exact number, if you're seeing a lot of system test failures then there is a good chance some modules are especially bug-prone.  Generally the problem is not simply programming errors, but rather poor design of these bug-prone modules that makes bugs inevitable. When you identify a bug farm, throw the offending module away, redesign it clean, and write the code from scratch. It's tempting to think that each bug is the last one, but after you've found more than a handful of bugs in a module, who are you kidding? Especially if it's spaghetti code, bug farms will always be one bug away from being done, and you'll never get out of system test cleanly.

(5) Peer review is too expensive (FALSE!)

Many, many projects skip peer review to get to completed code (see item #1 above). They feel that they just don't have time to do peer reviews. However, good peer reviews are going to find 50-75% of your bugs before you ever get to testing, and do so for about 10% of your development budget.  How can you not afford peer reviews?   (Answer: you don't have time to do peer reviews because you're too busy writing bugs!)

Have you run into another management misconception on a par with these? Let me know what you think!

4 comments:

  1. Love the alternative form of #2. I can't count the number of times I've come up against that one.

    ReplyDelete
  2. One I've been coming across more and more lately, especially in industries that are undergoing consolidation: there's a fundamental assumption that two (or more) disjoint firmware sets from different products can be merged with little effort. Because of course both products were designed and developed with good design and planning and testing, right? Otherwise we wouldn't be shipping them, right? So how much trouble could it be merging the two feature sets, or moving one product's features to the hardware of the other product?

    ReplyDelete
  3. "Peer review is too expensive" - I've been through this a lots of time.

    ReplyDelete
  4. Although I have to agree with the statement "Nobody ever solved code quality problems by testing", I totally disagree with the conclusion that Testing does not improve software quality. A rigorous, disciplined incorporation of unit testing, acceptance (end-to-end) testing, and regression testing within the context of a good CI/CD SDLC environment will absolutely, definitely, 100%, improve the quality of the software. how and why? Because bugs that are unwittingly programmed will trigger testing failures. The more tests there are that are automated, the better the software will become. We try to get a quality mark of 80% unit test coverage (which is easier said than done). Whenever a flaw is found in the real world on our software, not only do we need to fix the bug, we ALSO need to ADD a new unit test (and perhaps update our acceptance tests and regression tests) so that the use case becomes part of the normal testing cycle.
    I cannot imagine going back to the bad old days of coding without testing.

    ReplyDelete

Please send me your comments. I read all of them, and I appreciate them. To control spam I manually approve comments before they show up. It might take a while to respond. I appreciate generic "I like this post" comments, but I don't publish non-substantive comments like that.

If you prefer, or want a personal response, you can send e-mail to comments@koopman.us.
If you want a personal response please make sure to include your e-mail reply address. Thanks!

Static Analysis Ranked Defect List

  Crazy idea of the day: Static Analysis Ranked Defect List. Here is a software analysis tool feature request/product idea: So many times we...