Monday, October 9, 2017

Top Five Embedded Software Management Misconceptions

Here are five common management-level misconceptions I run into when I do design reviews of embedded systems. How many of these have you seen recently?

(1) Getting to compiled code quickly indicates progress. (FALSE!)

Many projects are judged by "coding completed" to indicate progress.  Once the code has been written, compiles, and kind of runs for a few minutes without crashing, management figures that they are 90% there.  In reality, a variant of the 90/90 rule holds:  the first 90% of the project is in coding, and the second 90% is in debugging.

Measuring teams on code completion pressures them to skip design and peer reviews, ending up with buggy code. Take the time to do it right up front, and you'll more than make up for those "delays" with fewer problems later in the development cycle.  Rather than measure "code completed" do something more useful, like measure the fraction of modules with "peer review completed" (and defects found in peer review corrected).  There are many reasonable ways to manage, but waterfall-ish projects that treat "code completed" as the most critical milestone is not one of them.

(2) Smart developers can write production-quality code on a long weekend (FALSE!)

Alternate form: marketing sets both requirements and end date without engineering getting a chance to spend enough time on a preliminary design to figure out if it can actually be done.

The true bit is anyone can slap together some code that doesn't work.  Some folks can slap together code in a long weekend that almost works.  But even the best of us can only push so many lines of code in a short amount of time without making mistakes, much less producing something anyone else can understand.  Many of us remember putting together hundreds or thousands of lines on an all-nighter when we were students. That should not be mistaken for writing production embedded code.

Good embedded code tends to cost about an hour for every 1 or 2 lines of non-comment code all-in, including testing (on a really good day 3 lines/hr).  Some teams come from the Lake Wobegone school, where all the programmers are above average.  (Is that really true for your team?  Really?  Good for you!  But you still have to pay attention to the other four items on this list.)  And sure, you can game this metric if you try. Nonetheless, it is remarkable how often I see a number well above about 2 SLOC/hour of deeply embedded code corresponding to a project that is in trouble.

Regardless of the precise productivity number, if you want your system to really work, you need to treat software development as a core competency.  You need an appropriately methodical and rigorous engineering process. Slapping together code quickly gives the illusion of progress, but it doesn't produce reliable products for full-scale production.

(3) A “mostly working,” undisciplined prototype can be deployed.  (FALSE!)

Quick and dirty prototypes provide value by giving stakeholders an idea of what to expect and allowing iterations to converge on the right product. They are invaluable for solidifying nebulous requirements. However, such a prototype should not be mistaken for an actual product!   If you've hacked together a prototype, in my experience it's always more expensive to clean up the mess than it is to take a step back and start a project from scratch or a stable production code base.

What the prototype gives you is a solid sense of requirements and some insight into pitfalls in design.

A well executed incremental deployment strategy can be a compromise to iteratively add functionality if you don't know all your requirements up front. But an well-run Agile project is not what I'm talking about when I say "undisciplined prototype." A cool proof of concept can be very valuable.  It should not be mistaken for production code.

(4) Testing improves software quality (FALSE!)

If there are code quality problems (possibly caused by trying to bring an undisciplined prototype to market), the usual hammer that is brought to bear is more testing.  Nobody ever solved code quality problems by testing. All that testing does is make buggy code a little less buggy. If you've got spaghetti code that is full of bugs, testing can't possibly fix that. And testing will generally miss most subtle timing bugs and non-obvious edge cases.

If you're seeing lots of bugs in system test, your best bet is to use testing to find bug farms. The 90/10 rule applies: many times 90% of the bugs are in bug farms -- the worst 10% of the modules. That's only an approximate ratio, but regardless of the exact number, if you're seeing a lot of system test failures then there is a good chance some modules are especially bug-prone.  Generally the problem is not simply programming errors, but rather poor design of these bug-prone modules that makes bugs inevitable. When you identify a bug farm, throw the offending module away, redesign it clean, and write the code from scratch. It's tempting to think that each bug is the last one, but after you've found more than a handful of bugs in a module, who are you kidding? Especially if it's spaghetti code, bug farms will always be one bug away from being done, and you'll never get out of system test cleanly.

(5) Peer review is too expensive (FALSE!)

Many, many projects skip peer review to get to completed code (see item #1 above). They feel that they just don't have time to do peer reviews. However, good peer reviews are going to find 50-75% of your bugs before you ever get to testing, and do so for about 10% of your development budget.  How can you not afford peer reviews?   (Answer: you don't have time to do peer reviews because you're too busy writing bugs!)

Have you run into another management misconception on a par with these? Let me know what you think!

2 comments:

  1. Love the alternative form of #2. I can't count the number of times I've come up against that one.

    ReplyDelete
  2. One I've been coming across more and more lately, especially in industries that are undergoing consolidation: there's a fundamental assumption that two (or more) disjoint firmware sets from different products can be merged with little effort. Because of course both products were designed and developed with good design and planning and testing, right? Otherwise we wouldn't be shipping them, right? So how much trouble could it be merging the two feature sets, or moving one product's features to the hardware of the other product?

    ReplyDelete

Please send me your comments. I read all of them, and I appreciate them. To control spam I manually approve comments before they show up. It might take a while to respond. I appreciate generic "I like this post" comments, but I don't publish non-substantive comments like that.

If you prefer, or want a personal response, you can send e-mail to comments@koopman.us.
If you want a personal response please make sure to include your e-mail reply address. Thanks!