Thursday, December 2, 2010

Embedded Software Risk Areas -- An Industry Study

I've had the opportunity to do many design reviews of real embedded software projects over the past decade or so.  About 95 reviews since 1996. For each review I usually had access to the project's source code and design documentation.  And in most cases I got to spend a day with the designers in person. The point of the reviews was usually identifying risk areas that they should address before the product went into production. Sometimes the reviews were post mortems -- trying to find out what caused a project failure so it could be fixed. And sometimes the reviews were more narrow (for example, just look at security or safety issues for a project). But in most cases I (sometimes with a co-reviewer) found one or more "red flag" issues that really needed to be addressed.

In other postings I'll summarize the red flag issues I found from all those reviews. Perhaps surprisingly, even engineers with no formal training in embedded systems tend to get the basics right. The books that are out there are good enough for a well trained non-computer engineer to pick up what they need to get basic functionality right most of the time. Where they have problems are in the areas of complex system integration (for example, real time scheduling) and  software process. I'm a hard-core lone cowboy techie at heart, and process is something I've learned gradually over the years as that sort of thing proved to be a problem for projects I observed. Well, based on a decade of design reviews I'm here to tell you that process and a solid design methodology matters. A lot. Even for small projects and small teams. Even for individuals. Details to follow in upcoming posts.

I'm giving a keynote talk at an embedded system education workshop at the end of October. But for non-academics, you'd probably just like me to summarize what I found:


(The green bar means it is things most embedded system experts think are the usual problems -- they were still red flags!) In other words, of all the red flag issues identified in these reviews, only about 1/3 were technical issues. The rest were either pieces missing from the software process or things that weren't being written down well enough.

Before you dismiss me as just another process guy who wants you to generate lots of useless paper, consider these points:
  • I absolutely hate useless paper. Seriously!  I'm a techie, not a process guy. So I got dragged kicking and screaming to the conclusion that more process and more paper help (sometimes).
  • These were not audits or best practice reviews. What I mean by that is we did NOT say "the rule book says you have to have paper of type X, and it is missing, so you get a red flag." What we DID say, probably most of the time, was "this paper is missing, but it's not going to kill you -- so not a red flag."  Only once in a while was a process or paperwork problem such high risk that it got a red flag.
  • Most reviews had only a handful of red flags, and every one was a time bomb waiting to explode. Most of the time bombs were in process and paperwork, not technology.
In other postings (click here to see the list as it grows) I'll summarize the 43 areas that had red flags (yes, including the technology red flags). They'll be organized, loosely, by phase in the design cycle. No doubt you have seen some project-killer risks that aren't on the lists, and I welcome brief war story postings about them.

No comments:

Post a Comment

Please send me your comments. I read all of them, and I appreciate them. To control spam I manually approve comments before they show up. It might take a while to respond. I appreciate generic "I like this post" comments, but I don't publish non-substantive comments like that.

If you prefer, or want a personal response, you can send e-mail to comments@koopman.us.
If you want a personal response please make sure to include your e-mail reply address. Thanks!