Tuesday, October 2, 2018

Cost of highly safety critical software

It's always interesting to see data on industry software costs. I recently came across a report on software costs for the aviation industry. The context was flight-critical radio communications, but the safety standards discussed were DO-178B and DO-254, which apply to flight controls as well.

Here's the most interesting picture from the report for my purposes:


(Source: Page 28 https://www.eurocontrol.int/sites/default/files/content/documents/communications/29012009-certification-cost-estimation-for-fci-platform.pdf.pdf )

Translating from DO-178B terminology, this means:

  • DAL A  (failure would be "catastrophic"):  3 - 12 SLOC/day
  • DAL B  (failure would be "hazardous"): 8 - 20 SLOC/day
  • DAL C (failure would be "major"): 15 - 40 SLOC/day
  • DAL D (failure would be "minor"): 25 - 64 SLOC/day
Worth noting is that, in my experience, really solid mission critical but NOT life-critical embedded software can be done at up to 16 SLOC per day for well-run experienced teams, so it tends to line up with DAL B costs.


For interpretation, "DAL" expresses a criticality level (a "Development Assurance Level"), with more critical software requiring more rigorous processes.  The document has quite a lot to say about how the engineering process works, and is worth a read if you want to see how the aviation folks do business.  (I'm aware that DO-178C is out, but this paper talks about the older "B" version.)    Note that there are other cost models in the paper that are less pessimistic in that report, but this is the one that says "industry experience."

Have you found other cost of software data for embedded or mission critical systems?

Saturday, September 8, 2018

Different types of risk analysis: ALARP, GAMAB, MEMS and more

When we talk about how much risk is enough, it is common to do things like compare the risk to current systems, or argue about whether something is more (or less) likely than events such as being killed by lightning. There are established ways to think about this topic, each with tradeoffs.

Tightrope Walker


The next time you need to think about how much risk is appropriate in a safety-critical system, try these existing approaches on for size instead of making up something on your own:

ALARP: "As Low As Reasonably Practicable"  Some risks are acceptable. Some are unacceptable. Some are worth taking in exchange for benefit, but if that is done the risk must be reduced to be ALARP.

GAMAB: "Globalement Au Moins Aussi Bon"  Offer a level of risk at least as good as the risk offered by an equivalent existing system. (i.e., no more dangerous than what we have already for a similar function)

MEM: "Minimum Endogenous Mortality"  The technical system must not create a significant risk compared to globally existing risks. For example, this should cause a minimal increase in overall death rates compared to the existing population death rates.

MGS: "Mindestens Gleiche Sicherheit"   (At least the same level of safety) Deviations from accepted practices must be supported by an explicit safety argument showing at least the same level of safety. This is more about waivers than whole-system evaluation.

NMAU: "Nicht Mehr Als Unvermeidbar"  (Not more than unavoidable)  Assuming there is a public benefit to the operation of the system, hazards should be avoided by reasonable safety measures implemented with reasonable cost.

Each of these approaches has pros and cons.  The above terms were paraphrased from this nice discussion:
Kron, On the evaluation of risk acceptance principles,
http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.455.4506&rep=rep1&type=pdf

There is an interesting set of slides that covers similar ground here, and works some examples. In particular the graphs involving whether risks are taken voluntarily for different scenarios is thought provoking:
http://agse3.informatik.uni-kl.de/teaching/suze/ws2014/material/folien/SRES_03_Risk_Acceptance.pdf

In general, if you want to dig deeper into this area, a search on
    gamab mem alarp 
will bring up a number of hits

Also note that legal and other types of considerations exist, especially regarding product liability.

Monday, March 12, 2018

Embedded Code Quality and Best Practices Training Videos full length

I've posted the full series of my available embedded system code quality and related best practices videos on YouTube.  These are full-length narrated slides of the core set of safety topics from my new course.  They concentrate on getting the big picture about code quality and good programming practices.
Each of the videos is posted to YouTube as a playlist, with each video covering a slide or two. The full lecture consists of playing the entire play list, with most lectures being 5-7 videos in sequence. (The slide download has been updated for my CMU grad course, so in general has a little more material than the original video. They'll get synchronized eventually, but for now this is what I have.)

Obviously there is more to code quality and safety than just these topics. Additional topics are available slides-only.  You can see the full set of course slides including for those lectures and others here:
  https://users.ece.cmu.edu/~koopman/lectures/index.html#642

Sunday, February 25, 2018

New Blog on Self-Driving Car Safety

I'm doing a lot more work on self-driving car (autonomous vehicle) safety, so I've decided to split my blogging for that activity.  I'll still post more general embedded system topics here, perhaps with reduced frequency.

You can see my new blog on self-driving car safety here:
    https://safeautonomy.blogspot.com

Just to keep perspective, self-driving cars are still very complex embedded systems. You need to get the basics right (this blog) if you want them to be safe!

Friday, February 16, 2018

Robustness Testing of Autonomy Software (ASTAA Paper Published)

I'm very pleased that our research team will present a paper on Robustness Testing of Autonomy Software at the ICSE Software Engineering in Practice session in a late May. You can see a preprint of the paper here:  https://goo.gl/Pkqxy6

The work summarizes what we've learned across several years of research stress testing many robots, including self-driving cars.

ABSTRACT
As robotic and autonomy systems become progressively more present in industrial and human-interactive applications, it is increasingly critical for them to behave safely in the presence of unexpected inputs. While robustness testing for traditional software systems is long-studied, robustness testing for autonomy systems is relatively uncharted territory. In our role as engineers, testers, and researchers we have observed that autonomy systems are importantly different from traditional systems, requiring novel approaches to effectively test them. We present Automated Stress Testing for Autonomy Architectures (ASTAA), a system that effectively, automatically robustness tests autonomy systems by building on classic principles, with important innovations to support this new domain. Over five years, we have used ASTAA to test 17 real-world autonomy systems, robots, and robotics-oriented libraries, across commercial and academic applications, discovering hundreds of bugs. We outline the ASTAA approach and analyze more than 150 bugs we found in real systems. We discuss what we discovered about testing autonomy systems, specifically focusing on how doing so differs from and is similar to traditional software robustness testing and other high-level lessons.

Authors:
Casidhe Hutchison
Milda Zizyte
Patrick Lanigan
David Guttendorf
Mike Wagner
Claire Le Guoes
Philip Koopman


Monday, January 29, 2018

New Peer Review Checklist for Embedded C Code

Here's a new peer review checklist to help improve the quality of your embedded C code.

To use the checklist, you should do a sit-down meeting with, ideally, three reviewers not including the code author. Divide the checklist up into three portions as indicated.  Be sure to run decent static analysis before the review to safe reviewer time -- let the tools find the easy stuff before spending human time on the review.

After an initial orientation to what the code is supposed to do and relevant background, the review process is:
  1. The review leader picks the next few lines of code to be reviewed and makes sure everyone is ONLY focused on those few lines.  Usually this is 5-10 lines encompassing a conditional structure, a basic block, or other generally unified small chunk within the code.
  2. Reviewers identify any code problems relevant to their part of the checklist.  It's OK if they notice others, but they should focus on individually considering each item in their part of the checklist and ask "do I see a violation of this item" in just the small chunk of code being considered.
  3. Reviewer comments should be recorded in the form: "Line X seems to violate Checklist Item Y for the following reason." Do NOT suggest a fix -- just record the issue.
  4. When all comments have been recorded, go back to step 1.  Continue to review up to a maximum of 2 hours. You should be covering about 100-200 lines of code per hour. Too fast and too slow are both a problem.
A text version of the checklist is below. You can also download an acrobat version here.  Additional pointers to support materials are after the checklist. If you have a static analysis tool that automates any of the checklist item, feel free to replace that item with something else that's important to you.

===============================================================
Peer Review Checklist: Embedded C Code
       
Before Review:
0    _____    Code compiles clean with extensive warning checks (e.g. MISRA C rules)
       
Reviewer #1:       
1    _____    Commenting:  top of file, start of function, code that needs an explanation
2    _____    Style is consistent and follows style guidelines
3    _____    Proper modularity, module size, use of .h files and #includes
4    _____    No orphans (redundant, dead, commented out, unused code & variables)
5    _____    Conditional expressions evaluate to a boolean value; no assignments
6    _____    Parentheses used to avoid operator precedence confusion
7    _____    All switch statements have a default clause; preferably an error trap
       
Reviewer #2:       
8    _____    Single point of exit from each function
9    _____    Loop entry and exit conditions correct; minimum continue/break complexity
10    _____    Conditionals should be minimally nested (generally only one or two deep)
11    _____    All functions can be unit tested; SCC or SF complexity less than 10 to 15
12    _____    Use const and inline instead of #define; minimize conditional compilation
13    _____    Avoid use of magic numbers (constant values embedded in code)
14    _____    Use strong typing (includes: sized types, structs for coupled data, const)
15    _____    Variables have well chosen names and are initialized at definition
       
Reviewer #3:       
16    _____    Minimum scope for all functions and variables; essentially no globals
17    _____    Concurrency issues? (locking, volatile keyword, minimize blocking time)
18    _____    Input parameter checking is done (style, completeness)
19    _____    Error handling for function returns is appropriate
20    _____    Null pointers, division by zero, null strings, boundary conditions handled
21    _____    Floating point use is OK (equality, NaN, INF, roundoff); use of fixed point
22    _____    Buffer overflow safety (bound checking, avoid unsafe string operations)
       
All Reviewers:     
23    _____    Does the code match the detailed design (correct functionality)?
24    _____    Is the code as simple, obvious, and easy to review as possible?
       
        For TWO Reviewers assign items:   Reviewer#1:  1-11; 23-24    Reviewer#2: 12-24
        Items that are covered with static analysis can be removed from checklist
        Template 1/28/2018:  Copyright CC BY 4.0, 2018, Philip Koopman
===============================================================

Additional material to help you with successful peer reviews:


Static Analysis Ranked Defect List

  Crazy idea of the day: Static Analysis Ranked Defect List. Here is a software analysis tool feature request/product idea: So many times we...