Better Embedded System SW

Sunday, November 27, 2011

Avoiding EEPROM corruption problems

EEPROM is invaluable for storing operating parameters, error codes, and other non-volatile data. But it can also be a source of problems if stored values get corrupted. Here are some good practices for avoiding EEPROM problems and increasing EEPROM reliability.

Pay attention to error return codes when writing values, and make sure that the data you wanted to write actually gets written. Read it back after the write and checking for a correct value.
It takes a while to write to data. If your EEPROM supports a "finished" signal, use that instead of a fixed timeout that might or might not be long enough.
Don't access the EEPROM with marginal voltage. It is common for an external EEPROM chip to need a higher minimum voltage than your microcontroller. That means the microcontroller can be running happily while the EEPROM doesn't have enough voltage to operate properly, causing corrupted writes. In other words, you might have to set your brownout protection at a higher voltage than normal to protect EEPROM operation.
Have a plan for power loss during a write cycle. Will your hold-up cap keep things afloat long enough to finish the write? Do you re-initialize the EEPROM when the CPU powers up in case the EEPROM was left half-way through a write cycle when the CPU was reset?
Don't use address zero of the EEPROM. It is common for corruption problems to hit address zero, which can be the default address pointed to in the EEPROM if that chip is reset or otherwise has a problem during a write cycle.
Watch out for wearout. EEPROM can only be written a finite number of times. Make sure you stay well within the wearout rating for your EEPROM.
Consider using error coding such as a CRC protection for critical data items or blocks of data items stored in the EEPROM. Make sure the CRC isn't re-written so often that it causes EEPROM wearout.

If you've seen other problems or strategies that you'd like to share, let me know!

You might also be interested in my blog post on EEPROM and flash memory wearout.

Friday, November 4, 2011

Embedded System Code Review Checklist

This past summer I had the opportunity to work with my friend Gautam Khattak on several industry design reviews. By the time the dust had settled, we'd had a lot of time to think about patterns of common problems and the types of things that designers should be looking for in peer reviews for embedded system software.

As a result, we've written a two-page checklist for embedded system code reviews that you are free to use however you like.

The sections include: function, style, architecture, exception handling, timing, validation/test, and hardware. This is a pretty thorough starting point, but no doubt you will want to tailor these to your specific situation.

The checklists are broken into sections to make it easier to do perspective-based reviews. The idea is to divide up the sections among 3 or 4 reviewers so that each reviewer is thinking about somewhat different aspects of the code, resulting in better review coverage of potential issues.

Everyone has their own favorite thing to look for in a review. If you think we've missed one please let us know with a comment. One thing we have intentionally left out is system-level problems that are better done in a system-level review. These are primarily intended for module reviews that look at a couple hundred lines of code at a time.

Again, you can use these within your company or for any other purpose without having to ask our permission. But we'd be happy to hear from you if you've found them especially useful or if you have suggestions for things we've missed.

Friday, October 14, 2011

Top 16 Embedded Security Pitfalls

Every once in a while I'm asked to take a look at a scheme for ensuring that an embedded system is secure for a variety of reasons. Generally the issue isn't really secrecy so much as making sure that an embedded device is really what you think it is, or that a license fee has been paid for a service, or that some bad guy hasn't broken in to a system to issue bogus commands. These are hard problems for any system, but can be made especially difficult for embedded systems because severe cost constraints can discourage the use of bank-grade IT security solutions.

There is plenty out there on the web about security, so I'm not going to attempt a comprehensive tutorial here. (There is a chapter in my book that covers the basics as applied to embedded systems.) But one thing that can be difficult to find is a comprehensive list of high-level pitfalls for embedded security all in one compact list. So here it is. Ignore these hard-won lessons from the security community at your own peril.

Security by obscurity never works against a serious adversary. Any sentence that contains the phrase "the bad guys will never figure out how" is false.
Almost nobody is good enough to cook up their own cryptography. Use a good standard algorithm. If that costs too much, then realize you're just keeping out the clueless.
Secret algorithms never remain secret. Any security that hinges upon the secrecy of an encryption algorithm is a bad idea. (Those sorts of systems are snake oil.)
Tamper-proofing only raises the bar to steal the secrets inside a chip. Tamper-proofing is worth doing, but don't count on it as a definitive security strategy.
Even if an algorithm is secret, it can be broken.
Weak algorithms will be broken, and probably publicized for bragging rights.
Even if an algorithm isn't broken, it might be subject to simple cloning or a replay attack that captures and repeats an encrypted message such as a door unlock command.
Never use a back-door key, or manufacturer secret key. The information will get out. Place your faith only in a large, unique-per-device random secret key.
Make sure you don't generate weak or non-random keys. All the above fallacies apply to the algorithm you use for generating random keys.
Don't forget the possibility of a disgruntled employee or a rubber hose attack disclosing your algorithm or master key.
Don't use encryption as your primary tool when what you really need is a secure hash, signature, or MAC for authentication. There are endless possibilities for getting it wrong if you use the wrong tool for the job.
Don't forget to have a plan for upgrading security or fixing bugs when they appear. Remember that update mechanisms can be attacked (for example, via a trojan horse update file).
The system owner might be an attacker (heard of jailbreaking?)
The attacker does not need to be sophisticated to break tough security (see "script kiddies")
Figure out export control issues as you design your system, and adapt the design if required.
Instances of poor security biting embedded system designers are ubiquitous. You just haven't heard most of them because nobody is allowed to talk about their own company's problems.

The point of the above list is the following: if you don't understand what a bullet is and why I'm saying it, you should probably dig deeper into security before you ship a product that needs to be secure. There are always exceptions, but they are rare. This is one of those cases where 80% of ordinary designers think they're the 1% exception case that is clever enough to get away with bending the rules -- and it's just not so.

If you have suggestions for additional items for high level of pitfalls I'd love to hear them.

Friday, September 30, 2011

The NOP Trick For Speed Profiling

An important principle in code optimization is speed up the parts that matter. One of the interpretations of Amdahl's Law is that speedup is limited by the fraction of total time taken by a particular piece of code. If for example a particular snippet of code is 10% of overall execution time, then making it execute infinitely fast is only going to give you a 10% speedup. 10% might be worthwhile, but if instead a snippet of code only takes 0.001% of total execution time, then it is unlikely spending time optimizing it is worth your while. In other words, there is no point wasting hours optimizing parts of the code that won't make any difference to overall execution speed.

But, how do you know what parts of the code matter? Intuition can help sometimes, but in practice it is hard to know your code well enough that you always guess the bottleneck locations correctly.

If you have a tool-rich development environment you can use a profiling tool (wikipedia has a description and a list of common tools). But sometimes you are on a platform that is too small to use these tools, or you're in too much of a hurry to use them because you are just absolutely sure you are right about where all the time is being spent.

Here is a trick that can save you a lot of time doing optimization. I use it pretty much every time I'm going to optimize code as a sanity check (even if a tool tells me where to optimize, because tools aren't perfect and I've learned to be careful before investing valuable time optimizing). The trick is: insert some NOPs in the code you want to optimize and see how much slower it goes. If you don't see a speed difference, then you are wasting your time optimizing that part of the code to make it go faster. If you do see a speed difference, that difference can help you estimate how much speedup you are going to get if you do the optimization.

Here's some example C code before optimization:

  for(int i = 0; i < 1000; i++) 

  { x[i] += y[i];

  }

Depending on your compiler it might well be that you can optimize this by switching to pointers instead of indexed arrays. Assuming that you've decided it is worth the trouble to optimize your code instead of buying a faster processor (that's a whole other discussion!) then probably this code can be made faster. But, if this code is only called once in a while and is only a small part of your overall program it might not give enough speedup to be worth optimizing.

Let's say you think you can squeeze 5 clock cycles per loop out of this code with a rewrite. Before you optimize it, use a timer (or an oscilloscope watching an output pulse on an I/O pin, or a stopwatch) to run the code as-is, and then time the following code to see if there is a speed difference:

  for(int i = 0; i < 1000; i++) 

  { x[i] += y[i];

    asm("nop");

    asm("nop");

    asm("nop");

    asm("nop");

    asm("nop");

  }

This inserts 5 NOP do-nothing instructions that are executed every time the loop is executed. Assuming a NOP instruction takes 1 clock, it slows the loop down by 5 clock cycles.

If the whole program with the NOPs in the loop runs 10% slower, then you know that optimizing the loop to save 5 clocks will likely cause the whole program to run about 10% faster. (The slowdown and speedup aren't quite the same percent if you think about the math, but for small percents it's close enough to not worry about this.)

If on the other hand adding a bunch of NOPs makes no difference to the overall program speed, then you are wasting your time optimizing that loop. If adding instructions doesn't make things noticeably slower, then removing them won't make things faster.

Monday, August 29, 2011

Compile-Time Constants For I/O conversion

In several design reviews I've run across code that looks something like this.

unsigned int VoltLimitMin = 122;   /* 3.30 Volts */
unsigned int VoltLimitMax = 179; /* 5.10 Volts */
unsigned int VoltLimitTrip = 205;   /* 5.55 Volts */

The purpose of this code is to set min, max, and emergency shutdown thresholds for voltage coming from some A/D converter. The programmer has manually converted the voltage value to the number of A/D integer steps that corresponds to. (Similar code might be used to express time in terms of timer ticks as well.) Do you see the bug in the above code? Would you have noticed if I hadn't told you a bug was there?

Here is a way to save some trouble and get rid of that type of bug. First, figure out how many steps there are per unit that you care about, such as:
// A/D converter has 37 steps per Volt ( 37 = 1 Volt, 74 = 2 Volts, etc.)
#define UNITS_PER_VOLT      37

then, create a macro that does the conversion such as:
#define VOLTS(v)   ((unsigned int) (((v)*UNITS_PER_VOLT)+.5))

Here is an example to explain how that works. If you say VOLTS(3.30) then "v" in the macro is 3.30. It gets multiplied by 37 to give 122.1.   Adding 0.5 lets it round to nearest positive integer, and the "(int)" ensures that the compiler knows you want it to be an unsigned integer result instead of a floating point number.

Then you can use:

unsigned int VoltLimitMin = VOLTS(3.30);
unsigned int VoltLimitMax = VOLTS(5.10);
unsigned int VoltLimitTrip = VOLTS(5.55);

Because the macros are expanded in-line by the preprocessor, most compilers are able to compile exactly the same code as if you had hand-computed the number (i.e., they compile to a constant integer value). So this macro shouldn't cost you anything at run time, but will remove the risk of for hand computation bugs or someone forgetting to update comments if the integer value is changed. Give it a try with your favorite compiler and see how it works.

Notes:

The rounding trick of adding 0.5 only works with non-negative numbers.Usually A/D converters output non-negative integers so this trick usually works.
This may not work on all compilers, but it works on all the ones I've tried on various microcontroller architectures. I saw one compiler do the float-to-int conversion at run-time if I left out the "(unsigned int)" in the macro, so make sure you put it in. Do a disassembly on your code to make sure it is working for you.
The "const" keyword available in some compilers can optimize things even further and possibly avoid the need for a macro if your compiler is smart enough, but I'll leave that up to you to play with if you use this trick in a real program.
As mentioned by one of the comments, you might also check for overflow with an ASSERT.

Monday, August 1, 2011

Proper use of .h and .c files

I recently worked with some embedded system teams who were struggling with the best way to use .c and .h files for their source code. As I was doing this, I remembered that back when I was learning how to program that it took me quite a while to figure all this out too! So, here are some guidelines on a reasonable way to use .c and .h files for organizing your source code. They are listed as a set of rules, but it helps to apply common sense too:

Use multiple .c files, not just one "main.c" file. Every .c file should have a set of variables and functions that are tightly related to each other, and only loosely related to the other .c files. "Main.c" should only have the main loop in it.
.C files allocate storage and define executable code. Only .c files have a function defined in them. Only .c files allocate storage for variables
.H files define external storage and define function prototypes The .h files give other modules the information they need to work with a particular .c file, but don't actually define storage and don't actually define code. The keyword "extern" should generally be showing up in .h files only.
Every .c file should have a corresponding .h file. The .h file provides the external interface information to other modules for using the corresponding .c file.

Here is an example of how this works. Let's say you have files: main.c adc.c output.c process.c watchdog.c

main.c would have the main loop that polls the A/D converter, processes values, sends outputs, and pets the watchdog timer. This would look like a sequence of consecutive subroutine calls within an infinite loop. There would be a corresponding main.h that might have global definitions in it (you can also have "globals.h" although I prefer not to do that myself). Main.c would #include main.h, adc.h, output.h, process.h and watchdog.h because it needs to call functions from all the corresponding .c files.

adc.c would have the A/D converter code and functions to poll the A/D and store most recent A/D values in a data structure. The corresponding adc.h would have extern declarations and function prototypes for any other routine that uses A/D calls (for example, a call to look up a recent A/D value from polling). Adc.c would #include adc.h and perhaps nothing else.

output.c would take values and send them to outputs. The corresponding output.h would have information for calling output functions. Output.c would probably just #include output.h. (Whether main.c actually calls something in output.c depends on your particular code structure, but I'm assuming that outputs have two steps: process.c queues outputs, and the main.c call actually sends them.)

process.c would take A/D values, compute on them, and queue results for output. It might need to call functions in adc.c to get recent values, and functions in output.c to send results out. For that reason it would #include process.h, adc.h, output.h

watchdog.c would set up and service the watchdog timer. It would #include watchdog.h

One wrinkle is that some compilers can only optimize in a single .c file (e.g., only do "inline" within a single file). This is no reason to put everything in a single .c file! Instead, you can just #include all the other .c files from within main.c. (You might have to make sure you include each .h file once depending on what's in them, but often that isn't necessary.) You should avoid #including a .c file from within another .c file unless there is a compelling reason to do so such as getting your optimizer to actually work.

This is only intended to convey the basics. There are many hairs to split depending on your situation, but if you follow the above guidelines you're off to a good start.

NOTE: Michael Barr published another, compatible, take on this topic in May. I just saw it at: http://www.embedded.com/electronics-blogs/barr-code/4215934/What-belongs-in-a-header-file

Friday, July 1, 2011

The Grand Challenge of Embedded System Dependability

The following is the extended version of a position statement I wrote for a panel session on Grand Challenges in Dependability for DSN 2011 in Hong Kong. Even if you are not a researcher in this area, it might provide some food for thought about the big picture, especially for embedded system security and safety.
You can find a printable version here.

-------------------------------------------

The Grand Challenge of Embedded System Dependability

Philip Koopman

ECE Department

Carnegie Mellon University

Pittsburgh, PA, United States

koopman@cmu.edu

Abstract: Four significant challenges in embedded system dependability are: embedded-specific security approaches, unifying security with safety, dealing with composable emergent properties, and enabling domain experts to use advanced dependability techniques.

Embedded systems permeate our everyday lives, including applications as diverse as cars, consumer electronics, thermostats, and industrial process controls. We have a surprising amount of reliance upon these systems, and we take their dependability almost for granted. Given extreme cost constraints, tremendous deployment scales, and the wide range of application domains, it is amazing that things more or less work well today. But, as application complexity increases, more applications become safety critical, and more embedded systems are attached to the Internet, we cannot expect business as usual with design approaches to maintain the level of dependability we want and need from such systems.

In my opinion the biggest challenges facing embedded systems lie in the areas of creating more suitable security techniques, finding a more unified approach to safety+security, dealing with composable emergent properties, and deploying dependability techniques to small product development teams.

Deeply embedded system security has significantly different constraints and requirements than enterprise and personal computing security. Embedded control systems often have severe resource constraints, limited development budgets, and stringent real time performance requirements. But an even more pressing security problem in many embedded systems is that the effects of a malicious fault can cause physical damage to people and the environment. It is less difficult to reverse or adequately insure against most malicious financial transactions than it is to reverse the release of toxins into the environment or “roll back” a multi-vehicle collision. Additionally, most embedded systems to date have been designed with near-zero security once an attacker has access to the internal control network. IT-based techniques for addressing that situation are unlikely to suffice due to matters of cost, real time dynamics, and lack of complete physical isolation from attackers.

Inevitably, embedded system safety and security will have to merge into a unified discipline, or at least a tightly-coordinated set of sub-disciplines. It is questionable to build safety cases for most everyday systems upon a faulty presumption of perfect security. At the same time, security techniques will need to take into account the safety implications of vulnerabilities and system outages. One element of a safety and security unification strategy might be to look at security faults as an attack on the assumptions of the safety case (e.g., an attacker negates the random independence assumption of fault arrival rates).

Due to the limitations and realities of embedded system development, workable dependability approaches will likely include some notion of cost-effective resilience in the face of inevitable faults, as well as a way to balance the tension among the often conflicting goals of safety, security, performance, and reliability. The good news will be that there are opportunities to exploit domain characteristics such as physical process inertia in ways not practical in desktop and enterprise computing.

A long-standing problem has been increasing the composability of emergent system properties. It is desirable to have building blocks that can be composed arbitrarily without surprises, and by the same token have an ability to decompose a system architecture so that predictable building blocks can be identified in a way that minimizes cross-coupled quality attributes. Much progress has been made on this in the area of real time systems, but much remains to be done in other areas such as safety and security. An additional challenge will be ensuring the composability of massively deployed distributed systems so that, for example, a city full of smart thermostats doesn’t display emergent aggregate behavior that takes down the power grid.

Finally, the serious challenges posed by creating a dependable system are made more difficult when the development teams are typically composed of a handful of domain experts who may have no formal computer training beyond an introductory programming course. The traditional way to deploy advanced knowledge is via synthesis and analysis tools, and this has been done with astonishing success in IC design. More recently, model-based design has been helping embedded system designers in some domains perform code synthesis from relatively high level system behavioral descriptions. But will be a long time before tools can provide us with push-button automation that addresses the myriad aspects of embedded system dependability.