Saturday, November 9, 2013

CRC and Checksum Tutorial Webinar

I've completed my FAA-sponsored look at CRC and Checksum performance for aviation systems. While some of the material is aircraft-specific, it is all relevant to safety critical systems and contains quite a bit of information about CRC vs. checksum performance.

I'm pleased to be able to share a two-hour Webinar recording describing all the work, as well as the slides and draft report. There is a guide to the material below that you may find useful if you are looking for a specific topic.

The general flow of the presentation is as follows, by slide number:
  • 6 - CRC and checksum background and terminology
  • 10 - Why picking the right CRC isn't as easy as you might think
  • 20 - Summary of project technical approach
  • 21 - Project results overview
  • 24 - Checksum results (Checksum, Fletcher Checksum, ATN-32)
  • 30 - CRC-32 compared to checksums
  • 31 - It's all about the Hamming Distance
  • 33 - Table of checksum compared to CRC-32
  • 35 - Good CRC choices (8 & 16 bits)
  • 36 - Good CRC choices (24 & 32 bits)
  • 37 - Aviation industry results (might be enlightening for everyone)
  • 43 - Multiple CRC & memory-specific CRC results
  • 46 - System level effects and cross-layer interactions (e.g., unprotected length field, bit encoding)
  • 48 - ARINC-825 / CAN issues
  • 52 - 8b10b encoding Gbit Ethernet issue
  • 53 - Mapping to criticality levels (how do you know your CRC/checksum is good enough?)
  • 64 - Determining Bit Error Ratio (BER)
  • 71 - Error detection mechanism failure and scrubbing
  • 74 - A simple recipe for how to determine how big and capable a CRC/checksum you need
  • 76 - CRC/Checksum seven deadline sins (bad ideas)
  • 79 - Review of the study project subtasks (most readers can skip this part)

I owe significant thanks to my Co-Investigators Kevin Driscoll and Brendan Hall at Honeywell labs, without whom this work could not have been done. And also a warm and sincere thanks to our FAA contacts, and especially to Chuck Kilgore for his good spirits, kind words, and unfailing support.

NOTES:

"The provided links will give you the option of downloading the video file or view it through your web browser.   Some people have reported problems viewing the video with their web browser due to the 2 hour length.  Others have been able to view the video in a browser without any problems.   If you encounter problems, we recommend that you download the file for offline viewing if possible to avoid any streaming issues."

"This work was supported by the Federal Aviation Administration, Aircraft Certification Service, and Assistant Administration for NextGen, William J. Hughes Technical Center, Aviation Research Division, Atlantic City International Airport, New Jersey 08405. The findings and conclusions in this presentation are those of the author(s) and do not necessarily represent the views of the funding agency.  This presentation does not constitute FAA policy."

The Webinar was scheduled Oct 1, 2013, but was delayed due to the government shutdown. The webinar was actually held on Oct 29, 2013.

Monday, October 21, 2013

Secrecy vs. Integrity and Why Encryption Might Be the Wrong Choice


Summary: Encryption doesn't solve all security problems.  In many cases you need authentication and integrity, not secrecy, and encryption can be the wrong tool for the job.  In those cases you need a Message Authentication Code, not encryption.



It's pretty typical to see embedded system designers use encryption to solve security problems. And it's also common for that to be the wrong answer to the real problem.

To understand why, consider a simplistic security need. (This example is naive in many ways, but serves to illustrate a point.) Let's say you want to set a light bulb intensity to one of 256 levels, and you want to make sure that only an authorized person can set that level.  To do this with no security, you'd send a message on an embedded network to that light bulb:

Message =  BulbLevel               (where BulbLevel is an 8 bit unsigned integer)

OK, so now you want to encrypt things.  You compute  X = Encrypt(BulbLevel) with a shared secret key that only you and the light bulb know and send that in the message:

Encrypted Message = X              (still 8 bits)

Now there is no way for anyone to know what level you've sent -- you've accomplished secrecy.  (Ignore all those attacks that just came into your head, such as recording past messages and playing them back .. or peeking to see what the light bulb did when it received the message ... this is just an illustrative example.) But is secrecy what you really wanted?   Remember our goal in this example wasn't to keep it a secret what the level was, but rather to prevent someone unauthorized from setting the light bulb level.

What if an adversary just sent random garbage:

Encrypted Message = RANDOM_8_bits         (still 8 bits)

That would set the light bulb output to some value. Maybe not a desired value, but the attacker would be able to change the light value to something other than what you commanded without knowing the secret key, (with probability 255/256 that it wasn't a repeat of the value already there) which is what you're trying to prevent.

The issue is that encryption is the wrong tool for the job.  What you really want is some combination of authentication (I know the right person sent it) and integrity (I know the contents have not been altered). Encryption isn't the best tool for this -- what you want is a Message Authentication Code (MAC). One way to get this is to compute an appropriately chosen secure hash function and append its results to the message:

Authenticated Message = BulbLevel  |  SecureHash         (8 bits concatenated with a hash value)

Both you and the light bulb still have a shared secret key with this approach. The light bulb receiving the message computes its own hash of the BulbLevel value and compares it to the received SecureHash value in the message. If they match, the message is authentic and it takes action. If they don't match, then it is a forgery and the message is ignored. Note that the BulbLevel isn't a secret -- it's transmitted "in the clear." That's because the point of this isn't secrecy; it's authentication (the sender knows the secret key to the cryptographic hash function) and integrity (the hash matches the value expected from the BulbLevel value, so the message hasn't been tampered with). If the hash value is sufficiently large, the chance of someone guessing the right hash value for a maliciously sent bulb level is low enough to be tolerated or even virtually impossible given the lifetime of the lightbulb, providing an arbitrarily good probabilistic level of security.

There's another important benefit to using a MAC. Encryption for the purpose of keeping data secret tends to be export controlled. Message Authentication Codes tend not to be export controlled. So ditching encryption in favor of a MAC usually helps with export issues. (Read the rules and talk to your lawyer -- this is just a sweeping generalization.)

The overall message is: if you want to ensure authenticity and integrity and secrecy isn't a big deal, using encryption is barking up the wrong tree.

The fine print: OK, now for those who have begun composing comments about how naive the above schemes are ... yes, I know .. it was only an example. You don't actually do it that way for either approach. For example, you need a time stamp or something to prevent playback attacks. And that tends to help encryption do better because of the reduced chance of accidentally coming up with a plausible decrypted timestamp value (if the receiver is checking for plausible timestamps). And certainly encryption can be made to work if you are careful. But, when I've looked into this for real systems what I've found is that a MAC is often a better tradeoff for a number of reasons and tends to provide better authentication and integrity for a given computational cost and bandwidth cost in practical scenarios. And, I've found designs in the real world based on encryption that weren't going to work as well as the designers thought because they didn't get the details of authentication right. Also, just to make sure it's said ... a CRC is not cryptographically secure, so don't use it as a secure hash function.

Even after the fine print, the message remains: use a MAC if it does the job. Don't jump to a default strategy of encryption if secrecy isn't what you really need. And if you do decide to use encryption, make sure it is really providing authentication and integrity checking in addition to secrecy.


Monday, September 16, 2013

Getting Rid of Global Variables

Summary: Global variables are evil. Here is an example of how to get rid of many of them.



Global variables are well known to be evil -- and you can read all about why that is in my free sample book chapter by that name. This posting gives a running example of changes that fix a common type of global variable.

Let's start with a pretty typical situation in a C program. You have a "globals.c" file that defines a mess of globals, including:
   int g_ErrCount;
which might be used to tally the number of run-time errors seen by the system. I've used a "g_" naming convention to emphasize that is a global, which means that every .c file in the program can read and write this variable with wild abandon.

Let's say you also have the following places this variable is referenced, including globals.c just mentioned:
globals.c:    int g_ErrCount;      // define the variable
globals.h:    extern int g_ErrCount; // other files include this
init.c:       g_ErrCount = 0; // init when program starts
moduleX.c:   g_ErrCount++;   // tally another error
moduleY.c:    XVar = g_ErrCount; // get current number of errors
moduleZ.c:    g_ErrCount = 0;  // clear number of reported errors

There are all sorts of risks with this approach... but let's concentrate on fixing them instead of diving into the Globals Are Evil discussion.

The first thing we're going to do is collect all the error counter functions into a single module, ErrCount.c, which would contain error counting, error reporting, and so on. This gets rid of the need to define g_ErrCount in globals.c, giving the below. We've also changed to using ErrCount.h for the extern definition:
globals.c:       // not needed any more for this variable
ErrCount.c:    int g_ErrCount;      // define the variable
ErrCount.h:    extern int g_ErrCount; // other files include this
init.c:       g_ErrCount = 0; // init when program starts
moduleX.c:   g_ErrCount++;   // tally another error
moduleY.c:    XVar = g_ErrCount; // get current number of errors
moduleZ.c:    g_ErrCount = 0;  // clear number of reported errors

Now let's get rid of the initialization. Having a central init.c is asking for problems if you forget to call an initialization function. Also, having a separate init.c forces variables to be global. So let's initialize the variable where it is defined:
globals.c:       // not needed any more for this variable
ErrCount.c:    int g_ErrCount = 0;    // define and init variable
ErrCount.h:    extern int g_ErrCount; // other files include this
init.c:       // no longer needed
moduleX.c:   g_ErrCount++;   // tally another error
moduleY.c:    XVar = g_ErrCount; // get current number of errors
moduleZ.c:    g_ErrCount = 0;  // clear number of reported errors

Instead of having the variable be global, let's hide it as a static variable inside ErrCount.c. Using the "static" keyword in defining a variable outside a function makes it invisible to other .c files. This step results in the program being broken, because other .c files can't get at the static variable. (We've also renamed the variable without the "g_" prefix because it's not global any more.)
ErrCount.c:    static int ErrCount = 0; // only visible in this file
ErrCount.h:    // static variables are invisible outside .c file
moduleX.c:   g_ErrCount++;   // tally another error
moduleY.c:    XVar = g_ErrCount; // get current number of errors
moduleZ.c:    g_ErrCount = 0;  // clear number of reported errors

To fix the problem with .c files seeing the static variable, we're going to add some access functions to ErrCount.c to provide the ability to touch the value without making the variable global.:
ErrCount.c:    static int ErrCount = 0; // only visible in this file
           inline void ErrCount_Incr() { ErrCount++; }
           inline int ErrCount_Get() { return(ErrCount); }
           inline void ErrCount_Reset() { ErrCount = 0; }
ErrCount.h:    inline void ErrCount_Incr();  // increment the count
           inline int ErrCount_Get();   // get current count value
           inline void ErrCount_Reset(); // reset count
           // Note that there is NO access to ErrCount directly
moduleX.c:      ErrCount_Incr();   // tally another error
moduleY.c:     XVar = ErrCount_Get(); // get current number of errors
moduleZ.c:     ErrCount_Reset();  // clear number of reported errors

And that's it -- we're there.  ErrCount is no longer a global variable. It is visible only inside ErrCount.c, and any accesses to the variable are performed via access functions that increment, read, and reset the value. Note that the keyword "inline" should, with a good compiler, make this code just as fast and efficient as the global variable version of the code -- except without actually having a global variable. In fact, what we've been doing is a C-based approach for making ErrCount into an object (the variable) with access methods to increment, read, and reset the object. Not quite as clean as you might see in C++, but it gets the job done with C syntax.

Some folks might just say this is slight of hand. If it generates the same code, why bother?  Here are some reasons that at least some developers find it useful to take this approach:
  • Software authors can only perform intended functions specific to an error counter: increment, read, and reset. Setting to an arbitrary value isn't allowed. If you don't want the value changed other than via incrementing, you can just delete the reset function. This prevents some types of bugs from ever happening.
  • If you need to change the data type or representation of the counter used that all happens inside ErrCount.c with no effect on the rest of the code. For example, if you find a bug with error counts overflowing, it is a lot easier to fix that in one place than every place that increments the counter! 
  • If you are debugging with a breakpoint debugger it is easier to know when the variable has been modified, because you can get rid of the "inline" keywords and put a breakpoint in the access functions. Otherwise, you need watchpoints, which aren't always available.
  • If different tasks in a multitasking system need to access the variable, then it is a lot easier to get the concurrency management right inside a few access functions than to remember to get it right everywhere the variable is read or written  (get it right once, use those functions over and over). Don't forget to make the variable volatile and disable interrupts when accessing it if concurrency is an issue.
I'm sure different readers have different ways of approaching this problem, And some globals are harder to get rid of than others. But I've seen a lot of code that is structured just like the "before" code. (I'm sure I must have written things that way myself in my mis-spent youth!) This approach cleans up a large fraction of globals with minimal pain and often no speed penalty.




Monday, August 19, 2013

Peer Review Spreadsheet

Summary: I've found peer reviews are only effective if they have tangible paperwork. Here's a minimalist approach that has worked for me in a classroom situation.



Good peer reviews can be very effective at finding bugs .. but bad peer reviews can be nearly useless. I teach a course that involves a semester-long software project and found it really difficult to get students to do proper peer reviews.  Basically, they were going through the motions but didn't produce much (i.e., didn't find many bugs). The emphasis on the materials I provided them was checklists for what to look for, but not a lot of formality in reporting results, because I wanted to minimize paperwork.  That didn't work. Teams were only finding about 10% of their bugs in reviews, which is just not enough to be worth the effort.

So one year I changed things, and made them to use a variant of the below spreadsheet to report results. The results were dramatic. The first semester I used the spreadsheet, defects found via peer review went from about 10% to about 50% across the class, and have stayed there ever since. (Also, the effort profile changed dramatically for the better, but that's a topic for another posting.) In my experience, finding 50% of bugs or so in peer review is about right. So, while I'm not claiming ultimate academic rigor for this study, it seems based on this experience that adopting a spreadsheet like this can be effective at improving the quality of peer reviews.

Here is the spreadsheet (click to download the .XLS file):




The first few lines are to record the project name, date, artifact (e.g., file name of the code being reviewed), and names of the reviewers present.  # Bugs is filled out later. The artifact author is intentionally omitted because the records are about finding bugs, not blaming the author. The issue rows are a place to record issues found in the review in free text, usually only one or two sentences apiece.

The status fields are left blank during the review. However, within 24 hours after the review has taken place the author of the item being reviewed needs to update it to indicate "Fixed", "Not Fixed," or "Not a Bug." The idea here is that if it is easy to fix, the burden to record the bug and fix it is minimal -- it stays in the spreadsheet. But if a bug is too hard to fix immediately, it is "Not Fixed" and must be entered as a formal bug report into the bug tracking system (Bugzilla or otherwise). Some items turn out not to be bugs, and it is OK to record them as such (e.g., a feature request or a misunderstanding by the reviewers). When the program author has updated the status, the # Bugs line is updated to reflect the number of bugs actually found, and that number is rolled up to a project tracking spreadsheet.

This last piece about rolling up the # Bugs to a higher level of visibility is crucial. In my course I have the Teaching Assistants track the number of bugs found weekly for every project team and make sure they asked hard questions if the answers were consistently zero. Really, that's all it took. Apparently if the teams know someone is watching, they'll look a bit harder to find bugs, and once they do it seems that the process boot-strapped into fairly effective reviews with minimal help from the course staff. It's worked several years in a row for me -- almost like a light switch had been flipped for the students in my class. Results have been pretty consistent since we started using the spreadsheet at about 50% of bugs found in peer reviews across dozens of teams. It should be noted that we peer review not only code, but also tests, statecharts, sequence diagrams, and other artifacts for our projects, and the payoff in finding bugs early has been unmistakable. Of course I need to make a "Your Mileage May Vary" disclaimer here, but it's worked for me.

I'd be interested in hearing stories about simple ways to make reviews more effective from industry teams as well. Ideally each team gets solid training on a review process along with help on soft skills for review leaders. But realistically a lot of times a bunch of engineers are just tossed into a room and told to make it happen. Knowing tricks that help is not a bad thing.


Monday, July 22, 2013

Making Main Loop Scheduling Timing More Deterministic

Summary: Wasting time in a main loop scheduler can make testing system-level timing a whole lot easier.



It's common enough to see a main loop scheduler in an embedded system along the lines of the following:

for(;;)
{ Task1();
  Task2();
  Task3();
}

I've heard this referred to as a "static scheduler," "cyclic executive," "main loop scheduler," or "static non-preemptive scheduler" among other terms. Regardless of what you call it, the idea is simple: run all the tasks that need to be run, then go back and do it again until the system is shut down or reset. There might be interrupt service routines (ISRs) also running in the system.

The main appeal of this approach is that it is simple. You don't need a real time operating system and, even more importantly, it would appear to be difficult to get wrong. But, there's a little more to it than that...

The first potential problem is those pesky ISRs. They might have timing problems, cause concurrency problems, and so on. Those issues are beyond what I want to talk about today except for making the point that they can affect the execution speed of one iteration through the main loop in ways that may not be obvious. You should do timing analysis for the ISRs (chapter 14 of my book has gory details).  But for today's discussion we're going to assume that you have the ISRs taken care of.

The next problem is timing analysis of the main loop. The worst case response time for running a task is one trip through the loop. But how long that trip is might vary depending on the calculations each task performs and how much time ISRs steal. It can be difficult to figure out the absolute worst case execution time (you should try, but it might not be easy). But the really bad news is, even if you know the theoretical worst case timing you're unlikely to actually see it during testing.

Consider the tester trying to make sure the system will function with worst case timing. How do you get the above static scheduler to take the worst case path through the code a bunch of times in a row to see what breaks? It's a difficult task, and probably most testers don't have a way to pull that off. So what is happening is you are shipping product that has never been tested for worst case main loop execution time. Will it work?  Who knows.  Do you want to take that chance with 10,000 or 100,000 units in the field? Eventually one of them will see worst case conditions and you haven't actually tested what will happen.

Fortunately there is an easy way to mitigate this risk. Add a time-waster at the end of the main loop. The time waster should convert the above main loop, which runs as fast as it can, to a main loop that runs exactly once per a defined period (for example, once every 100 msec):

for(;;)
{ StartTimer(100);  // start a 100 msec countdown
  Task1();
  Task2();
  Task3();
  WaitForTimer(0);  // wait for the 100 msec countdown to reach 0
}

This is just a sketch of the code -- how you build it will depend upon your system. The idea is that you waste time in the WaitForTimer routine until you've spent 100 msec in the main loop, then you run the loop again. Thus, the main loop runs exactly once every 100 msec.  If the tasks run faster than 100 msec as determined by a hardware timer, you waste time at the end, waiting for the 100 msec period to be up before starting the next main loop iteration. If the tasks take exactly 100 msec then you just start the main loop again immediately. If the tasks run longer than 100 msec, then you should log an error or perform some other action so you know something went wrong.

The key benefit to doing this is to ensure that in testing the average timing behavior is identical to the worst case timing behavior. That way, if something works when the system is fast, but breaks when it actually takes 100 msec to complete the main loop, you'll see it right away in testing. A second benefit is that since you are actively managing the main loop timing, you have a way to know the timing ran a little long on some loops even if it isn't bad enough to cause a watchdog reset.






Monday, June 10, 2013

Seven Deadly Sins of CRCs and Checksums

We're wrapping up the final report for an FAA-sponsored study of CRC and Checksum performance for aviation applications, although the results in general apply to all uses of those error detection codes.

As part of our results we came up with an informal list of "Seven Deadly Sins" (bad ideas):
  1. Picking a CRC based on a popularity contest instead of analysis
    • This includes using “standard” polynomials such as IEEE 802.3
  2. Saying that a good checksum is as good as a bad CRC
    • Many “standard” polynomials have poor HD at long lengths
  3. Evaluating with randomly corrupted data instead of BER fault model
    • Any useful error code looks good on random error patterns vs. BER random bit flips
  4. Blindly using polynomial factorization to choose a CRC
    • It works for long dataword special cases, but not beyond that
    • Divisibility by (x+1) doubles undetected fraction on even # bit errors
  5. Failing to protect message length field
    • Results in pointing to data as FCS, giving HD=1
  6. Failing to pick an accurate fault model and apply it
    • “We added a checksum, so ‘all’ errors will be caught” (untrue!)
    • Assuming a particular standard BER without checking the actual system
  7. Ignoring interaction with bit encoding
    • E.g., bit stuffing compromises HD to give HD=2
    • E.g., 8b10b encoding – seems to be OK, but depends on specific CRC polynomial
(I haven't tried to map it onto the more traditional sin list... if someone comes up with a clever mapping I'll post it!)

Thanks to Kevin Driscoll and Brendan Hall at Honeywell for their work as co-investigators. You can read more about the research on my CRC and Checksum Blog.  That blog has more detailed postings, slide sets, and will have the final research report when it is made publicly available.


Saturday, May 25, 2013

Adding Prioritization to an Single Level Interrupt Priority System


Summary of technique: Add a software structure that executes only the highest priority pending interrupt within the ISR polling loop. Then start again at the top of the polling loop instead of polling all possible ISRs. This gives you a prioritized non-preemptive interrupt service routine scheduler.

- - - - - - - - - - - - - - - - - - - -

With some microcontrollers, all of your interrupts come in at the same priority level (for example, via an external interrupt request pin). The usual thing to do in that case is create a polling loop to check all the sources of interrupts and see which one needs to be serviced by looking at peripheral status registers.  For example:
if(HWTimerTick)  { ... ISR to service hardware timer tick ... }
if(ADCReady)  { ... ISR to service A to D converter ... }
if(SerialPortDataInReady ) { ... ISR to read a serial port byte... }
if(SerialPortDataOutReady) { ... ISR to write a serial port byte ... }
...
(Of course this isn't real code ... I'm just sketching a flow that you've seen before if you've written this type of ISR that polls all the devices that can cause interrupts to see which one actually needs to be serviced.)

If only one of these devices is active, then this approach should work pretty well. And if you do system-level testing probably things will work fine -- at least most of the time.

But the way you can get into trouble is if one of the interrupts has a short deadline for being serviced. Let's say you have the above code and are seeing serial input bytes being dropped once in a while.  What could be happening?

One cause of dropping bytes might be that the HW Timer Tick and/or the ADC Ready interrupts are active at the same time that the serial port data input interrupt is ready. You need to execute them before you can get data from the serial port. If the sum of their two execution times is longer than the time between serial byte arrivals, you're going to take too long to get to the serial port input ISR and will drop bytes.

You might buy a faster processor (which might be unnecessary as we'll see), but before doing that you might reorganize the code to put the serial input first in the list of ISRs so you can get to it faster when an interrupt comes in:
if(SerialPortDataInReady ) { ... read a serial port byte... }
if(HWTimerTick)  { ... service hardware timer tick ... }
if(ADCReady)  { ... service A to D converter ... }
if(SerialPortDataOutReady) { ... write a serial

And that will *almost* work. Things might get a little better, but it won't cure the problem. (Or, MUCH worse, it will cure the problem in testing only to have the problem reappear in the field after you've shipped a lot of systems!)  Now when you get an interrupt you'll service the serial port input ISR first. But, then you'll go off and do the other ISRs. If those other ISRs take enough time, you will be stuck in those other ISRs too long and will miss the next byte -- you won't get back to the top of the list of ISRs in time.

You might try re-enabling interrupts inside any long ISRs to let the serial port get processed sooner. But resist the temptation -- that probably won't work, and will likely result in stack overflows due to recursive interrupt processing. (Simple rule: NEVER re-enable interrupts from inside an ISR.)

What we really need here is prioritization. And it's pretty easy to get even though we don't have hardware interrupt prioritization. All you have to do is (1) put the checks for each ISR in priority order, and (2) only execute the first one in the list each time you process interrupts. This can be done as follows:

if(SerialPortDataInReady ) { ... read a serial port byte... }
else if(HWTimerTick)  { ... service hardware timer tick ... }
else if(ADCReady)  { ... service A to D converter ... }
else if(SerialPortDataOutReady) { ... write a serial port byte ... }

Now only the first active interrupt will be serviced and the rest ignored. When you drop out of this structure and exit, any pending interrupt will re-trigger the checks from the beginning, again executing the highest priority interrupt that is still active (i.e., the first active one in the list). This will continue until all pending interrupts have been processed. You can use a "while" loop around the code above, or in many systems it may make sense just to exit interrupt processing and let the hardware interrupts re-trigger to re-run the polling code as a new interrupt.


This approach means that the worst case delay between processing serial input bytes is no longer all the ISRs running (if all interrupts are active). Rather, the worst case is the single longest ISR happens to be running, completes, and the serial port input ISR runs next. This happens because the list only runs at most one ISR rather than all of them. If that one ISR runs too long to meet deadlines, then it's probably too "fat" and should be simplified or its job moved out of ISRs and into the main loop.

There is no free lunch. The lowest priority ISR (the one at the end of the list) might starve. Making sure you meet all your ISR deadlines is trickier with this structure. Without the "elseif" approach the worst case timing is easy to compute -- it is the run time of all ISRs. But it might be too slow to live with. With this structure you have a nonpreemptive prioritized scheduling system for ISRs, and need to use suitable math and a suitable scheduling approach. Generally you'd want to use rate monotonic analysis (RMA) suitably adapted for the ISRs being non-preemptive. The analysis may be a little more complex, but this approach might help you salvage a situation in which you're missing deadlines and have already committed to a certain speed of microcontroller.


(Note on terminology: technically the whole thing is one big ISR that calls a different function depending upon what's active. But I'm calling each such function an ISR because that is really what it does ... you're using a software dispatcher to pick which ISR to run instead hardware prioritization logic to pick an ISR.)

Static Analysis Ranked Defect List

  Crazy idea of the day: Static Analysis Ranked Defect List. Here is a software analysis tool feature request/product idea: So many times we...