Better Embedded System SW: Do Not Re-Enable Interrupts In An ISR

Monday, January 6, 2014

Do Not Re-Enable Interrupts In An ISR

Summary: It might sound intuitive to re-enable interrupts within an ISR so higher priority interrupts can run without delay. But, in fact, this is probably the worst thing you can do for delay, and might even make the code unsafe due to stack overflow.

A previous post on rules for using interrupts included the rule:

"Don't re-enable interrupts within an Interrupt Service Routine (ISR). That's just asking for subtle race condition and stack overflow problems."

Some developers take a different point of view, and feel that it is best to re-enable interrupts to let higher priority ISRs run without delay. The idea is that high priority interrupts should run as soon as possible, without having to wait for low priority interrupts to complete. Re-enabling interrupts within an ISR seems to let that happen. BUT, while there might be some intuitive appeal to this notion, this is a dangerous practice that makes things worse instead of better.

First, to recap, when an interrupt triggers an ISR one of the first things that happens is that further interrupts get masked by the interrupt handling hardware mechanisms. Once the ISR starts running, at some point (best is at the beginning) it acknowledges the interrupt source, clearing the interrupt request that triggered the ISR. At that point the ISR can re-enable interrupts if it wants to, or leave them masked until the ISR completes execution. (The "return from interrupt" instruction will typically restore interrupt flags, re-enabling interrupts as appropriate when the ISR completes.) If interrupts are re-enabled within the ISR, then another interrupt can suspend the ISR and run some other, second ISR. This means that if a higher priority interrupt comes along, it can run its ISR right away.

The problem is that other, bad, things can also happen once interrupts are re-enabled:

If a lower priority interrupt comes along, it also gets to run, suspending the currently running, higher priority ISR. Interrupt priority hardware does not keep track of interrupt history after an ISR starts and acknowledges its interrupt source, and so loses track of how high the priority is for the running ISR. Worse, if a high and low priority interrupt happen at the same time, this approach guarantees that the high priority ISR waits for the low priority ISR. This happens because the high priority ISR runs first, then gets preempted by the lower priority ISR as soon as interrupts are re-enabled. In the case where no other interrupts are pending, the low priority ISR runs to completion before the high priority ISR gets to finish.

The ISRs nest as above rather than running one at a time, filling up the stack. You might be able to account for this by allocating enough stack for all ISRs to be active at the same time. But if you leave interrupts masked in ISRs. the worst case is only the single biggest ISR stack use. (Some hardware has multiple tiers/levels/classes ... pick your favorite term ... of interrupts, but in that case it is still only one ISR of stack use per tier rather than one per ISR source.)

The same ISR might run more than once at a time, especially if it got unlucky and was preempted by other ISRs, delaying its completion time. For example, if you get a burst of noise on an ISR hardware line you might kick of a half dozen or so copies of the same ISR. Or once in a while hardware events happen close together and re-trigger the ISR. This will lead to trouble if your ISR code is not re-entrant. It also could overflow the stack, ending up in memory corruption, etc. unless you can accurately predict or limit how many times ISRs can be re-triggered in absolute worst-case conditions.

You could say "the highest priority ISR doesn't re-enable interrupts" -- but what about the second-highest priority ISR? Once you get more than a couple ISRs involved this gets hopeless to untangle. You could try to write some sort of ISR handler to mitigate some of these risks, but it's going to be difficult to get right, and add overhead to every ISR. In all, the situation sounds pretty messy and prone to problems .. and it is. You might get away with this on some systems some of the time if you are really good (and never make mistakes). But, getting concurrency-related tricky code right is notoriously difficult. Re-enabling interrupts is just asking for problems.

So let's look at the alternative. What is the true cost you might be trying to avoid in terms of delaying that oh-so-urgent high priority ISR because you're not re-enabling interrupts in an ISR?

The worst case is that the longest-running low priority ISR runs to completion, making all the higher priority ISRs wait for it to complete before they can start. But after that all the remaining ISRs that are pending will complete in priority order -- highest to lowest priority. That's exactly what you want except for the low priority ISR clogging up the works. So if you have an obnoxiously long low priority ISR that's a problem. But if none of your ISRs run for very long (which is how you're supposed to write ISRs), you're fine. Put into scheduling terms, you want to make sure none of your ISRs runs long, because a long-running ISR gives you a high blocking time, and blocking time delays high priority tasks from completing.

Let's compare outcomes for the two alternative strategies. If you re-enable interrupts, the worst case latency for the highest priority ISR in the system is that it arrives, and then gets preempted by every other ISR in the system (including the longest-running ISR if it comes in later, but before the high priority ISR has a chance to complete). If you leave interrupts masked, the worst case is that the longest-running ISR has to complete, but then the high priority ISR goes immediately afterward. So, leaving interrupts disabled (masked) during every ISR is clearly a win for the worst case, in that you only have to wait for the longest-running ISR to complete before running the highest priority ISR, instead of waiting for all ISRs to complete. The worst case is typically what you care about in a real time embedded system, so you should leave interrupts disabled in ISRs to ensure the fastest worst-case completion time of high priority ISRs. And, leaving interrupts disabled in ISRs also gets rid of the risks of stack overflow and re-triggered ISRs we mentioned.

UPDATE: To avoid confusion, it's important to note that the above is talking about what happens at ONE level of interrupts, such that when one ISR is running no other interrupts run until the ISR completes or ISRs at that level complete. Many architectures have multiple levels, in which one ISR can interrupt another ISR at a lower level even if that lower level has interrupts masked. This corresponds to the comment about one ISR per level being active in the worst case. Also, note that if an architecture can change the priorities of interrupts within a single level that's irrelevant -- it is the existence of levels that are each individually maskable and that are prioritized as groups of interrupts per level that gives a way around some of these problems. So, going back to the title says, do not RE-enable the same level of interrupts within an ISR.

15 comments:

Phil KoopmanJanuary 6, 2014 at 9:29 AM
You are correct in the strictly worst timing case. But you're going to overflow your stack before you hit an infinitely long time. My summary was assuming that you can make assurances that each ISR will execute at most once before the highest priority ISR finally gets to go (e.g., as one might do using Rate Monotonic Analysis).
ReplyDelete
Replies
Lubomir DobsikJanuary 6, 2014 at 9:56 AM
I see your point, though I am not sure if RMA would give any meaningful result in such setting, where priorities are practically non-existent.
ReplyDelete
Replies
Markus FrankeJanuary 12, 2014 at 1:16 PM
Thanks for the interesting post. Another option would be to just re-enable handling of higher priority interrupts within the ISR, i.e. modify the interupt mask accordingly. What do you think about this?
ReplyDelete
Replies
vnz0rJanuary 13, 2014 at 9:54 AM
I am indeed using a system with different "levels" or "priorities". For us it is mandatory that a high priority ISR can interrupt a lower priority one ASAP (very critical timing, only for the highest prio ISRs). Re-enabling interrupts (globally, therefor allow higher prio ISRs only) is absolutely possible and safe. Your remarks on stack usage are correct, but in this case are limited to the number of ISR priorities that you use.
Anyway, thanks for sharing this!
ReplyDelete
Replies
AnonymousJune 17, 2014 at 12:32 AM
what if you clear the stack after every interrupt call? like:
1.interrupt call
2.ISR initiation
3.clear stack
4. run rest of the ISR
5. end ISR
6. Back to main program
ReplyDelete
Replies
AnonymousJune 23, 2014 at 8:23 AM
I really dont need the information in the stacks. No assumptions will be violated. So help me maybe?
ReplyDelete
Replies
AnonymousNovember 19, 2015 at 12:04 PM
Hi, Suppose when the control enters an ISR and then we disable the interrupts. After that when that ISR is still getting exexuted, a high priority interrupt occurs TWICE (same interrupt). Then after current ISR is executed, how many times will the blocked high priority interrupt will get executed?
ReplyDelete
Replies

Add comment

Please send me your comments. I read all of them, and I appreciate them. To control spam I manually approve comments before they show up. It might take a while to respond. I appreciate generic "I like this post" comments, but I don't publish non-substantive comments like that.

If you prefer, or want a personal response, you can send e-mail to comments@koopman.us.
If you want a personal response please make sure to include your e-mail reply address. Thanks!

Better Embedded System SW

Monday, January 6, 2014

Do Not Re-Enable Interrupts In An ISR

15 comments:

Static Analysis Ranked Defect List

Pages

Search This Blog