Recently I was looking for an example of a prioritized, cooperative multitasking main loop and realized it is surprisingly difficult to find one that is (1) readily available and (2) comprehensible.
Sure, once you understand the concept maybe you can dig through an industrial-strength implementation with all sorts of complexity. But good luck getting up the learning curve! So, here is a simple (I hope!) description of a multi-rate main loop scheduler.
First of all, we're talking about non-preemptive multitasking. This is variously called main loop scheduling, a main loop tasker, a prioritized cooperative multitasker, a cyclic executive, a non-preemptive scheduler, and no doubt a bunch of other terms. (Not all of these terms are precisely descriptive, but in terms of search terms they'll get you in the ballpark.) The general idea is to be able to schedule multiple tasks that operate at different periods without having to use an RTOS and without having to use the generally bad idea of stuffing everything into timer-based interrupts.
Single Rate Main Loop
The starting point is a single-rate main loop. This is a classical main loop schedule in which all tasks are run to completion over and over again:
void main(void)
{ ... initialization ...
while(1)
{ Task0();
Task1();
Task2();
Task3();
}
}
The good news about this is that you don't need an RTOS, and it's really hard to get things wrong. (In other words reviews and testing are easy to do well.)
The bad news is that all tasks have to run at the same period. That means that if one task needs to run really quickly you'll miss its deadlines.
I've seen way too many hacks that use conditional execution based on CPU load to sometimes skip tasks. But ad hoc approaches have the problem that you can't really know if you'll miss deadlines. Instead, you should use a methodical approach that can be analyzed mathematically to make sure you'll meet deadlines. The way people generally go is with some variation of a multi-rate main loop.
Multi-Rate Main Loop
The idea behind a multi-rate main loop is that you can run each task at a different periodic rate. Each task (which is just a subroutine) still runs to completion, so this is not a full-up preemptive multitasking system. But it is relatively simple to build, and flexible enough for many embedded systems.
Here is some example code of the main loop itself, with some explanation to follow.
void main(void)
{ ... initialization ...
while(1)
{ if( flag0 ) { flag0 = 0; Task0(); }
else if ( flag1 ) { flag1 = 0; Task1(); }
else if ( flag2 ) { flag2 = 0; Task2(); }
else if ( flag3 ) { flag3 = 0; Task3(); }
}
}
The way this code works is as follows. All the tasks that need to be run have an associated flag set to 1. So if Task1 and Task2 are the only tasks that need to run, flag0 is 0, flag1 is 1, flag2 is 1, and flag3 is 0. The code crawls through an "else if" cascade looking for the first non-zero flag. When it finds a non-zero flag it executes that task, and only that task.
Note that each task sets its flag to zero so that it runs exactly one time when it is activated by its flag. If all flags are zero then no task is executed and the do..while loop simply spins away until a flag finally becomes non-zero. More about how flags get set to 1 in a moment.
After executing at most one task, the loop goes back to the beginning. Because at most one task is executed per iteration of the main do..while loop, the tasks are prioritized. Task0 has the highest priority, and Task3 the lowest priority.
Yes, this prioritization means that if your CPU is overloaded Task0 may execute many times and Task3 may never get a turn. That's why its important to get scheduling right (this will be a topic in a later blog posting).
Multi-Rate Timers
The main loop wasn't so bad, except we swept under the rug the messy business of getting the flags set properly. Trying to do that in the main loop generally leads to problems, because a long task will cause many milliseconds to go by between timer checks, and it is too easy to have a bug that misses setting a flag some of the time. Thus, in general you tend to see flag maintenance in the timer interrupt service routine.
Conceptually the code looks like this, and for our example lives in a timer interrupt service routine (ISR) that is called every 1 msec. A variable called "timer" keeps track of system time and is incremented once every msec.
// in a timer ISR that is called once every msec
timer++;
if ((timer % 5) == 0) { flag0 = 1; } // enable 5 msec task0
if ((timer % 10) == 0) { flag1 = 1; } // enable 10 msec task1
if ((timer % 20) == 0) { flag2 = 1; } // enable 20 msec task2
if ((timer % 100) == 0) { flag3 = 1; } // enable 100 msec task3
if (timer >= 100) { timer = 0; } // avoid timer overflow bug
Every 5 msec the timer will be zero modulo 5, every 10 msec the timer will be zero modulo 10, and so on. So this gives us tasks with periods of 5, 10, 20, and 100 msec. Division is slow, and in many low-end microcontrollers should be avoided in an ISR. So it is common to see a set of counters (one per flag), where each counter is set to the period of a particular task and counts down toward zero once per msec. When a counter reaches zero the associated flag is set to 1 and the counter is reset to the tasks' period. This takes a little more RAM, but avoids division. How it's implemented depends upon your system tradeoffs.
The last line of this code avoids weird things happening when the timer overflows. The reset to zero should run at the least common multiple of all periods, which in this case happens to be equal to the longest period.
Concurrency Issues
As with any scheduling system there are potential concurrency issues. One subtle one is that the timer ISR can run part-way down the else..if structure in the main loop. This could cause a low-priority task to run before a high priority task if they both have their flags set to 1 on the same timer tick. It turns out that this doesn't make the worst case latency much worse. You could try to synchronize things, but that adds complexity. Another way to handle this is to copy the current time into a temporary variable and do the checks for when to run each task in the main loop instead of the timer ISR.
It's also important to note that there is a potential concurrency problem in writing flags in the main loop since both the ISR and the main task can write the flag variables. In practice the concurrency bug will only hit when you're missing deadlines, but good coding style dictates disabling interrupts when you update the flag values in the main loop, which isn't shown in the main loop code in an attempt to keep things simple for the purpose of explanation.
The Big Picture
OK, that's pretty much it. We have a main loop that runs each task when its ready-to-run-flag is set, and a timer ISR that sets a ready-to-run flag for each task at the desired period. The result is a system that has the following properties:
- Each task runs once during its assigned period
- The tasks are prioritized, so for example task 2 only runs when task 0 and task 1 do not need to run
In terms of practical application this is quite similar to what I often see in commercial systems. Sometimes developers use arrays of counters, arrays of flags, and sometimes even arrays of pointers to functions if they have a whole lot of functions, allowing the code to be a generic loop rather than spelling out each flag name and each task name. This might be necessary, but I recommend keeping things simple and avoiding arrays and pointers if it is practical for your system.
Coming soon ... real time analysis of a multi-rate main loop
Phil,
ReplyDeleteOne other point that might be worth mentioning -- often in systems like this, you'll see logic in the timer ISR to avoid the alignment of multiple tasks being "readied" at the same tick (harmonics -- for example, in the code above, every 100ms, you'll have all 5 tasks wanting to run at the same time).
For example, something like (not adjusting for initial/boundary conditions):
if ((timer % 5) == 0) { flag0 = 1; } // enable at 5, 10, 15...
if (((timer+1) % 10) == 0) { flag1 = 1; } // enable at 9, 19, ...
if (((timer+2) % 20) == 0) { flag2 = 1; } // enable at 8, 18...
Granted, the designer has to the schedulability analysis and see what the processing demands are, but hopefully the idea comes through.
Dan -- that's a great comment. I've seen that type of code too.
DeleteThis technique cuts both ways. If you can statically make sure things are schedulable then it is a great way to reduce the latency of running each task once it is scheduled. (By "static" schedulability, think of a spreadsheet that lays out exactly when everything runs, and then use the offsets to avoid things piling up just as you say. If you can make the puzzle work, you're all set.)
However, this does complicate analysis if you're doing rate monotonic scheduling, which is based on an assumption of zero offset to make the mathematical proofs work. (Basically what you're doing is the +1 and +2 is adding offset to those periods.) So I'd recommend against doing this if you're doing RMA/RMS. The short version is that RMA/RMS guarantees you'll run by the end of your period. Having all the tasks pile up at the beginning of the period gives the system the flexibility it needs to make sure the CPU always stays busy and it can squeeze everything as promised. If you add offset, tasks are going to spend less time queued to run, but every once in a while you could miss a deadline because you let some idle CPU time get lost while waiting for a task's offset to arrive. Whether it matters or not in your system depends on how you implement the offset and how you look at deadlines. I think the exact code you propose might work out OK, but it is easy to write code that doesn't work out OK. Adding offsets is a tricky area, and my first instinct is always to avoid tricky analysis. If you know of a place where someone has worked out the math on this technique to show things one way or another I'd love to hear about it.
Hi Phil,
ReplyDeletefirst of all I would like to tell you that finally your book has come to my desk here in Italy, and I'm sure it will be of great help in my job.
Second, in your post you suggest not to use arrays of pointers. Why do you? Is there a particular reason other than complexity in debugging and management?
Thanks in advance,
Daniele
The primary reason is complexity in debugging and managing the arrays of pointers. Any time you jump through a pointer there is an increased chance to get it wrong, especially if they are non-const pointer values. But even more important is that it is that much more difficult to do peer review of the system if you are using pointers. It is more difficult to be sure you did not make a mistake.
DeleteHi Phil,
ReplyDeleteA few years have passed since your post, but I still think it is worth noting that there is a possibility to get rid of the division or modulo operations completely. Most MCU have timer compare registers which can be used to generate interrupts at different frequencies using just one timer!
For example, with a 1 MHz timer clock, you can set your compare registers to 5000, 10000, 15000, and 20000 respectively. In the last one, you restart the timer. In each interrupts, you set the flag for the 5ms tick, in first and third the 10ms tick flag, and in the first one the 20ms flag.
If your tick periods are too far aways from each other, then you can use a prescaler (counter). E.g. in the first interrupt you increment a counter and when it reaches five you reset it and set the 100ms flag.
Cheers, Emil
Hi Emil,
DeleteYou're right -- the computation can avoid a mod operation if you are clever. A common optimization is to make things multiples of powers of two and look for low bits all being zero, and so on.
In general, you usually want periods to be harmonic multiples for better schedulability, so that would be 5000, 10000, 20000, but not 15000.