In the past couple posts I've talked about how to build a multi-rate main loop scheduler and the two biggest mistakes I tend to see with timing analysis. In this posting I'll wrap up the topic (at least for now) by describing how to determine whether a particular task in a non-preemptive multi-rate system is going to meet its deadlines. The math gets pretty complex and is pretty painful, so I'll work an example and give the general rules without trying to show the iterative math behind the method. At that, this posting will take more work to understand completely than my usual posts, but it's just a complex topic and this isn't the place for an entire book chapter worth of math. On real systems you can generally do the checking with a spreadsheet once you understand the technique.

Consider the following task set:

Task Period Compute CPU Load

0 7 2 28.6%

1 10 2 20%

2 20 3 15%

3 101 5 5%

4 199 3

__1.5%__

Total CPU Load 70.1% (numbers rounded to nearest 0.1%)

If we want to find out the worst case response time for Task 2, we need to look at the following worst case:

- All the tasks in the task set become ready to run simultaneously. In practice this means that the system timer is at zero or equal to the least common multiple of all periods
- The most obnoxious task with priority lower than the task we care about sneaks in and starts running just before the highest priority task gets to run. (Generally this happens because that task got a chance to start right at the last instant before the counter ticked.)
- All the tasks with higher priority run one or more times (based on their periods) before the task we care about runs. In general some tasks will run multiple times before our task gets a chance to run.

So for the example case of Task #2, that means:

- Identify Task #3 as the task with the largest compute time from those tasks with a priority higher than Task #2. Assume it starts running at time zero because it snuck in ahead of all the other tasks.
- Because it is time zero, all other tasks are ready to run starting at time zero. But because Task #3 snuck in first, all the other tasks are in the run queue at time zero.

- Start at time 0 with Task 3 running.

- Tasks 0, 1, 2, 4 queue to run at time zero.

- Running Task 3 takes us to time 5 when Task 3 completes.

- At time 5 tasks still queued are: 0, 1, 2, 4 - Time 5: run highest priority queued task: Task 0.

- This runs for 2 msec, bringing us to time 7.

- At time 7 we reach the Task 0 period, so another copy of Task 0 queues to run.

- At time 7 tasks still queued are: 0, 1, 2, 4 - At time 7 run highest priority queued task: Task 0.

- This runs for 2 msec, bringing us to time 9.

- At time 9 tasks still queued are: 1, 2, 4 - At time 9 run highest priority queued task: Task 1.

- This runs for 2 msec, bringing us to time 11.

- At time 10 another copy of Task 1 is queued (note that we have missed the deadline for Task 1 since it should have completed at time 10, but ran until time 11).

- At time 11 tasks still queued are: 1, 2, 4 - At time 11 run highest priority queued task: Task 1.

- This runs for 2 msec, bringing us to time 13.

- At time 13 tasks still queued are: 2, 4 - At time 13 run highest priority queued task: Task 2.

- This runs for 3 msec, bringing us to time 16.

- At time 14 another copy of Task 0 is queued.

- At time 14 tasks still queued are: 0, 4. - At time 16 Task #2 has completed, ending our analysis

We've computed the worst case time to complete Task 2 is 16 msec, which is less than its 20 msec period. So Task 2 will meet its deadline. However, along the way we noticed that even though the CPU is only loaded to 70.1% Task 1 is going to miss its deadline.

Graphically, the above scenario looks like this:

The blue arrows show where tasks become queued to run, and the boxes show which task is running when.

To determine if your system is schedulable you need to repeat this analysis for every task in the system. In this case, repeating it for Task 1 will reveal that Task 1 misses its deadlines even though the CPU is only loaded at 70.1%.

In general the technique is to assume the longest-running task (biggest compute time) with priority lower than yours starts running and all other tasks queue at that same time. Then play out the sequence to see if you meet your deadline. There are a few notes on special cases. The longest task with lower priority may vary depending on which task you are evaluating. For example, the longest lower priority task for Task #3 is not Task #3, but rather Task #4. And the lowest priority task doesn't have a lower priority blocking task, so when evaluating Task #4 you can just assume it starts at time zero (there is no initial blocker). This can be expressed as an iterative equation that has to be cycled until it converges. If you really want a math-based approach take a look at the CAN performance lecture slides in my grad course, which is pretty much the same math. But trying to explain the math takes a lot more words and complicated typography than are practical in a blog entry.

If you find out you are missing deadlines and want to fix it, there are two general techniques that can help:

- Make any really long-running low priority task run shorter, possibly by breaking it up into multiple shorter tasks that work together, or that can "yield" control back to the main loop every once in a while during their execution and pick up the next time they run where they left off. This will reduce the length of the initial blocking effect for all higher priority tasks.
- Schedule tasks so their periods evenly divide each other (this will result in the Least Common Multiple of all periods equals the largest period). This corresponds to the the approach of harmonic task periods discussed in the real time scheduling chapter of my book. For non-preemptive tasking it will NOT support 100% CPU usage, but probably it will make worst case latency better.

Notes for readers who want to be thorough:

- If you have interrupts you have to include the ISR contribution to CPU workload when doing this analysis. Doing so can get a bit tricky. The real time scheduling chapter of my book shows how to do this for a single-rate main loop scheduler.
- It may be that you want to resolve a situation in which the fastest task gets blocked for too long by putting that faster task into an ISR. If you do that, keep it short and don't forget that it still affects CPU workload. This approach is discussed for a single-rate main loop scheduler, also in the real time scheduling chapter of my book.
- We don't include the current task in the "most obnoxious" consideration because we start by assuming the system will meet deadlines where deadline = task period. Any particular task has to have finished running before it can run again unless it misses its deadline. So a task can't self-block in a system that is schedulable.

Can you write a post for debugging embedded software , i'm stuck on a bug : the problem is that the application when using debug mode works fine because it envolves reading CAN frames and writing to the FLASH Memory, thanks a lot

ReplyDeleteSorry to hear you're having trouble in that situation. There is no simple answer other than the usual bug hunting heuristic of divide and conquer -- get the CAN frames working on their own. Get writing to flash memory working on its own. Only then should you try to combine the two functions.

Delete