The Problem:
The Boeing 787 aircraft's electrical power control units shut down if powered without interruption for 248 days (a bit over 8 months). In the likely case that all the control units were turned on at about the same time, that means they all shut down at the same time -- potentially in the middle of a flight. Fortunately, the power is usually not left on for 8 continuous months, so apparently this has not actually happened in flight. But the problem was seen in a long-duration simulation and could happen in a real aircraft. (There are backup power supplies, but do you really want to be relying on them over the middle of an ocean? I thought not.) The fix is turning off the power and turning it back on every 120 days.
That's right -- the FAA is telling the airlines they have to do a maintenance reboot of their planes every 120 days.
(Sources: NY Times ; FAA)
Analysis:
Just for fun, let's do the math and figure out what's going on.
248 days * 24 hours/day * 60 minute/hour * 60 seconds/minute = 21,427,200
Hmmm ... what if those systems keep time as an 32-bit signed integer in hundredths of a second? The maximum positive value for such a counter would give:
0x7FFFFFFF = 2147483647 / (24*60*60) = 24855 / 100 = 248.55 days.
Bingo!
If they had used a 32-bit unsigned it would still overflow after twice as long = 497.1 days.
Other Examples:
This is not the first time a counter rollover has caused a problem. Some examples are:
- IBM: Interface adapters hang after 497 days of uptime [IBM]
- Windows 95: hang after 49.7 days without reboot, counting in milliseconds [Microsoft]
- Hong Kong rail service outage [Blog]
There are also plenty of date roll-over bugs:
- Y2K: on 1 January 2000 (overflow of 2-digit year from 99 to 00) [Wikipedia]
- GPS: 1024 week rollover on 22 August 1999 [USCG]
- Year 2038: Unix time will roll over on 19 January 2038 [Wikipedia]
There are also somewhat related capacity overflow issues such as 512K day for IPv4 routers.
If you want to dig further, there is a "zoo" of related problems on Wikipedia: "Time formatting and storage bugs"