Tuesday, September 25, 2018

Potentially deadly automotive software defects

Here's a list of potentially deadly automotive software defects, mostly from NHTSA Recall notices.

There is still a lot of resistance to the idea that car software can have fatal defects that result in deaths not due to driver error. In fact such defects do exist, and for many of them we've just gotten lucky that few or no people have died as a result. Recently we've been seeing more deadly software defects being reported. This posting is intended to give a taste of what's been going on in automotive software quality. This is a very partial list of bad software that was deployed on production vehicles in the US.

This list includes a variety of subsystems including unintended acceleration, steering failures, brake assist failures, headlights going out while driving, and quite a lot of air bag failures. There are software defects, configuration management errors, leaving the module in "factory mode" when shipped, and even EEPROM wearout. Overall this paints a picture of an industry that is shipping a lot of safety critical software defects.  In fairness, yes, these are all ones that are being fixed, and there are certainly other causes of fatal accidents. (Presumably there are others not yet being fixed, if for no other reason than that the cars are still new on the road. But at least some of these recalls sure look like mistakes that simply should not be happening in life critical software.)

The list is almost certainly much, much longer, and I simply ran out of time trying to go through the full NHTSA database.  And even that doesn't include everything that happens. The list is heavy in 2013-2015 mostly because that was the most convenient source material I found. There is no reason whatsoever to believe things have gotten dramatically better since then.

Remember that a NHTSA recall is by definition a safety defect that matters. That's the whole point of having a recall.

The purpose of this list is not to call out any particular company or software defect. Rather, the point is that safety critical software defects are both pervasive and persistent across the automotive industry.  Yes, we can have discussions about how many vehicles vs. how many defects. But it still does not instill confidence about life critical software in a self-certifying industry that in the US is not required to follow international software safety standards.

Updated April 2021:
  • "Tesla asked to recall 158,000 cars for failing displays" / Jan 2021
    • "The failures of the so-called “media control units” in these vehicles can sever the owner’s access to their vehicle’s backup camera, climate controls, and Tesla’s Autopilot driver assistance system, increasing the risk of a crash, the safety agency says"
    • "The problem at the heart of the defect that NHTSA wants Tesla to fix involves worn-out flash memory chips used in the displays"
    • "Tesla confirmed to NHTSA that all units with this chip “will inevitably fail,” according to the agency, and also provided a statistical model showing projected weekly repairs lasting from 2020 to 2028, with the most failures happening in 2022."
    • https://www.theverge.com/2021/1/13/22229854/tesla-recall-model-s-x-touchscreens-bricked-failure-nhtsa
  • ECM incorrectly reduces engine power (Infiniti) / Apr 2021
    • "After detecting rapid acceleration, the Engine Control Module (ECM) may incorrectly reduce engine power and reduce fuel supply to the engine."
    • NHTSA Recall 21V-234
  • ESC does not stay in lane (Mack Trucks) / April 2021
    • "The vehicles may not stay in their lane at certain speeds." (FMVS 136 violation)
    • "A vehicle that drifts out of its lane increases the risk of a crash."
    • "dealers will reprogram the vehicle control unit"
    • NHTSA Recall 21V-233
  • Backup camera failure (Ford Lincoln) / March 2021
    • "The image processing module may be unable to provide video feed to the display, which could result in a loss of the backup camera image."
    • " start-up anomaly between the serializer component in the image processing module and the deserializer in the accessory protocol interface module"
    • "dealers will update the image processing module software with the latest level"
    • NHTSA Recall 21V-223
  • ABS and DSC disabled due to diagnostic check issue (Jaguar) / March 2021
    • "The diagnostic check for the Anti-Lock Brake System (ABS) that runs at vehicle startup may not complete in the time required, which could disable the ABS and the Dynamic Stability Control (DSC) system during that drive cycle."
    • "on occasion, the CCF read cycles by the ABS module were not being completed in the time expected, with the diagnostic checks taking up to 25 seconds. After 15 seconds, the ABS stops transmitting and this terminates the ABS and DSC systems and a Malfunction Indicator Lamp illuminates on the instrument cluster to warn the driver the systems are not available."
    • "dealers will update the vehicle software"
    • NHTSA Recall 21V-167
  • Loss of high voltage system (Volvo) / Feb 2021
    • For recharge vehicles: "The Battery Energy Control Module (BECM) microprocessor may reset and cause the high voltage system to disconnect."
    • "A disconnected high voltage system can cause a loss of drive power, increasing the risk of a crash."
    • NHTSA Recall 21V-109    (See also 21V-110 for Polestar equivalent)
  • ESC causes vehicle pull (Mercedes) / Feb 2021
    • "During certain evasive driving maneuvers, the Electronic Stability Program (ESP) software may apply torque to one of the front wheels, pulling the vehicle to one side."
    • "If the vehicle unexpectedly pulls to one side during an evasive maneuver, it can increase the risk of a crash."
    • "Due to a deviation in the supplier development process, the ESP software might not meet current production specifications.
    • "The customer will not receive an advance warning due to the nature of the failure mechanism."
    • NTSA Recall 21V-071
  • Rearview image blanking / failed OTA update (Subaru) / Dec 2020
    • "The August 2020 over-the-air software update may have timed out without completing the installation, corrupting the data, and causing the rearview display to shutoff intermittently." 
    • "a FOTA update was made available for certain vehicles. If the software download was initiated and if there was a delay in the data writing speed of the flash memory, the installation process could timeout. A timeout failure during the data writing sequence could cause the data to be corrupted, and if corrupted, may result in the CID going blank. In this blank condition, the backup camera and display will continue to function with the shift selector in Reverse, however, the CCM may continuously reboot approximately every three (3) minutes. If a reboot occurs while the vehicle is reversing, the rearview display may disappear during the reboot process, which takes approximately six (6) seconds to complete"
    • NHTSA Recall 20V-766
  • Body Control Communication Error (Honda) / Dec 2020
    • "A software error may cause intermittent or continuous disruptions in communication between the Body Control Module (BCM) and other components.  This may result in malfunctions of various systems such as the windshield wipers and defroster, rearview camera, exterior lights, audible warning of a stopped vehicle, and power window operation."
    • NHTSA Recall 20V-771
  • Alternating MPH and KM/H display (Jaguar) / Dec 2020
    • "due to an error in a service software update, the Instrument Cluster (IC) randomly displays alternating speedometer and odometer units between MPH and KM/H while the vehicle is in motion without the driver making any selection of display units"
    • The changing displays may cause driver distraction or confusion and possibly result in excessive speed, which can increase the risk of a crash.
    • NHTSA Recall 20V-751
  • Reduced braking performance (Hyundai) / Dec 2020
    • "The Integrated Electronic Brake (IEB) system may detect an abnormal sensor signal and as a result, may significantly reduce braking performance."
    • "HMC identified a condition within the IEB motor control software that, in absence of proper “fail-safe” logic, would disable the IEB motor upon detection of an abnormal sensor signal thus reducing foundational brake performance."
    • NHTSA Recall 20V-748
  • Blank backup camera (VW Jetta) / Nov 2020
    • "The rear view camera could malfunction during an ignition cycle, leading to a black screen or infotainment system freeze. "
    • "A black or frozen rear view image reduces the driver's visibility when reversing, increasing the risk of a crash."
    • "The condition for the error is a specified course of action with fixed time segments and a small time window. When the vehicle is "woken up" (door open), the MIB is also woken up and goes into standby. If the vehicle is then not started the MIB goes back to sleep mode after a waiting time of 30 seconds. This is necessary to minimize the energy consumption in the vehicle. During this state the video line cannot be diagnosed and any activation requests from the camera system are lost or are no longer taken into account."
    • NHTSA Recall 20V-716

    • x
    • x
    • x
    • x
    • x
    • x
    • x
    • x
    • x
    • x
    • x
    • x
    • x

  • "Automatic braking systems in some Nissan Rogues are going rogue, safety group says" / Mar 2019
  • "Alfa Romeo recalling 60,000 vehicles to repair cruise management fault" / Mar 2019
  • "Ford recalls 1.5 million Ford Focus cars that could stall with fuel tank problem" / Oct 2018
  • "Toyota recalls trucks, SUVs and cars to fix air bag problem" / Oct 2018
    • "Toyota says the air bag control computer can erroneously detect a fault when the vehicles are started. With a fault, the air bags may not deploy in a crash. The company wouldn't say if the problem has caused any injuries."
    • https://www.abc57.com/news/toyota-recalls-trucks-suvs-and-cars-to-fix-air-bag-problem
  • "Toyota isssues second prius recall in a month on crash risk" / Oct 2018
  • "Safety systems may be disabled when in use" (Mitsubishi) / Sept. 2018
    • "Inappropriate" software in the hydraulic ECU causes the pump to generate electrical noise that resets the ECU. That reset can cause: automatic braking to be cancelled, wheels lock momentarily, stability control to be momentarily cancelled, release break of brake auto-hold is active.
    • NHTSA recall 18V-621
  • "GM recalls more than 1M pickups, SUVs for power steering problem" / Sept. 2018
    • 30 crashes; two injuries, no deaths attributed
    • Voltage drop and return causes momentary power steering failure; fixed via software update
    • https://www.freep.com/story/money/cars/general-motors/2018/09/13/gm-recall-pickups-suvs-power-steering/1287911002/
  • "Expert investigation says BMW software to blame" / Aug 2018
  • "Fiat Chrysler recalls 5.3 million vehicles for cruise control defect" / May 2018
  • Incorrect Speed Limitation Software (Mercedes-Benz) / 2018
    •  These vehicles may be equipped with the incorrect reverse speed limitation software. While in reverse, any abrupt changes in steering while exceeding 16 MPH may cause the vehicle to become unstable.
    • NHTSA recall 18V-457
  • Cruise control may not disengage (Mercedes-Benz) / 2017
    • ESP software malfunction may cause engine not to reduce power regardless of speed, driving situation, or brake application.
    • NHTSA recall 17V-713
  • "Fiat Chrysler recalls 1.25 million trucks over software error" / 2017
  • Unintended vehicle movement (Ford) / 2017
    • Quick movement of gear shift can cause up to 1 second selection of reverse gear when shifting into intended drive (forward) gear.
    • NHTSA recall 17V-669
  • Air bags may not deploy in a crash (Mitsubishi) / 2017
    • SRS ECU misinterprets vibrations, disabling air bags from deploying in a crash
    • NHTSA recall 17V-686
  • Unintended acceleration failsafes "missing" (Dodge) / 2016
  • Inadvertent Side Air Bag Deployment (Chrysler) / 2015
    • Unexpected side airbags may unexpectedly deploy due to incorrect software calibration; may result in crash or injury
    • NHTSA Recall 15V-460 and 15V-467
  • Radio Software Security Vulnerabilities (Chrysler) / 2015
    • Exploitation of the software vulnerability may result in unauthorized remote modification and control of certain vehicle systems, increasing the risk of a crash.
    • NHTSA Recall 15V-461, 15V-508
  • "Toyota recalls 625,000 hybrids: Software bug kills engines dead with thermal overload" / July 2015
    • Software settings for motor/generator ECU cause thermal damage, then propulsion shutdown
    • https://www.theregister.co.uk/2015/07/15/toyota_recalls_625000_hybrids_over_enginekilling_software_glitch/
    • Note previous recall 14V-053 for similar sounding problem
  • Tire pressure monitoring system message (Ferrari) / 2015
    • TPMS displays 50 mph speed limit warning instead of "do not proceed" warning due to software defect. Driving on punctured tire would cause loss of vehicle control and crash.
    • NHTSA Recall 15V-306
  • Airbag Incorrect Deployment Timing (BMW) / 2015
    • Driver front air bag timing incorrect / fails to meet FMVSS 208 due to programming error
    • NHTSA Recall 15V-148 
  • Passenger Air Bag may be disabled (Jaguar) / 2015
    • Light weight adult may be misclassified, disabling air bag
    • NHTSA Recall 15V-093
  • Unintended side air bag deployment (Chrysler) / 2015
    • Unintended side curtain and seat air bag deployment during operation / software reflash
    • NHTSA Recall 15V-041
  • Brake controller might not activate trailer brakes (Ford) / 2015
    • Trailer brakes not activated when towing, lengthening stopping distance, increasing risk of crash. Fixed via powertrain control module reflash.
    • NHTSA Recall 15V-710
  • On but unattended vehicle may cause CO poisoning (GM) / 2015
    • Vehicle may turn on gasoline engine to recharge hybrid battery, causing carbon monoxide poisoning (e.g., if car is in garage)
    • NHTSA Recall 15V-145
  • Incorrect electric power steering software setting (Jaguar) / 2015
    • Power steering set in factory operating mode. Vehicle can experience additional steering inputs from EPS causing driver to lose ability to control the vehicle.
    • NHTSA Recall 15V-569
  • Air bag may not detect passenger in seat (Nissan) / 2015
    • Configuration management error: incorrect occupant classification software version installed, resulting in no air bag deployment
    • NHTSA Recall 15V-681
  • "Honda admits software problem, recalls 175,000 hybrids" / July 2014
  • Transmission calibration error (Ford) / 2014
    • Due to software calibration error vehicle may be in and display "drive" but engage "reverse" for 1.5 seconds.
    • NHTSA Recall 14V-204
  • Headlights may unintentionally turn off (Motor Coach Industries) / 2014
    • A mux controller may unintentionally turn off headlights while vehicle is in gear
    • NHTSA Recall 14V-370
  • Brake vacuum pump may stop functioning (Mitsubishi) / 2014
    • Software defect causes false detection of stuck relay, disabling brake power assist
    • NHTSA Recall 14V-522
  • Loss of brake vacuum assist (GM) / 2014
    • Loss of power brake assist; fixed with software reflash
    • NHTSA Recall 14V-247
  • Reprogram sensing and diagnostics module (GM) / 2014
    • Module left in "manufacturing mode" when shipped, disabling airbags
    • NHTSA Recall 14V-247
  • Passenger airbag may be disabled (Jaguar) / 2014
    • EEPROM wearout (which is due to a software defect) causes airbag to be partially or totally disabled
    • NHTSA Recall 14V-395
  • Hybrid transmission software (Champion Bus) / 2014
    • Software may improperly raise vehicle's engine speed during downshifts without the driver's input. The increase in speed may result in unintended acceleration.
    • NHTSA Recall 14V-303  (See also 14V-043; 14V-043 Navistar; 14V-026 Kenworth)
  • Cruise control unintended continued acceleration (Chrysler) / 2014
    • Unintended continued acceleration after releasing accelerator due to adaptive cruise control software; may increase risk of crash
    • NHTSA Recall 14V-293
  • Side-curtain rollover airbag deployment delay (Ford) / 2014
    • Errors in the programming software which may result in delayed deployment of side-curtain rollover airbag
    • NHTSA Recall 14V-237
  • Improper seat belt restraint software (Toyota) / 2014
    • Improper software can use insufficient force in crash (e.g., 110 pound passenger force for larger passenter)
    • NHTSA Recall 14V-272
  • Air bag may not detect passenger in seat (Nissan) / 2014
    • Software may incorrectly classify passenger seat as empty; airbag will not deploy
    • NHTSA Recall 14V-138
  • Vehicle may gradually accelerate unexpectedly (Nissan) / 2014
    • If lost signal from throttle position sensor is regained (intermittent fault) fail-safe mode is deactiveted, opening throttle resulting in "gradual" acceleration due to software error.
    • NHTSA Recall 14V-583
  • Inadvertent Air Bag deployment (Ram) / 2014
    • Side air bags deploy when hitting potholes; fixed via software update
    • NHTSA Recall 14V-528
  • Side airbags may deploy on the incorrect side (Chrysler) / 2013
    • Airbag on the wrong side of the vehicle could deploy, leaving occupants with no airbag protection at point of impact due to a software defect
    • NHTSA Recall 13V-283
  • Delayed deployment or non-deployment of airbags (Chrysler/Jeep) / 2013
    • Airbag deployment delayed or no airbag deployment in rollover due to software defect
    • NHTSA Recall 13V-233
  • Airbag deployment software (Chrysler) / 2013
    • Incorrect software installed; air bags may not deploy or might deploy improperly
    • NHTSA Recall 13V-291
  • Improper occupant classification / 2012
    • Incorrect software installed that misclassifies passengers; airbag might not deploy when it should, deploys incorrectly, or deploys when it should not
    • NHTSA Recall 12V-198
  • Occupant classification system (Hyundai) / 2012
    • Software might miss small stature adults and not deploy airbag.
    • NHTSA Recall 12V-354 
  • Cruise Control System/Brake Switch Failure (Mercedes-Benz) / 2011
    • Brake pedal may not automatically disengage cruise control as expected. (Other methods still work.)  If driver pumps brakes it will take unusually high force to stop vehicle.
    • NHTSA Recall 11V-208
  • Engine stall prevention assist software (Honda) / 2011
    • Unexpected vehicle movement from ECU software providing hybrid electric power and unexpectedly moving vehicle in reverse direction if the engine stalls.
    • NHTSA Recall 11V-458
  • Loss of steering power assist (Toyota) / 2010
  • "Toyota: software to blame for Prius brake problems" / 2010
  • ABS ECU Programming (Toyota) / 2010
    • Inconsistent brake feel; increased stopping distances for a given pedal force due to ABS programming, raising the possibility of a crash.
    • NHTSA Recall 10V-039
  • Restraint control module (Land Rover) / 2009
    • Passenger airbag disabled as a result of temporary loss of CAN network messages and a software defect
    • NHTSA Recall 09V-467
  • Double Clutch Gearbox (BMW) / 2008
    • Engine stall increasing risk of a crash due to software multistage downshift defect
    • NHTSA Recall 08V-595
  • Passenger sensing system (GM) / 2008
    • Software condition within passenger sensing system may disable passenger air bag (or enable when it should be disabled).
    • NHTSA Recall 08V-582
  • Passenger air bag fail to deploy (Nissan) / 2008
    • Passenger air bag might not deploy due to low battery voltage combined with software defect
    • NHTSA Recall 08V-066
  • Engine Control Module Software Update (VW) / 2008
    • Software defect can cause unexpected engine surge that can "result in a crash without warning."
    • NHTSA Recall 08V-235
  • SRS Electronic control unit software (Maserati) / 2007
    • Passenger air bag might not deploy if car battery is not fully charged due to software defect
    • NHTSA Recall 07V-550
  • SRS control unit software (Volvo) / 2007
    • Two software errors result in late deployment of side airbags
    • NHTSA Recall 07V-500
  • Passenger side airbag does not deploy (Volkswagen) / 2006
    • A weak battery could cause air bag control unit to deactivate due to a software defect; airbag will not deploy in a crash
    • NHTSA Recall 06V-454
  • Electronic Throttle Control (GM) / 2006
    • ETC torque monitoring failsafe disabled, permitting throttle opening greater than commanded (i.e., UA) due to a software defect
    • NHTSA Recall 06V-007
  • Powertrain control module (DaimlerChrysler) / 2006
    • Software can cause momentary lock up of drive wheels at speeds over 40 mph if operator shifts from drive to neutral and back.
    • NHTSA Recall 06V-341
  • BMW/Driver's seat occupant detection system / 2004
    • Software can't reliably determine if driver seat is occupied; airbag may not deploy.
    • NHTSA Recall 04V-379
  • Jaguar/Forward drive gear / 2004
    • Selecting forward drive gear could select reverse while in forward motion, without indication. (Apparent limp home mode logic defect.)
    • NHTSA Recall 04-024
  • BMW/ENgine Idle Speed/DME Idle Control / 2003
    • Increase of idle speed up to 1,300 RPM. If a gear is selected, the driver may feel as if the vehicle is being pushed.
    • NHTSA Recall 03V124
  • KIA/ABS Electronic Control Module / 2003
    • A programming error in ABS cases reduced braking force at speeds below 25 mph, extending stopping distances
    • NHTSA Recall 03V-158
  • "GM Admits Brake Flaws After Inquiry" / July 1999
  • Chrysler/Interior systems: air bag / 1996
    • Air bag software error which can delay air bag deployment
    • NHTSA Recall 96V-060

Noteworthy: These are software-related problems with cars that are worth knowing about, but less black and white because, for example, there has been no general recall issued.
  • To access NHTSA recalls you need to visit https://www.nhtsa.gov/recalls then select Vehicle then select "search by NHTSA ID" which can take a few mouse clicks to find on the indicated NHTSA web site.  (It might be the interface has changed since I posted this; you might need to poke around to find the lookup function.)
  • This is a work in progress and a VERY incomplete list.  I thought this would be a one-day exercise, but, well, no. If you know of something really important I've missed, please let me know!  More importantly, if you know of someone who is interested in maintaining a list like this, especially as a more rigorous academic study, I'd be happy to collaborate.  I simply don't have the time to keep up with this.
  • Reasonable people can perhaps disagree about the inclusion or exclusion of some items. But the point is really more about the volume rather than any individual item. By definition each recall is a defect that should not have been shipped, because it resulted in a recall.  I've paraphrased the recall reports. If you want to know more be sure to look at the supporting documents on the NHTSA web site, which often have more details than the summaries.
  • To be "deadly" these defects have to be software faults that either have caused, could reasonably cause, or should have reasonably prevented significant injury or death. (This includes defects in failsafes, for example) A partial list includes: un-commanded acceleration (UA), stalling at speed (dangerous when merging onto a highway), failure to deactivate cruise control, extended braking distances, airbag disablement, and incorrect airbag deployment.  What happens in practice depends upon the circumstances.
  • This should not be construed to be an expert opinion of root cause of any particular mishap. I am summarizing publicly available information and have not independently verified the technical facts in each case. Those public sources might be incorrect, or I might not have fully understood the implications of the statements in those sources. Again, this is more about the overall trend and not any particular incident report.
  • There are plenty of commenters who say things for unintended acceleration like "just apply the brakes, because brakes always overcome the engine." First, this is simply not true in many situations due to loss of vacuum assist, drivers with weak leg strength etc. A single point fault or sufficiently likely multi-point fault should not be trying to kill the occupants in the first place, so it's still a defect.
  • The air bag software problems were found in: https://www.autosafety.org/staging/wp-content/uploads/import/Historical%20Airbag%20Recalls_1.pdf  I independently verified them on the NHTSA database.
  • I independently verified on the NHTSA database some drivetrain recalls found here: https://www.autosafety.org/sites/default/files/imce_staff_uploads/Exemplary%20Vehicle%20Software%20Recalls.pdf
    and here: https://www.autosafety.org/wp-content/uploads/2016/04/2014-15-Software-Recalls.pdf
  • If you want to go exploring, you can download a copy of the raw database here that I used for some of the other defects: https://www-odi.nhtsa.dot.gov/downloads/

Saturday, September 8, 2018

Different types of risk analysis: ALARP, GAMAB, MEMS and more

When we talk about how much risk is enough, it is common to do things like compare the risk to current systems, or argue about whether something is more (or less) likely than events such as being killed by lightning. There are established ways to think about this topic, each with tradeoffs.

Tightrope Walker

The next time you need to think about how much risk is appropriate in a safety-critical system, try these existing approaches on for size instead of making up something on your own:

ALARP: "As Low As Reasonably Practicable"  Some risks are acceptable. Some are unacceptable. Some are worth taking in exchange for benefit, but if that is done the risk must be reduced to be ALARP.

GAMAB: "Globalement Au Moins Aussi Bon"  Offer a level of risk at least as good as the risk offered by an equivalent existing system. (i.e., no more dangerous than what we have already for a similar function)

MEM: "Minimum Endogenous Mortality"  The technical system must not create a significant risk compared to globally existing risks. For example, this should cause a minimal increase in overall death rates compared to the existing population death rates.

MGS: "Mindestens Gleiche Sicherheit"   (At least the same level of safety) Deviations from accepted practices must be supported by an explicit safety argument showing at least the same level of safety. This is more about waivers than whole-system evaluation.

NMAU: "Nicht Mehr Als Unvermeidbar"  (Not more than unavoidable)  Assuming there is a public benefit to the operation of the system, hazards should be avoided by reasonable safety measures implemented with reasonable cost.

Each of these approaches has pros and cons.  The above terms were paraphrased from this nice discussion:
Kron, On the evaluation of risk acceptance principles,

There is an interesting set of slides that covers similar ground here, and works some examples. In particular the graphs involving whether risks are taken voluntarily for different scenarios is thought provoking:

In general, if you want to dig deeper into this area, a search on
    gamab mem alarp 
will bring up a number of hits

Also note that legal and other types of considerations exist, especially regarding product liability.