NOTE: there is an update here:
https://users.ece.cmu.edu/~koopman/lectures/index.html#642
which includes newer course notes and quite a few YouTube videos of these lectures.
You should use that URL instead of this blog post, but I've left this post as-is for Fall 2017.
18-642 Embedded System Software Engineering
Prof. Philip Koopman, Carnegie Mellon University, Fall 2017
Slides | Topics | |
1 | Course Introduction | Software is eating the world; embedded applications and markets; bad code is a problem; coding is 0% of software; truths and management misconceptions |
2 | Software Development Processes | Waterfall; swiss cheese model; lessons learned in software; V model; design vs. code; agile methods; agile for embedded |
3 | Global Variables | Global vs. static variables; avoiding and removing globals |
4 | Spaghetti Code | McCabe Cyclomatic Complexity (MCC); SCC; Spaghetti Factor (SF) |
5 | Unit Testing | Black box testing; white box testing; unit testing strategies; MCDC coverage; unit testing frameworks (cunit) |
6 | Modal Code/Statecharts | Statechart elements; statechart example; statechart implementation |
7 | Peer Reviews | Effective code quality practices, peer review efficiency and effectiveness; Fagan inspections; rules for peer review; review report; perspective-based reviews; review checklist; case study; economics of peer review |
8 | Code Style/Humans | Making code easy to read; good code hygiene; avoiding premature optimization; coding style |
9 | Code Style/Language | Pitfalls and problems with C; language use guidelines and analysis tools; using language wisely (strong typing); Mars Climate Orbiter; deviations & legacy code |
10 | Testing Quality | Smoke testing, exploratory testing; methodical test coverage; types of testing; testing philosophy; coverage; testing resources |
11 | Requirements | Ariane 5 flight 501; rules for good requirements; problematic requirements; extra-functional requirements; requirements approaches; ambiguity |
12 | System-Level Test | First bug story; effective test plans; testing won't find all bugs; F-22 Raptor date line bug; bug farms; risks of bad software |
13 | SW Architecture | High Level Design (HLD); boxes and arrows; sequence diagrams (SD); statechart to SD relationship; 2011 Health Plan chart |
14 | Integration Testing | Integration test approaches; tracing integration tests to SDs; network message testing; using SDs to generate unit tests |
15 | Traceability | Traceability across the V; examples; best practices |
16 | SQA isn't testing | SQA elements; audits; SQA as coaching staff; cost of defect fixes over project cycle |
17 | Lifecycle CM | A400M crash; version control; configuration management; long lifecycles |
18 | Maintenance | Bug fix cycle; bug prioritization; maintenance as a large cost driver; technical debt |
19 | Process Key Metrics | Tester to developer ratio; code productivity; peer review effectiveness |
33 | Date Time Management | Keeping time; time terminology; clock synchronization; time zones; DST; local time; sunrise/sunset; mobility and time; date line; GMT/UTC; leap years; leap seconds; time rollovers; Zune leap year bug; internationalization. |
21 | Floating Point Pitfalls | Floating point formats; special values; NaN and robots; roundoff errors; Patriot Missile mishap |
23 | Stack Overflow | Stack overflow mechanics; memory corruption; stack sentinels; static analysis; memory protection; avoid recursion |
25 | Race Conditions | Therac 25; race condition example; disabling interrupts; mutex; blocking time; priority inversion; priority inheritance; Mars Pathfinder |
27 | Data Integrity | Sources of faults; soft errors; Hamming distance; parity; mirroring; SECDED; checksum; CRC |
20 | Safety+Security Overview | Challenges of embedded code; it only takes one line of bad code; problems with large scale production; your products live or die by their software; considering the worst case; designing for safety; security matters; industrial controls as targets; designing for security; testing isn't enough Fiat Chrysler jeep hack; Ford Mytouch update; Toyota UA code quality; Heartbleed; Nest thermostats; Honda UA recall; Samsung keyboard bug; hospital infusion pumps; LIFX smart lightbulbs; German steel mill hack; Ukraine power hack; SCADA attack data; Shodan; traffic light control vulnerability; hydroelectric plant vulnerability; zero-day shopping list |
22 | Dependability | Dependability; availability; Windows 2000 server crash; reliability; serial and parallel reliability; example reliability calculation; other aspects of dependability |
24 | Critical Systems | Safety critical vs. mission critical; worst case and safety; HVAC malfunction hazard; Safety Integrity Levels (SIL); Bhopal; IEC 61508; fleet exposure |
26 | Safety Plan | Safety plan elements; functional safety approaches; hazards & risks; safety goals & safety requirements; FMEA; FTA; safety case (GSN) |
28 | Safety Requirements | Identifying safety-related requirements; safety envelope; Doer/Checker pattern |
29 | Single Points of Failure | Fault containment regions (FCR); Toyota UA single point failure; multi-channel pattern; monitor pattern; safety gate pattern; correlated & accumulated faults |
30 | SIL Isolation | Isolating different SILs, mixed-SIL interference sources; mitigating cross-SIL interference; isolation and security; CarShark hack |
31 | Redundancy Management | Bellingham WA gasoline pipeline mishap; redundancy for availability; redundancy for fault detection; Ariane 5 Flight 501; fail operational; triplex modular redundancy (TMR) 2-of-3 pattern; dual 2-of-2 pattern; high-SIL Doer/Checker pattern; diagnostic effectiveness and proof tests |
32 | Safety Architecture Patterns | Supplemental lecture with more detail on patterns: low SIL; self-diagnosis; partitioning; fail operational; voting; fail silent; dual 2-of-2; Ariane 5 Flight 501; fail silent patterns (low, high, mixed SIL); high availability mixed SIL pattern |
34 | Security Plan | Security plan elements; Target Attack; security requirements; threats; vulnerabilities; mitigation; validation |
35 | Cryptography | Confusion & diffusion; Caesar cipher; frequency analysis; Enigma; Lorenz & Colossus; DES; AES; public key cryptography; secure hashing; digital signatures; certificates; PKI; encrypting vs. signing for firmware update |
36 | Security Threats | Stuxnet; attack motivation; attacker threat levels; DirectTV piracy; operational environment; porous firewalls; Davis Besse incident; BlueSniper rifle; integrity; authentication; secrecy; privacy; LG Smart TV privacy; DoS/DDos; feature activation; St. Jude pacemaker recall |
37 | Security Vulnerabilities | Exploit vs. attack; Kettle spambot; weak passwords; master passwords; crypto key length; Mirai botnet attack; crypto mistakes; LIFX revisited; CarShark revisited; chip peels; hidden functionality; counterfeit systems; cloud connected devices; embedded-specific attacks |
38 | Security Mitigation Validation | Password strength; storing passwords & salt/pepper/key stretching; Adobe password hack; least privilege; Jeep firewall hack; secure update; secure boot; encryption vs. signing revisited; penetration testing; code analysis; other security approaches; rubber hose attack |
39 | Security Pitfalls | Konami code; security via obscurity; hotel lock USB hack; Kerckhoff's principle; hospital WPA setup hack; DECSS; Lodz tram attack; proper use of cryptography; zero day exploits; security snake oil; realities of in-system firewalls; aircraft infotainment and firewalls; zombie road sign hack |
Note that in Spring 2018 these are likely to be updated, so if want to see the latest also check the main course page: https://www.ece.cmu.edu/~ece642/ For other lectures and copyright notes, please see my general lecture notes & video page: https://users.ece.cmu.edu/~koopman/lectures/index.html
Thanks for providing the material.
ReplyDeleteAs a working embedded engineer professionnal I'm impressed that the course covers the real engineering process with requirements, quality etc. I wish I had known that before graduating and entering the field.
Wonderful. Thank you so much for this material. I found the Toyota case study to be a particularly interesting read. I remember reading all the news surrounding the story and the handout gives a lot more context. I am impressed with the breadth of topics that are being covered in this course. Just curious, in terms of feedback, were there some suggestions from students about other topics to include in this course?
ReplyDeleteThanks for the kind words. The topics are based on problems I've seen doing embedded system design reviews in industry. There are always more topics to cover than time available, but this covers a lot of the topics that practicing engineers tell me they wish they'd seen in college.
DeletePhil,
ReplyDeletethanks for making this material available. From a European/Formalist perspective, I am concerned that your students are leaving the course with the impression that embedded systems are programmed in, well, C and nothing else. That's not the way we do things over here. I would be encouraged if your students at least knew of the existence of other approaches such as SPARK (the Ada subset), the Frama-C toolset (as used by Airbus), and so on. How do you think the Typhoon aircraft stays in the air? :-)
Yours,
Rod Chapman, Director, Protean Code Limited
Rod, thanks for the thoughtful comment. This being a US-taught course they need C to be prepared for the job market :). I agree it's a good idea to give them exposure to more capable languages, especially in highly critical systems. I consider how to do this next semester when I teach the course. (In fairness I did spend a little time talking about SPARK in class, but it doesn't appear in the lecture notes.)
DeleteI been over most of the other modules - I really like the Code Review material... your guidance and results seem to be completely consistent with the PSP/TSP reviewing disciplines, other than PSP commends both Personal AND Peer review (i.e. do both!) rather than full-blown Fagan Inspections. Jim Over's team at SEI have a lot of data on this from TSP teams and PSP training classes...
ReplyDelete- Rod
My experience has been that full-blown Fagan Inspections have a dramatically better result in a typical embedded system project. The typical failure mode is that if you don't treat a review as a formal, almost ceremonial, event the quality degrades rapidly and it becomes a 30-second checkbox with no value. If teams get proficient enough at reviews that they can track a consistent high defect removal rate (above 50% of all defects) and they can then relax formality, great for them. But when first implemented, my experience is that anything less than very formal reviews is very likely to fail. Since most of the students in my class have never done a peer review of any kind before (beyond perhaps a quick over-the-shoulder check), this is the best way to get their head in the game.
DeleteIf you are doing full-up PSP/TSP it might be that the benefits of a Fagan-level of formality review are subsumed by other practices, but to a first approximation none of the teams I see in industry are using PSP/TSP. (Probably this is because the PSP/TSP teams aren't suffering near-death experiences on projects that prompt them to call me in to help.) And the majority of students taking my class are ECE students who will never take another software engineering course besides mine in their life. So I stand by my recommendation for the majority of practitioners (not using PSP/TSP and looking for some way to improve things.) If you're using PSP/TSP and the metrics say less formal reviews are working for you, then that's fine.
I suspect the difference in our observations is due to the underlying maturity and rigor of the engineering process beyond peer reviews. Thanks for your useful comments!