Monday, June 2, 2014

Avoid High Cyclomatic Complexity

Cyclomatic complexity is a measure of the number of paths through a particular piece of code (a higher number means the software is more complex). Critical software functions with high cyclomatic complexity are one type of "spaghetti" code that should be avoided. One can argue an exception for single-level switch statements, so long as there are no nested conditionals within the switch cases themselves.

Consequences: A high cyclomatic complexity for a particular function means that the function will be difficult to understand, and more difficult to test. That in turn means functions with a cyclomatic complexity above 10 or 15 can be reasonably expected to have more undetected defects than simpler functions. In practice, “spaghetti code” with a high cyclomatic complexity can reasonably be expected to have a poor design and poor unit test coverage, resulting in the reasonable expectation that it will have an elevated rate of software defects.

Accepted Practices:
  • Cyclomatic complexity should be limited for almost all software functions to a relatively small, predetermined threshold defined by the project software development plan. A range of 10 to 15 for a significant majority of functions in a project is typical.
  • Functions with a cyclomatic complexity above the defined threshold of 10 to 15 should be subject to special scrutiny, and simplified if practical. Functions with a complexity significantly above 15 (perhaps a predefined threshold in the range of 30-50 and above, depending on the particular system involved) should be strictly prohibited.
  • In some cases, functions with a single switch statement having many cases may be acceptable because of the regular structure of a switch statement, but should not contain nested switch statements, nor contain conditional statements within the first-level switch statement.
Discussion:

McCabe Cyclomatic Complexity (e.g., as discussed in NIST 500-235) is a measure of the complexity of a piece of software. Rather than counting the number of lines of code, cyclomatic complexity (roughly speaking) counts the number of decision points within the software. For example, a piece of software with 1000 lines of code but no “if” statements and no loops has a cyclomatic complexity of 1. But a piece of software 100 lines long might have a much higher cyclomatic complexity if it contains many levels of nested “if” statements and loops. Thus, cyclomatic complexity measures how complicated a piece of software is in terms of flow control structure rather than simply how big it is.

An important use of cyclomatic complexity is in understanding how difficult a piece of software will be to test. A common assessment of unit testing thoroughness is the “branch coverage” of a set of tests. Branch coverage measures whether tests that have been run have exercised both sides of a branch. For example, if a piece of code has a single “if … else” construct, attaining 100% branch coverage requires one test case to exercise the “if” part, and another test case to exercise the “else” part, since a program can’t take both paths at once. But, if there are nested conditionals the number of tests required can increase dramatically due to all the different paths that might have to be taken to exercise all paths through the code. Cyclomatic complexity provides a way to understand the number of tests required for these paths. In particular, depending upon the specifics of the software, it can take up to M unit tests to achieve 100% branch coverage, where M is the McCabe Cyclomatic Complexity of the software under test. (The actual number depends upon the structure of the software, but for a higher complexity number one expects that the effort required to test the software will increase.)

Below are examples of program control flow graphs from NIST publication 500-235. The point of these figures is not the particulars of the program, but rather the visible increase in number of possible paths (from top to bottom in each graph) as the cyclomatic complexity increases. Ensuring that tests exercise each and every path through the graph is going to be a lot harder for a complexity of 17 or 28 than for a complexity of 5.


Cyclomatic complexity examples from NIST 500-235, 1996.

A generally accepted practice is to limit cyclomatic complexity for each software function within a system to a range of 10-15 (NIST 500-235 pg. 25). Functions having a complexity above 10 or 15 require special attention and consideration as to why the deviation is required, and can reasonably be expected to be much more difficult to test thoroughly than functions with a complexity of 10 or below. This means that the effort required to achieve high branch coverage during unit test of each function is reduced with low cyclomatic complexity.

Code with a high cyclomatic complexity normally ends up as what is referred to as “spaghetti code,” because the control flow structure is a tangled mess like a plate of spaghetti. When you pull at one end of a piece of spaghetti on a dinner plate, you have no idea which of the other ends will also move. So it is with trying to understand the details of spaghetti software. Such code has the properties of being difficult to understand, difficult to test, and more prone to defects. (Historical note: "spaghetti code" classically referred to code with many goto statements, but more recently the term has come to encompass poorly structured code in general.)

The primary exception to this rule in practice is when a “switch” statement is used to deal with a large number of alternatives in a very regular pattern, especially if the switch is being used to dispatch different command bytes from a communication protocol parser or the like. But, test effort is still significant for a switch statement with many alternative possible paths, and some compilers have trouble with switch statements having more than 256 cases, so care should be taken with large switch statements. In any event, nested conditionals should not be buried inside cases of switch statements.

Selected Sources:

US National Institute of Standards and Technology (NIST) Special Publication 500-235 (Walsh 1996) explains the relationship between cyclomatic complexity and testing effort. Page 25 of that publication discusses a complexity limit of 10 being an accepted practice, and says that while limits as high as 15 have been successful, even higher levels of complexity require additional testing effort.

The Reliability Information Analysis Center (“RIAC” – formerly known as “RAC”) is the focus point of the US Department of Defense for reliability information of all types. They summarize the effects of McCabe Cyclomatic Complexity (MCC) as follows: “Empirically, numbers less than ten imply reasonable structure, numbers higher than 30 are of questionable structure. Very high cyclomatic numbers of more than 50 imply the application cannot be tested, while even higher numbers of more than 75 imply that every change may trigger a ‘bad fix.’. This metric is widely used for Quality Assurance and test planning purposes.” (RAC 1996, p. 124) In context, the term “bad fix” refers in general to a situation in which every attempt to fix a software defect introduces one or more new software defects, making it impossible to remove all defects regardless of how much effort is expended.



Jack Ganssle, a respected expert on embedded systems, makes it clear that “spaghetti” code must be avoided. He writes “A sure sign of disaster is the programmer who doesn’t seem to have a firm grasp of the program’s data and control flow, or who writes spaghetti code.” (Ganssle 1992, pg. 64)

Walsh found that defect rates increased significantly for modules with a McCabe Cyclomatic Complexity value above 10 on an embedded computer system (Walsh 1979, pg. 766).

Scholarly data on code complexity metrics is mixed because of the difficulty of performing controlled studies across different application areas, different programming team skill levels, different languages, and other confounding factors. For example, different projects start “counting” defects at different points in the development process, and defect counts are responsive to whether testing is thorough or not (inadequate testing may not find bugs, but that doesn’t mean the software is truly defect-free). Safety critical system developers tend to be a conservative lot (for good reason), and prefer to have less complexity because it makes it easier to inspect and test systems to ensure low defect rates.

I can also draw on my own personal experience conducting more than 130 embedded software design reviews. In essentially all cases when I see a complexity warning sign, I can find a defect within an hour (usually just minutes) of looking at code. But, more importantly, it is relatively straightforward to make the code less complex and more understandable (and therefore less defect prone) with very little effort in almost every case of high complexity that I have encountered. And, usually, once I show the programming team how to simplify, they agree it is better and start the practice themselves on other code. So, in my own extensive consulting and teaching experience, what I have found is that code with complexity warning signs is indicative of a software team that needs to improve in some significant way to create acceptable software – which is an undesirable state for developers of safety critical systems. When I find code that is simple, understandable, and devoid of unnecessary complexity, I know that I have found programmers who in general know what they are doing.

The developers of safety critical software standards feel the same way. They make it clear that complexity must be kept low in safety critical systems. Simplicity is one of the tools that is essential for creating safe systems. All of MISRA Report 5 is devoted to the topic of Software Metrics. It describes MCC (MISRA Report 5, pp. 37-38, p. 60) and gives a maximum acceptable complexity of 15 for safety critical systems.

The FAA recommends using a complexity metric such as MCC (FAA 2000, pg. J-17, J-23). NASA recommends reducing complexity using MCC and design for modularity via having weak coupling (NASA 2004, p. 95).

There are many possible complexity metrics for code. I have found that cyclomatic complexity is a useful predictor of likely code quality.

References:
  • FAA, System Safety Handbook, Appendix J: Software Safety, Federal Aviation Administration, Dec. 2000
  • Ganssle, J., The Art of Programming Embedded Systems, Academic Press, 1992.
  • MISRA, Report 5: Software Metrics, February 1995.
  • NASA-GB-8719.13, NASA Software Safety Guidebook, NASA Technical Standard, March 31, 2004.
  • RAC, Introduction to Software Reliability: a state of the art review, Reliability Analysis Center, 1996.
  • Watson, A. & McCabe, T., (NIST 500-235), Structured Testing: a testing methodology using the cyclomatic complexity metric, Special Publication 500-235, National Institute of Standards and Technology, Sept. 1996




2 comments:

  1. The graph given above have their respective complexities mentioned, but to which program do they belongs to?

    ReplyDelete
  2. These are from a NIST publication, so I don't know exactly how they were generated. In this context they are merely meant to be examples. Note that the complexity=28 example looks like it might be a switch statement, which is a special case when considering complexit.

    ReplyDelete

Please send me your comments. I read all of them, and I appreciate them. To control spam I manually approve comments before they show up. It might take a while to respond. I appreciate generic "I like this post" comments, but I don't publish non-substantive comments like that.

If you prefer, or want a personal response, you can send e-mail to comments@koopman.us.
If you want a personal response please make sure to include your e-mail reply address. Thanks!