Monday, March 7, 2016

Multiple Returns and Error Checking

Summary: Whether or not to allow multiple returns from a function is a controversial matter, but I recommend having a single return statement AND avoiding use of goto.

Discussion:

If you've been programming robust embedded systems for a while, you've seen code that looks something like this:

int MyRoutine(...)
{ ...
  if(..something fails..) { ret = 0; }
  else 
  { .. do something ...
    if(..somethingelse fails..) {ret = 0;}
    else 
    { .. do something ..
      if(...yetanotherfail..) {ret = 0;}
      else 
      { .. do the computation ...
        ret = value; 
      }
    }
  }
  // perform default function
  return(ret);
}

(Note: the "ret = 0" and "ret = value" parts are usually more complex; I'm keeping this simple for the sake of the discussion.)

The general pattern to this code is doing a bunch of validity checks and then, finally, in the most deeply nested "else" doing the desired action. Generally the code in the most deeply nested "else" is actually the point of the entire routine -- that's the action that you really want to take if all the exception checks pass.  This type of code shows up a lot in a robust embedded system, but can be difficult to understand.

This problem has been dealt with a number of times and most likely every reasonable idea has been re-invented many times. But when I ran up against it in a recent discussion I did some digging and found out I mostly disagree with a lot of the common wisdom, so here is what I think.

A common way to simplify things is the following, which is the guard clause pattern described by Martin Fowler in his refactoring catalog (an interesting resource with accompanying book if you haven't run into it before.)

int MyRoutine(...)
{ ...
  if(..something fails..)     {return(0);}
  if(..somethingelse fails..) {return(0);}
  if(...yetanotherfail..)     {return(0);}
  // perform default function
  return(value);
}

Reasonable people can (and do!) disagree about how to handle this.   Steve McConnell devotes a few pages of his book Code Complete (Section 17.1, and 17.3 in the second edition) that covers this territory, with several alternate suggestions, including the guard clause pattern and a discussion that says maybe a goto is OK sometimes if used carefully.

I have a somewhat different take, possibly because I specialize in high-dependability systems where being absolutely sure you are not getting things wrong can be more important than subjective code elegance. (The brief point of view is: getting code just a little klunky but obviously right and easy to check for correctness is likely to be safe. Getting code to look nice, but with a residual small chance of missing a subtle bug might kill someone. More this line of thought can be found in a different posting.)

Here is a sketch of some of the approaches I've seen and my thoughts on them:

(1) Another Language. Use another language that does a better job at this.  Sure, but I'm going to assume that you're stuck with just C.

(2) Guard Clauses.  Use the guard class pattern with early returns as shown above . The downside of this is that it breaks the single return per function rule that is often imposed on safety critical code. While a very regular structure might work out OK in many cases, things get tricky if the returns are part of code that is more complex. Rather than deal with shades of gray about whether it is OK to have more than one return, I prefer a strict single-return-per-function rule. The reason is that it avoids having to waste a lot of time hashing out when multiple returns are OK and when they aren't -- and the accompanying risk of getting that judgment call wrong. In other words black-and-white rules take less interpretation.

If you are building a safety critical system, every subjective judgement call as to what is OK and what is not OK is a chance to get it wrong.  (Could you get airtight design rules in place? Perhaps. But you still need to check them accurately after every code modification. If you can't automate the checks, realistically the code quality will likely degrade over time, so it's a path I prefer not to go down.)

(3) Else If Chain.  Use an if/else chain and put the return at the end:

int MyRoutine(...)
{ ...
  if(..something fails..)          {ret=0;}
  else if(..somethingelse fails..) {ret=0;}
  else if(...yetanotherfail..)     {ret=0;}
  else { // perform default function
         ret = value;
        }
  return(ret);
}

Wait, isn't this the same code with flat indenting?  Not quite.  It uses "else if" rather than "else { if". That means that the if statements aren't really nested -- there is really only a single main path through the code (a bunch of checks and the main action in the final code segment), without the possibility of lots of complex conditions in the "if" sidetrack branches.

If your code is flat enough that this works then this is a reasonable way to go. Think of it as guard clauses without the multiple returns.  The regular "else if" structure makes it pretty clear that this is a sequence of alternatives, or often a set of "check that everything is OK and in the end take the action." However, there may be cases in which the code logic is difficult to flatten this way, and in those cases you need something else (keep reading for more ideas).  The trivial case is:

(3a) Simplify To An Else If Chain.  Require that all code be simplified enough that an "else if" chain works.  This is a nice goal, and my first choice of style approaches.  But sometimes you might need a more capable and flexible approach.  Which brings us to the rest of the techniques...

(4) GOTO. Use a "goto" to break out of the flow and jump to the end of the routine.  I have seen many discussion postings saying this is the right thing to do because the code is cleaner than lots of messy if/else structures. At the risk of being flamed for this, I respectfully disagree. The usual argument is that a good programmer exhibiting discipline will do just fine.  But that entirely misses what I consider to be the bigger picture.  I don't care how smart the programmer who wrote the code is (or thinks he is).  I care a lot about whoever has to check and maintain the code. Sure, a good programmer on a good day can use a "goto" and basically emulate a try/throw/catch structure.  But not all programmers are top 10 percentile, and a lot of code is written by newbies who simply don't have enough experience to have acquired mature judgment on such matters. Beyond that, nobody has all good days.

The big issue isn't whether a programmer is likely to get it right. The issue is how hard (and error-prone) it is for a code reviewer and static analysis tools to make sure the programmer got it right (not almost right, or subtly wrong). Using a goto is like pointing a loaded gun at your foot. If you are careful it won't go off. But even a single goto shoves you onto a heavily greased slippery slope. Better not to go there in the first place. Better to find a technique that might seem a little more klunky, but that gets the job done with minimum fuss, low overhead, and minimal chance to make a mistake. Again, see my posting on not getting things wrong.

(Note: this is with respect to unrestricted "goto" commands used by human programmers in C. Generated code might be a different matter, as might be a language where "goto" is restricted.)

(5) Longjmp.  Use setjmp/longjmp to set a target and then jump to that target if the list of "if" error checks wants to return early. In the final analysis this is really the moral equivalent of a "goto," although it is a bit better controlled. Moreover, it uses a pointer to code (the setjmp/longjmp variable), so it is an indirect goto, and that in general can be hazardous. I've used setjmp/longjmp and it can be made to work (or at least seem to work), but dealing with pointers to code in your source code is always a dicey proposition. Jumping to a corrupted or uninitialized pointer can easily crash your system (or sometimes worse).  I'd say avoid using this approach.

I've seen discussion forum posts that wrap longjmp-based approaches up in macros to approximate Try/Throw/Catch. I can certainly appreciate the appeal of this approach, and I could see it being made to work. But I worry about things such as whether it will work if the macros get nested, whether reviewers will be aware of any implementation assumptions, and what will happen with static analysis tools on those structures. In high-dependability code if you can't be sure it will work, you shouldn't do it.

(6) Do..While..Break. Out of the classical C approaches I've seen, the one I like the most (beyond else..if chains) is using a "do..while..break" structure:

int MyRoutine(...)
{ ...
  do { //  start error handling code sequence
    if(..something fails..)     {ret=0; break;}
    if(..somethingelse fails..) {ret=0; break;}
    if(...yetanotherfail..)     {ret=0; break;}
    // perform default function
    ret = value;
  } while (0); // end error handling code sequence
  return(ret);
}

This code is a hybrid of the guard pattern and the "else if" block pattern. It uses a "break" to skip the rest of the code in the code block (jumping to the while(0), which always exits the loop). The while(0) converts this structure from a loop into just a structured block of code with the ability to branch to the end of the code block using the "break" keyword. This code ought to compile to an executable that is has essentially identical efficiency to code using goto or code using an else..if chain. But, it puts an important restriction on the goto-like capability -- all the jumps have to point to the end of the do..while without exception.

What this means in practice is that when reviewing the code (or a change to the code) there is no question as to whether a goto is well behaved or not. There are no gotos, so you can't make an unstructured goto mistake with this approach.

While this toy example looks pretty much the same as the "else if" structure, an important point is that the "break" can be placed anywhere -- even deeply within a nested if statement -- without raising questions as to what happens when the "break" is hit. If in doubt or there is some reason why this technique won't work for you, I'd suggest falling back on restructuring the code so "else if" or this technique works if the code gets too complex to handle. The main problem to keep in mind is that if you nest do..while structures the break will un-nest only one level.

I recognize that this area falls a little bit into a matter of taste and context. My taste is for code that is easy to review and unlikely to have bugs. If that is at odds with subjective notions of elegance, so be it. In part my preference is to outlaw the routine use of techniques that require manual analysis to determine of a potentially unsafe structure is being used the "right" way. Every such judgement call is a chance to get it wrong, and a distraction of human reviewer attention away from more important things. And I dislike arguments of the form that a "good" and experienced programmer won't make a mistake. It is just too easy to miss a subtle bug, especially when you're modifying code in a hurry.

If you've run into another way to handle this problem let me know.

16 comments:

  1. Great post! What about using switch statements combined with ret=value and break inside each case? I think it avoids the trouble of being stuck in an infinite loop if someone forgets to put a break inside one of the branches in a do-while loop.

    ReplyDelete
    Replies
    1. I'm sure there are a lot of variations; thanks for this suggestion.
      In the way I suggested this points out that it is VERY important to have a while(0) or while(FALSE) to make sure you don't get stuck in a loop

      Delete
  2. This is a great post on a problem that I run into more and more. Minor comment: Heading 3a has a typo. It should probably read "Simplify to an else if chain."

    The "else if chain" solution is my favorite because it seems like the simplest construct that a less-experienced reviewer/maintainer can understand. Also, the process of simplifying down to that construct is likely to create cleaner code on its own.

    I appreciate the cleverness of the "do...while...break" solution, but it feels like it has the same negative consequences as having multiple returns. I suppose that the static analysis tools will be happy, and you will be in full-compliance with the rule against multiple returns. However, I thought that the main reason for having exactly one return is to make it harder to miss a deeply-nested return statement. The "break" has the same problem, and inadvertently breaking inside another while or for loop has gotten me into trouble before.

    What are your thoughts on using a single "validate_assumptions()" function call at the beginning? You could put all of the individual assumption checks in that function, and then if it passes, the main meat of your code would be in the single else block that follows. If the assumptions really are complex enough to cause more than a few checks, then they probably deserve to be separated out semantically.

    ReplyDelete
    Replies
    1. I also like the else if chain (#3) if it works. Fair point about the do...while...break. I'm inclined to use it in moderation, and definitely with the rule that you are not allowed to nest the while loops or you will have exactly the type of problems you describe.

      I thought about a single validate_assumptions() function but I'm not sure how much it helps, because you are basically punting the problem down into a subroutine where you'll still need to use an else if chain or something else. So yes, if it is a tricky validation it should be put into a subroutine, but you still need to deal with the multiple checks in a structure.

      Typo fixed; thanks for pointing it out.

      Delete
  3. Thanks for this post. I never saw the do/while/break structure before.
    After trying many different solutions including several that your presented here, I ended up using a nested action/rollback structure like this:

    ------------------
    int MyRoutine(...)
    {
    int ret = 0;
    ...
    ret = actionA();
    if(ret == success)
    {
    ret = actionB();
    if(ret == success)
    {
    ret = actionC();
    if(ret == success)
    {
    .. do something ...
    }

    if(ret != success)
    {
    .. rollback actionB ...
    }
    }

    if(ret != success)
    {
    .. rollback actionA ...
    }
    }

    return ret;
    }
    ------------------

    It is more verbose but what I like is being able to cancel previous actions in
    only one location in the function. This avoids repeating rollback actions at
    several places in the function. Typical examples are locking/unlocking a
    mutex or opening/closing a file.

    This is also easier to rollback actions that cannot be easilly put at the end of
    the function without using a dedicated status variable: if ActionB adds an item
    to a list and this item must be removed from the list in case of error, then it is
    easy to do in "rollback actionB". Doing this in a common cleanup code block is
    usually more difficult.

    Unless you have another way to release resources in a error handling block ?

    Romain.

    ReplyDelete
    Replies
    1. This is an interesting approach I haven't seen before. The rollback strategy is interesting. Thanks for sharing!

      Delete
  4. Nice summary, but the only option allowed by MisraC rules is 3 which might be a concern for some people.

    ReplyDelete
    Replies
    1. Excellent point. If you are developing mission-critical code then either you'd need to follow this rule or document a formal deviation for a specific pattern like this with justification (which might or might not be the way to go depending upon your situation).

      I'd say if method 3 works that is a good way to go. (And 3a says make it work -- which is what you need to do if you are following Misra C). Thanks for the comment!

      Delete
  5. In the first example, I think it should be:

    // perform default function
    return(ret);

    instead of:

    // perform default function
    return(value);

    ReplyDelete
  6. Although I like the elegance of the do { ... } while/break approach, one of the problems I've seen with such an approach is the one you mention: "The main problem to keep in mind is that if you nest do..while structures the break will un-nest only one level". If the do/while loop itself contains loops or switch statements, for a reviewer/maintainer it can be very confusing to determine how much any particular break unwinds the nesting. I think this a very important point that should be emphasized more. I've seen more than my share of bugs where the original programmer or a maintainer failed to notice that a break inside a nested loop didn't completely unwind the error catching.

    ReplyDelete
    Replies
    1. For sure there should never be a nested do {} structure within the subroutine this is used in or you are susceptible to this type of bug.

      Good point. This discussion is pushing me more toward Option #3. Elgance is nice, but bug-free is better!

      Delete
  7. First of all I want to thank you for your great blog!

    I think the "break", "continue", "goto" and (multiple) "return" keywords should be avoided altogether, because they all do the same thing: they break the regular program structure by jumping from one code position to another one. They only differ in their severity/craziness, e.g. "return" just lets you jump to the end of a function, while "goto" lets you jump anywhere.
    The only exception where "break" should be allowed IMHO is in a "switch" expression.


    In all the error cases of your examples, you do the same thing: setting the return value to zero.
    So why not just bundle the conditions and set the error indicator only once?
    This is how I would do it:


    int MyRoutine(...)
    {
    ...

    if (
    (..something fails..) ||
    (..somethingelse fails..) ||
    (...yetanotherfail..)
    )
    {
    ret = 0;
    }
    else
    {
    // do the computation
    ret = value;
    }

    return(ret);
    }


    This approach avoids the redundancy of typing "ret = 0" multiple times, it does not use any of the evil keywords and it does not have any if/else nesting. Writing "if (...) {...} else if (...) {...}" is also nesting, it is just hidden a little bit better by not writing the surrounding "{" and "}".

    Of course, the disadvantage here is that you cannot use different error indicator return values for different errors.
    But apart from that - do you see any problem with that solution?

    ReplyDelete
    Replies
    1. Thanks for suggesting an alternate pattern. I suspect this pattern will be useful sometimes but not others. When it works it definitely has the advantages you state.
      The disadvantages are: (1) sometimes you need to do some setup work for determining a condition (e.g., read an I/O device) and you would need to do all of that before the if in this structure and (2) compilers generally have a lot of freedom in reordering conditional if structures, so this won't work for situations in which ordering matters, such as checking for a pointer being non-null before deferencing it to do the next check.
      Thanks for your contribution!

      Delete
  8. At work we had the same discussion about the short circuit behavior of "||" and "&&" a few months ago.
    This is what came up back then:
    http://stackoverflow.com/questions/628526/is-short-circuiting-boolean-operators-mandated-in-c-c-and-evaluation-order

    In both C and C++, the short circuit operators guarantee left-to-right evaluation.
    Only the bitwise operators "|" and "&" do not guarantee this (in that case all checks are done, so order doesn't matter).

    So it is actually valid to write: "if ( (ptr == NULL) || (*ptr != 42) ) { /* error! */ }".

    And this method conforms with MISRA C, by the way.


    You are right that sometimes you need to do additional work between the checks. In that case I would fall back to suggestion (3).

    ReplyDelete
    Replies
    1. Thanks for pointing this out. I checked Harbison & Steele to be sure and they say it is left to right for || and && for C as well (pages 180 & 181 of 2nd ed.) There is a caveat that sometimes folks don't get this right for && and || for other than built-in data types.

      It still rubs my natural paranoia the wrong way, mostly because whether people can keep straight things like "works for && but not for &" is something I'm leery of. But I suppose at some point it becomes a matter of taste. Thanks for the contribution.
      Delete

      Delete

Please send me your comments. I read all of them, and I appreciate them. To control spam I manually approve comments before they show up. It might take a while to respond. I appreciate generic "I like this post" comments, but I don't publish non-substantive comments like that.

If you prefer, or want a personal response, you can send e-mail to comments@koopman.us.
If you want a personal response please make sure to include your e-mail reply address. Thanks!