Monday, July 24, 2017

Don't use macros for MIN and MAX



It is common to see small helper functions implemented as macros, especially in older C code. Everyone seems to do it.  But you should avoid macros, and instead use inline functions.

The motivation for using macros was originally that you needed to use a small function in many places but were worried about the overhead of doing a subroutine call. So instead, you used a macro, which expands into source code in the preprocessor phase.  That was a reasonable tradeoff 40 years ago. Not such a great idea now, because macros cause problems for no good reason.

For example, you might look on the Web and find these common macros
    #define MAX(a,b) ((a) > (b) ? a : b)
    #define MIN(a,b) ((a) < (b) ? a : b)

And you might find that it seems to work for a while.  You might get bitten by the missing "()" guards around the second copy of a and b in the above -- which version you get depends on which cut & paste code site you visit. 

But then you'll find that there are still weird situations where you get unexpected behavior. For example, what does this do?
    c = MAX(a++, b);
If a is greater than b executing the code will increment a twice, but if a is less than or equal to b it will only increment a once.  And if you start mixing types or putting complicated expressions into the macro things can get weird and buggy in a hurry.

Another related problem is that the macro will expand, increasing the cyclomatic complexity of your code. That's because a macro is equivalent to you having put the conditional branch into the source code. (Remember, macro expansion is done by the preprocessor, the so compiler itself acts as if you'd typed the conditional assignment expression every place you use the macro.) This complexity rating is justified, because there is no actual procedure that can be unit tested independently.

As it turns out, macros are evil. See the C++ FAQ: https://isocpp.org/wiki/faq/misc-technical-issues#macros-with-if  which lists 4 different types of evil behavior.  There are fancy hacks to try to get any particular macros such as MIN and MAX to be better behaved, but no matter how hard you try you're really just making a deal with the devil. 

What's the fix?

The fix is: don't use macros. Instead use inline procedure calls.

You should already have access to built-in functions for floating point such as fmin() and fmax().  If it's there, use the stuff from your compiler vendor instead of writing it yourself!

If your compiler doesn't have integer min and max, or you are worried about breaking existing macro code, convert the macros into inline functions with minimal changes to your code base:

inline int32_t MAX(int32_t a, int32_t b) { return((a) > (b) ? a : b); }
inline int32_t MIN(int32_t a, int32_t b) { return((a) < (b) ? a : b); }

If you have other types to deal with you might need different variants depending on the types, but often a piece of code uses predominantly one data type for its calculations, so in practice this is usually not a big deal. And don't forget, if your build environment has a built in min or max you can just set up the macro to call that directly.

What about performance?

The motivation for using macros back in the bad old days was efficiency. A subroutine call involved a lot of overhead. But the inline keyword tells the compiler to expand the code in-place while retaining all the advantages of a subroutine call.  Compilers are pretty good at optimization these days. So there is no overhead at run-time.  I've also seen advice to put the inline function in a header file so it will be visible to any procedure needing it, and the macro was already there anyway.

Strictly speaking, "inline" is a suggestion to the compiler. However, if you have a decent compiler it will follow the suggestion unless the inline function is so big the call overhead just doesn't matter. Some compilers have a warning flag that will let you know when the inline didn't happen.  For example, use -Winline for gcc.  If your compiler ignores "inline" for something as straightforward as MIN or MAX, get a different compiler.

What about multiple types?

A perceived advantage of the macro approach is that you can play fast and loose with types.  But playing fast and loose with types is a BAD IDEA because you'll get bugs.  

If you really hate having to match the function name to the data types then what you need is to switch to a language that can handle this by automatically picking the right function based on the operator types. In other words, switch from a to a language that is 45 years old (C) to one that is only about 35 years old (C++).  There's a paper from 1995 that explains this in the context of min and max implemented with templates:  http://www.aristeia.com/Papers/C++ReportColumns/jan95.pdf
As it turns out the rabbit hole goes a lot deeper than you might think for a generic solution.

But you don't have to go down the rabbit hole.  For most code the best answer is simply to use inline functions and pick the function name that matches your data types. You shouldn't lose any performance at all, and you'll be likely to save a lot of time chasing obscure bugs.

3 comments:

  1. Nowadays, compiler becomes smart enough to produce best machine code. I would like to add that most smart-compiler comes with an user guide that includes a chapter titled "compiler coding practices" or something like that. I found it worthy to read first before starting to code.

    ReplyDelete
  2. I find it diminishes the case for a coding convention etc when you use examples that would themselves breach generally accepted conventions.
    MISRA, for example, would preclude
    c = MAX(a++, b);
    These days I would be questioning the legitimacy of having an increment (or decrement) inside of a function, before I started down the rabbit hole of "what if I write this unclear code"?

    ReplyDelete
  3. Not all macros are evil. If you use GCC for C (and if C++ is not an option), this implementation of MAX gives you the best of both worlds: one macro works for all data types, and it only evaluates its arguments once:

    #define max(a,b) \
    ({ __typeof__ (a) _a = (a); \
    __typeof__ (b) _b = (b); \
    _a > _b ? _a : _b; })

    ReplyDelete

Please send me your comments. I read all of them, and I appreciate them. To control spam I manually approve comments before they show up. It might take a while to respond. I appreciate generic "I like this post" comments, but I don't publish non-substantive comments like that.

If you prefer, or want a personal response, you can send e-mail to comments@koopman.us.
If you want a personal response please make sure to include your e-mail reply address. Thanks!

Static Analysis Ranked Defect List

  Crazy idea of the day: Static Analysis Ranked Defect List. Here is a software analysis tool feature request/product idea: So many times we...