Force Inline in C++
Non-inlined function calls can be expensive. The compiler would not treat the body of the caller and the callee in the same basic block and thus not be able to apply certain optimizations. This is not an issue as the compiler does a pretty good job at inlining mostly, but if you are calling a function in a big loop you might want to ensure that the compiler always inlines it. inline
, always_inline
and forceinline
are just hints. They dont always inline [1] [2].
Example of some libraries that do these mesa, fastfloat
Trust the compiler some say. Profile your code say the others. Use macros say the old and wise.
But what if you are developing a library and need to ensure that your method gets inlined? You cant say trust the compiler, because the users want to trust your library. You cant profile the code, because the application isn't your code.
We want a FORCE_INLINE
keyword that just works, at least on the three major compilers - g++
, clang
and msvc
. If it cant inline, it should error in a meaningful way.
The example:
int decrement(int x) { return x - 1; }
int factorial(int x) { return (x == 0) ? 1 : x * factorial(x - 1); }
Putting our FORCE_INLINE
keyword in front of decrement
should inline.
It should either compile-time error or link-time error if put it in front of factorial
(as the recursion depth / input argument is not known at compile-time).
#if defined(__clang__)
#define FORCE_INLINE [[gnu::always_inline]] [[gnu::gnu_inline]] extern inline
#elif defined(__GNUC__)
#define FORCE_INLINE [[gnu::always_inline]] inline
#elif defined(_MSC_VER)
#pragma warning(error: 4714)
#define FORCE_INLINE __forceinline
#else
#error Unsupported compiler
#endif
Note the link-error / clang linker error part is a bit shady and most people would not want to adopt it unless working in somewhat close collaboration. You can remove the [[gnu::gnu_inline]] extern
part.
Now lets check it in action:
Compiler | Working case | Error case | Error type |
---|---|---|---|
Clang | clang_working | clang_error | Linker error |
GCC | gcc_working | gcc_error | Compile error |
MSVC | msvc_working | msvc_error | Compile error (requires optimization flag) |
How it works:
- GCC would generate an error if it cant
always_inline
[3] - Clang:
- MSVC:
- Generates a warning for for non-inlinable
__forceinline
functions - But only if compiled with any "inline expansion" optimization (
/Ob<n>
) [2:1] - This is present with
/O1
or/O2
- We promote this warning to an error
- Generates a warning for for non-inlinable
Note: do not use this with virtual functions. You can't "force inline" them as they need to be pointed to at runtime.
GCC summarizes this as "An Inline Function is As Fast As a Macro" [5:1]. Zig provides something similar as its callconv(.Inline)
Thus we can and should build syntactic sugar as functions instead of weird macros. Without any worries of performance.
https://clang.llvm.org/docs/AttributeReference.html#always-inline-force-inline ↩︎ ↩︎
https://docs.microsoft.com/en-us/cpp/cpp/inline-functions-cpp ↩︎ ↩︎
https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html#always_inline ↩︎
https://clang.llvm.org/docs/AttributeReference.html#gnu-inline ↩︎
- Previous: C++20 Concepts in C++03
- Discuss on: reddit