C/C++ "inline" keyword in CUDA device-side code -


i total "newbie", when comes cuda. if question trivial, pardon me.

does nvcc understands meaning of inline c keyword?
know __forceinline__, , similar nvcc "macros", therefor not asking how write inline cuda device-side code.
know also, code "split" between nvcc , c/c++ compiler (i using visual studio ide).
mean inline keyword ignored nvcc when "stands next to" __device__ or __global__ kernels?

edit:
p.s. had searched cuda programing guide. not find useful under inline entry, similar "tags" not either.

cuda programming language in c++ family. therefore, cuda documentation not duplicate standard c++ documentation, merely points out differences , extensions. if can't find description of use of inline specifier functions in cuda documentation, indication processed in standard c++ fashion.

interpolating between various parts of questions, seems concerned how use of inline affects actual inlining of functions in generated code.

the iso c++11 standard specifies inline function attribute in section 7.1.2. besides provisions linkage , duplicate definitions, states following actual inlining of functions inline specifier:

the inline specifier indicates implementation inline substitution of function body @ point of call preferred usual function call mechanism. implementation not required perform inline substitution @ point of call;

so inline merely suggestion compiler, free ignore. since cuda compiler inlines functions aggressively in device code default (for performance reasons), use of inline seems quite redundant device code, programmers free use it.

the inlining heuristics used cuda compiler may prevent inlining of particular function programmer have inlined under circumstances. purpose, cuda provides non-standard __forceinline__ function attribute. specifier affects both device code , host code, nvcc translates equivalent host-compiler specific attribute host code, such __forceinline msvc. can verified dumping , inspecting intermediate c++ files nvcc sends host compiler.


Comments

Popular posts from this blog

python - How to insert QWidgets in the middle of a Layout? -

python - serve multiple gunicorn django instances under nginx ubuntu -

module - Prestashop displayPaymentReturn hook url -