按值传递与按引用或按指针传递的性能成本?
让我们考虑一个对象 foo
(可能是一个 int
、一个 double
、一个自定义的 struct
,一个 class
,随便).我的理解是通过引用一个函数来传递 foo
(或者只是传递一个指向 foo
的指针)会导致更高的性能,因为我们避免制作本地副本(这可能很昂贵)如果 foo
很大).
Let's consider an object foo
(which may be an int
, a double
, a custom struct
, a class
, whatever). My understanding is that passing foo
by reference to a function (or just passing a pointer to foo
) leads to higher performance since we avoid making a local copy (which could be expensive if foo
is large).
然而,从答案 here 看来,64 位系统上的指针实际上可以预期为 8 字节大小,而不管指向的是什么.在我的系统上,一个 float
是 4 个字节.这是否意味着如果 foo
是 float
类型,那么按值传递 foo
会更有效而不是给出一个指向它的指针(假设没有其他约束会使得在函数内使用一个比另一个更有效)?
However, from the answer here it seems that pointers on a 64-bit system can be expected in practice to have a size of 8 bytes, regardless of what's being pointed. On my system, a float
is 4 bytes. Does that mean that if foo
is of type float
, then it is more efficient to just pass foo
by value rather than give a pointer to it (assuming no other constraints that would make using one more efficient than the other inside the function)?
推荐答案
这取决于您所说的成本"是什么意思,以及与操作相关的主机系统(硬件、操作系统)的属性.
It depends on what you mean by "cost", and properties of the host system (hardware, operating system) with respect to operations.
如果您的成本度量是内存使用,那么成本的计算是显而易见的 - 将复制的大小相加.
If your cost measure is memory usage, then the calculation of cost is obvious - add up the sizes of whatever is being copied.
如果您衡量的是执行速度(或效率"),那么游戏就不同了.借助专用电路(机器寄存器及其使用方式),硬件(以及操作系统和编译器)往往会针对复制特定大小的事物的操作性能进行优化.
If your measure is execution speed (or "efficiency") then the game is different. Hardware (and operating systems and compiler) tend to be optimised for performance of operations on copying things of particular sizes, by virtue of dedicated circuits (machine registers, and how they are used).
例如,机器具有导致最佳位置"的架构(机器寄存器、内存架构等)是很常见的 - 复制某种大小的变量最有效",但复制更大的 ORSMALLER 变量则不然.较大的变量复制成本更高,因为可能需要对较小的块进行多次复制.较小的也可能花费更多,因为编译器需要将较小的值复制到较大的变量(或寄存器)中,对其进行操作,然后再将值复制回来.
It is common, for example, for a machine to have an architecture (machine registers, memory architecture, etc) which result in a "sweet spot" - copying variables of some size is most "efficient", but copying larger OR SMALLER variables is less so. Larger variables will cost more to copy, because there may be a need to do multiple copies of smaller chunks. Smaller ones may also cost more, because the compiler needs to copy the smaller value into a larger variable (or register), do the operations on it, then copy the value back.
浮点示例包括一些 cray 超级计算机,它们本机支持双精度浮点(在 C++ 中又称为 double
),以及所有单精度运算(在 C++ 中也称为 float
)C++) 在软件中被模拟.一些较旧的 32 位 x86 CPU 也在内部使用 32 位整数,并且由于与 32 位之间的转换,对 16 位整数的操作需要更多的时钟周期(这对于更现代的 32 位或 64-位 x86 处理器,因为它们允许将 16 位整数复制到 32 位寄存器或从 32 位寄存器复制并对其进行操作,这样的惩罚较少).
Examples with floating point include some cray supercomputers, which natively support double precision floating point (aka double
in C++), and all operations on single precision (aka float
in C++) are emulated in software. Some older 32-bit x86 CPUs also worked internally with 32-bit integers, and operations on 16-bit integers required more clock cycles due to translation to/from 32-bit (this is not true with more modern 32-bit or 64-bit x86 processors, as they allow copying 16-bit integers to/from 32-bit registers, and operating on them, with fewer such penalties).
按值复制一个非常大的结构比创建和复制它的地址效率低,这有点显而易见.但是,由于上述因素,最好按值复制这种大小的东西"和最好传递其地址"之间的交叉点不太清楚.
It is a bit of a no-brainer that copying a very large structure by value will be less efficient than creating and copying its address. But, because of factors like the above, the cross-over point between "best to copy something of that size by value" and "best to pass its address" is less clear.
指针和引用往往以类似的方式实现(例如,按引用传递可以以与传递指针相同的方式实现),但这并不能保证.
Pointers and references tend to be implemented in a similar manner (e.g. pass by reference can be implemented in the same way as passing a pointer) but that is not guaranteed.
唯一确定的方法是测量它.并意识到测量值会因系统而异.
The only way to be sure is to measure it. And realise that the measurements will vary between systems.
相关文章