结构体变量通过值传递与通过指针传递给函数

huangapple go评论58阅读模式
英文:

Struct variable passed by value vs. passed by pointer to a function

问题

以下是代码部分的翻译:

typedef struct s_tuple{
	double	x;
	double	y;
	double	z;
	double	w;
}	t_tuple;

t_tuple	tuple_sub_values(t_tuple a, t_tuple b)
{
	a.x -= b.x;
	a.y -= b.y;
	a.z -= b.z;
	a.w -= b.w;
	return (a);
}

t_tuple	tuple_sub_pointers(t_tuple *a, t_tuple *b)
{
    t_tuple c;

	c.x = a->x - b->x;
	c.y = a->y - b->y;
	c.z = a->z - b->z;
	c.w = a->w - b->w;
	return (c);
}

关于函数性能差异和优劣的问题,传递值和传递指针之间存在一些权衡。如果您的结构相对小且函数不太频繁调用,性能差异可能不太明显。但是,对于像射线追踪这样的计算密集型应用,性能可能会受到影响。

通过值传递(tuple_sub_values)的主要优点是它可以更简单地使用,不需要担心指针和引用。然而,它会复制整个结构,可能导致额外的内存开销和性能损失,尤其是对于大型结构。

通过指针传递(tuple_sub_pointers)的主要优点是它避免了结构的复制,从而节省了内存和一些性能。但是,它需要处理指针,可能引入了更多的代码复杂性。此外,如果传递的指针无效,可能会引发错误。

在您的具体情况下,由于您提到要频繁调用,而且结构可能相对大,通过指针传递可能更有效。但是,为了最终确定哪种方法更好,最好进行基准测试以评估性能,并根据测试结果进行决策。

英文:

Let's say I have the following structure:

typedef struct s_tuple{
	double	x;
	double	y;
	double	z;
	double	w;
}	t_tuple;

Let's say I have the two following functions:

t_tuple	tuple_sub_values(t_tuple a, t_tuple b)
{
	a.x -= b.x;
	a.y -= b.y;
	a.z -= b.z;
	a.w -= b.w;
	return (a);
}

t_tuple	tuple_sub_pointers(t_tuple *a, t_tuple *b)
{
    t_tuple c;

	c.x = a->x - b->x;
	c.y = a->y - b->y;
	c.z = a->z - b->z;
	c.w = a->w - b->w;
	return (c);
}

Will there be a performance difference between the functions ? Is one of these better than the other ?
Basically, what are the pros and cons of passing by value vs. passing by pointer when all of the structure elements are called ?

Edit: Completely changed my structure and functions to give a more precise example
I found this post that is related to my question but is for C++: https://stackoverflow.com/questions/40185665/performance-cost-of-passing-by-value-vs-by-reference-or-by-pointer#:~:text=In%20short%3A%20It%20is%20almost,reference%20parameters%20than%20value%20parameters.

Context: My structures are not huge in this example, but I am coding a ray-tracer and some structs of size around 100B can be called millions of times so I'd like to try to optimize these calls. My structs are kind of imbricated so it would be a mess to copy them here, this is why I tried to ask my question on a kind of general example.

答案1

得分: 1

为了获得最佳的参数传递和返回值性能,你基本上希望遵循你的平台的ABI,以确保参数在寄存器中,并保持在寄存器中。如果它们不在寄存器中,或者不能保持在寄存器中,那么通过指针传递大于指针大小的数据可能会节省一些复制(除非复制需要在被调用者内部完成:void pass_copy(struct large x){ use(&x); }可能实际上比void pass_copy2(struct large const*x){ struct large cpy=*x; use(&cpy); }更适合代码生成)。

例如,sysv x86-64 ABI的具体规则有点复杂(请参阅有关调用约定的章节)。但一个简短的版本可能是:只要它们的类型足够“简单”且适当的参数传递寄存器可用(6个整数值和6个双精度浮点数),参数/返回值将通过寄存器传递。最多包含两个八字节的结构体可以通过寄存器传递(作为参数或返回值),前提是它们足够“简单”。

假设您的双精度浮点数已经加载到寄存器中(或者没有汇总到t_tuples中,您可以将其指向被调用者),在x86-64 SysV ABI上传递它们的最有效方式将是单独传递或通过每两个双精度浮点数一个的结构体传递,但仍然需要通过内存返回它们,因为ABI只能容纳寄存器中的两个双精度浮点数返回值,而不能容纳4个双精度浮点数的返回值。如果您返回了四个双精度浮点数,编译器将在调用者中分配堆栈内存,并将指向它的指针作为隐藏的第一个参数传递,然后返回分配的内存的指针(在底层)。更灵活的方法是不返回这样一个大的聚合体,而是显式传递一个指向待填充的结构体的指针。这样,结构体可以放在您想要的任何地方(而不是由编译器在堆栈上自动分配)。

因此,类似于以下内容:

void tuple_sub_values(t_tuple *retval, 
      t_twodoubles a0, t_twodoubles a1, 
      t_twodoubles b0, t_twodoubles b1);

将是在x86-64 SysV ABI(Linux、MacOS、BSD等)上避免内存溢出的更好API。

如果您的测量结果显示代码大小节省/性能提升对您来说值得,您可以将其封装在一个内联函数中,该函数会执行结构拆分。

英文:

Getting to the core of the question: for optimal arg-passing/value-returning performance, you basically want to follow the ABI of your platform to try and make sure that things are in registers and stay in registers. If they aren't in registers and or cannot stay in registers, then passing larger-than-pointer-size data by pointer will likely save some copying (unless the copying would need to be done in the callee anyway: void pass_copy(struct large x){ use(&x); } could actually be a small bit better for codegen than void pass_copy2(struct large const*x){
struct large cpy=*x; use(&cpy);

}`).

The concrete rules for e.g., the sysv x86-64 ABI are a bit complicated (see the chapter on calling conventions).
But a short version might be: args/return-vals go through registers as long as their type is "simple enough" and appropriate argument passing registers are available (6 for integer vals and 6 for doubles). Structs of up to two eightbytes can go through registers (as arguments or a return value) provided they're "simple enough".

Supposing your doubles are already loaded in registers (or aren't aggregated into t_tuples that you could point the callee to), the most efficient way to pass them on x86-64 SysV ABI would be individually or via structs of two doubles each, but you'd still need to return them via memory because the ABI can only accommodate two-double retvals with registers, not 4-double retvals. If you returned a fourdouble, the compiler would stack-alloc memory in the caller, and pass a pointer to it as a hidden first argument and then return a pointer to the allocated memory (under the covers). A more flexible approach would be to not return such a large aggregate but instead explicitly pass a pointer to a struct-to-be-filled. That way the struct can be anywhere you want it (rather then auto-alloced on the stack by the compiler).

So something like

void tuple_sub_values(t_tuple *retval, 
      t_twodoubles a0, t_twodoubles a1, 
      t_twodoubles b0, t_twodoubles b1);

would a better API for avoiding memory spillage on x86-64 SysV ABI (Linux, MacOS, BSDs...).

If your measurements show the codesize savings / performance boost to be worth it for you, you could wrap it in an inline function that'd do the struct-splitting.

答案2

得分: 0

在性能方面,这通常会因为实现方式的不同而有所差异,超出了这篇帖子的范围,但最坏的情况下可能是微秒级别。现在,讨论一下它们的优缺点:

  • 传递值将只给你一个该结构体的副本,修改将只在本地生效。换句话说,你的函数将接收到该结构体的一个全新副本,只能修改这个副本。

  • 相反,通过引用传递可以直接在函数中修改给定的结构体,通常在函数需要返回多个值时使用。

选择哪种方式完全取决于你的情况。但是为了提供额外的帮助:

  • 通过引用传递会减少函数调用的开销,因为你无需从头复制32个字节到新的函数中。如果你计划多次调用该函数,这也会在保持内存占用较低的情况下有所帮助。为什么?因为你不需要为这些调用创建多个不同的结构体,你只需告诉每个调用重用同一个结构体。这在游戏开发中经常见到,因为在游戏中,结构体可能有数千字节大。
英文:

When it comes to performance, that will most likely be implementation specific for reasons going far away from this post, but most likely we're talking about microseconds at the worst case. Now when it comes to the pros and cons:

  • Passing by value will only give you a copy of that struct, and modifications will be local only. In other words, your function will receive an entirely new copy of the struct, and it will only be able to modify that copy.

  • In contrast, passing by reference gives you the ability to modify the given struct directly from the function, and is often seen when multiple values need to be returned from a function.

It's entirely up to you to choose which one works for your case. But to add some extra help:

  • Passing by reference will reduce the function call overhead because you won't have to copy 32 bytes from scratch to the new function. It will also help significantly if you're planning to keep memory footprint low, if you plan to call the function multiple times. Why? Because instead of creating multiple different structs for those calls, you simply tell every call to reuse the same struct. Which is mainly seen in games, where structs may be thousands of bytes large.

huangapple
  • 本文由 发表于 2023年2月6日 05:41:38
  • 转载请务必保留本文链接:https://go.coder-hub.com/75355716.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定