自定义缓存对齐在Rust中

huangapple go评论60阅读模式
英文:

custom cache alignment in rust

问题

如何优化我的 Rust 中的 RowMatrix 结构以处理大量行的性能?

我在 Rust 中使用如下的结构定义了一个以行主要形式表示的矩阵:

pub struct RowMatrix
{
    data: Vec<[usize; 8]>,
    width: usize,
}

每一行都被拆分成一个包含 8 个元素的数组,并依次堆叠在 data 向量中。例如,如果宽度为 64,那么向量中的前 8 个元素代表第一行,接下来的 8 个元素代表第二行,以此类推。

我需要在这个矩阵中属于两个不同行的单独数组上执行操作,这些操作将在相同索引处进行。例如,如果我想在第一行和第十行的第二个数组段上执行操作,我会分别选择数据向量中的第 2 和 74 个元素。数组元素将始终来自相同的数组段。

这个操作会多次执行,涉及不同的行对,当矩阵中的行数较少时,性能没有任何问题。然而,当行数显著增加时,性能显著下降,我认为这是由于频繁的缓存未命中引起的。

有没有办法自定义对齐我的结构以减少缓存未命中,而不更改结构定义?我希望以像保持相隔 8 个元素的元素在缓存中一样的细粒度内控制内存中的元素布局(如果矩阵的宽度是 64)。

我尝试使用 repr(align(x)) 属性来指定结构的对齐方式,但我认为它没有帮助,因为我认为它会保持数组元素的连续方式,而在大矩阵的情况下,相应的元素可能不会在缓存中。

英文:

How can I optimize the performance of my RowMatrix struct in Rust for large number of rows?

I have a matrix defined in a RowMajor form using a struct in Rust as follows:


pub struct RowMatrix
{
    data: Vec&lt;[usize; 8]&gt;,
    width: usize,
}

Each row is broken down into an array of 8 elements and stacked one after the other in the data vector. For example, if the width is 64, then, the first 8 elements in the vector represent the first row, the next 8 elements represent the second row, and so on.

I need to perform operations on individual arrays belonging to two separate rows of this matrix at the same index. For example, if I want to perform an operation on the 2nd array segment of the 1st and 10th row, I would pick the 2nd and 74th elements from the data vector respectively. The array elements will always be from the same array segment.

This operation is performed a number of times with different row pairs and when the number of rows in the matrix is small, I don't see any issues with the performance. However, when the number of rows is significant, I'm seeing a significant degradation in performance, which I attribute to frequent cache misses.

Is there a way to custom align my struct along the cache line to reduce cache misses without changing the struct definition? I want to control the layout of elements in memory at a fine-grained level like keeping elements that are 8 elements apart in cache(if 64 is the width of the matrix).

I used the repr(align(x)) attribute to specify the alignment of a struct but I think it's not helping as I think it's keeping array elements in a sequential fashion and in the case of a big matrix the respective elements might not be there in the cache.

答案1

得分: 1

#[repr(align)] 只能影响存储在结构体中的项(Vec 指针、长度和容量以及您的 width),但由于 Vec 只不过是指向数据的指针,其背后的布局完全由它的实现决定,您无法直接影响它。因此,在“不更改结构定义”的情况下,无法更改布局。但您可以创建一个类似于自定义的 Vec 或直接在 RowMatrix 中管理内存。

英文:

#[repr(align)] can only affect the items stored in the struct (The Vec pointer, length and capacity plus your width), but since Vec is little more than a pointer to the data the layout behind it is entirely dictated by it's implementation and there is no way for you to directly affect it. So "without changing the struct definition" it's not possible to change the layout. You can however create a custom Vec-like or manage the memory yourself directly in the RowMatrix

huangapple
  • 本文由 发表于 2023年2月6日 19:03:40
  • 转载请务必保留本文链接:https://go.coder-hub.com/75360484.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定