`__m256` 的包装器使用构造函数产生分段错误 - Windows 64 + MinGW + AVX 问题
我有一个像这样的工会
union bareVec8f {
__m256 m256; //avx 8x float vector
float floats[8];
int ints[8];
inline bareVec8f(){
}
inline bareVec8f(__m256 vec){
this->m256 = vec;
}
inline bareVec8f &operator=(__m256 m256) {
this->m256 = m256;
return *this;
}
inline operator __m256 &() {
return m256;
}
}
__m256 需要在 32 字节边界上对齐才能与 SSE 函数一起使用,并且应该自动对齐,即使在联合内也是如此.
the __m256 needs to be aligned on 32 byte boundary to be used with SSE functions, and should be automatically, even within the union.
当我这样做时
bareVec8f test = _mm256_set1_ps(1.0f);
我遇到了分段错误.由于我制作的构造函数,这段代码应该可以工作.但是,当我这样做时
I get a segmentation fault. This code should work because of the constructor I made. However, when I do this
bareVec8f test;
test.m256 = _mm256_set1_ps(8.f);
我没有遇到分段错误.
因此,由于这工作正常,联合可能正确对齐,似乎构造函数导致了一些分段错误
So because that works fine the union is probably aligned properly, there's just some segmentation fault being caused with the constructor it seems
我正在使用 gcc 64 位 windows 编译器
I'm using gcc 64bit windows compiler
---------------------------------编辑Matt 设法生成了似乎在这里发生的错误的最简单示例.
---------------------------------EDIT Matt managed to produce the simplest example of the error that seems to be happening here.
#include <immintrin.h>
void foo(__m256 x) {}
int main()
{
__m256 r = _mm256_set1_ps(0.0f);
foo(r);
}
我正在使用 -std=c++11 -mavx
推荐答案
这是 g++ for Windows 中的一个错误.它不应该执行 32 字节堆栈对齐.错误 49001 错误 54412
This is a bug in g++ for Windows. It does not perform 32-byte stack alignment when it should. Bug 49001 Bug 54412
在 这个 SO 线程上有人制作了一个 Python 脚本来处理 g++ 的程序集输出以解决问题,因此这是一种选择.
On this SO thread someone made a Python script to process the assembly output by g++ to fix the problem, so that would be one option.
否则,为避免在您的联合中出现这种情况,您可以将按值获取 __m256
的函数改为通过引用获取.这不应该有任何性能损失,除非优化低/关闭.
Otherwise, to avoid this in your union you could make the functions which take __m256
by value, take it by reference instead. This shouldn't have any performance penalty unless optimization is low/off.
如果您不知道 - 联合别名会导致 C++ 中未定义的行为,则不允许先编写 m256
然后再读取 floats
或 ints
例如.因此,您的问题可能有不同的解决方案.
In case you are unaware - union aliasing causes undefined behaviour in C++, it's not permitted to write m256
and then read floats
or ints
for example. So perhaps there is a different solution to your problem.
相关文章