任意解引用指针的输出

2022-01-17 00:00:00 pointers arm c++ memory-alignment x86

我按如下方式填充内存:

I fill the memory as follows:

char buf[8] = { 0x11, 0x22, 0x33, 0x44, 0x55, 0x66, 0x77, 0x88};

然后将无符号长指针依次放在前5个字节并输出结果:

And than put the unsigned long pointer in turn on first 5 bytes and output result:

char *c_ptr;
unsigned long *u_ptr;

c_ptr = buf;
for (int i=0;i<5;i++)
{
    u_ptr = (unsigned long *)c_ptr;
    printf("%X
",*u_ptr);
    c_ptr++;
}

当我在我的 x64 平台上执行此代码时,我得到了我的预期:

When I execute this code on my x64 plaform I get what I expected:

44332211
55443322
66554433
77665544
88776655

但是当我在 ARM 平台上执行相同的代码时,我得到以下信息:

But when I execute the same code on ARM platform I get following:

44332211
11443322
22114433
33221144
88776655

即它每 4 个字节绑定一次,并且在此范围内仅取消引用 4 个字节.

I.e. it get bound every 4 byte and dereference only 4 bytes within this bounds.

所以我想问一下,这种行为(当 pointer_value%4 != 0 时)是错误的还是特定于实现的?

So I want to ask, if this behavior (when pointer_value%4 != 0) erroneous or implementation-specific?

更新:我知道字节序,我想知道这是正确的,我得到了

UPD: I known about endiannes, I want to know is this correct, that I am getting

11443322

而不是

55443322

即当我有指针时,例如 0x10000001它从地址为 0x100000010x100000020x100000030x10000000 的字节生成 unsigned long,而不是 0x10000005.

I.e when I have pointer for example 0x10000001 It makes unsigned long from bytes with addresses 0x10000001, 0x10000002, 0x10000003 and than 0x10000000, instead of 0x10000005.

推荐答案

在怀疑内存对齐后我快速google了=)

After suspecting memory alignment I did a quick google =)

http://awayitworks.blogspot.co.nz/2010/02/arm-memory-alignment.html

在那篇文章中说:

在 ARMv4 架构之前,假定为获取地址给定内容是内存对齐的...... 32 位数据获取应该有地址对齐到 32 位等等.正如猜测的那样,问题只是用于 32 位和 16 位数据获取.ARM 忽略低 2 位如果数据获取是 32 位,则地址,如果数据则忽略低 1 位获取是 16 位的.所以,如果地址没有正确对齐那么数据获取将是错误的.

Till ARMv4 architecture, it’s assumed that address given for fetching contents is memory aligned...a 32-bit data fetch should have address aligned to 32-bit and so on. As guessed correctly the problem is only for 32-bit and 16-bit data fetching. ARM ignores lower 2-bits of address if the data fetch is 32-bit, and ignores lower 1-bit if data fetch is 16-bit. So, in all if the address is not properly aligned then data fetch will be erroneous.

注意最后一句=)

如果您需要 x86 上的预期行为,则必须从字符显式构建整数,ie(假设 little-endian):

If you require the behaviour that you expected on x86, you'll have to explicitly build the integers from chars, ie (assuming little-endian):

// Endian-specific
inline unsigned long ulong_at( const char *p ) {
    return ((unsigned long)p[0])
         | (((unsigned long)p[1]) << 8)
         | (((unsigned long)p[2]) << 16)
         | (((unsigned long)p[3]) << 24);
}

或许:

// Architecture-specific
inline unsigned long ulong_at( const char *p ) {
    unsigned long val;
    char *v = (char*)&val;
    v[0] = p[0];
    v[1] = p[1];
    v[2] = p[2];
    v[3] = p[3];
    return val;
}

相关文章