理解 Java 字节

2022-01-09 00:00:00 binary byte java

所以昨天在工作中,我不得不编写一个应用程序来计算 AFP 文件中的页数.所以我翻阅了我的 MO:DCA 规范 PDF 并找到了结构化字段 BPG (Begin Page) 及其 3 字节标识符.该应用程序需要在 AIX 机器上运行,因此我决定用 Java 编写它.

So at work yesterday, I had to write an application to count the pages in an AFP file. So I dusted off my MO:DCA spec PDF and found the structured field BPG (Begin Page) and its 3-byte identifier. The app needs to run on an AIX box, so I decided to write it in Java.

为了获得最大效率,我决定读取每个结构化字段的前 6 个字节,然后跳过该字段中的剩余字节.这会让我:

For maximum efficiency, I decided that I would read the first 6 bytes of each structured field and then skip the remaining bytes in the field. This would get me:

0: Start of field byte
1-2: 2-byte length of field
3-5: 3-byte sequence identifying the type of field

所以我检查字段类型,如果它是 BPG,我会增加一个页面计数器,如果不是,我不会.然后我跳过字段中的剩余字节而不是通读它们.在这里,在跳过(实际上是在字段长度中)是我发现 Java 使用有符号字节的地方.

So I check the field type and increment a page counter if it's BPG, and I don't if it's not. Then I skip the remaining bytes in the field rather than read through them. And here, in the skipping (and really in the field length) is where I discovered that Java uses signed bytes.

我做了一些谷歌搜索,发现了很多有用的信息.当然,最有用的指令是按位执行 &0xff 以获得 unsigned int 值.这对于我获得可用于计算要跳过的字节数的长度是必要的.

I did some googling and found quite a bit of useful information. Most useful, of course, was the instruction to do a bitwise & to 0xff to get the unsigned int value. This was necessary for me to get a length that could be used in the calculation for the number of bytes to skip.

我现在知道在 128 时,我们从 -128 开始倒数.我想知道的是这里的按位运算是如何工作的——更具体地说,我是如何得出负数的二进制表示的.

I now know that at 128, we start counting backwards from -128. What I want to know is how the bitwise operation works here--more specifically, how I arrive at the binary representation for a negative number.

如果我正确理解按位 & ,则您的结果等于一个仅设置两个数字的公共位的数字.所以假设 byte b = -128,我们会有:

If I understand the bitwise & properly, your result is equal to a number where only the common bits of your two numbers are set. So assuming byte b = -128, we would have:

b & 0xff // 128

1000 0000-128
1111 1111 255
---------
1000 0000 128

那么我如何以 -128 获得 1000 0000?我如何获得像 -72 或 -64 这样不太明显的东西的二进制表示?

So how would I arrive at 1000 0000 for -128? How would I get the binary representation of something less obvious like -72 or -64?

推荐答案

为了获得负数的二进制表示,你需要计算二进制补码:

In order to obtain the binary representation of a negative number you calculate two's complement:

  • 获取正数的二进制表示
  • 反转所有位
  • 添加一个

我们以-72为例:

0100 1000    72
1011 0111    All bits inverted
1011 1000    Add one

所以-72的二进制(8位)表示是10111000.

So the binary (8-bit) representation of -72 is 10111000.

实际发生在您身上的是:您的文件有一个值为 10111000 的字节.当解释为无符号字节(这可能是您想要的)时,这是 88.

What is actually happening to you is the following: You file has a byte with value 10111000. When interpreted as an unsigned byte (which is probably what you want), this is 88.

在Java中,当这个字节被用作一个int时(例如因为read()返回一个int,或者因为隐式提升),它会被解释为一个有符号字节,并且符号- 扩展至 11111111 11111111 11111111 10111000.这是一个值为 -72 的整数.

In Java, when this byte is used as an int (for example because read() returns an int, or because of implicit promotion), it will be interpreted as a signed byte, and sign-extended to 11111111 11111111 11111111 10111000. This is an integer with value -72.

通过与 0xff 进行与运算,您只保留最低 8 位,因此您的整数现在是 00000000 00000000 00000000 10111000,即 88.

By ANDing with 0xff you retain only the lowest 8 bits, so your integer is now 00000000 00000000 00000000 10111000, which is 88.

相关文章