JVM中操作数栈的作用是什么?

2022-01-16 00:00:00 jvm bytecode java jvm-bytecode

JVM 运行时数据区为每个正在执行的方法单独的堆栈.它包含操作数堆栈和局部变量.每次加载变量时,都需要const到操作数栈,然后store到局部变量.为什么不直接操作局部变量表,做一些看似重复的工作?

JVM Run-time Data Areas separate stack for each method being executed. It contains operand stack and local variables. Every time you load a variable, you need to const to the operand stack and then store to the local variables. Why not directly operate the local variable table, and have some seemingly repeated work?

推荐答案

具有直接操作数的指令集必须对每条指令中的操作数进行编码.相反,使用操作数堆栈的指令集,操作数是隐式的.

An instruction set with direct operands has to encode the operands in each instruction. In contrast, with an instruction set using an operand stack, the operands are implicit.

在查看诸如将常量加载到变量之类的小琐碎操作时,隐式参数的优势并不明显.本例将操作码、常量、操作码、变量索引"序列与操作码、常量、变量索引"进行比较,因此看起来直接寻址更简单、更紧凑.

The advantage of implicit arguments is not obvious when looking at a small trivial operation like loading a constant into a variable. This example is comparing an "opcode, constant, opcode, variable-index" sequence with "opcode, constant, variable index", so it seems like addressing directly is simpler and more compact.

但是让我们看看,例如return Math.sqrt(a * a + b * b);

假设变量索引从零开始,字节码看起来像

Assuming the variable indices start at zero, the bytecode looks like

   0: dload_0
   1: dload_0
   2: dmul
   3: dload_2
   4: dload_2
   5: dmul
   6: dadd
   7: invokestatic  #2                  // Method java/lang/Math.sqrt:(D)D
  10: dreturn
  11 bytes total

对于直接寻址架构,我们需要类似

For a directly addressing architecture, we would need something like

dmul a,a → tmp1
dmul b,b → tmp2
dadd tmp1,tmp2 → tmp1
invokestatic #2 tmp1 → tmp1
dreturn tmp1

我们必须用索引替换名称.

where we have to replace the names with indices.

虽然此序列包含较少的指令,但每条指令都必须对其操作数进行编码.当我们希望能够寻址 256 个局部变量时,我们需要每个操作数一个字节,因此每条算术指令需要三个字节加上操作码,调用需要两个加上操作码和方法地址,返回需要一个加操作码.所以对于字节边界的指令,这个序列需要 19 个字节,比等效的 Java 字节码要多得多,同时被限制为 256 个局部变量,而字节码最多支持 65536 个局部变量.

While this sequence consists of fewer instructions, each instruction has to encode its operands. When we want to be able to address 256 local variables, we need a byte per operand, so each arithmetic instruction needs three bytes plus opcode, the invocation needs two plus opcode and method address, and the return needs one plus opcode. So for instructions at byte boundaries, this sequence needs 19 bytes, significantly more than the equivalent Java bytecode, while being limited to 256 local variables whereas the bytecode supports up to 65536 local variables.

这展示了操作数堆栈概念的另一个优势.Java 字节码允许组合不同的优化指令,例如用于加载整数常量有 iconst_nbipushsipushldc 并将其存储到变量中有 istore_nistore nwide istore n.当一个具有直接变量寻址的指令集应该支持广泛的常量和变量数量但仍支持紧凑指令时,它需要针对每种组合使用不同的指令.同样,它需要所有算术指令的多个版本.

This demonstrates another strength of the operand stack concept. Java bytecode allows to combine different, optimized instructions, e.g. for loading an integer constant there are iconst_n, bipush, sipush, and ldc and for storing it into a variable there are istore_n, istore n, and wide istore n. An instruction set with direct variable addressing would need distinct instructions for each combination when it is supposed to support a wide range of constants and numbers of variables but still support compact instructions. Likewise, it would need multiple versions of all arithmetic instructions then.

您可以使用两操作数形式来代替三操作数形式,其中一个源变量也指示目标变量.这会产生更紧凑的指令,但如果之后仍需要操作数的值,则需要额外的传输指令.操作数堆栈形式仍然更紧凑.

Instead of a three operand form, you could use a two operand form, where one of the source variables also indicates the target variable. This results in more compact instructions but creates the need for additional transfer instructions if the operand’s value is still needed afterwards. The operand stack form still is more compact.

请记住,这仅描述了操作.执行环境在执行代码时不需要严格遵循这个逻辑.所以除了最简单的解释器之外,所有的 JVM 实现都会在执行之前将其转换为不同的形式,因此原始存储的形式对实际执行性能无关紧要.它只影响空间需求和加载时间,这两者都受益于更紧凑的表示.这尤其适用于通过可能较慢的网络连接传输的代码,这是 Java 最初设计的用例之一.

Keep in mind that this only describes the operations. An execution environment is not required to strictly follow this logic when executing the code. So besides the simplest interpreters, all JVM implementations convert this into a different form before executing, so the original stored form doesn’t matter for the actual execution performance. It only affects the space requirements and loading time, which both benefit from a more compact representation. This especially applies to code transferred over potentially slow network connections, one of the use cases, Java was originally designed for.

相关文章