什么是堆栈图框
我最近一直在查看 Java 虚拟机规范(JVMS)试图更好地理解是什么让我的程序工作,但我发现了一个我不太明白的部分......
I've recently been looking at The Java Virtual Machine Specifications (JVMS) to try to better understand the what makes my programs work, but I've found a section that I'm not quite getting...
4.7.4 部分描述StackMapTable 属性,在该部分中,文档详细介绍了堆栈映射框架.问题是它有点罗嗦,我最好通过例子来学习;不是通过阅读.
Section 4.7.4 describes the StackMapTable Attribute, and in that section the document goes into details about stack map frames. The issue is that it's a little wordy and I learn best by example; not by reading.
我知道第一个堆栈映射框架是从方法描述符派生的,但我不明白如何(这应该是解释 这里.)另外,我不完全理解堆栈映射框架的作用.我会假设它们类似于 Java 中的块,但看起来好像你不能在彼此内部拥有堆栈映射框架.
I understand that the first stack map frame is derived from the method descriptor, but I don't understand how (which is supposedly explained here.) Also, I don't entirely understand what the stack map frames do. I would assume they're similar to blocks in Java, but it appears as though you can't have stack map frames inside each other.
无论如何,我有两个具体问题:
Anyway, I have two specific questions:
- 堆栈映射框架有什么作用?
- 第一个堆栈图框是如何创建的?
还有一个一般性问题:
- 谁能提供一个比 JVMS 中给出的解释更简洁、更容易理解的解释?
推荐答案
Java 要求所有加载的类都经过验证,以维护沙盒的安全性,确保代码安全优化.请注意,这是在字节码级别完成的,因此验证不验证 Java 语言的不变量,它仅根据字节码规则验证字节码是否有意义.
Java requires all classes that are loaded to be verified, in order to maintain the security of the sandbox and ensure that the code is safe to optimize. Note that this is done on the bytecode level, so the verification does not verify invariants of the Java language, it merely verifies that the bytecode makes sense according to the rules for bytecode.
除其他外,字节码验证可确保指令格式正确,所有跳转都指向方法内的有效指令,并且所有指令都对正确类型的值进行操作.最后一个是堆栈映射的来源.
Among other things, bytecode verification makes sure that instructions are well formed, that all the jumps are to valid instructions within the method, and that all instructions operate on values of the correct type. The last one is where the stack map comes in.
问题是字节码本身不包含明确的类型信息.类型是通过数据流分析隐式确定的.例如,iconst 指令创建一个整数值.如果将其存储在插槽 1 中,则该插槽现在有一个 int.如果控制流从其中存储浮点数的代码合并,则该插槽现在被认为具有无效类型,这意味着在覆盖它之前您无法对该值执行更多操作.
The thing is that bytecode by itself contains no explicit type information. Types are determined implicitly through dataflow analysis. For example, an iconst instruction creates an integer value. If you store it in slot 1, that slot now has an int. If control flow merges from code which stores a float there instead, the slot is now considered to have invalid type, meaning that you can't do anything more with that value until overwriting it.
从历史上看,字节码验证器使用这些数据流规则推断所有类型.不幸的是,不可能在一次线性遍历字节码中推断出所有类型,因为向后跳转可能会使已经推断出的类型无效.经典的验证器通过迭代代码来解决这个问题,直到一切都停止变化,可能需要多次通过.
Historically, the bytecode verifier inferred all the types using these dataflow rules. Unfortunately, it is impossible to infer all the types in a single linear pass through the bytecode because a backwards jump might invalidate already inferred types. The classic verifier solved this by iterating through the code until everything stopped changing, potentially requiring multiple passes.
但是,验证使 Java 中的类加载速度变慢.Oracle 决定通过添加一个新的、更快的验证器来解决这个问题,该验证器可以一次验证字节码.为此,他们要求从 Java 7 开始的所有新类(Java 6 处于过渡状态)携带有关其类型的元数据,以便可以一次性验证字节码.由于字节码格式本身无法更改,因此此类型信息单独存储在名为 StackMapTable
的属性中.
However, verification makes class loading slow in Java. Oracle decided to solve this issue by adding a new, faster verifier, that can verify bytecode in a single pass. To do this, they required all new classes starting in Java 7 (with Java 6 in a transitional state) to carry metadata about their types, so that the bytecode can be verified in a single pass. Since the bytecode format itself can't be changed, this type information is stored seperately in an attribute called StackMapTable
.
简单地在代码中的每个点存储每个值的类型显然会占用大量空间并且非常浪费.为了使元数据更小更高效,他们决定只列出作为跳转目标位置的类型.如果您考虑一下,这是您唯一需要额外信息来进行单次通过验证的时候.在跳转目标之间,所有控制流都是线性的,因此您可以使用旧的推理规则来推断中间位置的类型.
Simply storing the type for every single value at every single point in the code would obviously take up a lot of space and be very wasteful. In order to make the metadata smaller and more efficient, they decided to have it only list the types at positions which are targets of jumps. If you think about it, this is the only time you need the extra information to do a single pass verification. In between jump targets, all control flow is linear, so you can infer the types at in between positions using the old inference rules.
显式列出类型的每个位置称为堆栈映射框架.StackMapTable
属性包含按顺序排列的帧列表,尽管它们通常表示为与前一帧的差异以减少数据大小.如果方法中没有帧,这种情况发生在控制流从不加入时(即 CFG 是一棵树),那么 StackMapTable 属性可以完全省略.
Each position where types are explicitly listed is known as a stack map frame. The StackMapTable
attribute contains a list of frames in order, though they are usually expressed as a difference from the previous frame in order to reduce data size. If there are no frames in the method, which occurs when control flow never joins (i.e. the CFG is a tree), then the StackMapTable attribute can be omitted entirely.
这就是 StackMapTable 工作原理以及添加它的原因的基本思路.最后一个问题是如何创建隐式初始帧.答案当然是在方法开始的时候,操作数栈是空的,局部变量slots有方法参数的类型给定的类型,这些类型是由方法描述符决定的.
So this is the basic idea of how StackMapTable works and why it was added. The last question is how the implicit initial frame is created. The answer of course is that at the beginning of the method, the operand stack is empty and the local variable slots have the types given by the types of the method parameters, which are determined from the method decriptor.
如果您习惯于 Java,那么方法参数类型在字节码级别的工作方式会有一些细微差别.首先,虚方法有一个隐含的 this
作为第一个参数.其次,boolean
、byte
、char
、short
在字节码层面不存在.相反,它们都在幕后作为整数实现.
If you're used to Java, there are a few minor differences to how method parameter types work at the bytecode level. First off, virtual methods have an implicit this
as first parameter. Second, boolean
, byte
, char
, and short
do not exist at the bytecode level. Instead, they are all implemented as ints behind the scenes.
相关文章