JavaScript 运行时如何表示闭包和作用域

2022-01-16 00:00:00 garbage-collection closures javascript

这主要是一个出于好奇的问题.考虑以下函数

This is mostly an out-of-curiosity question. Consider the following functions

var closure ;
function f0() {
    var x = new BigObject() ;
    var y = 0 ;
    closure = function(){ return 7; } ;
}
function f1() {
    var x = BigObject() ;
    closure =  (function(y) { return function(){return y++;} ; })(0) ;
}
function f2() {
    var x = BigObject() ;
    var y = 0 ;
    closure = function(){ return y++ ; } ;
}

在每种情况下,在函数执行后,(我认为)没有办法到达 x,因此 BigObject 可以被垃圾回收,只要因为 x 是对它的最后引用.每当计算函数表达式时,头脑简单的解释器都会捕获整个作用域链.(一方面,您需要这样做才能使对 eval 的调用正常工作——下面的示例).更聪明的实现可能会在 f0 和 f1 中避免这种情况.更智能的实现将允许保留 y,但不允许保留 x,这是 f2 高效所必需的.

In every case, after the function has been executed, there is (I think) no way to reach x and so the BigObject can be garbage collected, as long as x is the last reference to it. A simple minded interpreter would capture the whole scope chain whenever a function expression is evaluated. (For one thing, you need to do this to make calls to eval work -- example below). A smarter implementation might avoid this in f0 and f1. An even smarter implementation would allow y to be retained, but not x, as is needed for f2 to be efficient.

我的问题是现代 JavaScript 引擎(JaegerMonkey、V8 等)如何处理这些情况?

My question is how do the modern JavaScript engines (JaegerMonkey, V8, etc.) deal with these situations?

最后,这里有一个例子说明变量可能需要保留,即使它们从未在嵌套函数中提及.

Finally, here is an example that shows that variables may need to be retained even if they are never mentioned in the nested function.

var f = (function(x, y){ return function(str) { return eval(str) ; } } )(4, 5) ;
f("1+2") ; // 3
f("x+y") ; // 9
f("x=6") ;
f("x+y") ; // 11

但是,有一些限制可以防止人们以编译器可能遗漏的方式潜入对 eval 的调用.

However, there are restrictions that prevent one from sneaking in a call to eval in ways that might be missed by the compiler.

推荐答案

存在限制阻止您调用静态分析会遗漏的 eval 并不是真的:只是对 eval 的此类引用运行在全球范围.请注意,这是 ES5 与 ES3 相比的一个变化,其中对 eval 的间接和直接引用都在本地范围内运行,因此,我不确定是否有任何东西实际上基于这一事实进行了任何优化.

It's not true that there are restrictions that prevent you from calling eval that would be missed by static-analysis: it's just that such references to to eval run in the global scope. Note that this is a change in ES5 from ES3 where indirect and direct references to eval both ran in the local scope, and as such, I'm unsure whether anything actually does any optimizations based upon this fact.

一个明显的测试方法是让 BigObject 成为一个非常大的对象,并在运行 f0–f2 后强制执行 gc.(因为,嘿,尽管我想我知道答案,但测试总是更好!)

An obvious way to test this is to make BigObject be a really big object, and force a gc after running f0–f2. (Because, hey, as much as I think I know the answer, testing is always better!)

那么……

var closure;
function BigObject() {
  var a = '';
  for (var i = 0; i <= 0xFFFF; i++) a += String.fromCharCode(i);
  return new String(a); // Turn this into an actual object
}
function f0() {
  var x = new BigObject();
  var y = 0;
  closure = function(){ return 7; };
}
function f1() {
  var x = new BigObject();
  closure =  (function(y) { return function(){return y++;}; })(0);
}
function f2() {
  var x = new BigObject();
  var y = 0;
  closure = function(){ return y++; };
}
function f3() {
  var x = new BigObject();
  var y = 0;
  closure = eval("(function(){ return 7; })"); // direct eval
}
function f4() {
  var x = new BigObject();
  var y = 0;
  closure = (1,eval)("(function(){ return 7; })"); // indirect eval (evaluates in global scope)
}
function f5() {
  var x = new BigObject();
  var y = 0;
  closure = (function(){ return eval("(function(){ return 7; })"); })();
}
function f6() {
  var x = new BigObject();
  var y = 0;
  closure = function(){ return eval("(function(){ return 7; })"); };
}
function f7() {
  var x = new BigObject();
  var y = 0;
  closure = (function(){ return (1,eval)("(function(){ return 7; })"); })();
}
function f8() {
  var x = new BigObject();
  var y = 0;
  closure = function(){ return (1,eval)("(function(){ return 7; })"); };
}
function f9() {
  var x = new BigObject();
  var y = 0;
  closure = new Function("return 7;"); // creates function in global scope
}

我已经为 eval/Function 添加了测试,看起来这些也是有趣的案例.f5/f6 之间的区别很有趣,因为 f5 实际上与 f3 完全相同,因为它实际上是相同的闭包函数;f6 只返回一些曾经评估过的东西,并且由于尚未评估 eval,编译器无法知道其中没有对 x 的引用.

I've added tests for eval/Function, seeming these are also interesting cases. The different between f5/f6 is interesting, because f5 is really just identical to f3, given what is really an identical function for closure; f6 merely returns something that once evaluated gives that, and as the eval hasn't yet been evaluated, the compiler can't know that there is no reference to x within it.

js> gc();
"before 73728, after 69632, break 01d91000
"
js> f0();
js> gc(); 
"before 6455296, after 73728, break 01d91000
"
js> f1(); 
js> gc(); 
"before 6455296, after 77824, break 01d91000
"
js> f2(); 
js> gc(); 
"before 6455296, after 77824, break 01d91000
"
js> f3(); 
js> gc(); 
"before 6455296, after 6455296, break 01db1000
"
js> f4(); 
js> gc(); 
"before 12828672, after 73728, break 01da2000
"
js> f5(); 
js> gc(); 
"before 6455296, after 6455296, break 01da2000
"
js> f6(); 
js> gc(); 
"before 12828672, after 6467584, break 01da2000
"
js> f7(); 
js> gc(); 
"before 12828672, after 73728, break 01da2000
"
js> f8(); 
js> gc(); 
"before 6455296, after 73728, break 01da2000
"
js> f9(); 
js> gc(); 
"before 6455296, after 73728, break 01da2000
"

SpiderMonkey 在除 f3、f5 和 f6 之外的所有内容上都对 GC "x" 显示.

SpiderMonkey appears to GC "x" on everything except f3, f5, and f6.

除非在仍然存在的任何函数的作用域链中直接 eval 调用,否则它会尽可能地显示(即,如果可能,y 和 x).(即使该函数对象本身已被 GC 处理并且不再存在,如 f5 中的情况,这在理论上意味着它可以 GC x/y.)

It appears to as much as possible (i.e., when possible, y, as well as x) unless there is direct eval call within the scope-chain of any function that still exists. (Even if that function object itself has been GC'd and no longer exists, as is the case in f5, which theoretically means that it could GC x/y.)

gsnedders@dolores:~$ v8 --expose-gc --trace_gc --shell foo.js
V8 version 3.0.7
> gc();
Mark-sweep 0.8 -> 0.7 MB, 1 ms.
> f0();
Scavenge 1.7 -> 1.7 MB, 2 ms.
Scavenge 2.4 -> 2.4 MB, 2 ms.
Scavenge 3.9 -> 3.9 MB, 4 ms.
> gc();   
Mark-sweep 5.2 -> 0.7 MB, 3 ms.
> f1();
Scavenge 4.7 -> 4.7 MB, 9 ms.
> gc();
Mark-sweep 5.2 -> 0.7 MB, 3 ms.
> f2();
Scavenge 4.8 -> 4.8 MB, 6 ms.
> gc();
Mark-sweep 5.3 -> 0.8 MB, 3 ms.
> f3();
> gc();
Mark-sweep 5.3 -> 5.2 MB, 17 ms.
> f4();
> gc();
Mark-sweep 9.7 -> 0.7 MB, 5 ms.
> f5();
> gc();
Mark-sweep 5.3 -> 5.2 MB, 12 ms.
> f6();
> gc();
Mark-sweep 9.7 -> 5.2 MB, 14 ms.
> f7();
> gc();
Mark-sweep 9.7 -> 0.7 MB, 5 ms.
> f8();
> gc();
Mark-sweep 5.2 -> 0.7 MB, 2 ms.
> f9();
> gc();
Mark-sweep 5.2 -> 0.7 MB, 2 ms.

V8 在除 f3、f5 和 f6 之外的所有内容上都出现在 GC x 上.这与 SpiderMonkey 相同,参见上面的分析.(但是请注意,这些数字不够详细,无法判断 y 是否在 x 没有被 GC 时,我没有费心去调查这个.)

V8 appears to GC x on everything apart from f3, f5, and f6. This is identical to SpiderMonkey, see analysis above. (Note however that the numbers aren't detailed enough to tell whether y is being GC'd when x is not, I've not bothered to investigate this.)

我不想再运行这个了,但不用说行为与 SpiderMonkey 和 V8 相同.没有 JS shell 更难测试,但随着时间的推移是可行的.

I'm not going to bother running this again, but needless to say behaviour is identical to SpiderMonkey and V8. Harder to test without a JS shell, but doable with time.

在 Linux 上构建 JSC 很痛苦,而 Chakra 不能在 Linux 上运行.我相信 JSC 对上述引擎有相同的行为,如果 Chakra 没有,我会感到惊讶.(做任何更好的事情很快就会变得非常复杂,做任何更糟糕的事情,好吧,你几乎永远不会做 GC 并且有严重的内存问题......)

Building JSC is a pain on Linux, and Chakra doesn't run on Linux. I believe JSC has the same behaviour to the above engines, and I'd be surprised if Chakra didn't have too. (Doing anything better quickly becomes very complex, doing anything worse, well, you'd almost never be doing GC and have serious memory issues…)

相关文章