减少 Javascript 中垃圾收集器活动的最佳实践

2022-01-16 00:00:00 garbage-collection javascript

我有一个相当复杂的 Javascript 应用程序,它有一个每秒调用 60 次的主循环.似乎正在进行大量垃圾收集(基于 Chrome 开发工具中内存时间线的锯齿"输出)——这通常会影响应用程序的性能.

因此,我正在尝试研究减少垃圾收集器必须完成的工作量的最佳实践.(我在网上找到的大部分信息都是关于避免内存泄漏,这是一个稍微不同的问题——我的内存正在被释放,只是垃圾收集太多了.)我假设这主要归结为尽可能重用对象,但细节当然是魔鬼.

该应用按照 John Resig 的简单 JavaScript 继承 的类"构造.p>

我认为一个问题是某些函数每秒可以调用数千次(因为它们在主循环的每次迭代中使用了数百次),可能还有这些函数中的局部工作变量(字符串、数组、等)可能是问题.

我知道用于更大/更重对象的对象池(我们在一定程度上使用它),但我正在寻找可以全面应用的技术,尤其是与被多次调用的函数相关的技术在紧密的循环中.

我可以使用哪些技术来减少垃圾收集器必须完成的工作量?

而且,也许还有 - 可以使用哪些技术来识别哪些对象被垃圾回收最多?(这是一个非常大的代码库,所以比较堆的快照并不是很有成果)

解决方案

你需要做的很多事情来最小化 GC churn 在大多数其他场景中被认为是惯用的 JS,所以请记住上下文判断我给出的建议.

现代口译员的分配发生在几个地方:

  1. 当您通过 new 或通过文字语法 [...]{} 创建对象时.
  2. 连接字符串时.
  3. 当您输入包含函数声明的范围时.
  4. 当您执行触发异常的操作时.
  5. 评估函数表达式时:(function (...) { ... }).
  6. 当您执行像 Object(myNumber)Number.prototype.toString.call(42)
  7. 这样的强制对象操作时
  8. 当您调用在后台执行任何这些操作的内置函数时,例如 Array.prototype.slice.
  9. 当您使用 arguments 来反映参数列表时.
  10. 当您拆分字符串或使用正则表达式匹配时.

避免这样做,并尽可能集中和重用对象.

特别是寻找机会:

  1. 将不依赖或很少依赖封闭状态的内部函数拉到更高、寿命更长的范围内.(像 闭包编译器 之类的一些代码压缩器可以内联内部函数,并可能提高您的 GC 性能.)
  2. 避免使用字符串来表示结构化数据或动态寻址.尤其要避免使用 split 或正则表达式匹配重复解析,因为每个都需要多个对象分配.这经常发生在查找表和动态 DOM 节点 ID 的键上.例如,lookupTable['foo-' + x]document.getElementById('foo-' + x) 都涉及分配,因为存在字符串连接.通常,您可以将键附加到长期存在的对象而不是重新连接.根据您需要支持的浏览器,您也许可以使用 Map 直接使用对象作为键.
  3. 避免在正常代码路径上捕获异常.代替 try { op(x) } catch (e) { ... },执行 if (!opCouldFailOn(x)) { op(x);} else { ... }.
  4. 当您无法避免创建字符串时,例如要将消息传递给服务器,请使用像 JSON.stringify 这样的内置函数,它使用内部本机缓冲区来累积内容而不是分配多个对象.
  5. 避免对高频事件使用回调,并且在可能的情况下,将一个长期存在的函数(参见 1)作为回调传递,该函数会根据消息内容重新创建状态.
  6. 避免使用 arguments,因为使用它的函数在调用时必须创建一个类似数组的对象.

我建议使用 JSON.stringify 来创建传出网络消息.使用 JSON.parse 解析输入消息显然涉及分配,其中很多用于大消息.如果您可以将传入消息表示为原语数组,那么您可以节省大量分配.String.prototype.charCodeAt 是您可以构建不分配解析器的唯一其他内置函数.一个复杂格式的解析器只使用它,但阅读起来会很糟糕.

I have a fairly complex Javascript app, which has a main loop that is called 60 times per second. There seems to be a lot of garbage collection going on (based on the 'sawtooth' output from the Memory timeline in the Chrome dev tools) - and this often impacts the performance of the application.

So, I'm trying to research best practices for reducing the amount of work that the garbage collector has to do. (Most of the information I've been able to find on the web regards avoiding memory leaks, which is a slightly different question - my memory is getting freed up, it's just that there's too much garbage collection going on.) I'm assuming that this mostly comes down to reusing objects as much as possible, but of course the devil is in the details.

The app is structured in 'classes' along the lines of John Resig's Simple JavaScript Inheritance.

I think one issue is that some functions can be called thousands of times per second (as they are used hundreds of times during each iteration of the main loop), and perhaps the local working variables in these functions (strings, arrays, etc.) might be the issue.

I'm aware of object pooling for larger/heavier objects (and we use this to a degree), but I'm looking for techniques that can be applied across the board, especially relating to functions that are called very many times in tight loops.

What techniques can I use to reduce the amount of work that the garbage collector must do?

And, perhaps also - what techniques can be employed to identify which objects are being garbage collected the most? (It's a farly large codebase, so comparing snapshots of the heap has not been very fruitful)

解决方案

A lot of the things you need to do to minimize GC churn go against what is considered idiomatic JS in most other scenarios, so please keep in mind the context when judging the advice I give.

Allocation happens in modern interpreters in several places:

  1. When you create an object via new or via literal syntax [...], or {}.
  2. When you concatenate strings.
  3. When you enter a scope that contains function declarations.
  4. When you perform an action that triggers an exception.
  5. When you evaluate a function expression: (function (...) { ... }).
  6. When you perform an operation that coerces to Object like Object(myNumber) or Number.prototype.toString.call(42)
  7. When you call a builtin that does any of these under the hood, like Array.prototype.slice.
  8. When you use arguments to reflect over the parameter list.
  9. When you split a string or match with a regular expression.

Avoid doing those, and pool and reuse objects where possible.

Specifically, look out for opportunities to:

  1. Pull inner functions that have no or few dependencies on closed-over state out into a higher, longer-lived scope. (Some code minifiers like Closure compiler can inline inner functions and might improve your GC performance.)
  2. Avoid using strings to represent structured data or for dynamic addressing. Especially avoid repeatedly parsing using split or regular expression matches since each requires multiple object allocations. This frequently happens with keys into lookup tables and dynamic DOM node IDs. For example, lookupTable['foo-' + x] and document.getElementById('foo-' + x) both involve an allocation since there is a string concatenation. Often you can attach keys to long-lived objects instead of re-concatenating. Depending on the browsers you need to support, you might be able to use Map to use objects as keys directly.
  3. Avoid catching exceptions on normal code-paths. Instead of try { op(x) } catch (e) { ... }, do if (!opCouldFailOn(x)) { op(x); } else { ... }.
  4. When you can't avoid creating strings, e.g. to pass a message to a server, use a builtin like JSON.stringify which uses an internal native buffer to accumulate content instead of allocating multiple objects.
  5. Avoid using callbacks for high-frequency events, and where you can, pass as a callback a long-lived function (see 1) that recreates state from the message content.
  6. Avoid using arguments since functions that use that have to create an array-like object when called.

I suggested using JSON.stringify to create outgoing network messages. Parsing input messages using JSON.parse obviously involves allocation, and lots of it for large messages. If you can represent your incoming messages as arrays of primitives, then you can save a lot of allocations. The only other builtin around which you can build a parser that does not allocate is String.prototype.charCodeAt. A parser for a complex format that only uses that is going to be hellish to read though.

相关文章