你能“低头"吗?ES6模板字符串到普通字符串?

我必须解决 gettext 的限制来识别 ES6 模板字符串,并且我考虑将模板字符串的非插值"作为编译步骤,以便只有代码中的正常"字符串.

I have to work around the limitation of gettext to recognise ES6 template strings, and I thought about getting the "non interpolated value" of the template strings as a compilation step, in order to have only "normal" strings in the code.

基本上我想要实现的就是改造这个

Basically what I would like to achieve is transform this

const adjective = 'wonderful'
const something = `Look, I am a ${adjective} string`

console.log(something)
> "Look, I am a wonderful string"

进入这个

const adjective = 'wonderful'
const something = 'Look, I am a ${adjective} string'

console.log(something)
> "Look, I am a ${adjective} string"

实现这一目标的一种残酷方式是使用 sed,但它肯定不是更优雅(而且可能还容易出错)

One brutal way of achieving this is using sed, but it's most certainly not the more elegant (and probably also error prone)

sed "s/`/'/g" FILENAME

有什么更好更简洁的想法吗?

Any better and cleaner idea comes to mind?

推荐答案

好问题.想到了四种解决方案:

Great question. There are four solutions that come to mind:

按照您的建议,在扫描可翻译字符串之前用引号强力替换反引号并不是一个可怕的想法,只要您了解风险即可.例如,考虑:

A brute force replacement of backticks with quote marks prior to scanning for translatable strings, as you suggested, is not a horrible idea, as long as you understand the risks. For instance, consider:

"hello, this word is in `backticks`"

另一个极端情况是

`${`I am nested`}`

这种方法也会破坏多行模板字符串.

This approach will also break multi-line template strings.

当然,正确"的解决方案是编写一个处理模板字符串的 xgettext 分支.然后你可以写

Of course, the "correct" solution is to write a fork of xgettext that deals with template strings. Then you could just write

const something = _(`Look, I am a ${adjective} string`);

不幸的是,这可能比看起来更难.xgettext 内部有一堆与字符串相关的硬连线逻辑.如果你要承担这个项目,很多人会感谢你.

Unfortunately, this could be harder that it seems. There is a bunch of hard-wired logic inside xgettext related to strings. If you were to undertake this project, many would thank you.

更强大的替代方法是使用 JavaScript 解析器,例如 Esprima.这些解析器公开了获取标记(例如模板字符串)的能力.正如您在 http://esprima.org/demo/parse.html 中看到的,相关的要查找的令牌类型是 TemplateLiteral.

The more robust alternative is to use a JavaScript parser such as Esprima. These parsers expose the ability to pick up tokens (such as template strings). As you can see at http://esprima.org/demo/parse.html, the relevant token type to look for is TemplateLiteral.

另一个(不好的?)想法是将模板字符串作为常规字符串编写,然后在运行时将它们视为模板字符串.我们定义一个函数eval_template:

Another (bad?) idea is to write template strings as regular strings to start with, then treat them as template strings at run-time. We define a function eval_template:

const template = _("Look, I am a ${adjective} string");
const something = eval_template(template, {adjective});

eval_template 将字符串转换为评估模板.模板字符串中使用的本地范围内的任何变量都需要作为第二个参数中传递的对象的一部分提供给 eval_template(因为使用 Function 创建的函数位于全局范围并且不能访问局部变量,所以我们必须将它们传入).实现如下:

eval_template converts a string into an evaluated template. Any variable in local scope used in the template string needs to be provided to eval_template as part of the object passed in the second parameter (because functions created using Function are in the global scope and cannot access local variables, so we have to pass them in). It is implemented as follows:

function eval_template_(s, params) {
  var keys = Object.keys(params);
  var vals = keys.map(key => params[key]);

  var f = Function(...keys, "return `" + s + "`");
  return f(...vals);
}

当然,这有点尴尬.这种方法的唯一优点是它不需要预扫描重写.

Granted, this is a bit awkward. The only advantage of this approach is that it requires no pre-scan rewriting.

小问题,但是如果原始模板字符串是多行的,则不能直接将其重写为常规字符串.在这种情况下,您可以将其保留为反引号模板字符串,但将 $ 转义为 $,一切都会好起来的:

Minor point, but if the original template string is multi-line, you cannot directly rewrite it as a regular string. In that case, you can leave it as a back-ticked template string but escape the $ as $, and all will be well:

底线:除非您想重写 xgettext、使用解析器或从事其他黑客活动,否则请进行暴力替换.

Bottom line: unless you want to rewrite xgettext, use a parser, or engage in other hackery, do the brute force replacement.

相关文章