C++ 中的抽象语法树表示

2022-01-23 00:00:00 g++ c++ c++11

我已经有了创建标记列表的标记器接口.我有解析器的工作机制.它真的很独特，就像一个魅力.我唯一想念的是 AST 的基本结构.树、节点和语句应如何在抽象级别表示.我不需要任何实现，只需快速了解它在类层次结构中的外观如何?我正在研究一种面向对象的语言.是的，我已经意识到我需要两种类型的陈述.一些返回值的表达式"类型语句和一个不返回的指令流控制类型语句.非常感谢.

I already have the tokenizer interface which creates a list of tokens. I have the working mechanism for the parser. It's really unique and works like a charm. The only thing that i miss is the basic structure of the AST. How the tree, the nodes and the statements should be represented on abstraction level. I don't need any implementation only a quick idea how should it look like in class hierarchy? I'm working on an object-oriented language. Yeah, i already realized that i will need two types of statements. Some value returning "expression" type statement and a non-returning, instruction flow controlling type statement. Thank you very much.

推荐答案

如果你的语言是命令式/c-like，常见的场景是从顶层结构被分成 2 个超类型开始:

If your language is imperative/c-like, the common scenario begins with top-hierarchy being split in 2 supertypes:

表达式
声明

程序是一个语句列表，它本身就是一个语句.

The program is a list of statement, which is a statement itself.

您可能希望拥有一个扩展语句基类的语句类型类.

You will probably want to have one class for type of statement that extends the statement base class.

一个典型的场景是这样的:

A typical scenario looks like:

语句块(语句列表)
ite (if then else)
for(一个 for 循环及其初始化语句列表、检查表达式、增量语句和块
while(类似，但只检查表达式
变量声明
赋值(包括 += -= ++ --，您可以使用运算符字段、lval 和 rval 将所有内容包装在一个类中)
函数调用(空一)

对于表达式:

Bop(二元运算，任何有 2 个操作数和 1 个运算符，即 + - */% | & && || == <
Uop(一元运算，任何有 1 个操作数和 1 个运算符，即 ~ ！)
函数调用(非空函数)
条件表达式(exp ? true val : false val)

拥有这 2 个抽象(表达式和语句)的好处是，在您的所有类中，您将拥有抽象类型，并且将能够使用访问者模式访问 AST.

The good thing of having this 2 abstractions (expressions and statements) is that inside all your classes, you will have abstract types, and will be able to visit the AST with the visitor pattern, for example.

例如，一些类看起来像这样(伪代码):

For example, some classes would look like this (pseudocode):

class Ite extends Statement { Expression condition; Statement ifBranch; Statement elseBranch; } class Bop extends Expression { BOperator operator; // +, -. * or whatever Expression left; // Left operand Expression right; // Right operand } class StatementBlock extends Statement { List<Statement> statements; } class Assignment extends Statement { AOperator assignOp; // = += -= etc. LVal lvalue; // The lvalue cannot be an arbitrary expression, you will usually have a specific type for it Expression rvalue; // Right value }

此外，您还需要某种方式来表示类型(对于 AST，仅静态类型就足够了，如果您的项目还需要实现一些后端，那么您还需要一些动态类型).

Also, you will need some way to represent types (for the AST, just static types are enough, if you project to implement some back-end as well, you will need some dynamic types too).

如果您不打算支持需要大小信息的固定大小数组，通常可以使用一些枚举指定静态类型.如果你想要固定大小的数组，大小，你可以实现一个类型的类，并让数组类型保存额外的大小信息.

Static types can usually be specified with some enumerations, if you don't plan to support fixed-size arrays which need a size information. If you want fixed-size arrays with, size, you can implement one class for type and have the array type hold additional size information.

enum Type { CHAR, SHORT, INT, LONG, FLOAT, DOUBLE, ARRAY } class Float extends StaticType { final Type type = Type.FLOAT; } class Array extends StaticArray { final Type type = Type.ARRAY; int size; }

然后，您将为 AST 中的每种类型实例化一个 StaticType 实例，例如当用户声明一个变量时.如果您计划在将来进行静态类型检查，您也可以使用相同的层次结构.

You will then instantiate one StaticType instance for every type in the AST, for example when the user declares a variable. You will be able to use the same hierarchy if you plan to do static type-checking in the future, also.

至于以 AST 形式运行/解释代码，您将需要一个 Memory 来保存包含有关运行时内存信息的堆栈/堆.此时，您需要将值与其类型信息一起存储.

As for running/interpreting the code in the AST form, you will need a Memory which will hold a stack/heap containing information about runtime memory. At that point you will need to store values together with their type information.

相关文章