为什么C++没有反射?

2021-12-26 00:00:00 reflection c++

这是一个有点奇怪的问题.我的目标是了解语言设计决策并确定在 C++ 中进行反射的可能性.

This is a somewhat bizarre question. My objectives are to understand the language design decision and to identify the possibilities of reflection in C++.

  1. 为什么 C++ 语言委员会没有在语言中实现反射?用一种不在虚拟机上运行的语言(比如java)反射是否太难了?

  1. Why C++ language committee did not go towards implementing reflection in the language? Is reflection too difficult in a language that does not run on a virtual machine (like java)?

如果要为 C++ 实现反射,会有什么挑战?

If one were to implement reflection for C++, what will be the challenges?

我想反射的用途是众所周知的:编辑器可以更容易编写,程序代码会更小,可以为单元测试生成模拟等等.但如果你也能评论反射的使用,那就太好了.

I guess the uses of reflection are well-known: editors can be more easily written, program code will be smaller, mocks can be generated for unit tests and so on. But it would be great if you could comment on uses of reflection too.

推荐答案

C++ 中的反射有几个问题.

There are several problems with reflection in C++.

  • 要添加很多工作,而且 C++ 委员会相当保守,除非他们确定会得到回报,否则不要花时间在激进的新功能上.(已经提出了添加类似于 .NET 程序集的模块系统的建议,虽然我认为人们普遍认为拥有它会很好,但目前这不是他们的首要任务,并且已被推迟到很久之后C++0x.此功能的动机是摆脱 #include 系统,但它也将启用至少一些元数据).

  • It's a lot of work to add, and the C++ committee is fairly conservative, and don't spend time on radical new features unless they're sure it'll pay off. (A suggestion for adding a module system similar to .NET assemblies has been made, and while I think there's general consensus that it'd be nice to have, it's not their top priority at the moment, and has been pushed back until well after C++0x. The motivation for this feature is to get rid of the #include system, but it would also enable at least some metadata).

你不会为你不付出的东西付出代价利用.这是必须的基础之一C++ 的设计理念.为什么我的代码应该随身携带如果我可能永远不需要元数据?此外,添加元数据可能会阻止编译器优化.我为什么要付那个如果我可能永远不需要,在我的代码中花费那个元数据?

You don't pay for what you don't use. That's one of the must basic design philosophies underlying C++. Why should my code carry around metadata if I may never need it? Moreover, the addition of metadata may inhibit the compiler from optimizing. Why should I pay that cost in my code if I may never need that metadata?

这将我们引向了另一个重点:C++很少保证关于编译后的代码.这编译器可以做的很漂亮它喜欢什么,只要由此产生的功能是什么是期待.例如,您的课程实际上并不需要在那里.编译器可以优化它们,内联他们所做的一切,而且经常这样做,因为即使是简单的模板代码也倾向于创建相当多的模板实例化.C++ 标准图书馆依赖这种激进的优化.函子只是如果开销为实例化和销毁对象可以被优化掉.向量上的 operator[] 只能与 raw 比较性能中的数组索引因为整个操作符可以是内联并因此完全删除从编译的代码.C# 和 Java做出很多保证编译器的输出.如果我定义C# 中的一个类,然后该类 将存在于结果程序集中.即使我从不使用它.即使所有对其成员函数的调用可以内联.该类必须是在那里,以便反射可以找到它.C# 缓解了这种情况的一部分编译为字节码,这意味着JIT 编译器可以删除类定义和内联如果它喜欢,即使初始 C# 编译器不能.在 C++ 中,你只有一个编译器,而且它必须输出高效的代码.如果你被允许检查元数据一个 C++ 可执行文件,你会期望查看它定义的每个类,其中意味着编译器会有保留所有定义的类,即使它们不是必需的.

Which leads us to another big point: C++ makes very few guarantees about the compiled code. The compiler is allowed to do pretty much anything it likes, as long as the resulting functionality is what is expected. For example, your classes aren't required to actually be there. The compiler can optimize them away, inline everything they do, and it frequently does just that, because even simple template code tends to create quite a few template instantiations. The C++ standard library relies on this aggressive optimization. Functors are only performant if the overhead of instantiating and destructing the object can be optimized away. operator[] on a vector is only comparable to raw array indexing in performance because the entire operator can be inlined and thus removed entirely from the compiled code. C# and Java make a lot of guarantees about the output of the compiler. If I define a class in C#, then that class will exist in the resulting assembly. Even if I never use it. Even if all calls to its member functions could be inlined. The class has to be there, so that reflection can find it. Part of this is alleviated by C# compiling to bytecode, which means that the JIT compiler can remove class definitions and inline functions if it likes, even if the initial C# compiler can't. In C++, you only have one compiler, and it has to output efficient code. If you were allowed to inspect the metadata of a C++ executable, you'd expect to see every class it defined, which means that the compiler would have to preserve all the defined classes, even if they're not necessary.

然后是模板.C++ 中的模板完全不同其他语言中的泛型.每一个模板实例化创建一个新类型.std::vector 是一个完全独立的类std::vector.这加起来一个整体有很多不同的类型程序.我们应该反思什么看?模板 std::vector?但怎么可能,既然是源代码结构,它没有运行时的意思?它必须看到单独的类std::vectorstd::vector.和std::vector::iteratorstd::vector::iterator,相同对于 const_iterator 等等.和一旦你进入模板元编程,你很快就会结束实例化数百个模板,所有这些都被内联和删除再次由编译器.他们没有意思是,除了作为一个编译时元程序.应该都这数百个类是可见的反思?他们不得不,因为否则我们的反思如果它甚至不能保证我定义的类实际上存在,那将是无用的.一个附带问题是模板类在实例化之前不存在.想象一个使用 std::vector 的程序.我们的反射系统应该能够看到 std::vector::iterator 吗?一方面,您肯定会如此期待.这是一个重要的类,它是根据 std::vector 定义的,它确实存在于元数据中.另一方面,如果程序从来没有真正使用这个迭代器类模板,它的类型就永远不会被实例化,所以编译器不会首先生成这个类.在运行时创建它为时已晚,因为它需要访问源代码.

And then there are templates. Templates in C++ are nothing like generics in other languages. Every template instantiation creates a new type. std::vector<int> is a completely separate class from std::vector<float>. That adds up to a lot of different types in a entire program. What should our reflection see? The template std::vector? But how can it, since that's a source-code construct, which has no meaning at runtime? It'd have to see the separate classes std::vector<int> and std::vector<float>. And std::vector<int>::iterator and std::vector<float>::iterator, same for const_iterator and so on. And once you step into template metaprogramming, you quickly end up instantiating hundreds of templates, all of which get inlined and removed again by the compiler. They have no meaning, except as part of a compile-time metaprogram. Should all these hundreds of classes be visible to reflection? They'd have to, because otherwise our reflection would be useless, if it doesn't even guarantee that the classes I defined will actually be there. And a side problem is that the template class doesn't exist until it is instantiated. Imagine a program which uses std::vector<int>. Should our reflection system be able to see std::vector<int>::iterator? On one hand, you'd certainly expect so. It's an important class, and it's defined in terms of std::vector<int>, which does exist in the metadata. On the other hand, if the program never actually uses this iterator class template, its type will never have been instantiated, and so the compiler won't have generated the class in the first place. And it's too late to create it at runtime, since it requires access to the source code.

回复评论:

笔记本:是的,调试符号做了类似的事情,因为它们存储有关可执行文件中使用的类型的元数据.但他们也遭受我所描述的问题.如果您曾经尝试过调试发布版本,您就会明白我的意思.在源代码中创建类的地方存在很大的逻辑空白,该类已被内联到最终代码中.如果您要将反射用于任何有用的事情,您需要它更可靠和一致.事实上,几乎每次编译时类型都会消失.您更改了一个很小的细节,作为响应,编译器决定更改哪些类型被内联,哪些类型没有.当您甚至不能保证最相关的类型将在您的元数据中表示时,您如何从中提取任何有用的信息?您正在寻找的类型可能在上次构建中已经存在,但现在已经消失了.明天,有人会检查一个小的无害的更改到无害的小函数,这使得类型足够大以至于不会完全内联,所以它会再次返回.这对于调试符号仍然很有用,但仅此而已.我不想尝试根据这些条款为类生成序列化代码.

cdleary: Yes, debug symbols do something similar, in that they store metadata about the types used in the executable. But they also suffer from the problems I described. If you've ever tried debugging a release build, you'll know what I mean. There are large logical gaps where you created a class in the source code, which has gotten inlined away in the final code. If you were to use reflection for anything useful, you'd need it to be more reliable and consistent. As it is, types would be vanishing and disappearing almost every time you compile. You change a tiny little detail, and the compiler decides to change which types get inlined and which ones don't, as a response. How do you extract anything useful from that, when you're not even guaranteed that the most relevant types will be represented in your metadata? The type you were looking for may have been there in the last build, but now it's gone. And tomorrow, someone will check in a small innocent change to a small innocent function, which makes the type just big enough that it won't get completely inlined, so it'll be back again. That's still useful for debug symbols, but not much more than that. I'd hate trying to generate serialization code for a class under those terms.

Evan Teran:当然,这些问题可以得到解决.但这又回到了我的观点#1.这需要大量的工作,而且 C++ 委员会有很多他们认为更重要的事情.在 C++ 中获得一些有限的反射(并且会是有限的)的好处是否真的足够大以证明以牺牲其他功能为代价来关注它?添加核心语言的功能是否真的有巨大的好处,这些功能已经(大部分)可以通过库和预处理器(如 QT 的)来完成?也许吧,但与不存在此类库时相比,这种需求的紧迫性要低得多.不过,对于您的具体建议,我相信在模板上禁止它会使它完全无用.例如,您将无法在标准库上使用反射.什么样的反射不会让你看到 std::vector?模板是 C++ 的巨大部分.一个在模板上不起作用的功能基本上没用.

Evan Teran: Of course these issues could be resolved. But that falls back to my point #1. It'd take a lot of work, and the C++ committee has plenty of things they feel is more important. Is the benefit of getting some limited reflection (and it would be limited) in C++ really big enough to justify focusing on that at the expense of other features? Is there really a huge benefit in adding features the core language which can already (mostly) be done through libraries and preprocessors like QT's? Perhaps, but the need is a lot less urgent than if such libraries didn't exist. For your specific suggestions though, I believe disallowing it on templates would make it completely useless. You'd be unable to use reflection on the standard library, for example. What kind of reflection wouldn't let you see a std::vector? Templates are a huge part of C++. A feature that doesn't work on templates is basically useless.

但是您说得对,可以实现某种形式的反射.但这将是语言的重大变化.就像现在一样,类型只是一个编译时构造.它们的存在是为了编译器的利益,没有别的.一旦代码被编译,就没有类了.如果你伸展自己,你可能会争辩说函数仍然存在,但实际上,只有一堆跳转汇编指令,以及很多堆栈推送/弹出.添加此类元数据时,没什么可做的.

But you're right, some form of reflection could be implemented. But it'd be a major change in the language. As it is now, types are exclusively a compile-time construct. They exist for the benefit of the compiler, and nothing else. Once the code has been compiled, there are no classes. If you stretch yourself, you could argue that functions still exist, but really, all there is is a bunch of jump assembler instructions, and a lot of stack push/pop's. There's not much to go on, when adding such metadata.

但就像我说的,有一个修改编译模型的提议,添加自包含模块,存储选择类型的元数据,允许其他模块引用它们而不必与 #include s.这是一个好的开始,老实说,我很惊讶标准委员会并没有因为提案的变化太大而将其抛弃.那么也许在 5-10 年内?:)

But like I said, there is a proposal for changes to the compilation model, adding self-contained modules, storing metadata for select types, allowing other modules to reference them without having to mess with #includes. That's a good start, and to be honest, I'm surprised the standard committee didn't just throw the proposal out for being too big a change. So perhaps in 5-10 years? :)

相关文章