从Mach-O角度谈谈Swift和OC的存储差异
导读
背景
动态调用
在正文开始之前,我们先来看个与主题无关的例子。
class MyClass {
var p:Int = 0
init() {
print("init")
}
func helloSwift() -> Int {
print("helloSwift")
return 100
}
func helloSwift1() -> Int {
print("helloSwift1")
return 100
}
func helloSwift2() -> Int {
print("helloSwift2")
return 100
}
}
在运行时,我们能否动态调用上面这个类的函数呢?如果换成OC语言,我相信绝大多数iOSer 都知道如何动态调用。
以下面代码为例:
/**
假设MyClass由OC实现
*/
@interface MyClass : NSObject
@property(nonatomic,assign)int p;
@end
@implementation MyClass
- (instancetype)init{
if (self = [super init]) {
NSLog(@"init");
}
return self;
}
- (int)helloSwift{
NSLog(@"helloSwift");
return 100;
}
- (int)helloSwift1{
NSLog(@"helloSwift1");
return 100;
}
- (int)helloSwift2{
NSLog(@"helloSwift2");
return 100;
}
@end
/**
那么通过runtime可以获取到任意的方法IMP
*/
Class class = NSClassFromString(@"MyClass");
unsigned int count = 0;
Method *list = class_copyMethodList(class,&count);
for (int i = 0; i < count; i++) {
Method method = list[i];
NSLog(@"- [%@ %@]",class,NSStringFromSelector(method_getName(method)));
}
NSLog(@"%@ count = %u",class,count);
//模拟通过IMP调用更直观
Method method = class_getInstanceMethod(class, @selector(helloSwift));
IMP imp = method_getImplementation(method);
imp();
打印结果如下
2020-11-19 17:16:08.885763+0800 SwiftToolDemo[45037:17709798] - [MyClass init]
2020-11-19 17:16:08.886219+0800 SwiftToolDemo[45037:17709798] - [MyClass helloSwift]
2020-11-19 17:16:08.886389+0800 SwiftToolDemo[45037:17709798] - [MyClass helloSwift1]
2020-11-19 17:16:08.886537+0800 SwiftToolDemo[45037:17709798] - [MyClass helloSwift2]
2020-11-19 17:16:08.886680+0800 SwiftToolDemo[45037:17709798] - [MyClass p]
2020-11-19 17:16:08.886823+0800 SwiftToolDemo[45037:17709798] - [MyClass setP:]
2020-11-19 17:16:08.886932+0800 SwiftToolDemo[45037:17709798] MyClass count = 6
2020-11-19 17:16:08.887166+0800 SwiftToolDemo[45037:17709798] helloSwift
Class class = NSClassFromString(@"SwiftDynamicRun.MyClass");
unsigned int count = 0;
Method *list = class_copyMethodList(class,&count);
for (int i = 0; i < count; i++) {
Method method = list[i];
NSLog(@"- [%@ %@]",class,NSStringFromSelector(method_getName(method)));
}
打印结果如下:
2020-11-11 16:08:30.714057+0800 SwiftDynamic[71869:13232511] SwiftDynamic.MyClass count = 0
OC的存储
通过 __objc_classlist中的地址,我们能找到每个类的详细信息。本文以arm64架构为例,在找到0x11820文件偏移后,我们很容易通过结构体结构套取到类的信息。
struct class64
{
unsigned long long isa;
unsigned long long superClass;
unsigned long long cache;
unsigned long long vtable;
unsigned long long data;
};
在本文中,可能有同学对地址和偏移的换算存在困惑。例如8字节中存储的是0x1000011820,为什么我们要去寻找0x11820的文件偏移。在Mach-O需要先判断0x1000011820位于哪个segment中,在Load Commands里会记录每个segment的起始虚拟地址及size。
if (address >= segmentCommand.vmaddr && address <= segmentCommand.vmaddr + segmentCommand.vmsize) {
return address - (segmentCommand.vmaddr - segmentCommand.fileoff);
}
在本文中,为了不影响阅读,可以将虚拟地址 - 0x100000000当做文件偏移。
因此class64结构体的isa就位于0x11820的连续8字节。data就位于0x11820随后的第5个8字节。
上文中struct class64 中的data指向了class64Info结构体的地址。根据class64Info结构体我们很容易能找到类名和类的实例方法列表。并且通过方法列表的IMP找到每个函数的起始地址。
上文的简单演示下OC类信息的遍历过程,即如何找到每个类的每个方法及首条指令地址。除了实例方法外还有类方法、分类中的方法等等,详细的过程和代码可以参考58开源的WBBlades(https://github.com/wuba/WBBlades),代码中有详细的过程,在此不再赘述。
Swift
虽然Swift完整保留了struct class64和struct class64Info的数据结构,但是MyClass并没有将方法列表保存到struct class64Info中。那么在这里就会有2个问题
为什么Swift类要保留OC的类结构?
MyClass的方法存在哪里?
Swift类要保留OC的类结构是为了兼容OC,部分Swift类继承自OC,并且需要向OC暴露接口,不可避免地需要借用OC的消息转发机制。
那么MyClass的方法存储在哪里呢?参考Swift5.0的Runtime机制浅析的总结(https://www.jianshu.com/p/158574ab8809),可能一部分方法在编译优化时被内联化。假设先不考虑内联这种场景,如何找到每个MyClass的函数表呢?
Swift除了兼容了OC的存储结构外,还具备自己的存储结构,通过MachOView能看到Mach-O文件中存储了很多以swift5命名的section(以swift5示例)。
这些section中,__swift5_*中存储的是Class、Struct、Enum的地址。具体每个section存储Swift的哪些数据,在Swift metadata(https://knight.sc/reverse%20engineering/2019/07/17/swift-metadata.html)一文中有较为详细的描述。
如果此时你打开MachOView,查看__swift5_*的二进制数据后你会发现它与OC的存储有很大的不同。在OC中,存储地址通常都是8字节的直接存储对应的地址。但是*不是8字节地址,而是4字节,并且所存储的数据明显不是直接地址,而是相对地址。那么如何得出MyClass的地址呢?当前文件偏移 + 随后4字节中存储的value即可得到地址。
经过计算后可发现,MyClass的偏移位于__TEXT,__const中。无论是按 Scott Knight(https://knight.sc/)整理好的结构:
type ClassDescriptor struct {
Flags uint32
Parent int32
Name int32
AccessFunction int32
FieldDescriptor int32
SuperclassType int32
MetadataNegativeSizeInWords uint32
MetadataPositiveSizeInWords uint32
NumImmediateMembers uint32
NumFields uint32
}
还是按HandyJSON(https://github.com/alibaba/HandyJSON)整理的结构:
struct _ClassContextDescriptor: _ContextDescriptorProtocol {
var flags: Int32
var parent: Int32
var mangledNameOffset: Int32
var field*Accessor: Int32
var reflectionFieldDescriptor: Int32
var superClsRef: Int32
var metadataNegativeSizeInWords: Int32
var metadataPositiveSizeInWords: Int32
var numImmediateMembers: Int32
var numberOfFields: Int32
var fieldOffsetVector: Int32
}
那到底是不是ClassDescriptor这个结构体还有其他的内容呢?这个只能从源码中寻找答案了。
首先查看了 ClassContextDescriptorBuilder 的layout方法,这里似乎能看到我们想要的信息——VTable。
class ClassContextDescriptorBuilder
//重写了addLayoutInfo
void layout() {
super::layout();
addVTable();
addOverrideTable();
addObjCResilientClassStubInfo();
maybeAddCanonicalMetadataPrespecializations();
}
}
ClassContextDescriptorBuilder //重写父类addLayoutInfo方法,从而添加SuperclassType 、MetadataNegativeSizeInWords、MetadataPositiveSizeInWords、NumImmediateMembers 、NumFields、FieldOffsetVectorOffset、VTable、OverrideTable等
^
|
TypeContextDescriptorBuilderBase // 添加Name、AccessFunction、FieldDescriptor、NumFields、FieldOffsetVectorOffset
^
|
ContextDescriptorBuilderBase //添加Flag 、Parent
class TypeContextDescriptorBuilderBase
void layout() {
asImpl().computeIdentity();
super::layout();
asImpl().addName();
asImpl().addAccessFunction();
asImpl().addReflectionFieldDescriptor();
asImpl().addLayoutInfo();
asImpl().addGenericSignature();
asImpl().maybeAddResilientSuperclass();
asImpl().maybeAddMetadataInitialization();
}
void addLayoutInfo() {
auto properties = getType()->getStoredProperties();
// uint32_t NumFields;
B.addInt32(properties.size());
// uint32_t FieldOffsetVectorOffset;
B.addInt32(FieldVectorOffset / IGM.getPointerSize());
}
}
class ContextDescriptorBuilderBase {
void layout() {
asImpl().addFlags();
asImpl().addParent();
}
}
void addVTable() {
...
B.addInt32(VTableEntries.size());
for (auto fn : VTableEntries)
emitMethodDescriptor(fn);
}
在addVTable函数中可以看出,在依次存储函数前,先通过4字节存储函数表的大小。
从上文中的代码描述来看,在某些情况下是不存在VTable的,那么怎么才能知道是否存在VTable呢?如果不存在VTable的情况下,按照存在VTable的结构去解析,会造成错乱。
按照Mach-O的习惯,一般Kind、Flag这样的字节都会有一定的标示性,能够通过一个或几个字节告诉我们后续内容的类别情况。
经过整理,Flag的详细说明如下:
-------------------------------------------------------------------------------------------------
| TypeFlag(16bit) | version(8bit) | generic(1bit) | unique(1bit) | unknown (1bit) | Kind(5bit) |
-------------------------------------------------------------------------------------------------
先来看2个枚举:
// Kinds of context descriptor.
enum class ContextDescriptorKind : uint8_t {
/// This context descriptor represents a module.
Module = 0,
/// This context descriptor represents an extension.
Extension = 1,
/// This context descriptor represents an anonymous possibly-generic context
/// such as a function body.
Anonymous = 2,
/// This context descriptor represents a protocol context.
Protocol = 3,
/// This context descriptor represents an opaque type alias.
OpaqueType = 4,
/// First kind that represents a type of any sort.
Type_First = 16,
/// This context descriptor represents a class.
Class = Type_First,
/// This context descriptor represents a struct.
Struct = Type_First + 1,
/// This context descriptor represents an enum.
Enum = Type_First + 2,
/// Last kind that represents a type of any sort.
Type_Last = 31,
};sVTable = 15, };
/// Flags for nominal type context descriptors. These values are used as the
/// kindSpecificFlags of the ContextDescriptorFlags for the type.
class TypeContextDescriptorFlags : public FlagSet<uint16_t> {
enum {
// All of these values are bit offsets or widths.
// Generic flags build upwards from 0.
// Type-specific flags build downwards from 15.
/// Whether there's something unusual about how the metadata is
/// initialized.
///
/// Meaningful for all type-descriptor kinds.
MetadataInitialization = 0,
MetadataInitialization_width = 2,
/// Set if the type has extended import information.
///
/// If true, a sequence of strings follow the null terminator in the
/// descriptor, terminated by an empty string (i.e. by two null
/// terminators in a row). See TypeImportInfo for the details of
/// these strings and the order in which they appear.
///
/// Meaningful for all type-descriptor kinds.
HasImportInfo = 2,
/// Set if the type descriptor has a pointer to a list of canonical
/// prespecializations.
HasCanonicalMetadataPrespecializations = 3,
// Type-specific flags:
/// The kind of reference that this class makes to its resilient superclass
/// descriptor. A TypeReferenceKind.
///
/// Only meaningful for class descriptors.
Class_ResilientSuperclassReferenceKind = 9,
Class_ResilientSuperclassReferenceKind_width = 3,
/// Whether the immediate class members in this metadata are allocated
/// at negative offsets. For now, we don't use this.
Class_AreImmediateMembersNegative = 12,
/// Set if the context descriptor is for a class with resilient ancestry.
///
/// Only meaningful for class descriptors.
Class_HasResilientSuperclass = 13,
/// Set if the context descriptor includes metadata for dynamically
/// installing method overrides at metadata instantiation time.
Class_HasOverrideTable = 14,
/// Set if the context descriptor includes metadata for dynamically
/// constructing a class's vtables at metadata instantiation time.
///
/// Only meaningful for class descriptors.
Class_HasVTable = 15,
};
低5位标识当前描述的类型,是Class | Struct | Enum | Protocol等等。 高16位用于标识是否有Class_HasVTable | Class_HasOverrideTable | Class_HasResilientSuperclass 等等。
如何实现动态调用
函数的Flag解释如下,感兴趣的可以关注下
/**
------------------------------------------------------------------------------------
| ExtraDiscriminator(16bit) | .. | isDynamic(1bit) | isInstance(1bit) | Kind(4bit) |
------------------------------------------------------------------------------------
enum class Kind {
Method,
Init,
Getter,
Setter,
ModifyCoroutine,
ReadCoroutine,
};
*/
另外,overrideTable在Demo中没有实现,但是结构和存储位置在代码做了注释标记,感兴趣的可以自己解析下。
//OverrideTable结构如下,紧随VTable后4字节为OverrideTable数量,再其后为此结构数组
struct SwiftOverrideMethod {
struct SwiftClassType *OverrideClass;
struct SwiftMethod *OverrideMethod;
struct SwiftMethod *Method;
};
总结
相关文章