在讲这边文章之前,假设我们都已经掌握了c语言指针知识。并且已经编译好了苹果开源的objc4-756。 关于一些lldb的指令,请移步Xcode调试LLDB 补充两点:
p/x 以16进制打印当前地址
x/4xg 以16进制读取当前对象的首地址向后4位内存地址
一、查看cpp源码 首先我们创建一个工程,声明一个FLYPerson类
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 #import <Foundation/Foundation.h> NS_ASSUME_NONNULL_BEGIN @interface FLYPerson : NSObject { NSString *hobby; } @property (nonatomic , copy ) NSString *nickName;- (void )sayHello; - (void )sayByeBye; - (void )sayGoGo; + (void )sayHappy; @end NS_ASSUME_NONNULL_END
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 #import "FLYPerson.h" @implementation FLYPerson - (void )sayHello { NSLog (@"FLYPerson say : Hello!!!" ); } - (void )sayByeBye { NSLog (@"FLYPerson say : ByeBye!!!" ); } - (void )sayGoGo { NSLog (@"FLYPerson say : GoGo!!!" ); } + (void )sayHappy { NSLog (@"FLYPerson say : Happy!!!" ); } @end
在main函数中我们创建一个对象
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 #import <Foundation/Foundation.h> #import <objc/runtime.h> #import "FLYPerson.h" int main(int argc, const char * argv[]) { @autoreleasepool { NSLog (@"Hello, World!" ); FLYPerson * person = [[FLYPerson alloc] init]; Class pClass = object_getClass(person); [person sayHello]; [person sayByeBye]; [person sayGoGo]; NSLog (@"%@ - %p" , person, pClass); } return 0 ; }
利用clang编译成cpp源码(先cd到当前文件的目录):
clang -rewrite-objc main.m -o main.cpp
存在UIKit等其他动态引用库时: clang -rewrite-objc -fobjc-arc -fobjc-runtime=ios-13.0.0 -isysroot/Application/Xcode.app/Comtents/Developer/Platforms/iPhoneSimulator.platform/Developer/SDKs/iPhoneSimulator13.0.sdk main.m
xcrun xcode 命令 模拟器:xcrun -sdk iphonesimulator clang -arch arm64 -rewrite-objc main.m -o main-arm64.cpp 真机:xcrun -sdk iphoneos clang -arch arm64 -rewrite-objc main.m -o main-arm64.cpp
二、在生成的.cpp源码中查看FLYPerson类 既然我们探究的是类,那就是Class,我们在cpp文件中可以看到:
1 typedef struct objc_class * Class;
这说明,Class其实就是objc_class结构体的指针 继续查找,可以发现objc_clss的声明:
1 2 3 struct objc_class { Class _Nonnull isa __attribute__ ((deprecated)); } __attribute__ ((unavailable));
已经注释已经废弃,没有办法了么?还记得我们已经准备好了756的源码,去源码中搜索objc_class,会发现几点重要信息: 上图中可以证实,class其实就是objc_class
三、分析objc_class源码 1、先来了解一下主要的数据结构 看一下objc_class数据结构: objc-runtime-new.h中struct objc_class : objc_object
1 2 3 4 5 6 7 8 struct objc_class : objc_object { Class superclass; cache_t cache; class_data_bits_t bits; ... }
看一下内部属性的各自数据结构,这里的// Class ISA; ISA被隐藏了,这是因为objc_class继承自objc_object
1 2 3 struct objc_object { Class _Nonnull isa OBJC_ISA_AVAILABILITY }
注意: 这里isa用Class类型,应该是与oc的多态类似,isa和class的结构相同,或者isa就是按照class的结构来设计的。
cache_t:(对sel和imp做缓存,这里有一个3/4缓存机制)
1 2 3 4 5 6 struct cache_t { struct bucket_t *_buckets; mask_t _mask; mask_t _occupied; ... }
class_data_bits_t:(这里只有一个属性bits,其中具体的值要转换成data查看,即class_rw_t* data() )
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 struct class_data_bits_t { uintptr_t bits; private : ... public : class_rw_t * data () { return (class_rw_t *)(bits & FAST_DATA_MASK); } void setData (class_rw_t *newData) { assert (!data () || (newData->flags & (RW_REALIZING | RW_FUTURE))); uintptr_t newBits = (bits & ~FAST_DATA_MASK) | (uintptr_t )newData; atomic_thread_fence (memory_order_release); bits = newBits; } ... }
class_rw_t的数据结构:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 struct class_rw_t { uint32_t flags; uint32_t version; const class_ro_t *ro; method_array_t methods; property_array_t properties; protocol_array_t protocols; Class firstSubclass; Class nextSiblingClass; char *demangledName; #if SUPPORT_INDEXED_ISA uint32_t index; #endif void setFlags (uint32_t set) { OSAtomicOr32Barrier (set, &flags); } void clearFlags (uint32_t clear) { OSAtomicXor32Barrier (clear, &flags); } void changeFlags (uint32_t set, uint32_t clear) { assert ((set & clear) == 0 ); uint32_t oldf, newf; do { oldf = flags; newf = (oldf | set) & ~clear; } while (!OSAtomicCompareAndSwap32Barrier (oldf, newf, (volatile int32_t *)&flags)); } };
class_ro_t的数据结构:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 struct class_ro_t { uint32_t flags; uint32_t instanceStart; uint32_t instanceSize; #ifdef __LP64__ uint32_t reserved; #endif const uint8_t * ivarLayout; const char * name; method_list_t * baseMethodList; protocol_list_t * baseProtocols; const ivar_list_t * ivars; const uint8_t * weakIvarLayout; property_list_t *baseProperties; method_list_t *baseMethods () const { return baseMethodList; } };
2、计算objc_class中各属性占用的字节长度 提取出objc_class中的属性: Class ISA; // 8字节 Class superclass; // 8字节 cache_t cache; // 经过计算后为16字节 class_data_bits_t bits; // class_rw_t * plus custom rr/alloc flags
3、获取类中成员变量、属性和method 此时回到我们main函数: 利用lldb指令打印FLYperson对象指针地址:
1 2 3 (lldb) x/4 gx pClass 0 x1000021f8 : 0 x001d8001000021d1 0 x0000000100b37140 0x100002208 : 0 x00000001003da290 0 x0000000000000000
0x1000021f8 偏移8个字节,刚好到superclass的首地址,打印出了superclass
1 2 (lldb) po 0x100002200 <NSObject: 0x100002200 >
利用内存地址偏移,0x100002208 较 0x1000021f8 刚好偏移16位,刚好指向objc_class中的cache的首地址
1 2 (lldb) po 0x100002208 4294976008
打印cache的首地址,发现是一串数字,因为cache是结构体,内部有多个值,进行了内存对齐,所以打印出来的是多个值的组合,先过滤,最后进行摸索,由于cache是16位,从cache首地址偏移16位,就能到达bits首地址
1 2 3 4 5 6 (lldb) po 0x100002218 objc[14430 ]: Attempt to use unknown class 0 x101e09dd0. 4294976024 (lldb) p 0x100002218 (long) $5 = 4294976024
两种方式均报错了,尝试强转(因为已经不是oc中对象类型,以下都是用p来打印)
1 2 p 0 x100002218 $7 = 0 x0000000100002218
此时用到class_data_bits_t中的方法(此方法是通过掩码的方式,将数据获取出来,FAST_DATA_MASK是掩码)
1 2 3 class_rw_t * data () { return (class_rw_t *)(bits & FAST_DATA_MASK ); }
继续获取:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 (lldb) p $7 ->data () (class_rw_t *) $9 = 0x0000000101e09dd0 (lldb) p *$9 (class_rw_t ) $10 = { flags = 2148139008 version = 0 ro = 0x0000000100002170 methods = { list_array_tt<method_t , method_list_t > = { = { list = 0x00000001000020a8 arrayAndFlag = 4294975656 } } } properties = { list_array_tt<property_t , property_list_t > = { = { list = 0x0000000100002158 arrayAndFlag = 4294975832 } } } protocols = { list_array_tt<unsigned long , protocol_list_t > = { = { list = 0x0000000000000000 arrayAndFlag = 0 } } } firstSubclass = nil nextSiblingClass = NSUUID demangledName = 0x0000000000000000 }
继续
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 (lldb) p $10 .ro (const class_ro_t *) $11 = 0x0000000100002170 (lldb) p *$11 (const class_ro_t) $12 = { flags = 388 instanceStart = 8 instanceSize = 24 reserved = 0 ivarLayout = 0x0000000100000f45 "\x02" name = 0x0000000100000f3b "FLYPerson" baseMethodList = 0x00000001000020a8 baseProtocols = 0x0000000000000000 ivars = 0x0000000100002110 weakIvarLayout = 0x0000000000000000 baseProperties = 0x0000000100002158 }
属性:
1 2 3 4 5 6 7 8 9 10 (lldb) p $12 .baseProperties (property_list_t *const) $13 = 0x0000000100002158 (lldb) p *$13 (property_list_t) $14 = { entsize_list_tt<property_t, property_list_t, 0 > = { entsizeAndFlags = 16 count = 1 first = (name = "nickName" , attributes = "T@\" NSString\",C,N,V_nickName" ) } }
成员变量:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 (lldb) p $12. ivars (const ivar_list_t *const ) $15 = 0x0000000100002110 (lldb) p *$15 (const ivar_list_t ) $16 = { entsize_list_tt<ivar_t , ivar_list_t , 0 > = { entsizeAndFlags = 32 count = 2 first = { offset = 0x00000001000021c0 name = 0x0000000100000f7d "hobby" type = 0x0000000100000fa8 "@\"NSString\"" alignment_raw = 3 size = 8 } } }
方法列表:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 (lldb) p $12 .baseMethodList (method_list_t *const) $17 = 0x00000001000020a8 (lldb) p *$17 (method_list_t) $18 = { entsize_list_tt<method_t, method_list_t, 3 > = { entsizeAndFlags = 26 count = 4 first = { name = "sayHello" types = 0x0000000100000f8d "v16@0:8" imp = 0x0000000100000c90 (FLYTest`-[FLYPerson sayHello] at FLYPerson.m:12 ) } } }
count = 4 说明有4个方法,让我们一一打印(get是结构体中的函数,可去结构体中查看)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 (lldb) p $18 .get (0 ) (method_t) $19 = { name = "sayHello" types = 0x0000000100000f8d "v16@0:8" imp = 0x0000000100000c90 (FLYTest`-[FLYPerson sayHello] at FLYPerson.m:12 ) } (lldb) p $18 .get (1 ) (method_t) $20 = { name = ".cxx_destruct" types = 0x0000000100000f8d "v16@0:8" imp = 0x0000000100000d60 (FLYTest`-[FLYPerson .cxx_destruct] at FLYPerson.m:10 ) } (lldb) p $18 .get (2 ) (method_t) $21 = { name = "setNickName:" types = 0x0000000100000f9d "v24@0:8@16" imp = 0x0000000100000d20 (FLYTest`-[FLYPerson setNickName:] at FLYPerson.h:16 ) } (lldb) p $18 .get (3 ) (method_t) $22 = { name = "nickName" types = 0x0000000100000f95 "@16@0:8" imp = 0x0000000100000cf0 (FLYTest`-[FLYPerson nickName] at FLYPerson.h:16 ) }
以上并没有类方法,因为类方法是存放在元类中的,让我们来获取该类的ISA,即该类的元类
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 (lldb) x/4 xg pClass 0 x1000021f8 : 0 x001d8001000021d1 0 x0000000100b37140 0x100002208 : 0 x00000001003da290 0 x0000000000000000 (lldb) po 0 x001d8001000021d1 & 0 x00007ffffffffff8 FLYPerson (lldb) p/x 0 x001d8001000021d1 & 0 x00007ffffffffff8 (long) $3 = 0 x00000001000021d0 (lldb) x/4 gx 0 x00000001000021d0 0 x1000021d0 : 0 x001d800100b370f1 0 x0000000100b370f0 0 x1000021e0 : 0 x000000010112bcb0 0 x0000000100000003 (lldb) p (class_data_bits_t *)0 x1000021f0 (class_data_bits_t *) $5 = 0 x00000001000021f0 (lldb) p $5 ->data() (class_rw_t *) $7 = 0 x0000000101111ac0 (lldb) p $7 ->ro (const class_ro_t *) $8 = 0 x0000000100002060 (lldb) p *$8 (const class_ro_t) $9 = { flags = 389 instanceStart = 40 instanceSize = 40 reserved = 0 ivarLayout = 0 x0000000000000000 name = 0 x0000000100000f3b "FLYPerson" baseMethodList = 0 x0000000100002040 baseProtocols = 0 x0000000000000000 ivars = 0 x0000000000000000 weakIvarLayout = 0 x0000000000000000 baseProperties = 0 x0000000000000000 } (lldb) p $9 .baseMethodList (method_list_t *const) $11 = 0 x0000000100002040 (lldb) p *$11 (method_list_t) $12 = { entsize_list_tt<method_t, method_list_t, 3 > = { entsizeAndFlags = 26 count = 1 first = { name = "sayHappy" types = 0 x0000000100000f8d "v16@0 :8 " imp = 0 x0000000100000cc0 (FLYTest`+[FLYPerson sayHappy] at FLYPerson.m:16 ) } } }
经过一顿操作之后,又回到了上述的地方(至于0x00007ffffffffff8怎么来的,会在另一篇文章中指出)
小结: 在获取成员变量、属性和方法的时候,objc_class -> bits -> class_rw_t * data() -> class_ro_t * ro -> 获取对应的属性。 此处需要注意的是:在class_rw_t中
method_array_t methods;
property_array_t properties;
protocol_array_t protocols;
还不清楚这三个属性的作用,有时间补充
4、探索cache的原理 看一下cache_t完整的结构体
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 struct cache_t { struct bucket_t *_buckets; mask_t _mask; mask_t _occupied; public : struct bucket_t *buckets (); mask_t mask () ; mask_t occupied () ; void incrementOccupied () ; void setBucketsAndMask (struct bucket_t *newBuckets, mask_t newMask) ; void initializeToEmpty () ; mask_t capacity () ; bool isConstantEmptyCache () ; bool canBeFreed () ; static size_t bytesForCapacity (uint32_t cap) ; static struct bucket_t * endMarker (struct bucket_t *b, uint32_t cap); void expand () ; void reallocate (mask_t oldCapacity, mask_t newCapacity) ; struct bucket_t * find (cache_key_t key, id receiver); static void bad_cache (id receiver, SEL sel, Class isa) __attribute__ ((noreturn)) ; };
bucket_t的结构:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 struct bucket_t {private : #if __arm64__ MethodCacheIMP _imp; cache_key_t _key; #else cache_key_t _key; MethodCacheIMP _imp; #endif public : inline cache_key_t key () const { return _key; } inline IMP imp () const { return (IMP)_imp; } inline void setKey (cache_key_t newKey) { _key = newKey; } inline void setImp (IMP newImp) { _imp = newImp; } void set (cache_key_t newKey, IMP newImp) ; };
可见,sel和imp都是存在bucket_t中。 跑一下项目,将断点打在[person sayGoGo],利用lldb查看内存
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 (lldb) x person 0x102100020: 4d 22 00 00 01 80 1d 00 00 00 00 00 00 00 00 00 M".............. 0x102100030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ (lldb) p (cache_t *)0x102100030 (cache_t *) $1 = 0x0000000102100030 (lldb) p *$1 (cache_t) $2 = { _buckets = 0x0000000000000000 _mask = 0 _occupied = 0 } (lldb) x pClass 0x100002248: 21 22 00 00 01 80 1d 00 40 71 b3 00 01 00 00 00 !"......@q...... 0x100002258: 80 01 10 02 01 00 00 00 03 00 00 00 03 00 00 00 ................ (lldb) p (cache_t *)0x100002258 (cache_t *) $4 = 0x0000000100002258 (lldb) p *$4 (cache_t) $5 = { _buckets = 0x0000000102100180 _mask = 3 _occupied = 3 } (lldb) p $5._buckets (bucket_t *) $6 = 0x0000000102100180 (lldb) p *$6 (bucket_t) $7 = { _key = 4294971196 _imp = 0x0000000100000ba0 (FLYTest`-[FLYPerson sayHello] at FLYPerson.m:12) }
可以看到,我们先获取person的cache_t,发现内部的值都是空值,又来获取其Class的cache_t,发现是有值的,得出结论,对象的方法列表缓存是存在其类对象中(其实就是对象的ISA中,因为对象的元类就是其Class)。 既然我们知道了方法是缓存在class中,那我们直接探索class中方法的缓存策略。
经过诸多尝试,我们发现,对象在调用方法的时候,执行的步骤大概如下: 重点来了,看一下
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 static void cache_fill_nolock(Class cls, SEL sel, IMP imp, id receiver) { cacheUpdateLock.assertLocked(); if (!cls-> isInitialized()) return; if (cache_getImp(cls, sel)) return; cache_t *cache = getCache(cls); cache_key_t key = getKey(sel); mask_t newOccupied = cache-> occupied() + 1 ; mask_t capacity = cache-> capacity(); if (cache-> isConstantEmptyCache()) { cache -> reallocate(capacity, capacity ?: INIT_CACHE_SIZE); } else if (newOccupied <= capacity / 4 * 3 ) { } else { cache -> expand(); } bucket_t *bucket = cache-> find(key, receiver); if (bucket->key () == 0) cache-> incrementOccupied(); bucket -> set(key, imp); }
1 2 3 4 5 6 7 8 9 10 11 cache_t *getCache (Class cls) { assert (cls); return &cls->cache; } cache_key_t getKey (SEL sel) { assert (sel); return (cache_key_t)sel; }
以上两个方法没什么好说的,一个获取cache,一个获取sel的key,这里是把(char类型)sel转成了unsigned long,因为数字比char容易处理,且速度快。 cache->capacity() 方法:
1 2 3 4 mask_t cache_t::capacity () { return mask () ? mask ()+1 : 0 ; }
capacity 获取新的容量,在原基础上+1。
cache->reallocate方法:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 void cache_t::reallocate (mask_t oldCapacity, mask_t newCapacity) { bool freeOld = canBeFreed (); bucket_t *oldBuckets = buckets (); bucket_t *newBuckets = allocateBuckets (newCapacity); assert (newCapacity > 0 ); assert ((uintptr_t )(mask_t )(newCapacity-1 ) == newCapacity-1 ); setBucketsAndMask (newBuckets, newCapacity - 1 ); if (freeOld) { cache_collect_free (oldBuckets, oldCapacity); cache_collect (false ); } }
reallocate方法,开辟新空间,会把原来的内存都释放掉,也就是会把原来缓存的内容全部清除掉。
cache->expand方法:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 void cache_t::expand () { cacheUpdateLock.assertLocked (); uint32_t oldCapacity = capacity (); uint32_t newCapacity = oldCapacity ? oldCapacity*2 : INIT_CACHE_SIZE; if ((uint32_t )(mask_t )newCapacity != newCapacity) { newCapacity = oldCapacity; } reallocate (oldCapacity, newCapacity); }
expand是扩容方法,从代码中可以得知,如果原来的容量是0,则创建为4的新容量,如果不是0,则扩展为原来的两倍。扩容的时候会在最后一位插入key为1,值根据不同设备存不同的值。容量设置为4-1或者两倍-1。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 bucket_t * cache_t::find (cache_key_t k, id receiver) { assert (k != 0 ); bucket_t *b = buckets (); mask_t m = mask (); mask_t begin = cache_hash (k, m); mask_t i = begin; do { if (b[i].key () == 0 || b[i].key () == k) { return &b[i]; } } while ((i = cache_next (i, m)) != begin); Class cls = (Class)((uintptr_t )this - offsetof (objc_class, cache)); cache_t ::bad_cache (receiver, (SEL)k, cls); }
find方法找到对应当前sel的bucket,如果找不到获取一个空的bucket,如果没有空的则报错。这里其实有一个算法,所以每次遍历并不是从0开始的,所以每次缓存方法的时候的位置不是依次存储的,而是根据该算法有关。
小结: 知道了每个方法的作用,不难看出cache的机制,进来之后先判断cache是否为空,如果为空,创建空间为4的容量的缓存区。如果非空,且当前占据小于等于总容量的3/4,直接进行缓存,如果大于总容量3/4则进行扩容,扩容过程中会把之前的缓存清掉,然后将当前要缓存的进行缓存。
四、最后 以上只是一个读取类的数据结构的思路,这里主要是经过了内存地址读取,内存偏移,分析结构体等等,重要的是思路和过程,当然结果也重要,毕竟可以吹一波了!