1、if(av-avail),本地缓存足够取ac末尾最热数据,objp=ac-entry-ac-avail缓存器缓存申请成功cache_alloc_refill(),重新填充if(ls-shared),本地共享缓存足够transfer_objects()函数,本地共享缓存转给本地缓存成功向ac中转移了至少一个可用对象retry:在三链中搜寻可分配对象三链中有空闲对象cache_grow()函数,从伙伴系统获取kmem_getpages()和alloc_slabmgmt()函数,并执行retryyesno下面来看 kmem_cache_create()函数的实现:/* * kmem_cache_cr
2、eate - Create a cache. * name: A string which is used in /proc/slabinfo to identify this cache. * size: The size of objects to be created in this cache. * align: The required(必须的的) alignment for the objects. / * flags: SLAB flags * ctor: A constructor for the objects. * dtor: A destructor for the ob
3、jects (not implemented anymore). * * Returns a ptr to the cache on success, NULL on failure. /成功返回cache指针,失败返回空 * Cannot be called within a int, but can be interrupted. /不能在中断中调用,但是可以被打断 * The ctor is run when new pages are allocated by the cache * and the dtor is run before the pages are handed bac
4、k. * name must be valid until the cache is destroyed. This implies that * the module calling this has to destroy the cache before getting unloaded. * The flags are /填充标记 * * %SLAB_POISON - Poison(使污染) the slab with a known test pattern (a5a5a5a5) /使用a5a5a5a5填充这片未初始化区域 * to catch references to uninit
5、ialised memory. * %SLAB_RED_ZONE - Insert Red zones around the allocated memory to check /添加红色警戒区,检测越界 * for buffer overruns. * %SLAB_HWCACHE_ALIGN - Align the objects in this cache to a hardware /物理缓存行对齐, * cacheline. This can be beneficial if youre counting cycles as closely * as davem. */ /创建缓存器
6、/* gfporder: 取值011遍历直到计算出cache的对象数量跳出循环,slab由2gfporder个页面组成 buffer_size: 为当前cache中对象经过cache_line_size对齐后的大小 align: 是cache_line_size,按照该大小对齐 flags: 此处为0,用于标识内置slab还是外置slab left_over: 输出值,记录slab中浪费空间的大小 num:输出值,用于记录当前cache中允许存在的对象数目struct kmem_cache *kmem_cache_create (const char *name, size_t size, s
7、ize_t align, unsigned long flags, void (*ctor)(void*, struct kmem_cache *, unsigned long), void (*dtor)(void*, struct kmem_cache *, unsigned long) size_t left_over, slab_size, ralign; struct kmem_cache *cachep = NULL, *pc; /* * Sanity checks. these are all serious usage bugs. /参数检查,名字不能为NULL,不能在中断中调
8、用本函数(本函数可能会睡眠) /获取长度不得小于4字节,即CPU字长, 获取长度不得大于最大值(我剖析的这个版本是225,有的可能是222) if (!name | in_interrupt() | (size KMALLOC_MAX_SIZE | dtor) printk(KERN_ERR %s: Early error in slab %sn, _FUNCTION_, name); BUG(); * We use cache_chain_mutex to ensure a consistent view of * cpu_online_map as well. Please see cpu
9、up_callback mutex_lock(&cache_chain_mutex);#if 0 /DEBUG 部分被我注释掉了,免得挡点 /一些检查机制,无需关注 .#endif * Check that size is in terms of(依据) words. This is needed to avoid * unaligned accesses for some archs(拱) when redzoning is used, and makes /避免当红色警戒区被使用时,避免未对齐的访问接触红区 * sure any on-slab bufctls are also corre
10、ctly aligned. /同时确保任何on-slab的bfclt 正确对齐 */ / / 为什么kmem_cache_init函数已经计算过size,align了,这里还要计算? / 因为这里是用来创建缓存器的,只是借用了cache_cache,而 kmem_cache_init函数中初始化的是cache_cahce的 / size,align等成员,所以无关系。 /先检查 对象 ! 是不是32位对齐,如果不是则进行调整 if (size & (BYTES_PER_WORD - 1) size += (BYTES_PER_WORD - 1); size &= (BYTES_PER_WORD
11、 - 1); /* calculate the final buffer alignment: /* 1) arch recommendation: can be overridden for debug */ /再检查 对象! 要不要求按照缓冲行对齐 if (flags & SLAB_HWCACHE_ALIGN) * Default alignment: as specified by the arch code. Except if * an object is really small, then squeeze multiple objects into * one cacheline
12、. ralign = cache_line_size(); while (size = ralign / 2) /进行对齐大小的调整,我们要保证对象的大小大于 针对硬件缓冲行对齐所需的大小 ralign /= 2; else /不需要按硬件缓冲行对齐,那就默认4字节,即32位 ralign = BYTES_PER_WORD; * Redzoning and user store require word alignment or possibly larger. * Note this will be overridden by architecture or caller mandated
13、* alignment if either is greater than BYTES_PER_WORD. /如果开启了DEBUG,则按需要进行相应的对齐 SLAB_STORE_USER) SLAB_RED_ZONE) ralign = REDZONE_ALIGN; /* If redzoning, ensure that the second redzone is suitably * aligned, by adjusting the object size accordingly. */ size += REDZONE_ALIGN - 1;= (REDZONE_ALIGN - 1); /
14、* 2) arch mandated alignment */ if (ralign _alignof_(unsigned long long) flags &= (SLAB_RED_ZONE | SLAB_STORE_USER); * 4) Store it. align = ralign; /通过上面一大堆计算,算出了align值 /* Get caches description obj. */ /按照cache_cache的大小分配一个kmem_cache新实例,实际上cache_cache在内核初始化完成后就是kmem_cache了,为了内核初始化时可使用kmalloc,所以这里要用
15、cache_cache cachep = kmem_cache_zalloc(&cache_cache, GFP_KERNEL); /哈哈,这就是使用cache_cache /这里会分配一块干净的清零过的内存cachep) goto oops;#if DEBUG#endif * Determine if the slab management is on or off slab. * (bootstrapping cannot cope with offslab caches so dont do * it too early on.) /第一个条件通过PAGE_SIZE确定slab管理对象的
16、存储方式,内置还是外置。 /初始化阶段采用内置式(kmem_cache_init()中创建两个普通高速缓存后就把slab_early_init置0了 if (size = (PAGE_SIZE 3) & !slab_early_init) * Size is large, assume best to place the slab management obj * off-slab (should allow better packing of objs). flags |= CFLGS_OFF_SLAB; size = ALIGN(size, align); /从这一步可知,slab机制先把
17、对象针对及其字长进行对齐,然后再在此基础上又针对硬件缓冲行进行对齐。 /以后所有的对齐都要照这个总的对齐值对齐 /计算碎片大小,计算slab由几个页面(order)组成,同时计算每个slab中有多少个对象 left_over = calculate_slab_order(cachep, size, align, flags); /这次计算的不是cache_cache了cachep-num) printk(KERN_ERR kmem_cache_create: couldnt create cache %s.n, name); kmem_cache_free(&cache_cache, cach
18、ep); cachep = NULL; /计算slab管理对象的大小,包括struct slab对象和 kmem_bufctl_t 数组 slab_size = ALIGN(cachep-num * sizeof(kmem_bufctl_t) + sizeof(struct slab), align); * If the slab has been placed off-slab, and we have enough space then * move it on-slab. This is at the expense of any extra colouring. /如果是一个外置sla
19、b,并且碎片大小大于slab管理对象的大小,则可将slab管理对象移到slab中,改造成一个内置slab! CFLGS_OFF_SLAB & left_over = slab_size) = CFLGS_OFF_SLAB; left_over -= slab_size; /slab_size 就是 slab 管理对象大小 CFLGS_OFF_SLAB) /align是针对slab对象的,如果 slab管理者 是外置存储,自然也不会像内置那样影响到后面slab对象的存储位置 /slab管理者也就不需要对齐了 /* really off slab. No need for manual align
20、ment */ slab_size = cachep-num * sizeof(kmem_bufctl_t) + sizeof(struct slab); /着色块单位,为L1_CACHE_BYTES,即32字节colour_off = cache_line_size(); /* Offset must be a multiple of the alignment. */ /着色单位必须是对齐单位的整数倍 if (cachep-colour_off colour_off; /管理对象的大小slab_size = slab_size;flags = flags;gfpflags = 0; if
21、(CONFIG_ZONE_DMA_FLAG & (flags & SLAB_CACHE_DMA) /与伙伴系统交互的DMA标志gfpflags |= GFP_DMA; /slab对象的大小buffer_size = size; /倒数reciprocal_buffer_size = reciprocal_value(size); /如果是外置slab,这里要分配一个管理对象,保存在slabp_cache中,如果是内置式的slab,此指针为空 /array_cachine, cache_cache, 3list 这几个肯定是内置式,不会进入这个slabp_cache = kmem_find_ge
22、neral_cachep(slab_size, 0u); * This is a possibility for one of the malloc_sizes caches. * But since we go off slab only for object size greater than * PAGE_SIZE/8, and malloc_sizes gets created in ascending order, * this should not happen at all. * But leave a BUG_ON for some lucky dude. BUG_ON(!sl
23、abp_cache); /kmem_cach的名字和它管理的对象的构造函数ctor = ctor;name = name; /设置每个CPU上的local cache,配置local cache和slab 三链 if (setup_cpu_cache(cachep) _kmem_cache_destroy(cachep); /将kmem_cache加入到cache_chain为头的kmem_cache链表中 /* cache setup completed, link it into the list */ list_add(&next, &cache_chain); /还是用了cache_c
24、hainoops:cachep & SLAB_PANIC) panic(kmem_cache_create(): failed to create slab %sn, mutex_unlock(& /mutex return cachep; /返回该 kmem_cache在这个函数中,首先要计算一些对齐的值。一是内存对齐,我们需要数据按照 CPU字长 进行对齐(比如结构体中间一个 char 类型数据 32 位下所占字节可能是 4 字节),这样才能提高 CPU 访问数据的效率。其次如果设置了 SLAB_HWCACHE_ALIGN,那么还要和缓存行(cachine line)进行对齐。和缓存行除此
25、之外,除了需要DEBUG,可能还需要一些对齐,不过这都不是我们关注的重点。总之,前面一大串就是计算出了 slab 的对齐值。然后调用 kmem_cache_zalloc()函数为要创建的缓存器申请内存,我们知道,先前我们定义了 cache_cache 这个缓存器的缓存器,此时就派上用场了,这里就用它来做参数。 * kmem_cache_zalloc - Allocate an object. The memory is set to zero. * cache: The cache to allocate from. See kmalloc(). * Allocate an object from this cache and set the allocated memory to zero. * The flags are only relevant if the cache has no