malloc相关源码解读

Posted by l0tus on 2023-03-17
Estimated Reading Time 21 Minutes
Words 4.5k In Total
Viewed Times

做比较难的堆题需要对堆这个数据结构以及glibc各种函数的实现源码有充分的掌握,因此我打算详细读一遍源码,对堆获得一个比较完备的认识。
本文分析的glibc版本为glibc2.33
注:本文由l0tus一人编写而成,编写过程参考了ctf wiki和A1ex、chuj等师傅的博客,如果有错误欢迎联系本人更正。

从一个chunk的结构开始讲起

在glibc源码中,可以找到malloc_chunk这个结构体的定义:

1
2
3
4
5
6
7
8
9
10
11
12
struct malloc_chunk {

INTERNAL_SIZE_T mchunk_prev_size; /* Size of previous chunk (if free). */
INTERNAL_SIZE_T mchunk_size; /* Size in bytes, including overhead. */

struct malloc_chunk* fd; /* double links -- used only if free. */
struct malloc_chunk* bk;

/* Only used for large blocks: pointer to next larger size. */
struct malloc_chunk* fd_nextsize; /* double links -- used only if free. */
struct malloc_chunk* bk_nextsize;
};

这就是一个标准的chunk结构,我们对其加以分析。首先最前面的0x10位是mchunk_prev_sizemchunk_size位,也就是常说的prev_sizesize,分别对应了上一个chunk和当前chunk的大小,这部分就是常说的chunk头。但是需要注意的是,prev_size只在物理相邻的上一个堆块为freed状态时才生效。
后0x10位是fdbk指针,用于记录单双向链表(也就是被free之后形成的链表)的前后chunk节点。后0x10是fd_nextsizebk_nextsize,记录链表前后chunk的size,出现的情况与前者一致。
从fd开始到下一个chunk的prev_size可以看作是user_data部分,也就是一个chunk可写入部分。
值得注意的是size位存在三个比特位的利用,具体定义在源码中如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
/* size field is or'ed with PREV_INUSE when previous adjacent chunk in use */
#define PREV_INUSE 0x1

/* extract inuse bit of previous chunk */
#define prev_inuse(p) ((p)->mchunk_size & PREV_INUSE)


/* size field is or'ed with IS_MMAPPED if the chunk was obtained with mmap() */
#define IS_MMAPPED 0x2

/* check for mmap()'ed chunk */
#define chunk_is_mmapped(p) ((p)->mchunk_size & IS_MMAPPED)


/* size field is or'ed with NON_MAIN_ARENA if the chunk was obtained
from a non-main arena. This is only set immediately before handing
the chunk to the user, if necessary. */
#define NON_MAIN_ARENA 0x4

/*
Bits to mask off when extracting size

Note: IS_MMAPPED is intentionally not masked off from size field in
macros for which mmapped chunks should never be seen. This should
cause helpful core dumps to occur if it is tried by accident by
people extending or adapting this malloc.
*/
#define SIZE_BITS (PREV_INUSE | IS_MMAPPED | NON_MAIN_ARENA)

从这里看到,prev_inuseis_mmappednon_main_arena分别对应0x1、0x2、0x4这三个位作为标志位,prev_inuse的含义是物理相邻的前一个chunk是否处于malloced状态,是为1,否为0。is_mmaped记录当前 chunk 是否是由 mmap 分配的,NON_MAIN_ARENA,记录当前 chunk 是否不属于主线程,1 表示不属于,0 表示属于。
这三者被宏定义为SIZE_BITS位,那么为什么glibc要这么做呢?因为一个chunk的size大小一般为2 * SIZE_SZ 的最小整数倍,也就是0x10对齐,那么size的低4位就是0,为了提高效率将低三位拿来作bitflag,也就是这三个标志位。
其实看到这里读者肯定会产生疑惑,chunk的大小不是通过你malloc的参数决定的吗?怎么就说是0x10对齐了呢?
关于chunk到底多大,大小对齐是怎么来的,我在cj学长博客看到这篇文章的链接,确实解释得很清楚,读者可以先去看看这篇文章,心里的疑惑大概就通顺了。
那么我们可以画一个示意图来形象地表示堆块,我个人觉得最详细的堆块结构示意图(自己画的),如下:
pic0
值得注意的是,prev_size这个位只有在前一个chunk处于free的时候在会被使用,因此当前一个chunk处于malloced状态时,后者prev_size这个位会被当作前一个chunk的user_data,这个就是所谓的空间复用,这个值得特别记一下,因为涉及到后面的off-by-one/off-by-null以及相关的unlink、堆合并等操作
剩下的fdbk这些比较简单,这俩是记录在链表中的前后chunk,既然是在链表中那就不一定是物理相邻,什么时候会形成链表呢?答案就是被free
fd_nextsizebk_nextsize只出现在large bin中。在large bin attack的时候会用到。
到这里对于堆块的结构认识已经比较全面了,我把glibc用于解释的源码也放在这,讲的比较详细,但没我解释得直观就是了:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
/*
malloc_chunk details:

(The following includes lightly edited explanations by Colin Plumb.)

Chunks of memory are maintained using a `boundary tag' method as
described in e.g., Knuth or Standish. (See the paper by Paul
Wilson ftp://ftp.cs.utexas.edu/pub/garbage/allocsrv.ps for a
survey of such techniques.) Sizes of free chunks are stored both
in the front of each chunk and at the end. This makes
consolidating fragmented chunks into bigger chunks very fast. The
size fields also hold bits representing whether chunks are free or
in use.

An allocated chunk looks like this:


chunk-> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Size of previous chunk, if unallocated (P clear) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Size of chunk, in bytes |A|M|P|
mem-> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| User data starts here... .
. .
. (malloc_usable_size() bytes) .
. |
nextchunk-> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| (size of chunk, but used for application data) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Size of next chunk, in bytes |A|0|1|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

Where "chunk" is the front of the chunk for the purpose of most of
the malloc code, but "mem" is the pointer that is returned to the
user. "Nextchunk" is the beginning of the next contiguous chunk.

Chunks always begin on even word boundaries, so the mem portion
(which is returned to the user) is also on an even word boundary, and
thus at least double-word aligned.

Free chunks are stored in circular doubly-linked lists, and look like this:

chunk-> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Size of previous chunk, if unallocated (P clear) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
`head:' | Size of chunk, in bytes |A|0|P|
mem-> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Forward pointer to next chunk in list |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Back pointer to previous chunk in list |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Unused space (may be 0 bytes long) .
. .
. |
nextchunk-> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
`foot:' | Size of chunk, in bytes |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Size of next chunk, in bytes |A|0|0|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

The P (PREV_INUSE) bit, stored in the unused low-order bit of the
chunk size (which is always a multiple of two words), is an in-use
bit for the *previous* chunk. If that bit is *clear*, then the
word before the current chunk size contains the previous chunk
size, and can be used to find the front of the previous chunk.
The very first chunk allocated always has this bit set,
preventing access to non-existent (or non-owned) memory. If
prev_inuse is set for any given chunk, then you CANNOT determine
the size of the previous chunk, and might even get a memory
addressing fault when trying to do so.

The A (NON_MAIN_ARENA) bit is cleared for chunks on the initial,
main arena, described by the main_arena variable. When additional
threads are spawned, each thread receives its own arena (up to a
configurable limit, after which arenas are reused for multiple
threads), and the chunks in these arenas have the A bit set. To
find the arena for a chunk on such a non-main arena, heap_for_ptr
performs a bit mask operation and indirection through the ar_ptr
member of the per-heap header heap_info (see arena.c).

Note that the `foot' of the current chunk is actually represented
as the prev_size of the NEXT chunk. This makes it easier to
deal with alignments etc but can be very confusing when trying
to extend or adapt this code.

The three exceptions to all this are:

1. The special chunk `top' doesn't bother using the
trailing size field since there is no next contiguous chunk
that would have to index off it. After initialization, `top'
is forced to always exist. If it would become less than
MINSIZE bytes long, it is replenished.

2. Chunks allocated via mmap, which have the second-lowest-order
bit M (IS_MMAPPED) set in their size fields. Because they are
allocated one-by-one, each must contain its own trailing size
field. If the M bit is set, the other bits are ignored
(because mmapped chunks are neither in an arena, nor adjacent
to a freed chunk). The M bit is also used for chunks which
originally came from a dumped heap via malloc_set_state in
hooks.c.

3. Chunks in fastbins are treated as allocated chunks from the
point of view of the chunk allocator. They are consolidated
with their neighbors only in bulk, in malloc_consolidate.
*/

/*
---------- Size and alignment checks and conversions ----------
*/

/* Conversion from malloc headers to user pointers, and back. When
using memory tagging the user data and the malloc data structure
headers have distinct tags. Converting fully from one to the other
involves extracting the tag at the other address and creating a
suitable pointer using it. That can be quite expensive. There are
cases when the pointers are not dereferenced (for example only used
for alignment check) so the tags are not relevant, and there are
cases when user data is not tagged distinctly from malloc headers
(user data is untagged because tagging is done late in malloc and
early in free). User memory tagging across internal interfaces:

sysmalloc: Returns untagged memory.
_int_malloc: Returns untagged memory.
_int_free: Takes untagged memory.
_int_memalign: Returns untagged memory.
_int_memalign: Returns untagged memory.
_mid_memalign: Returns tagged memory.
_int_realloc: Takes and returns tagged memory.
*/

malloc_state

bins被定义在malloc_state结构体,类型是mfastbinptr和mchunkptr,分别找到这两者的原始定义发现其实都是malloc_chunk

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
typedef struct malloc_chunk *mfastbinptr;
typedef struct malloc_chunk* mchunkptr;

#define MAX_FAST_SIZE (80 * SIZE_SZ / 4)

#define NFASTBINS (fastbin_index (request2size (MAX_FAST_SIZE)) + 1)

#define NBINS 128

struct malloc_state
{
/* Serialize access. */
__libc_lock_define (, mutex);

/* Flags (formerly in max_fast). */
int flags;

/* Set if the fastbin chunks contain recently inserted free blocks. */
/* Note this is a bool but not all targets support atomics on booleans. */
int have_fastchunks;

/* Fastbins */
mfastbinptr fastbinsY[NFASTBINS];

/* Base of the topmost chunk -- not otherwise kept in a bin */
mchunkptr top;

/* The remainder from the most recent split of a small request */
mchunkptr last_remainder;

/* Normal bins packed as described above */
mchunkptr bins[NBINS * 2 - 2];

/* Bitmap of bins */
unsigned int binmap[BINMAPSIZE];

/* Linked list */
struct malloc_state *next;

/* Linked list for free arenas. Access to this field is serialized
by free_list_lock in arena.c. */
struct malloc_state *next_free;

/* Number of threads attached to this arena. 0 if the arena is on
the free list. Access to this field is serialized by
free_list_lock in arena.c. */
INTERNAL_SIZE_T attached_threads;

/* Memory allocated from the system in this arena. */
INTERNAL_SIZE_T system_mem;
INTERNAL_SIZE_T max_system_mem;
};

以上源码中,flags标志位用来表示当前arena中是否存在fastbin或者内存是否连续等,暂且不去管他
后面定义的是一些比较重要的结构,可以看到主要有两个数组fastbinsY[]bins[],fastbinsY的数量是NFASTBINS根据宏定义得知也就是10个
而bins数组中的第一个 bin 是 unsorted bin(1个),数组中从第 2 个到第 63 个 bin 是 small bin(62个),64到126为large bin(63个)
看到这里会产生疑惑,我们在寻找宏定义的时候看到NBINS是128,而bins的数量是NBINS*2-2也就是254,那为什么上述bins加起来也只有一半不到呢?
答案就是unsorted bin, small bin, large bin三者都是由双向链表构成,需要数组中两个相邻的值表示一个chunk。
所以这样表述可能会更贴切:bins中每一组数据由两个元素组成,第1组是Unsorted bin,第2 到 63组是Small bin,第64 到 126组是Large bin。
binmap这个数组表示每一个bit表示对应的bin中是否存在空闲chunk

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
/*
Indexing

Bins for sizes < 512 bytes contain chunks of all the same size, spaced
8 bytes apart. Larger bins are approximately logarithmically spaced:

64 bins of size 8
32 bins of size 64
16 bins of size 512
8 bins of size 4096
4 bins of size 32768
2 bins of size 262144
1 bin of size what's left

There is actually a little bit of slop in the numbers in bin_index
for the sake of speed. This makes no difference elsewhere.

The bins top out around 1MB because we expect to service large
requests via mmap.

Bin 0 does not exist. Bin 1 is the unordered list; if that would be
a valid chunk size the small bins are bumped up one.
*/

#define NBINS 128
#define NSMALLBINS 64
#define SMALLBIN_WIDTH MALLOC_ALIGNMENT
#define SMALLBIN_CORRECTION (MALLOC_ALIGNMENT > CHUNK_HDR_SZ)
#define MIN_LARGE_SIZE ((NSMALLBINS - SMALLBIN_CORRECTION) * SMALLBIN_WIDTH)

#define in_smallbin_range(sz) \
((unsigned long) (sz) < (unsigned long) MIN_LARGE_SIZE)

#define smallbin_index(sz) \
((SMALLBIN_WIDTH == 16 ? (((unsigned) (sz)) >> 4) : (((unsigned) (sz)) >> 3))\
+ SMALLBIN_CORRECTION)

#define largebin_index_32(sz) \
(((((unsigned long) (sz)) >> 6) <= 38) ? 56 + (((unsigned long) (sz)) >> 6) :\
((((unsigned long) (sz)) >> 9) <= 20) ? 91 + (((unsigned long) (sz)) >> 9) :\
((((unsigned long) (sz)) >> 12) <= 10) ? 110 + (((unsigned long) (sz)) >> 12) :\
((((unsigned long) (sz)) >> 15) <= 4) ? 119 + (((unsigned long) (sz)) >> 15) :\
((((unsigned long) (sz)) >> 18) <= 2) ? 124 + (((unsigned long) (sz)) >> 18) :\
126)

#define largebin_index_32_big(sz) \
(((((unsigned long) (sz)) >> 6) <= 45) ? 49 + (((unsigned long) (sz)) >> 6) :\
((((unsigned long) (sz)) >> 9) <= 20) ? 91 + (((unsigned long) (sz)) >> 9) :\
((((unsigned long) (sz)) >> 12) <= 10) ? 110 + (((unsigned long) (sz)) >> 12) :\
((((unsigned long) (sz)) >> 15) <= 4) ? 119 + (((unsigned long) (sz)) >> 15) :\
((((unsigned long) (sz)) >> 18) <= 2) ? 124 + (((unsigned long) (sz)) >> 18) :\
126)

// XXX It remains to be seen whether it is good to keep the widths of
// XXX the buckets the same or whether it should be scaled by a factor
// XXX of two as well.
#define largebin_index_64(sz) \
(((((unsigned long) (sz)) >> 6) <= 48) ? 48 + (((unsigned long) (sz)) >> 6) :\
((((unsigned long) (sz)) >> 9) <= 20) ? 91 + (((unsigned long) (sz)) >> 9) :\
((((unsigned long) (sz)) >> 12) <= 10) ? 110 + (((unsigned long) (sz)) >> 12) :\
((((unsigned long) (sz)) >> 15) <= 4) ? 119 + (((unsigned long) (sz)) >> 15) :\
((((unsigned long) (sz)) >> 18) <= 2) ? 124 + (((unsigned long) (sz)) >> 18) :\
126)

#define largebin_index(sz) \
(SIZE_SZ == 8 ? largebin_index_64 (sz) \
: MALLOC_ALIGNMENT == 16 ? largebin_index_32_big (sz) \
: largebin_index_32 (sz))

#define bin_index(sz) \
((in_smallbin_range (sz)) ? smallbin_index (sz) : largebin_index (sz))

small bin:根据用户请求的 sz 大小得到 small bin的下标的方法是:chunk_size = MALLOC_ALIGNMENT * indx,其中 MALLOC_ALIGNMENT = 2 * SIZE_SZ。所以对于 32位,最大 chunk 为 504字节,对于64最大为 1008 字节。
large bin:共包含 63 个 bin,每个 bin 中的chunk 大小不一致,而是在一定范围之内,63个 bin 被分成 6 组,每组 Bin 的 chunk 的大小之间的公差一致。
largebin 排列方式从大到小依次排列,链表头的 bk 指针指向最小的堆块.
unsorted bin和small bin采取FIFO(first in first out),其余是FILO

_heap_info

1
2
3
4
5
6
7
8
9
10
11
12
13
typedef struct _heap_info
{
mstate ar_ptr; /* Arena for this heap. */
struct _heap_info *prev; /* Previous heap. */
size_t size; /* Current size in bytes. */
size_t mprotect_size; /* Size in bytes that has been mprotected
PROT_READ|PROT_WRITE. */
size_t pagesize; /* Page size used when allocating the arena. */
/* Make sure the following data is properly aligned, particularly
that sizeof (heap_info) + 2 * SIZE_SZ is a multiple of
MALLOC_ALIGNMENT. */
char pad[-3 * SIZE_SZ & MALLOC_ALIGN_MASK];
} heap_info;

该结构体主要是为了非主线程分配内存使用,也即在非主分配区分配内存。因为在主分配区内存可以直接使用 sbrk 方式拓展内存,只有一个堆。而非主线程的 heap 是提前分配好的,因此当该 heap 资源用完时,就需要重新申请内存空间,因为一般情况下申请的内存空间是不连续的,因此需要记录一个线程中不同的 heap 的链接结构。
ar_ptr:表示该堆所对应的分配区
prev:该线程的上一个堆结构,注意到这里是采用单链表的形式来记录一个线程的所有堆结构的。
size:堆的大小

malloc

在malloc.c中,glibc使用strong_alias (__libc_malloc, __malloc) strong_alias (__libc_malloc, malloc)这条代码将malloc作为__libc_malloc的别名,所以在调用malloc时,实际被调用的是__libc_malloc这个函数。
2.34以上的版本已经取消了与hook有关的使用,所以其malloc函数源码和旧版本不一样,不过现在很多国内比赛的堆题考hook的还是挺多的,上IO_FILE的house of系列估计也暂时还没卷到2.37。我这里就先试着分析旧版本,然后再把无hook的新版本也分析一遍。

2.33(有hook)

初始化

1
2
3
4
5
6
7
8
9
10
11
12
13
14
void *
__libc_malloc (size_t bytes)
{
mstate ar_ptr;
void *victim;

_Static_assert (PTRDIFF_MAX <= SIZE_MAX / 2,
"PTRDIFF_MAX is not more than half of SIZE_MAX");

void *(*hook) (size_t, const void *)
= atomic_forced_read (__malloc_hook);
if (__builtin_expect (hook != NULL, 0))
return (*hook)(bytes, RETURN_ADDRESS (0));
}

静态断言的部分我们不去管他,是编译器需要分析的东西,我们直接看malloc的流程。
看到这样一段代码:

1
2
if (__builtin_expect (hook != NULL, 0))
return (*hook)(bytes, RETURN_ADDRESS (0));

在调用__libc_malloc这个函数时,会先检验__malloc_hook是否为空,若不是则直接调用其指向的函数。
在第一次调用malloc函数的时候,malloc_hook指向的是malloc_hook_ini这个函数,这个函数定义在hooks.c中

1
2
3
4
5
6
7
static void *
malloc_hook_ini (size_t sz, const void *caller)
{
__malloc_hook = NULL;
ptmalloc_init ();
return __libc_malloc (sz);
}

执行这个函数会将__malloc_hookNULL
同时调用ptmalloc_init这个函数
这个函数定义在arena.c中,主要作用是对堆和main_arena的初始化,我们暂时先不详细分析这个函数。
在初始化完成后再调用__libc_malloc来分配堆块。

#IF USE_TCACHE

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
#if USE_TCACHE
/* int_free also calls request2size, be careful to not pad twice. */
size_t tbytes;
if (!checked_request2size (bytes, &tbytes))
{
__set_errno (ENOMEM);
return NULL;
}
size_t tc_idx = csize2tidx (tbytes);

MAYBE_INIT_TCACHE ();

DIAG_PUSH_NEEDS_COMMENT;
if (tc_idx < mp_.tcache_bins
&& tcache
&& tcache->counts[tc_idx] > 0)
{
victim = tcache_get (tc_idx);
return TAG_NEW_USABLE (victim);
}
DIAG_POP_NEEDS_COMMENT;
#endif

进入这个分支以后会先把size储存为tbytes,同时使用csize2tidx这个函数将大小转化成tcache对应的下标。
该函数是宏定义的函数:# define csize2tidx(x) (((x) - MINSIZE + MALLOC_ALIGNMENT - 1) / MALLOC_ALIGNMENT)
在64位系统中,MINISIZEMALLOC_ALIGNMENT的值分别是0x20,0x10,这个计算式也很好理解了,就是根据size的值与tcache最小的size比较,计算出对应的下标
然后执行的是MAYBE_INIT_TCACHE这个宏,这个内容很简略,就是判断tcache是否初始化,若无,则调用tcahce_init函数来初始化tcache。
tcache_init函数的作用是初始化线程缓存(Thread Cache),它是glibc中用于提高内存分配效率的一种机制。
线程缓存是一种线程本地的内存池,每个线程都有自己的缓存,用于存储小块内存,当线程需要分配小块内存时,会先从自己的缓存中获取,如果缓存中没有,则从全局内存池中获取。
tcache_init函数的作用是初始化线程缓存,它会为每个线程分配一个缓存,并且为每个缓存分配一定数量的内存块,以便线程可以从缓存中获取内存。

1
static __thread tcache_perthread_struct *tcache = NULL;

在MAYBE_INIT_TCACHE这个宏定义的时候看到tcache这个变量,我们去找它的声明,可以看到它的类型是tcache_perthread_struct结构体,__thread是修饰它的关键字,查找这个关键字的作用: __thread是GCC内置的线程局部存储设施,存取效率可以和全局变量相比。__thread变量每一个线程有一份独立实体,各个线程的值互不干扰。可以用来修饰那些带有全局性且值可能变,但是又不值得用全局变量保护的变量。相当于每个线程一个副本。
然后它的类型tcache_perthread_struct结构体定义如下:

1
2
3
4
5
6
7
8
# define TCACHE_MAX_BINS		64
# define MAX_TCACHE_SIZE tidx2usize (TCACHE_MAX_BINS-1)

typedef struct tcache_perthread_struct
{
uint16_t counts[TCACHE_MAX_BINS];
tcache_entry *entries[TCACHE_MAX_BINS];
} tcache_perthread_struct;

定义了最大数量64,也就是说tcache中维护了64个不同size且相互隔离的链表,size范围可以通过计算得知是0x20~0x410。
定义了tcache_entry类型的指针数组entries,可以看出它的指向是链表的第一个chunk。
tcache_entry结构定义如下:

1
2
3
4
5
6
typedef struct tcache_entry
{
struct tcache_entry *next;
/* This field exists to detect double frees. */
struct tcache_perthread_struct *key;
} tcache_entry;

一个next指针指向下一个chunk。2.28以后的版本加入了key,用来检查tcache double free,具体方法大概就是当 free 掉一个堆块进入tcache 时,假如堆块的bk位存放的key == tcache_key,就会遍历这个大小的 Tcache,假如发现同地址的堆块,则触发 Double Free 报错,具体实现暂不分析。

初始化tcache函数tcache_init:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
static void
tcache_init(void)
{
mstate ar_ptr;
void *victim = 0;
const size_t bytes = sizeof (tcache_perthread_struct);

if (tcache_shutting_down)
return;

arena_get (ar_ptr, bytes);
victim = _int_malloc (ar_ptr, bytes);
if (!victim && ar_ptr != NULL)
{
ar_ptr = arena_get_retry (ar_ptr, bytes);
victim = _int_malloc (ar_ptr, bytes);
}


if (ar_ptr != NULL)
__libc_lock_unlock (ar_ptr->mutex);

/* In a low memory situation, we may not be able to allocate memory
- in which case, we just keep trying later. However, we
typically do this very early, so either there is sufficient
memory, or there isn't enough memory to do non-trivial
allocations anyway. */
if (victim)
{
tcache = (tcache_perthread_struct *) victim;
memset (tcache, 0, sizeof (tcache_perthread_struct));
}
}

如果您喜欢此博客或发现它对您有用,则欢迎对此发表评论。 也欢迎您共享此博客,以便更多人可以参与。 如果博客中使用的图像侵犯了您的版权,请与作者联系以将其删除。 谢谢 !