(这篇文章呢,大家看一下就好,因为很多地方情况都不一样的!)
原先的问题是:
malloc 申请得到的内存后,再 free 释放它的时候,操作系统会立即收回那块内存吗?
一种解释:
1)C库老老实实地待在硬盘里,没法管这事。free()是库函数,没错。可这就给许多人错觉,以为,C自己把这事给办了。空间回收和空间分配一样,涉及到两块空间:虚拟空间与物理空间。虚拟空间是进程私有的,这没事,进程可以自行了断。但物理空间一共就一块,所有的进程都共用它。如果让每个进程自行做垃圾回收,那不就乱了套了?因此,我想象,库函数free()只是通知“可以回收啦”而没有真正回收空间。
2)当虚存释放空间时,如果系统立即做实存的回收,那会把系统累死。整个系统的运作效率大大降低。因此,现代系统的垃圾回收都有一些算法。这个好比你家门口的垃圾袋,并不是你放出去,小区清洁工立即拿走的。没有一个清洁工能对小区所有业主做到这样的服务,除非她是超人或机器猫。如果系统足够聪明,能识别某进程已经僵死并回收了它所占的空间,那为啥还剩下它的PCB呢?
问题是这样一段代码,运行会崩溃:
int main()
{
unsigned char *p = (unsigned char*)malloc(4 * sizeof(unsigned char));
memset(p, 0, 4);
strcpy_s((char*)p, 9,"abcdabcd"); // **deliberately storing 8bytes**
std::cout << p;
free(p); // Obvious Crash, but I need how it works and why crash.
//std::cout << p;
return 0;
}
我试了一下真的会停止运行。。(原因在文章最后)
先介绍一下memset函数:
memset是计算机中C/C++语言函数。将s所指向的某一块内存中的前n个 字节的内容全部设置为ch指定的ASCII值, 第一个值为指定的内存地址,块的大小由第三个参数指定,这个函数通常为新申请的内存做初始化工作, 其返回值为指向s的指针。
函数原型:
void * memset ( void * ptr, int value, size_t num );
作用:
Fill block of memory
Sets the first num bytes of the block of memory pointed by ptr to the specified value (interpreted as an unsigned char).
参数具体介绍:
Parameters
- ptr
Pointer to the block of memory to fill. - value
Value to be set. The value is passed as an int, but the function fills the block of memory using the unsigned char conversion of this value. - num
Number of bytes to be set to the value.size_t is an unsigned integral type
所以上面的:
unsigned char *p = (unsigned char*)malloc(4 * sizeof(unsigned char)); memset(p, 0, 4);
就是申请一块4个字节的内存然后初始化为0.
然后我们看解释是啥:
In many malloc/free implementations, free does normally not return the memory to the operating system (or at least only in rare cases). The reason is, that you will get gaps in your heap and thus it can happen, that you just finish off your 2 or 4 GB of virtual memory with gaps. This should be avoided of course, since as soon as the virtual memory is finished, you will be in really big trouble. The other reason of course is, that the OS can only handle memory chunks that are of a specific size and alignment. To be specific: Normally the OS can only handle blocks that the virtual memory manager can handle (most often multiples of 512 bytes e.g. 4KB).
- free通常不会直接把申请的内存还给操作系统,因为,如果如果你这样做的话,堆内存里面会出现gap(这个下面有解释),所以可能出现的情况是:你把你几个G的虚拟内存都被gap用完了(一脸茫然。。不懂)。当然要避免这样的情况啦。不过还有一个原因就是,操作系统每次只能处理一块特定大小的区域。具体一点说就是操作系统会处理虚拟内存能处理的大小的内存区域,大多数是512B的整数倍,比如4KB。
(update:可能的一个翻译是“碎片”)
关于gap:
最后一项gap,又称NoMansLand,是4byte(nNoMansLandSize=4)大小的一段区域,注意看最后几行注释就明白了,在这个结构后面跟的是用户真正需要的10byte数据区域,而其后还跟了一个4byte的Gap,那么也就是说用户申请分配的区域是被一个头结构,和一个4byte的gap包起来的。在释放这10byte空间的时候,会检查这些信息。Gap被分配之后会被以0xFD填充。检查中如果gap中的值变化了,就会以Assert fail的方式报错
// Buffer just before (lower than) the user's memory: unsigned char gap[nNoMansLandSize];
So returning 40 Bytes to the OS will just not work. So what does free do?
- 所以呢,free 40B的内存操作系统是不会处理的!那么free是干啥的?
Free will put the memory block in its own free block list. Normally it also tries to meld together adjacent blocks in the address space. The free block list is just a circular list of memory chunks which have of course some administrative data in the beginning. This is also the reason, why managing very small memory elements with the standard malloc/free is not efficient. Every memory chunk needs additional data and with smaller sizes more fragmentation happens.
- free会把这块内存块放到它自己的要free的内存块区域,通常它也会试着把相邻的区域合并到一起。
free区域只是一个环形的内存块链,并且这块区域还要有管理的数据,这也是为啥用标准库里面的malloc/free函数处理小块的数据很低效。
The free-list is also the first place that malloc looks at when a new chunk of memory is needed. It is scanned before it calls for new memory from the OS. When a chunk is found that is bigger then the needed memory, it is just divided into two parts. One is returned to caller, the other is put back into the free list.
- 要free的列表也是malloc要分配一块新的内存的时候,首先要遍历的区域,如果free表的区域够大的话,就会使用这块区域,多余的部分还是会放回freelist。
There are many different optimizations to this standard behaviour (for example for small chunks of memory). But since malloc and free must be so universal, the standard behaviour is always the fallback when alternatives are not usable. There are also optimizations in handling the free-list — for example storing the chunks in lists sorted by sizes. But all optimizations also have their own limitations.
但是啊,其实呢,上面程序崩溃的主要原因是,只分配了4Byte的空间却写入了9个Byte 的内容!
这样就会把后面的管理区域给覆盖掉,所以会出现问题!
根据上面的说法,我稍微修改了一下程序:
unsigned char *p = (unsigned char*)malloc(10 * sizeof(unsigned char));
memset(p, 0, 10);
strcpy_s((char*)p, 9,"abcdabcd"); // **deliberately storing 8bytes**
std::cout << p;
free(p);
这样就可以正常运行啦!
并且有人在free的后面再次打印这个字符串,貌似有些机器可以打印出来,反正我试了一下直接乱码了。肯定是内存被回收了。
这篇文章呢,大家看一下就好,因为很多地方情况都不一样的!