文章中的C++代码未经特别声明,均为VC编译。
使用VC编译器生成汇编代码: 运行"cl filename.cpp /Fa"生成"filename.cpp"的中间汇编代码。这些代码没有经过编译器优化,所以要比编译成EXE后再返汇编得到的汇编代码来得更易读;更方便的是,编译器会在asm文件中生成注释,将C++代码的行号对应到asm代码中。 在运行cl.exe前,必须先运行"C:\Program Files\Microsoft Visual Studio\VC98\Bin\VCVARS32.BAT"注册环境变量。
一个C++的对象在内存中到底是个什么样子呢?先来看看下面的代码:
#include <stdio.h>
class test { public: int m1; int m2; int m3; virtual int function1(){return 1;} virtual int function2(){return 2;} int function3(){return 3;} int function4(){return 4;} };
int main() { test *ptr1=new test(); test *ptr2=new test(); printf("Size of test:\t%d\n",sizeof(test)); printf("Addr of ptr1:\t0x%08X\n",ptr1); printf("Addr of m1:\t0x%08X\n",&(ptr1->m1)); printf("Addr of m2:\t0x%08X\n",&(ptr1->m2)); printf("Addr of m3:\t0x%08X\n",&(ptr1->m3)); printf("Addr of vtable:\t0x%08X\n",*(unsigned int *)((void *)ptr1)); printf("\n"); printf("Addr of ptr2:\t0x%08X\n",ptr2); printf("Addr of m1:\t0x%08X\n",&(ptr2->m1)); printf("Addr of m2:\t0x%08X\n",&(ptr2->m2)); printf("Addr of m3:\t0x%08X\n",&(ptr2->m3)); printf("Addr of vtable:\t0x%08X\n",*(unsigned int *)((void *)ptr2)); printf("\n"); printf("Addr of vtable[0]:\t0x%08X\n",**((int**)ptr1)); printf("Addr of vtable[1]:\t0x%08X\n",*(*((int**)ptr1)+1)); printf("Addr of function1:\t0x%08X\n",(test::function1)); printf("Addr of function2:\t0x%08X\n",(test::function2)); printf("Addr of function3:\t0x%08X\n",(test::function3)); printf("Addr of function4:\t0x%08X\n",(test::function4)); return 0; }
在VC中编译运行后的结果是:
Size of test: 16 Addr of ptr1: 0x00340758 Addr of m1: 0x0034075C Addr of m2: 0x00340760 Addr of m3: 0x00340764 Addr of vtable: 0x004060B0
Addr of ptr2: 0x00340770 Addr of m1: 0x00340774 Addr of m2: 0x00340778 Addr of m3: 0x0034077C Addr of vtable: 0x004060B0
Addr of vtable[0]: 0x00401210 Addr of vtable[1]: 0x00401220 Addr of function1: 0x00401230 Addr of function2: 0x00401240 Addr of function3: 0x004011E0 Addr of function4: 0x004011F0
可以确定,test对象在内存中的大小是16字节,结构如下:

其中pvtable是一个指向虚函数表的指针,C++依赖vtable实现动态编联,在程序运行时,依靠vtable中的函数指针来执行相应的虚函数。但是执行的结果却与这个模型有些出入:
Addr of vtable[0]: 0x00401210 Addr of vtable[1]: 0x00401220 Addr of function1: 0x00401230 Addr of function2: 0x00401240
vtable[0]、vtable[1]和function1、function2并不对应,虽然它们的内存地址十分接近。究竟是怎么回事,还是反汇编看看:
:00401230 8B01 mov eax, dword ptr [ecx] ;将vtble地址放到eax寄存器 :00401232 FF20 jmp dword ptr [eax] ;跳转到vtable指向的function1 :00401234 CC int 03 ... ... :0040123F CC int 03 :00401240 8B01 mov eax, dword ptr [ecx] ;将vtble地址放到eax寄存器 :00401242 FF6004 jmp [eax+04] ;跳转到vtable指向的function2
注:对于thiscall函数调用,ecx寄存器中保存的是该对象的this指针。这两段代码样子差不多,都是从vtable中找到对应的虚函数地址,然后跳转到虚函数里。VC之所以不暴露真正的虚函数地址是为了实现对象的多态性,因为在程序执行前,虚函数的地址是不能确定的;也不应该是确定的。
(待续)
http://www.donews.net/tabris17/archive/2005/02/13/275979.aspx

|