Inheriting From a Native C++ Class in C#
Hi, this is
Jim Springfield, an architect on the Visual C++ team. I have blogged in the past
about our IDE and Intellisense work. I am still heavily focused on that and we
are working hard to deliver an improved experience, but this post is about a
completely different topic. A few months ago, I started thinking about how to
access C++ classes from managed code and came up with this technique, which I
haven’t seen mentioned anywhere else.
There are many ways
that native code and managed code can interact and call each other. If you have
native code that you want to call from C# you have several choices depending on
the nature of the API. If you have a flat “C” API, you can use P/Invoke to
directly call the API. If the native code is exposed using COM, the CLR’s COM
Interop can provide access. If you have a C++ class, you could go add COM
support, or write a custom wrapper using C++/CLI and expose a new managed
class.
I really wanted something more direct than these.
Initially, I was just trying to see if I could call a native C++ class from C#,
but as I started playing with it, I realized that I could actually “inherit”
from the native class. I put “inherit” in quotes, because you could make an
argument that it isn’t truly inheritance, but I will let the reader make the
final decision.
Let’s say I have a C++ class exposed from a
DLL that I want to consume in C#. The class looks like the
following.
class __declspec(dllexport) CSimpleClass
{
public:
int
value;
CSimpleClass(int value) :
value(value)
{
}
~CSimpleClass()
{
printf("~CSimpleClass\n");
}
void M1()
{
printf("C++/CSimpleClass::M1()\n");
V0();
V1(value);
V2();
}
virtual void
V0()
{
printf("C++/CSimpleClass::V0()\n");
}
virtual void V1(int x)
{
printf("C++/CSimpleClass::V1(%d)\n",
x);
}
virtual void
V2()
{
printf("C++/CSimpleClass::V2()\n", value);
}
};
The
__declspec(dllexport) means that the class is exported from the DLL. What this
really means is that all of the class methods are exported from the DLL. If I
look at the list of exports using dumpbin.exe or depends.exe, I see the
following list of
exports.
??0CSimpleClass@@QAE@ABV0@@Z
??0CSimpleClass@@QAE@H@Z
??1CSimpleClass@@QAE@XZ
??4CSimpleClass@@QAEAAV0@ABV0@@Z
??_7CSimpleClass@@6B@
?M1@CSimpleClass@@QAEXXZ
?V0@CSimpleClass@@UAEXXZ
?V1@CSimpleClass@@UAEXH@Z
?V2@CSimpleClass@@UAEXXZ
These
are decorated (i.e. “mangled”) names. For most of these, you can probably guess
what the name is actually referring to.
(Note:
Name mangling may change between versions of C++ and mangling is different
between x86, x64, and Itanium platforms. The example here works on both VS2008
and the CTP release of VS2010.)
There is a nifty tool called
undname.exe that ships with Visual Studio, which can take a mangled name and
undecorate it. Running it on each of the names above gives the corresponding
output.
public: __thiscall
CSimpleClass::CSimpleClass(int)
public: __thiscall
CSimpleClass::~CSimpleClass(void)
public: class CSimpleClass
& __thiscall CSimpleClass::operator=(class CSimpleClass const
&)
const
CSimpleClass::`vftable‘
public: void __thiscall
CSimpleClass::M1(void)
public: virtual void __thiscall
CSimpleClass::V0(void)
public: virtual void __thiscall
CSimpleClass::V1(int)
public: virtual void __thiscall
CSimpleClass::V2(void)
Other than
the methods we explicitly defined, there is also a compiler generated assignment
operator and a reference to the vtable for this class. OK, so I know that using
P/Invoke, C# can call into native DLL entry points, and I just happen to have a
list of native entry points.
First, however, we
need to define a structure in C# that corresponds to the native class. Our
native class only has one field: an int. However, it does have virtual methods,
so there is also a vtable pointer at the beginning of the class.
(Note: I am only dealing with single
inheritance here. With multiple inheritance, there are multiple vtables and
vtable pointers.)
[StructLayout(LayoutKind.Sequential, Pack
= 4)]
public unsafe struct
__CSimpleClass
{
public IntPtr*
_vtable;
public int
value;
}
Next,
I am going to define a C# class that wraps the native class and mimics it. I
want to expose synchronous destruction, so the C# equivalent of that is
implementing IDisposable, which I do here. I also create a matching constructor
and the “M1” method of CSimpleClass. I use “DllImport” to specify the DLL name,
entrypoint, and calling convention. The “ThisCall” convention is the default for
C++ member functions.
(Note: to be safer, I should
explicitly specify calling conventions and structure packing in my native code,
but that is left out for brevity. If they aren’t explicitly specified, compiler
options can change the defaults.)
There are calls in the
code below to Memory.Alloc and Memory.Free. These were implemented by me and
just forward to HeapAlloc/Free in kernel32.dll.
public
unsafe class CSimpleClass :
IDisposable
{
private
__CSimpleClass* _cpp;
//
CSimpleClass constructor and destructor
[DllImport("cppexp.dll", EntryPoint = "??0CSimpleClass@@QAE@H@Z",
CallingConvention = CallingConvention.ThisCall)]
private static extern int _CSimpleClass_Constructor(__CSimpleClass* ths, int
value);
[DllImport("cppexp.dll", EntryPoint =
"??1CSimpleClass@@QAE@XZ", CallingConvention =
CallingConvention.ThisCall)]
private static extern
int _CSimpleClass_Destructor(__CSimpleClass*
ths);
// void
M1();
[DllImport("cppexp.dll", EntryPoint =
"?M1@CSimpleClass@@QAEXXZ", CallingConvention =
CallingConvention.ThisCall)]
private static extern
void _M1(__CSimpleClass*
ths);
public
CSimpleClass(int value)
{
//Allocate storage for object
_cpp =
(__CSimpleClass*)Memory.Alloc(sizeof(__CSimpleClass));
//Call constructor
_CSimpleClass_Constructor(_cpp,
value);
}
public void
Dispose()
{
//call
destructor
_CSimpleClass_Destructor(_cpp);
//release
memory
Memory.Free(_cpp);
_cpp = null;
}
public void
M1()
{
_M1(_cpp);
}
}
So, at
this point I can create a CSimpleClass in C# and call the “M1” method like this.
The “using” statement defines a scope. At the end of the scope, Dispose() will
automatically be called on sc.
static void Main(string[]
args)
{
CSimpleClass sc = new
CSimpleClass(10);
using
(sc)
{
//M1 calls all of
the virtual functions V0,V1,V2
sc.M1();
}
}
Running
this code gives me the following output on the console. M1 calls each of the
virtual functions V0, V1, and
V2.
C++/CSimpleClass::M1()
C++/CSimpleClass::V0()
C++/CSimpleClass::V1(10)
C++/CSimpleClass::V2()
OK,
this is pretty cool, right? That’s what I was thinking anyway. A couple of days
later, I picked up this code again and started thinking that it would be really
cool if I could override a virtual function. I’ve already got the vtable pointer
in my __CSimpleClass structure. I know that the vtable pointer points to an
array of function pointers, at least in the simple single inheritance case.
(Multiple inheritance and virtual inheritance can add some significant wrinkles
to this.) If I can change a function in the vtable, then I’ve overridden it. The
vtables themselves are shared by all instances of a class, so I can’t just go
pound a slot in the vtable with my own function pointer. I need to actually
create my own vtable.
I need to construct an
array of native pointers to my virtual method overrides and replace the vtable
pointer with a pointer to my vtable. As it turns out, the .Net libraries
provides a mechanism to implement callbacks from native code. This is
Marshal.GetFunctionPointerForDelegate, and it works just fine for our needs.
First of all, we need to use DllImport to get
access to the virtual functions we are overriding. This is just like what we did
to access the M1 method above. The example below only shows the code for V1, but
we actually need it for V0 and V2 as well. I chose V1 for the example as it is
the only virtual that takes a parameter. The others take no
arguments.
[DllImport("cppexp.dll", EntryPoint =
"?V1@CSimpleClass@@UAEXH@Z", CallingConvention =
CallingConvention.ThisCall)]
private static extern void
_V1(__CSimpleClass* ths, int
i);
Now, we need to implement our
override in the managed version of CSimpleClass. It simple forwards to the _V1
that we defined above, which is a direct call to the native version in
cppexp.dll.
public virtual void V1(int
i)
{
_V1(_cpp,
i);
}
The
tricky part is to get our new virtual function V1 into the vtable. This can be
done by creating a delegate in our class. We declare a delegate and specify an
instance of it. Again, we need to do this for V0 and V2 as
well.
public delegate void V1_Delegate(int
i);
public V1_Delegate _v1_Delegate;
In our
C# CSimpleClass constructor, we need to create the delegates, use
Marshal.GetFunctionPointerForDelegate for each delegate, put them into an array,
and override the vtable pointer in the native class. Here is what the final
class looks like. We remember the old vtable pointer as well, so that we can
reset it in the Dispose method to the old value. C++ differs from C# in this
regard in that as an object is constructed, its vtable pointer will change to
match the level in the inheritance. If you look closely, you will see two other
helper functions that I’ve defined: InitVtable and ResetVtable. InitVtable does
the work of copying the function pointers from the managed array into some
native memory and then patching the vtable of the object. ResetVtable puts the
old vtable pointer back and frees the memory of the created vtable. In C++, a
single copy of the vtable is shared by all instances of a class, but here we
create a unique vtable for each instance. This is needed as the delegates
encompass the actual managed object itself rather than just a pointer to a
method that takes a “this” pointer. We don’t actually use the “this” pointer
that is passed to us from native code as the delegate implicitly knows the
managed object and the managed object contains a pointer to the native
object.
public unsafe class CSimpleClass :
IDisposable
{
private
__CSimpleClass* _cpp;
private IntPtr*
_oldvtbl;
private void
InitVtable(__CSimpleClass* ths, IntPtr[] arr, int
len)
{
IntPtr* newvtable =
(IntPtr*)Memory.Alloc(len * sizeof(IntPtr));
for (int
i = 0; i < len; i++)
newvtable[i] =
arr[i];
_oldvtbl =
ths->_vtable;
ths->_vtable =
newvtable;
}
private void
ResetVtable(__CSimpleClass* ths)
{
IntPtr* oldvtbl =
ths->_vtable;
ths->_vtable =
_oldvtbl;
Memory.Free(oldvtbl);
}
//
CSimpleClass constructor and destructor
[DllImport("cppexp.dll", EntryPoint = "??0CSimpleClass@@QAE@H@Z",
CallingConvention = CallingConvention.ThisCall)]
private static extern int _CSimpleClass_Constructor(__CSimpleClass* ths, int
value);
[DllImport("cppexp.dll", EntryPoint =
"??1CSimpleClass@@QAE@XZ", CallingConvention =
CallingConvention.ThisCall)]
private static extern
int _CSimpleClass_Destructor(__CSimpleClass*
ths);
// void
M1();
// virtual void
V0();
// virtual void V1(int
x);
// virtual void V2();
[DllImport("cppexp.dll", EntryPoint = "?M1@CSimpleClass@@QAEXXZ",
CallingConvention = CallingConvention.ThisCall)]
private static extern void _M1(__CSimpleClass* ths);
[DllImport("cppexp.dll", EntryPoint = "?V0@CSimpleClass@@UAEXXZ",
CallingConvention = CallingConvention.ThisCall)]
private static extern void _V0(__CSimpleClass* ths);
[DllImport("cppexp.dll", EntryPoint = "?V1@CSimpleClass@@UAEXH@Z",
CallingConvention = CallingConvention.ThisCall)]
private static extern void _V1(__CSimpleClass* ths, int
i);
[DllImport("cppexp.dll", EntryPoint =
"?V2@CSimpleClass@@UAEXXZ", CallingConvention =
CallingConvention.ThisCall)]
private static extern
void _V2(__CSimpleClass*
ths);
public delegate void
V0_Delegate();
public delegate void V1_Delegate(int
i);
public delegate void
V2_Delegate();
public
V0_Delegate _v0_Delegate;
public V1_Delegate
_v1_Delegate;
public V2_Delegate
_v2_Delegate;
public
CSimpleClass(int value)
{
//Allocate storage for object
_cpp =
(__CSimpleClass*)Memory.Alloc(sizeof(__CSimpleClass));
//Call constructor
_CSimpleClass_Constructor(_cpp,
value);
//Create delegates for the virtual
functions
_v0_Delegate = new
V0_Delegate(V0);
_v1_Delegate = new
V1_Delegate(V1);
_v2_Delegate = new
V2_Delegate(V2);
IntPtr[] arr = new
IntPtr[3];
arr[0] =
Marshal.GetFunctionPointerForDelegate(_v0_Delegate);
arr[1] =
Marshal.GetFunctionPointerForDelegate(_v1_Delegate);
arr[2] =
Marshal.GetFunctionPointerForDelegate(_v2_Delegate);
//Create a new vtable and replace it in the object
InitVtable(_cpp, arr, 3);
}
public void Dispose()
{
//reset old vtable
pointer
ResetVtable(_cpp);
//call destructor
_CSimpleClass_Destructor(_cpp);
//release
memory
Memory.Free(_cpp);
_cpp = null;
}
public void
M1()
{
_M1(_cpp);
}
public
virtual void V0()
{
_V0(_cpp);
}
public
virtual void V1(int i)
{
_V1(_cpp, i);
}
public
virtual void V2()
{
_V2(_cpp);
}
}
We have a
managed CSimpleClass with virtual methods that can be overridden in a derived
class. If we create a new C# class that inherits from CSimpleClass, we can
override any virtual functions. In CSimpleClassEx, we are overriding V2 and
writing out some text.
class CSimpleClassEx :
CSimpleClass
{
public
CSimpleClassEx(int value)
:
base(value)
{
}
public override void
V2()
{
Console.WriteLine("C#/CSimpleClassEx.V2()");
}
}
If we
create in instance of CSimpleClassEx and call M1, we now get the following
output.
C++/CSimpleClass::M1()
C++/CSimpleClass::V0()
C++/CSimpleClass::V1(10)
C#/CSimpleClassEx.V2()
So,
what do you think? Is it really inheritance? Is it just a stupid trick? It
definitely requires a lot of manual code writing to make this work, but let’s do
some blue sky thinking for a bit. It is easy to get a list of exports from the
DLL, and the mangled names encapsulate a good bit of information including
calling convention, name, return type, parameters. I could probably write a tool
to generate this. And if I have the PDB, I could get the structure of the class
including data members, structure packing, etc.
Now, back to
working on C++ IDE performance and scalability for Visual Studio
2010.