Thursday, September 07, 2006 2:59 PM bart

Going unsafe - An AddressOf operator in C#

For demonstration purposes, I've been creating some little functions to print the address of a variable in managed code, without the aid of a debugger such as WinDbg + SOS (= Son-of-(Strike = debugger for (Lightning = first codename of the .NET Framework))). Not so straightforward after all: some part (built-in value types) can easily be done using C# only, whileas another part (reference types) requires some Managed C++ work. Let's take a look at the code.

A built-in type: int (System.Int32)

//By-value semantics
static unsafe void AddressOf(int n)
{
   Console.WriteLine("void AddressOf(int n)"
);

   int
* pn = &n;
   byte* p = (byte
*)&pn;

   for (int i = sizeof(int
*) - 1; i >= 0; i--)
      Console.Write("{0:X2}"
, p[i]);
   Console
.WriteLine();

//<EDIT date="09/10/06" comment="Alternative suggested by Anton">
   string format = "{0:X" + IntPtr.Size * 2 + "}"
   Console.WriteLine(format, (long)&n);
//</EDIT>

   AddressInspector
.AddressOf(pn); //MC++
}

//Grow the stack
static unsafe void AddressOf2(int
n)
{
   Console.WriteLine("void AddressOf2(int n)"
);

   AddressOf(n);
}

//By-reference semantics
static unsafe void AddressOf(int* pn)
{
   Console.WriteLine("void AddressOf(int* pn)"
);

   byte* p = (byte
*)&pn;

   for (int i = sizeof(int
*) - 1; i >= 0; i--)
      Console.Write("{0:X2}"
, p[i]);
   Console
.WriteLine();

   AddressInspector
.AddressOf(pn); //MC++
}

//By-reference semantics
static unsafe void AddressOf(ref int
n)
{
   Console.WriteLine("void AddressOf(ref int n)"
);

   fixed (int
* pn = &n)
   {
      //byte* p = (byte*)&pn; //CS0459
      AddressInspector.AddressOf(pn);
//MC++
   }
}

Don't worry about the AddressInspector yet, that's done in Managed C++ and will be covered a little further. Notice the following things:

  • Methods are marked as unsafe. This is required to get the *, & and sizeof operators to work as well as the fixed statement.
  • In order to get this thing compiled, you'll need to enable unsafe code via the project properties, tab Build, Allow unsafe code (command line compilers use the flag /unsafe).
  • Watch the number of address-of operations (& operator) carefully. We need to get the contents of the pointer to the integer, so that's a pointer to a pointer to an integer (or int**). We use the type byte* to get the individual bytes of which the pointer consists.

Notice there is a method called AddressOf2 as well. Its sole purpose is to grow the stack and see the impact of it: because int is a value type, it will get copied every time you invoke a method and pass it as an argument (unless you're using a pointer int* or by-reference semantics ref int).

As you can see, this approach works great for all these scenarios except for one: the ref int case. If you would write the following:

static unsafe void AddressOf(ref int n)
{
   byte* p = (byte
*)&n;

   for
(int i = sizeof(int*) - 1; i >= 0; i--)
      Console.Write("{0:X2}"
, p[i]);
   Console
.WriteLine();
}

the compiler would complain "You can only take the address of an unfixed expression in a fixed statement initializer". Basically this means that n is moveable and thus we need to "fix" access to that variable so that the garbage collector won't move it to another location while we keep the address to the old location. See part 18.3 of the C# specification for more info.

So in order to get this fixed, we need to do something like this:

static unsafe void AddressOf(ref int n)
{
   fixed (int
* pn = &n)
   {
      byte* p = (byte
*)&pn;

      for (int i = sizeof(int
*) - 1; i >= 0; i--)
         Console.Write("{0:X2}"
, p[i]);
      Console
.WriteLine();
   }
}

Again, the C# compiler is unhappy with our piece of code and complains "Cannot take the address of a read-only local variable". This time the problem is that by using fixed, the variable pn becomes readonly; this makes sense because changing it (pn = somethingElse) would lead to an uncertain situation about whether pn still is fixed or not (it won't be).

Time to go unmanaged, so this is the outcome:

//By-reference semantics
static unsafe void AddressOf(ref int
n)
{
   Console.WriteLine("void AddressOf(ref int n)"
);

   fixed (int
* pn = &n)
   {
      //byte* p = (byte*)&pn; //CS0459
      AddressInspector.AddressOf(pn);
//MC++
   }
}

Let's take a look at the AddressInspector.AddressOf method written in Managed C++.

The managed C++ portion

First of all the header file (AddressInspector.h):

// AddressInspector.h

#pragma once

using
namespace
System;

public ref class
AddressInspector
{
public
:
   static void
AddressOf(Object^ o);
   static void AddressOf(void
* ptr);
};

On to the implementation now (AddressInspector.cpp):

// This is the main DLL file.

#include "stdafx.h"
#include
"AddressInspector.h"

void AddressInspector::AddressOf(void
* ptr)
{
   char* p = (char
*)&ptr;
   for (int i = sizeof(void
*) - 1; i >= 0; i--)
      System::Console::Write(
"{0:X2}"
, p[i]);
   System::Console::WriteLine();
}

void
AddressInspector::AddressOf(Object^ o)
{
   char* p = (char
*)&o;
   for (int i = sizeof(void
*) - 1; i >= 0; i--)
      System::Console::Write(
"{0:X2}"
, p[i]);
   System::Console::WriteLine();
}

Don't worry about the Object^ overload yet, just take a look at the void* one. The idea is pretty simple again, but now we can get the address of the pointer. We've left the strict and pragmatic world of C# and can mess around as much as we want :o.

A reference type: System.String

What about strings? System.String is a reference type as you know and things will look different:

//Copy-by-value or copy-by-reference?
static unsafe void AddressOf(object o)
{
   Console.WriteLine("void AddressOf(object o)"
);
   //string* pn = &n; //CS0208
   AddressInspector.AddressOf(o);
//MC++
}

//Check stack impact
static unsafe void AddressOf2(object
o)
{
   Console.WriteLine("void AddressOf2(object o)"
);
   AddressOf(o);
}

Notice we use the mother-of-all-types-type System.Object to have more flexibility further on (we'll pass in another non-built-in value type, i.e. System.TimeSpan, to see the copy-by-value behavior - don't forget that System.Object does not imply reference types, which is a common misunderstanding; the int overloads of the AddressOf* methods will be chosen by the C# compiler as the best overload when we use an int parameter).

As you can see, we need to rely on Managed C++ code once again. This time the compiler doesn't see fit to compile the &n portion of the AddressOf code ("Cannot take the address of, get size of, or declare a pointer to a managed type ('string')"). Why? See section 18.2 of the C# spec which says that "the referent type of a pointer must be an unmanaged-type" (= "not a reference-type and ..."). This is where our Managed C++ AddressInspector::AddressOf(Object^ o)method enters the scene:

void AddressInspector::AddressOf(Object^ o)
{
   char* p = (char
*)&o;
   for (int i = sizeof(void
*) - 1; i >= 0; i--)
      System::Console::Write(
"{0:X2}"
, p[i]);
   System::Console::WriteLine();
}

Again, think carefully about the address-of & usage: o is a reference type and therefore is a pointer; however, we need a pointer to that pointer in order to write out the pointer (simple, isn't it?).

Testing it

Time for some testing; let's enter our (C#) Main method:

static unsafe void Main(string[] args)
{
   Console.WriteLine("C# int\n======"
);
   int n = 1234;
//Value type (structure)
   AddressOf(n);
   AddressOf2(n);
   AddressOf(&n);
   AddressOf(
ref
n);
   Console
.WriteLine();

   Console.WriteLine("String\n======"
);
   string s = "Bart";
//Reference type
   AddressOf(s);
   AddressOf2(s);
   Console
.WriteLine();

   Console.WriteLine("TimeSpan\n========"
);
   TimeSpan ts = TimeSpan.FromSeconds(1.0);
//Value type (structure)
   AddressOf(ts);
   AddressOf2(ts);
   Console
.WriteLine();
}

The challenge for the readers of this blog is to predict the output of this listing (okay, you can't know the addresses upfront, but reason about which AddressOf* calls will yield the same output). Therefore, I won't put the output over here; just think about it and give it a try if you want (create a C# console app and a MC++ CLR class library, add a reference to the MC++ library in the C# console app, copy-paste a few times and compile).

Enjoy!

Del.icio.us | Digg It | Technorati | Blinklist | Furl | reddit | DotNetKicks

Filed under: ,

Comments

# re: Going unsafe - An AddressOf operator in C#

Saturday, September 09, 2006 3:28 AM by Anton Lapounov

There is an easier way to output a pointer value. Instead of

  int* pn = &n;
  byte* p = (byte*)&pn;

  for (int i = sizeof(int*) - 1; i >= 0; i--)
     Console.Write("{0:X2}", p[i]);
  Console.WriteLine();

I would use

  string format = "{0:X" + IntPtr.Size * 2 + "}";
  Console.WriteLine(format, (long)&n);

# re: Going unsafe - An AddressOf operator in C#

Sunday, September 10, 2006 10:56 PM by bart

Great tip Anton! Thanks for reading my blog and providing the valuable feedback.

# .NET 2.0 string interning inside out

Wednesday, September 27, 2006 11:19 AM by B# .NET Blog

Introduction
Time for some cool .NET 2.0 feature that might prove useful in some scenarios: string interning....

# re: Going unsafe - An AddressOf operator in C#

Thursday, October 05, 2006 2:53 AM by Scott

Try String.Intern  :-)

# re: Going unsafe - An AddressOf operator in C#

Thursday, October 05, 2006 12:34 PM by bart

Hi Scott,

That's exactly what was covered in my post on string interning a couple of days ago:

http://community.bartdesmet.net/blogs/bart/archive/2006/09/27/4472.aspx

-Bart

# AddressOf alternative in C# | C Language Articles | C + Language Tutorial

Pingback from  AddressOf alternative in C# | C Language Articles | C + Language Tutorial