Wednesday, August 10, 2005 2:17 AM bart

About functional programming in C# 2.0, anonymous delegates, generics and even more

Sriram Krishnan blogs about using the anonymous delegates in C# 2.0 to implement a form of currying in C#. You can find it over here: http://blogs.msdn.com/sriram/archive/2005/08/07/448722.aspx. I guess this deserves a little (???) explanation...

So first, what are these anonymous delegates? Let's show a little example of this. We'll start with C# 1.x code:

using System;
using System.Threading;

class Demo
{
   public void Do()
   {
      //thread's background work code
   }

   public static void Main()
   {
      Thread t = new Thread(new ThreadStart(Do));
      t.Start();
      //...
   }
}

This piece of code uses the ThreadStart delegate to point to function Do in order to create a thread that can be started for background processing of the method Do. You can think of similar examples with event handlers for various kinds of objects (e.g. a FileSystemWatcher, a Button control, etc). The point I want to make is that you really need an existing method to point at in order to instantiate the delegate. So, we need to have a method Do that matches the signature of the Do delegate, which is: void return type + no parameters.

In C# 2.0 it's allowed to do the following:

using System;
using System.Threading;

class Demo
{
   public static void Main()
   {
      Thread t = new Thread(
        
new ThreadStart(
            delegate () {
                //thread's background work code
            }
         )
      );

      t.Start();
      //...
   }
}

or even this

using System;
using System.Threading;

class Demo
{
   public static void Main()
   {
      Thread t = new Thread(
         delegate () {
             //thread's background work code
         }
      );

      t.Start();
      //...
   }
}

This is just a very simple example to show the basic idea of these anonymous delegates. For the full syntactical sugar, check out the C# 2.0 Specification documents. More interesting for us is the underlying IL to see how this works. We'll kick off with the classic Thread t = new Thread(new ThreadStart(Do)); approach's IL compilation result.

<Note>
Yesterday I received a comment of an interested reader on my IL adventures asking to dive a little deeper into the meaning of the various instructions. In this post I aim to do so, but for geeks I want to recommend the book
"The Common Language Infrastructure Annotated Standard" part "6. Partition III: CIL instruction Set" pages 473-600.
</Note>

.method public hidebysig static void  Main() cil managed
{
  .entrypoint
  // Code size       24 (0x18)
  .maxstack  8
  IL_0000:  ldnull                                                                                                           //load null, used for further newobj call that requires two parameters of which we only use one
  IL_0001:  ldftn      void AnonDel::Do()                                                                                    //Thread t = new Thread(new ThreadStart(Do));
  IL_0007:  newobj     instance void [mscorlib]System.Threading.ThreadStart::.ctor(object,
                                                                                   native int)                               //Thread t = new Thread(new ThreadStart(Do));
  IL_000c:  newobj     instance void [mscorlib]System.Threading.Thread::.ctor(class [mscorlib]System.Threading.ThreadStart)  //Thread t = new Thread(new ThreadStart(Do));
  IL_0011:  call       instance void [mscorlib]System.Threading.Thread::Start()                                              //t.Start();
  IL_0016:  nop
  IL_0017:  ret
} // end of method AnonDel::Main

Next, consider the alternative C# approach using the new Thread(delegate() {Console.WriteLine("Test");}).Start(); syntax:

.method public hidebysig static void  Main() cil managed
{
  .entrypoint
  // Code size       43 (0x2b)
  .maxstack  8
  IL_0000:  ldsfld     class [mscorlib]System.Threading.ThreadStart AnonDel::'<>9__CachedAnonymousMethodDelegate1'          //load ThreadStart delegate from field <>9_CachedAnonymousDelegate1
  IL_0005:  brtrue.s   IL_001a                                                                                              //do we have the delegate instance already? if yes, continue at IL_001a; otherwise, fall through
  IL_0007:  ldnull                                                                                                          //null value needed for further constructor call with non-used parameter
  IL_0008:  ldftn      void AnonDel::'b__0'()                                                                               //load function pointer to b__0 (aha, there still is a function pointer)
  IL_000e:  newobj     instance void [mscorlib]System.Threading.ThreadStart::.ctor(object,
                                                                                   native int)                              //instantiate the ThreadStart delegate using the function pointer
  IL_0013:  stsfld     class [mscorlib]System.Threading.ThreadStart AnonDel::'<>9__CachedAnonymousMethodDelegate1'          //store the instance in the <>9_CachedAnonymousDelegate1 field
  IL_0018:  br.s       IL_001a                                                                                              //up to the next statement
  IL_001a:  ldsfld     class [mscorlib]System.Threading.ThreadStart AnonDel::'<>9__CachedAnonymousMethodDelegate1'          //load ThreadStart delegate from field <>9_CachedAnonymousDelegate1
  IL_001f:  newobj     instance void [mscorlib]System.Threading.Thread::.ctor(class [mscorlib]System.Threading.ThreadStart) //create a new Thread instance using our ThreadStart instance
  IL_0024:  call       instance void [mscorlib]System.Threading.Thread::Start()                                             //and finally, start the thread
  IL_0029:  nop
  IL_002a:  ret
} // end of method AnonDel::Main

The stsfld and ldsfld instructions are used to store to and load from a field respectively. The field being used in the code above is the following one:

.field private static class [mscorlib]System.Threading.ThreadStart '<>9__CachedAnonymousMethodDelegate1'

As you can see, there still is a method called b__0 that's being passed to the ThreadStart constructor. This method's IL is the following:

.method private hidebysig static void  'b__0'() cil managed
{
  // Code size       12 (0xc)
  .maxstack  8
  IL_0000:  ldstr      "Test"
  IL_0005:  call       void [mscorlib]System.Console::WriteLine(string)
  IL_000a:  nop
  IL_000b:  ret
} // end of method AnonDel::'b__0'

No tricks involved over here...

Now, what's a closure? Allow me to point to some useful resources for this, being Martin Fowler's blog and Wikipedia. Simply stated, a closure is a way to pass a piece of code as an argument to a function in order to use it over there for further processing. But there's more, as a closure in its strict meaning can be parameterized, meaning that you can do something like this (however, see further - also marked in red -, anonymous methods are not closures but to some extent they have the feeling to be closures):

Adder CreateAdder(int a)
{
   return delegate(int b) { return a + b; };
}

This way, you can call the CreateAdder method and construct a parameterized piece of code that's returned through a delegate for further usage. For example:

Adder f = CreateAdder(2);
for (int i = 0; i < 10; i++)
   Console.Write("{0} ", f(i));

will print:

2 3 4 5 6 7 8 9 10 11

This is still pretty straightforward, but assume the following:

Accumulator CreateAccumulator(int start)
{
   int sum = start;
   return delegate(int b) { sum += b; return sum; };
}

Now, without compiling, try to guess what will appear on the screen using this code:

Accumulator g = CreateAccumulator(2);
for (int i = 0; i < 10; i++)
   Console.Write("{0} ", g(i));

If you think you've found out, here's the complete code which you can compile with C# 2.0:

using System;

class Accu
{
   delegate RETURN Adder(PARAM a);
   delegate RETURN Accumulator(PARAM a);

   static Adder CreateAdder(int a)
   {
      return delegate(int b) { return a + b; };
   }

   static Accumulator CreateAccumulator(int start)
   {
      int sum = start;
      return delegate(int b) { sum += b; return sum; };
   }

   public static void Main()
   {
      Adder f = CreateAdder(2);
      for (int i = 0; i < 10; i++)
         Console.Write("{0} ", f(i));

      Console.WriteLine();

      Accumulator g = CreateAccumulator(2);
      for (int i = 0; i < 10; i++)
         Console.Write("{0} ", g(i));
   }
}

As an addition, you can use generics on the methods CreateAdder and CreateAccumulator too in order to specify the type of the underlying Adder and Accumulator. Try this as a little exercise :-). The application as shown above should produce the following output:

C:\temp>accu
2 3 4 5 6 7 8 9 10 11
2 3 5 8 12 17 23 30 38 47

So, as you can see, the second row of numbers is really accumulating (an even better example would be a demo of the mathematical faculty operator "!", which just requires a little change in the code, try it yourself).

What's import to see however is that we have returned an anonymous delegate but we need to keep track of the sum variable, so where is this hold? The answer is that at runtime a heap-allocated frame is constructed to hold all of the data needed for that frame (called an "activation").

To keep good habits going on, let's dive into the IL behind this little sample application to see what's going on. We'll start with the Main method, shown below an annotated in green:

.method public hidebysig static void  Main() cil managed
{
  .entrypoint
  // Code size       101 (0x65)
  .maxstack  3
  .locals init (class Accu/'Adder`2' V_0,                                         //place to hold local variable for the adder
           int32 V_1,                                                             //counter value for loop --> for (int i = 0; i < 10; i++)
           class Accu/'Accumulator`2' V_2,                                        //place to hold local variable for the accumulator
           bool V_3)                                                              //condition check boolean value for loop --> for (int i = 0; i < 10; i++)
  IL_0000:  ldc.i4.2                                                              //Adder = CreateAdder(2)
  IL_0001:  call       class Accu/'Adder`2' Accu::CreateAdder(int32)              //Adder = CreateAdder(2)
  IL_0006:  stloc.0                                                               //Adder = CreateAdder(2)
  IL_0007:  ldc.i4.0                                                              //for (int i = 0; i < 10; i++)
  IL_0008:  stloc.1                                                               //for (int i = 0; i < 10; i++)
  IL_0009:  br.s       IL_0026                                                    //jump to loop condition check
  IL_000b:  ldstr      "{0} "                                                     //Console.WriteLine("{0} ", f(i));
  IL_0010:  ldloc.0                                                               //Console.WriteLine("{0} ", f(i));
  IL_0011:  ldloc.1                                                               //Console.WriteLine("{0} ", f(i));
  IL_0012:  callvirt   instance !1 class Accu/'Adder`2'::Invoke(!0)               //Console.WriteLine("{0} ", f(i));
  IL_0017:  box        [mscorlib]System.Int32                                     //boxing of int parameter
  IL_001c:  call       void [mscorlib]System.Console::Write(string,
                                                            object)               //Console.WriteLine("{0} ", f(i));
  IL_0021:  nop
  IL_0022:  ldloc.1                                                               //for (int i = 0; i < 10; i++) ==> for (int i = 0; i < 10; i = i + 1)
  IL_0023:  ldc.i4.1                                                              //for (int i = 0; i < 10; i++) ==> for (int i = 0; i < 10; i = i + 1)
  IL_0024:  add                                                                   //for (int i = 0; i < 10; i++) ==> for (int i = 0; i < 10; i = i + 1)
  IL_0025:  stloc.1                                                               //for (int i = 0; i < 10; i++) ==> for (int i = 0; i < 10; i = i + 1)
  IL_0026:  ldloc.1                                                               //for (int i = 0; i < 10; i++)
  IL_0027:  ldc.i4.s   10                                                         //for (int i = 0; i < 10; i++)
  IL_0029:  clt                                                                   //for (int i = 0; i < 10; i++)
  IL_002b:  stloc.3                                                               //store boolean value for i < 10 --> for (int i = 0; i < 10; i++)
  IL_002c:  ldloc.3                                                               //load boolean value for i < 10 --> for (int i = 0; i < 10; i++)
  IL_002d:  brtrue.s   IL_000b                                                    //loop, goto next round
  IL_002f:  call       void [mscorlib]System.Console::WriteLine()                 //Console.WriteLine();
  IL_0034:  nop
  IL_0035:  ldc.i4.2
  IL_0036:  call       class Accu/'Accumulator`2' Accu::CreateAccumulator(int32)
  IL_003b:  stloc.2
  IL_003c:  ldc.i4.0
  IL_003d:  stloc.1
  IL_003e:  br.s       IL_005b
  IL_0040:  ldstr      "{0} "
  IL_0045:  ldloc.2
  IL_0046:  ldloc.1
  IL_0047:  callvirt   instance !1 class Accu/'Accumulator`2'::Invoke(!0)
  IL_004c:  box        [mscorlib]System.Int32
  IL_0051:  call       void [mscorlib]System.Console::Write(string,
                                                            object)
  IL_0056:  nop
  IL_0057:  ldloc.1
  IL_0058:  ldc.i4.1
  IL_0059:  add
  IL_005a:  stloc.1
  IL_005b:  ldloc.1
  IL_005c:  ldc.i4.s   10
  IL_005e:  clt
  IL_0060:  stloc.3
  IL_0061:  ldloc.3
  IL_0062:  brtrue.s   IL_0040
  IL_0064:  ret
} // end of method Accu::Main

No magic going on in here, we're just calling the Create* methods with the right parameters (the explanation for lines IL_0035 to IL_0062 is completely analogue to the explanation of lines IL_0000 to IL_002f). Let's concentrate on the Accumulator now, so open up the CreateAccumulator static method. Remember that on entry of this method we have one argument passed in (see IL_0000 of Main's IL).

.method private hidebysig static class Accu/'Accumulator`2'
        CreateAccumulator(int32 start) cil managed
{
  // Code size       30 (0x1e)
  .maxstack  3
  .locals init (class Accu/'<>c__DisplayClass4' V_0,                                              //place to keep the instance of <>c__DisplayClass4 (see further)
           class Accu/'Accumulator`2' V_1)                                                        //place to keep the instance of Accumulator`2 (see further)
  IL_0000:  newobj     instance void Accu/'<>c__DisplayClass4'::.ctor()                           //create an instance of <>c__DisplayClass4; no parameters needed
  IL_0005:  stloc.0                                                                               //store the instance of <>c__DisplayClass4 in the .locals' V_0 reserved space
  IL_0006:  ldloc.0                                                                               //load the instance of <>c__DisplayClass4 from the .local' V_0 reserved space
  IL_0007:  ldarg.0                                                                               //load the argument passed in to the CreateAccumulator method (see IL_0000 of Main)
  IL_0008:  stfld      int32 Accu/'<>c__DisplayClass4'::sum                                       //store the argument (on top of the stack, see IL_0007) to the <>c__DisplayClass4 class' sum field
  IL_000d:  ldloc.0                                                                               //load the instance of <>c__DisplayClass4 from the .local' V_0 reserved space
  IL_000e:  ldftn      instance int32 Accu/'<>c__DisplayClass4'::'b__3'(int32)                    //load function pointer to the b__3 method of <>c__DisplayClass4
  IL_0014:  newobj     instance void class Accu/'Accumulator`2'::.ctor(object,
                                                                                    native int)   //create an instance of Accumulator`2 which is a delegate
  IL_0019:  stloc.1                                                                               //store the instance of Accumulator`2 in the .local' V_1 reserved space
  IL_001a:  br.s       IL_001c                                                                    //just go on to the next statement
  IL_001c:  ldloc.1                                                                               //load the instance of Accumulator`2 from the .local' V_1 reserved space
  IL_001d:  ret
} // end of method Accu::CreateAccumulator

So, which code do we recognize in here? Remember the original code was the following:

   static Accumulator CreateAccumulator(int start)
   {
      int sum = start;
      return delegate(int b) { sum += b; return sum; };
   }

On line IL_0008 we find the sum variable field that's being hold by the <>c_DisplayClass4 class:

.field public int32 sum

The code on line IL_0007 simply stores the argument (start) to that location, which clearly is the translation of the line

      int sum = start;

In IL_001d we return what was loaded in line IL_001c, i.e. the V_1 local of type Accumulator`2'. The initialization of this object happens in line IL_0014 based on two parameters: one is the <>c__DisplayClass4 instance and the other one is a pointer to the function b__3 on that same instance. So, what we have is a class called <>c__DisplayClass4 that wraps the sum variable and the logic that's found in:

      return delegate(int b) { sum += b; return sum; };

The rest of CreateAccumulator's code is straightforward. So, what's up in that strange <>c__DisplayClass4 thing? First of all we have a field:

.field public int32 sum

Second, there is a simple constructor without parameters (not displayed in here). Third, there's the method IL for the anonymous delegate used in our code, stored in b__3:

.method public hidebysig instance int32  'b__3'(int32 b) cil managed
{
  // Code size       25 (0x19)
  .maxstack  3
  .locals init (int32 V_0)                                   //place to hold return value for the function temporarily
  IL_0000:  ldarg.0                                          //load the argument of the function call --> delegate(int b)
  IL_0001:  dup                                              //duplicate the argument; we'll need it more than once
  IL_0002:  ldfld      int32 Accu/'<>c__DisplayClass4'::sum  //load the current value of the sum field --> sum += b ==> sum = sum + b
  IL_0007:  ldarg.1                                          //get our (duplicated) argument --> sum += b ==> sum = sum + b
  IL_0008:  add                                              //add the argument and the sum --> sum += b ==> sum = sum + b
  IL_0009:  stfld      int32 Accu/'<>c__DisplayClass4'::sum  //store the result in the sum field --> sum += b ==> sum = sum + b
  IL_000e:  ldarg.0                                          //get our argument
  IL_000f:  ldfld      int32 Accu/'<>c__DisplayClass4'::sum  //load the current value of the sum field (contains accumulated value)
  IL_0014:  stloc.0                                          //store the sum to .local V_0
  IL_0015:  br.s       IL_0017                               //just continue
  IL_0017:  ldloc.0                                          //load the V_0 value --> return sum;
  IL_0018:  ret                                              //return sum;
} // end of method '<>c__DisplayClass4'::'b__3'

Pretty clear I hope? Last but not least, what about the Accumulator`2' class? To show this, open up the MetaInfo view for the assembly using CTRL-M.

<Intermezzo>
Readers of "Applied .NET Framework Programming" (J. Richter) will remember the ildasm.exe /Adv hook to show the View, MetaInfo, Show (CTRL-M) menu in ildasm.exe. When you did launch ildasm.exe without the /Adv flag (which was not mentioned in the ildasm.exe /? output btw :o), there was no way to get the meta info of the assembly. Being one of the crucial contents of an assembly, you might be interested in the assembly's metadata, so this (hidden) option might be intersting. However, in .NET v2.0 the /Adv flag is gone and the View, MetaInfo, Show (CTRL-M) menu appears always. A little mysterious has been demystified :-). Notice that the Help included with ildasm still mentiones an "advanced mode" thingie:

    Statistics (advanced mode only)    - show PE file statistics in a disassembly window
    MetaInfo (advanced mode only)
</Intermezzo>

You should find the following:

TypeDef #3 (02000004)
-------------------------------------------------------
 TypDefName: Accumulator`2  (02000004)
 Flags     : [NestedPrivate] [AutoLayout] [Class] [Sealed] [AnsiClass]  (00000103)
 Extends   : 01000002 [TypeRef] System.MulticastDelegate
 EnclosingClass : Accu (02000002)
 2 Generic Parameters
  (0) GenericParamToken : (2a000003) Name : PARAM flags: 00000000 Owner: 02000004 Kind: 02000000
   1 Constraint(s)
   01000001 
  (1) GenericParamToken : (2a000004) Name : RETURN flags: 00000000 Owner: 02000004 Kind: 02000000
   1 Constraint(s)
   01000001 

 Method #1 (06000009)
 -------------------------------------------------------
  MethodName: .ctor (06000009)
  Flags     : [Public] [HideBySig] [ReuseSlot] [SpecialName] [RTSpecialName] [.ctor]  (00001886)
  RVA       : 0x00000000
  ImplFlags : [Runtime] [Managed]  (00000003)
  CallCnvntn: [DEFAULT]
  hasThis
  ReturnType: Void
  2 Arguments
   Argument #1:  Object
   Argument #2:  I
  2 Parameters
   (1) ParamToken : (0800000a) Name : object flags: [none] (00000000)
   (2) ParamToken : (0800000b) Name : method flags: [none] (00000000)

 Method #2 (0600000a)
 -------------------------------------------------------
  MethodName: Invoke (0600000A)
  Flags     : [Public] [Virtual] [HideBySig] [NewSlot]  (000001c6)
  RVA       : 0x00000000
  ImplFlags : [Runtime] [Managed]  (00000003)
  CallCnvntn: [DEFAULT]
  hasThis
  ReturnType: Var!1
  1 Arguments
   Argument #1:  Var!0
  1 Parameters
   (1) ParamToken : (0800000c) Name : a flags: [none] (00000000)

 Method #3 (0600000b)
 -------------------------------------------------------
  MethodName: BeginInvoke (0600000B)
  Flags     : [Public] [Virtual] [HideBySig] [NewSlot]  (000001c6)
  RVA       : 0x00000000
  ImplFlags : [Runtime] [Managed]  (00000003)
  CallCnvntn: [DEFAULT]
  hasThis
  ReturnType: Class System.IAsyncResult
  3 Arguments
   Argument #1:  Var!0
   Argument #2:  Class System.AsyncCallback
   Argument #3:  Object
  3 Parameters
   (1) ParamToken : (0800000d) Name : a flags: [none] (00000000)
   (2) ParamToken : (0800000e) Name : callback flags: [none] (00000000)
   (3) ParamToken : (0800000f) Name : object flags: [none] (00000000)

 Method #4 (0600000c)
 -------------------------------------------------------
  MethodName: EndInvoke (0600000C)
  Flags     : [Public] [Virtual] [HideBySig] [NewSlot]  (000001c6)
  RVA       : 0x00000000
  ImplFlags : [Runtime] [Managed]  (00000003)
  CallCnvntn: [DEFAULT]
  hasThis
  ReturnType: Var!1
  1 Arguments
   Argument #1:  Class System.IAsyncResult
  1 Parameters
   (1) ParamToken : (08000010) Name : result flags: [none] (00000000)

The parts indicated in blog show what's relevant to us. Basically, this class is just a subclass of the MulticastDelegate BCL class, which is hidden as a private class in the Accu class and is sealed (no further inheritance possible. Note that the generic parameters are listed as well (2a000003 and 2a000004) as a translation of

   delegate RETURN Accumulator(PARAM a);

This also explains the backquote `2 inside the method name, which indicates the number of generic parameters.

Another example that can be interesting is the use of delegates to filter a list using the FindAll method. The System.Collections.Generic.List class has a method FindAll that accepts a delegate like this:

class List
{
   public List FindAll(Predicate match);
}

The generic Predicate type is a delegate with the following signature:

public delegate bool Predicate (T obj);

In the following demo we'll also use the ForEach method to do an action for every element in the list, i.e. printing the value to the screen. The signature of this delegate (Action) looks as follows:

public delegate void Action (T obj);

Let's do something cool. Assume a List filled with ints that we want to filter for prime numbers. The following code will do:

using System;
using System.Collections.Generic;

class PrimeFilter
{
   public static void Main()
   {
      List<int> l = new List<int>();

      for (int i = 2; i < 100; i++)
         l.Add(i);

      PrintList("Original values", l);

      List<int> p = l.FindAll(
         delegate (int a)
         {
            for (int i = 2; i <= Math.Sqrt(a); i++)
               if (a % i == 0)
                  return false;
            return true;
         }
      );

      PrintList("Primes", p);
   }

   private static void PrintList(string msg, List<int> l)
   {
      Console.WriteLine(msg);
      l.ForEach(
         delegate (int a)
         {
            Console.Write("{0}\t", a);
         }
      );
      Console.WriteLine();
   }
}

Note that this is a very naive implementation of prime search but it will do for the sake of the demo. Compile and run. You should see the following output:

C:\temp>primefilter
Original values
2       3       4       5       6       7       8       9       10      11
12      13      14      15      16      17      18      19      20      21
22      23      24      25      26      27      28      29      30      31
32      33      34      35      36      37      38      39      40      41
42      43      44      45      46      47      48      49      50      51
52      53      54      55      56      57      58      59      60      61
62      63      64      65      66      67      68      69      70      71
72      73      74      75      76      77      78      79      80      81
82      83      84      85      86      87      88      89      90      91
92      93      94      95      96      97      98      99
Primes
2       3       5       7       11      13      17      19      23      29
31      37      41      43      47      53      59      61      67      71
73      79      83      89      97

Well, in my very opinion this is really cool.

To be complete one quote of Brad Adams on anonymous methods versus closures (source):

Anonymous methods are not closures or completions. They are anonymous methods. They allow sharing of variables with the outer method (not copy in, or copy out, or copy in-out, or even by ref).

And indeed, that's all I've been showing in this post, but nevertheless it's cool.

Now, after writing this post for a couple of hours now (you can't believe how fast the time goes by) I won't cover the currying stuff anymore, but you can find information on http://blogs.msdn.com/sriram/archive/2005/08/07/448722.aspx as aforementioned. Currying, named after Haskell Curry (cf. Haskell), allows to transform a function with multiple parameters in a function with only the first parameter left, by kind of "instantiating the function" with the other parameters and returning the "curried function" to the caller. More information can be found on Wikepedia. Have fun with it!

Del.icio.us | Digg It | Technorati | Blinklist | Furl | reddit | DotNetKicks

Filed under: ,

Comments

No Comments