Wednesday, August 10, 2005 2:17 AM
bart
About functional programming in C# 2.0, anonymous delegates, generics and even more
Sriram Krishnan blogs about using the anonymous delegates in C# 2.0 to implement a form of currying in C#. You can find it over here: http://blogs.msdn.com/sriram/archive/2005/08/07/448722.aspx. I guess this deserves a little (???) explanation...
So first, what are these anonymous delegates? Let's show a little example of this. We'll start with C# 1.x code:
using System;
using System.Threading;
class Demo
{
public void Do()
{
//thread's background work code
}
public static void Main()
{
Thread t = new Thread(new ThreadStart(Do));
t.Start();
//...
}
}
This piece of code uses the ThreadStart delegate to point to function Do in order to create a thread that can be started for background processing of the method Do. You can think of similar examples with event handlers for various kinds of objects (e.g. a FileSystemWatcher, a Button control, etc). The point I want to make is that you really need an existing method to point at in order to instantiate the delegate. So, we need to have a method Do that matches the signature of the Do delegate, which is: void return type + no parameters.
In C# 2.0 it's allowed to do the following:
using System;
using System.Threading;
class Demo
{
public static void Main()
{
Thread t = new Thread(
new ThreadStart(
delegate () {
//thread's background work code
}
)
);
t.Start();
//...
}
}
or even this
using System;
using System.Threading;
class Demo
{
public static void Main()
{
Thread t = new Thread(
delegate () {
//thread's background work code
}
);
t.Start();
//...
}
}
This is just a very simple example to show the basic idea of these anonymous delegates. For the full syntactical sugar, check out the C# 2.0 Specification documents. More interesting for us is the underlying IL to see how this works. We'll kick off with the classic Thread t = new Thread(new ThreadStart(Do)); approach's IL compilation result.
<Note>
Yesterday I received a comment of an interested reader on my IL adventures asking to dive a little deeper into the meaning of the various instructions. In this post I aim to do so, but for geeks I want to recommend the book "The Common Language Infrastructure Annotated Standard" part "6. Partition III: CIL instruction Set" pages 473-600.
</Note>
.method public hidebysig static void Main() cil managed
{
.entrypoint
// Code size 24 (0x18)
.maxstack 8
IL_0000: ldnull //load null, used for further newobj call that requires two parameters of which we only use one
IL_0001: ldftn void AnonDel::Do() //Thread t = new Thread(new ThreadStart(Do));
IL_0007: newobj instance void [mscorlib]System.Threading.ThreadStart::.ctor(object,
native int) //Thread t = new Thread(new ThreadStart(Do));
IL_000c: newobj instance void [mscorlib]System.Threading.Thread::.ctor(class [mscorlib]System.Threading.ThreadStart) //Thread t = new Thread(new ThreadStart(Do));
IL_0011: call instance void [mscorlib]System.Threading.Thread::Start() //t.Start();
IL_0016: nop
IL_0017: ret
} // end of method AnonDel::Main
Next, consider the alternative C# approach using the new Thread(delegate() {Console.WriteLine("Test");}).Start(); syntax:
.method public hidebysig static void Main() cil managed
{
.entrypoint
// Code size 43 (0x2b)
.maxstack 8
IL_0000: ldsfld class [mscorlib]System.Threading.ThreadStart AnonDel::'<>9__CachedAnonymousMethodDelegate1' //load ThreadStart delegate from field <>9_CachedAnonymousDelegate1
IL_0005: brtrue.s IL_001a //do we have the delegate instance already? if yes, continue at IL_001a; otherwise, fall through
IL_0007: ldnull //null value needed for further constructor call with non-used parameter
IL_0008: ldftn void AnonDel::'b__0'() //load function pointer to b__0 (aha, there still is a function pointer)
IL_000e: newobj instance void [mscorlib]System.Threading.ThreadStart::.ctor(object,
native int) //instantiate the ThreadStart delegate using the function pointer
IL_0013: stsfld class [mscorlib]System.Threading.ThreadStart AnonDel::'<>9__CachedAnonymousMethodDelegate1' //store the instance in the <>9_CachedAnonymousDelegate1 field
IL_0018: br.s IL_001a //up to the next statement
IL_001a: ldsfld class [mscorlib]System.Threading.ThreadStart AnonDel::'<>9__CachedAnonymousMethodDelegate1' //load ThreadStart delegate from field <>9_CachedAnonymousDelegate1
IL_001f: newobj instance void [mscorlib]System.Threading.Thread::.ctor(class [mscorlib]System.Threading.ThreadStart) //create a new Thread instance using our ThreadStart instance
IL_0024: call instance void [mscorlib]System.Threading.Thread::Start() //and finally, start the thread
IL_0029: nop
IL_002a: ret
} // end of method AnonDel::Main
The stsfld and ldsfld instructions are used to store to and load from a field respectively. The field being used in the code above is the following one:
.field private static class [mscorlib]System.Threading.ThreadStart '<>9__CachedAnonymousMethodDelegate1'
As you can see, there still is a method called b__0 that's being passed to the ThreadStart constructor. This method's IL is the following:
.method private hidebysig static void 'b__0'() cil managed
{
// Code size 12 (0xc)
.maxstack 8
IL_0000: ldstr "Test"
IL_0005: call void [mscorlib]System.Console::WriteLine(string)
IL_000a: nop
IL_000b: ret
} // end of method AnonDel::'b__0'
No tricks involved over here...
Now, what's a closure? Allow me to point to some useful resources for this, being Martin Fowler's blog and Wikipedia. Simply stated, a closure is a way to pass a piece of code as an argument to a function in order to use it over there for further processing. But there's more, as a closure in its strict meaning can be parameterized, meaning that you can do something like this (however, see further - also marked in red -, anonymous methods are not closures but to some extent they have the feeling to be closures):
Adder CreateAdder(int a)
{
return delegate(int b) { return a + b; };
}
This way, you can call the CreateAdder method and construct a parameterized piece of code that's returned through a delegate for further usage. For example:
Adder f = CreateAdder(2);
for (int i = 0; i < 10; i++)
Console.Write("{0} ", f(i));
will print:
2 3 4 5 6 7 8 9 10 11
This is still pretty straightforward, but assume the following:
Accumulator CreateAccumulator(int start)
{
int sum = start;
return delegate(int b) { sum += b; return sum; };
}
Now, without compiling, try to guess what will appear on the screen using this code:
Accumulator g = CreateAccumulator(2);
for (int i = 0; i < 10; i++)
Console.Write("{0} ", g(i));
If you think you've found out, here's the complete code which you can compile with C# 2.0:
using System;
class Accu
{
delegate RETURN Adder(PARAM a);
delegate RETURN Accumulator(PARAM a);
static Adder CreateAdder(int a)
{
return delegate(int b) { return a + b; };
}
static Accumulator CreateAccumulator(int start)
{
int sum = start;
return delegate(int b) { sum += b; return sum; };
}
public static void Main()
{
Adder f = CreateAdder(2);
for (int i = 0; i < 10; i++)
Console.Write("{0} ", f(i));
Console.WriteLine();
Accumulator g = CreateAccumulator(2);
for (int i = 0; i < 10; i++)
Console.Write("{0} ", g(i));
}
}
As an addition, you can use generics on the methods CreateAdder and CreateAccumulator too in order to specify the type of the underlying Adder and Accumulator. Try this as a little exercise :-). The application as shown above should produce the following output:
C:\temp>accu
2 3 4 5 6 7 8 9 10 11
2 3 5 8 12 17 23 30 38 47
So, as you can see, the second row of numbers is really accumulating (an even better example would be a demo of the mathematical faculty operator "!", which just requires a little change in the code, try it yourself).
What's import to see however is that we have returned an anonymous delegate but we need to keep track of the sum variable, so where is this hold? The answer is that at runtime a heap-allocated frame is constructed to hold all of the data needed for that frame (called an "activation").
To keep good habits going on, let's dive into the IL behind this little sample application to see what's going on. We'll start with the Main method, shown below an annotated in green:
.method public hidebysig static void Main() cil managed
{
.entrypoint
// Code size 101 (0x65)
.maxstack 3
.locals init (class Accu/'Adder`2' V_0, //place to hold local variable for the adder
int32 V_1, //counter value for loop --> for (int i = 0; i < 10; i++)
class Accu/'Accumulator`2' V_2, //place to hold local variable for the accumulator
bool V_3) //condition check boolean value for loop --> for (int i = 0; i < 10; i++)
IL_0000: ldc.i4.2 //Adder = CreateAdder(2)
IL_0001: call class Accu/'Adder`2' Accu::CreateAdder(int32) //Adder = CreateAdder(2)
IL_0006: stloc.0 //Adder = CreateAdder(2)
IL_0007: ldc.i4.0 //for (int i = 0; i < 10; i++)
IL_0008: stloc.1 //for (int i = 0; i < 10; i++)
IL_0009: br.s IL_0026 //jump to loop condition check
IL_000b: ldstr "{0} " //Console.WriteLine("{0} ", f(i));
IL_0010: ldloc.0 //Console.WriteLine("{0} ", f(i));
IL_0011: ldloc.1 //Console.WriteLine("{0} ", f(i));
IL_0012: callvirt instance !1 class Accu/'Adder`2'::Invoke(!0) //Console.WriteLine("{0} ", f(i));
IL_0017: box [mscorlib]System.Int32 //boxing of int parameter
IL_001c: call void [mscorlib]System.Console::Write(string,
object) //Console.WriteLine("{0} ", f(i));
IL_0021: nop
IL_0022: ldloc.1 //for (int i = 0; i < 10; i++) ==> for (int i = 0; i < 10; i = i + 1)
IL_0023: ldc.i4.1 //for (int i = 0; i < 10; i++) ==> for (int i = 0; i < 10; i = i + 1)
IL_0024: add //for (int i = 0; i < 10; i++) ==> for (int i = 0; i < 10; i = i + 1)
IL_0025: stloc.1 //for (int i = 0; i < 10; i++) ==> for (int i = 0; i < 10; i = i + 1)
IL_0026: ldloc.1 //for (int i = 0; i < 10; i++)
IL_0027: ldc.i4.s 10 //for (int i = 0; i < 10; i++)
IL_0029: clt //for (int i = 0; i < 10; i++)
IL_002b: stloc.3 //store boolean value for i < 10 --> for (int i = 0; i < 10; i++)
IL_002c: ldloc.3 //load boolean value for i < 10 --> for (int i = 0; i < 10; i++)
IL_002d: brtrue.s IL_000b //loop, goto next round
IL_002f: call void [mscorlib]System.Console::WriteLine() //Console.WriteLine();
IL_0034: nop
IL_0035: ldc.i4.2
IL_0036: call class Accu/'Accumulator`2' Accu::CreateAccumulator(int32)
IL_003b: stloc.2
IL_003c: ldc.i4.0
IL_003d: stloc.1
IL_003e: br.s IL_005b
IL_0040: ldstr "{0} "
IL_0045: ldloc.2
IL_0046: ldloc.1
IL_0047: callvirt instance !1 class Accu/'Accumulator`2'::Invoke(!0)
IL_004c: box [mscorlib]System.Int32
IL_0051: call void [mscorlib]System.Console::Write(string,
object)
IL_0056: nop
IL_0057: ldloc.1
IL_0058: ldc.i4.1
IL_0059: add
IL_005a: stloc.1
IL_005b: ldloc.1
IL_005c: ldc.i4.s 10
IL_005e: clt
IL_0060: stloc.3
IL_0061: ldloc.3
IL_0062: brtrue.s IL_0040
IL_0064: ret
} // end of method Accu::Main
No magic going on in here, we're just calling the Create* methods with the right parameters (the explanation for lines IL_0035 to IL_0062 is completely analogue to the explanation of lines IL_0000 to IL_002f). Let's concentrate on the Accumulator now, so open up the CreateAccumulator static method. Remember that on entry of this method we have one argument passed in (see IL_0000 of Main's IL).
.method private hidebysig static class Accu/'Accumulator`2'
CreateAccumulator(int32 start) cil managed
{
// Code size 30 (0x1e)
.maxstack 3
.locals init (class Accu/'<>c__DisplayClass4' V_0, //place to keep the instance of <>c__DisplayClass4 (see further)
class Accu/'Accumulator`2' V_1) //place to keep the instance of Accumulator`2 (see further)
IL_0000: newobj instance void Accu/'<>c__DisplayClass4'::.ctor() //create an instance of <>c__DisplayClass4; no parameters needed
IL_0005: stloc.0 //store the instance of <>c__DisplayClass4 in the .locals' V_0 reserved space
IL_0006: ldloc.0 //load the instance of <>c__DisplayClass4 from the .local' V_0 reserved space
IL_0007: ldarg.0 //load the argument passed in to the CreateAccumulator method (see IL_0000 of Main)
IL_0008: stfld int32 Accu/'<>c__DisplayClass4'::sum //store the argument (on top of the stack, see IL_0007) to the <>c__DisplayClass4 class' sum field
IL_000d: ldloc.0 //load the instance of <>c__DisplayClass4 from the .local' V_0 reserved space
IL_000e: ldftn instance int32 Accu/'<>c__DisplayClass4'::'b__3'(int32) //load function pointer to the b__3 method of <>c__DisplayClass4
IL_0014: newobj instance void class Accu/'Accumulator`2'::.ctor(object,
native int) //create an instance of Accumulator`2 which is a delegate
IL_0019: stloc.1 //store the instance of Accumulator`2 in the .local' V_1 reserved space
IL_001a: br.s IL_001c //just go on to the next statement
IL_001c: ldloc.1 //load the instance of Accumulator`2 from the .local' V_1 reserved space
IL_001d: ret
} // end of method Accu::CreateAccumulator
So, which code do we recognize in here? Remember the original code was the following:
static Accumulator CreateAccumulator(int start)
{
int sum = start;
return delegate(int b) { sum += b; return sum; };
}
On line IL_0008 we find the sum variable field that's being hold by the <>c_DisplayClass4 class:
.field public int32 sum
The code on line IL_0007 simply stores the argument (start) to that location, which clearly is the translation of the line
int sum = start;
In IL_001d we return what was loaded in line IL_001c, i.e. the V_1 local of type Accumulator`2'. The initialization of this object happens in line IL_0014 based on two parameters: one is the <>c__DisplayClass4 instance and the other one is a pointer to the function b__3 on that same instance. So, what we have is a class called <>c__DisplayClass4 that wraps the sum variable and the logic that's found in:
return delegate(int b) { sum += b; return sum; };
The rest of CreateAccumulator's code is straightforward. So, what's up in that strange <>c__DisplayClass4 thing? First of all we have a field:
.field public int32 sum
Second, there is a simple constructor without parameters (not displayed in here). Third, there's the method IL for the anonymous delegate used in our code, stored in b__3:
.method public hidebysig instance int32 'b__3'(int32 b) cil managed
{
// Code size 25 (0x19)
.maxstack 3
.locals init (int32 V_0) //place to hold return value for the function temporarily
IL_0000: ldarg.0 //load the argument of the function call --> delegate(int b)
IL_0001: dup //duplicate the argument; we'll need it more than once
IL_0002: ldfld int32 Accu/'<>c__DisplayClass4'::sum //load the current value of the sum field --> sum += b ==> sum = sum + b
IL_0007: ldarg.1 //get our (duplicated) argument --> sum += b ==> sum = sum + b
IL_0008: add //add the argument and the sum --> sum += b ==> sum = sum + b
IL_0009: stfld int32 Accu/'<>c__DisplayClass4'::sum //store the result in the sum field --> sum += b ==> sum = sum + b
IL_000e: ldarg.0 //get our argument
IL_000f: ldfld int32 Accu/'<>c__DisplayClass4'::sum //load the current value of the sum field (contains accumulated value)
IL_0014: stloc.0 //store the sum to .local V_0
IL_0015: br.s IL_0017 //just continue
IL_0017: ldloc.0 //load the V_0 value --> return sum;
IL_0018: ret //return sum;
} // end of method '<>c__DisplayClass4'::'b__3'
Pretty clear I hope? Last but not least, what about the Accumulator`2' class? To show this, open up the MetaInfo view for the assembly using CTRL-M.
<Intermezzo>
Readers of "Applied .NET Framework Programming" (J. Richter) will remember the ildasm.exe /Adv hook to show the View, MetaInfo, Show (CTRL-M) menu in ildasm.exe. When you did launch ildasm.exe without the /Adv flag (which was not mentioned in the ildasm.exe /? output btw :o), there was no way to get the meta info of the assembly. Being one of the crucial contents of an assembly, you might be interested in the assembly's metadata, so this (hidden) option might be intersting. However, in .NET v2.0 the /Adv flag is gone and the View, MetaInfo, Show (CTRL-M) menu appears always. A little mysterious has been demystified :-). Notice that the Help included with ildasm still mentiones an "advanced mode" thingie:
Statistics (advanced mode only) - show PE file statistics in a disassembly window
MetaInfo (advanced mode only)
</Intermezzo>
You should find the following:
TypeDef #3 (02000004)
-------------------------------------------------------
TypDefName: Accumulator`2 (02000004)
Flags : [NestedPrivate] [AutoLayout] [Class] [Sealed] [AnsiClass] (00000103)
Extends : 01000002 [TypeRef] System.MulticastDelegate
EnclosingClass : Accu (02000002)
2 Generic Parameters
(0) GenericParamToken : (2a000003) Name : PARAM flags: 00000000 Owner: 02000004 Kind: 02000000
1 Constraint(s)
01000001
(1) GenericParamToken : (2a000004) Name : RETURN flags: 00000000 Owner: 02000004 Kind: 02000000
1 Constraint(s)
01000001
Method #1 (06000009)
-------------------------------------------------------
MethodName: .ctor (06000009)
Flags : [Public] [HideBySig] [ReuseSlot] [SpecialName] [RTSpecialName] [.ctor] (00001886)
RVA : 0x00000000
ImplFlags : [Runtime] [Managed] (00000003)
CallCnvntn: [DEFAULT]
hasThis
ReturnType: Void
2 Arguments
Argument #1: Object
Argument #2: I
2 Parameters
(1) ParamToken : (0800000a) Name : object flags: [none] (00000000)
(2) ParamToken : (0800000b) Name : method flags: [none] (00000000)
Method #2 (0600000a)
-------------------------------------------------------
MethodName: Invoke (0600000A)
Flags : [Public] [Virtual] [HideBySig] [NewSlot] (000001c6)
RVA : 0x00000000
ImplFlags : [Runtime] [Managed] (00000003)
CallCnvntn: [DEFAULT]
hasThis
ReturnType: Var!1
1 Arguments
Argument #1: Var!0
1 Parameters
(1) ParamToken : (0800000c) Name : a flags: [none] (00000000)
Method #3 (0600000b)
-------------------------------------------------------
MethodName: BeginInvoke (0600000B)
Flags : [Public] [Virtual] [HideBySig] [NewSlot] (000001c6)
RVA : 0x00000000
ImplFlags : [Runtime] [Managed] (00000003)
CallCnvntn: [DEFAULT]
hasThis
ReturnType: Class System.IAsyncResult
3 Arguments
Argument #1: Var!0
Argument #2: Class System.AsyncCallback
Argument #3: Object
3 Parameters
(1) ParamToken : (0800000d) Name : a flags: [none] (00000000)
(2) ParamToken : (0800000e) Name : callback flags: [none] (00000000)
(3) ParamToken : (0800000f) Name : object flags: [none] (00000000)
Method #4 (0600000c)
-------------------------------------------------------
MethodName: EndInvoke (0600000C)
Flags : [Public] [Virtual] [HideBySig] [NewSlot] (000001c6)
RVA : 0x00000000
ImplFlags : [Runtime] [Managed] (00000003)
CallCnvntn: [DEFAULT]
hasThis
ReturnType: Var!1
1 Arguments
Argument #1: Class System.IAsyncResult
1 Parameters
(1) ParamToken : (08000010) Name : result flags: [none] (00000000)
The parts indicated in blog show what's relevant to us. Basically, this class is just a subclass of the MulticastDelegate BCL class, which is hidden as a private class in the Accu class and is sealed (no further inheritance possible. Note that the generic parameters are listed as well (2a000003 and 2a000004) as a translation of
delegate RETURN Accumulator(PARAM a);
This also explains the backquote `2 inside the method name, which indicates the number of generic parameters.
Another example that can be interesting is the use of delegates to filter a list using the FindAll method. The System.Collections.Generic.List class has a method FindAll that accepts a delegate like this:
class List
{
public List FindAll(Predicate match);
}
The generic Predicate type is a delegate with the following signature:
public delegate bool Predicate (T obj);
In the following demo we'll also use the ForEach method to do an action for every element in the list, i.e. printing the value to the screen. The signature of this delegate (Action) looks as follows:
public delegate void Action (T obj);
Let's do something cool. Assume a List filled with ints that we want to filter for prime numbers. The following code will do:
using System;
using System.Collections.Generic;
class PrimeFilter
{
public static void Main()
{
List<int> l = new List<int>();
for (int i = 2; i < 100; i++)
l.Add(i);
PrintList("Original values", l);
List<int> p = l.FindAll(
delegate (int a)
{
for (int i = 2; i <= Math.Sqrt(a); i++)
if (a % i == 0)
return false;
return true;
}
);
PrintList("Primes", p);
}
private static void PrintList(string msg, List<int> l)
{
Console.WriteLine(msg);
l.ForEach(
delegate (int a)
{
Console.Write("{0}\t", a);
}
);
Console.WriteLine();
}
}
Note that this is a very naive implementation of prime search but it will do for the sake of the demo. Compile and run. You should see the following output:
C:\temp>primefilter
Original values
2 3 4 5 6 7 8 9 10 11
12 13 14 15 16 17 18 19 20 21
22 23 24 25 26 27 28 29 30 31
32 33 34 35 36 37 38 39 40 41
42 43 44 45 46 47 48 49 50 51
52 53 54 55 56 57 58 59 60 61
62 63 64 65 66 67 68 69 70 71
72 73 74 75 76 77 78 79 80 81
82 83 84 85 86 87 88 89 90 91
92 93 94 95 96 97 98 99
Primes
2 3 5 7 11 13 17 19 23 29
31 37 41 43 47 53 59 61 67 71
73 79 83 89 97
Well, in my very opinion this is really cool.
To be complete one quote of Brad Adams on anonymous methods versus closures (source):
Anonymous methods are not closures or completions. They are anonymous methods. They allow sharing of variables with the outer method (not copy in, or copy out, or copy in-out, or even by ref).
And indeed, that's all I've been showing in this post, but nevertheless it's cool.
Now, after writing this post for a couple of hours now (you can't believe how fast the time goes by) I won't cover the currying stuff anymore, but you can find information on http://blogs.msdn.com/sriram/archive/2005/08/07/448722.aspx as aforementioned. Currying, named after Haskell Curry (cf. Haskell), allows to transform a function with multiple parameters in a function with only the first parameter left, by kind of "instantiating the function" with the other parameters and returning the "curried function" to the caller. More information can be found on Wikepedia. Have fun with it!
Del.icio.us |
Digg It |
Technorati |
Blinklist |
Furl |
reddit |
DotNetKicks
Filed under: .NET Framework v2.0, C# 2.0