Thursday, July 06, 2006 10:15 PM bart

C# 2.0 Iterators

Introduction

In my post about LINQ a couple of days ago, I promised to do a dive deep post on iterators in C# 2.0 (and later). Promises kept, here it is.

So, what’s in a name? Iterators are defined in the C# 2.0 specification, section 22. From the spec we learn the following:

An iterator block is a block (§8.2) that yields an ordered sequence of values. An iterator block is distinguished from a normal statement block by the presence of one or more yield statements.

  • The yield return statement produces the next value of the iteration.
  • The yield break statement indicates that the iteration is complete.

An iterator block may be used as a method-body, operator-body or accessor-body as long as the return type of the corresponding function member is one of the enumerator interfaces (§22.1.1) or one of the enumerable interfaces (§22.1.2).

A few keywords have been marked in bold. We’ll focus on each of these individually in a minute. But let’s concretize this spec definition by a little example:

using System;
using System.Collections.Generic;

class Test
{
     public static void Main()
     {
          foreach (string s in GetItems())
               Console.WriteLine(s);
     }

     private static IEnumerable<string> GetItems()
     {
          yield return "Hello yield 1";
          yield return "Hello yield 2";
          yield return "Hello yield 3";
          yield return "Hello yield 4";
          yield return "Hello yield 5";
     }
}

In here, the iterator is the method GetItems. Two elements indicate this:

  1. The presence of the yield keyword in the method body.

  2. The use of an IEnumerable<T> return type.

But hang on, where the object of (a) type (that implements) IEnumerable<T> which is returned? Enter the powerful world of iterators!

Exercise

What does the following code fragment put on the screen? Think about it for a while and move on to the next section.

using System;
using System.Collections.Generic;

class Test
{
    public static void Main()
    {
        foreach (int i in EvenNumbers())
            Console.WriteLine(i);
    }

    public static IEnumerable<int> EvenNumbers()
    {
        for (int i = 0; true; i += 2)
            yield return i;
    }
}

It's all about laziness

I’d like to summarize iterators with one simple statement: “iterators are sequence generators”. As such, an iterator is lazy and just sits there idle till a consumer asks to provide the next element of a sequence. The sample above (exercise) shows this. The method EvenNumbers is a generator for even numbers, that’s clear. At first glance it might look as a method that never stops executing due to the endless for loop (which I made a explicit by means of the true condition, there are of course more dirty ways of creating endless loops). But what’s really going on?

1. Dissecting the foreach construction

Let’s start at the consumer side:

        foreach (int i in EvenNumbers())
            Console.WriteLine(i);

As the matter in fact, the foreach loop construction is built around the IEnumerable and IEnumerable<T> interfaces (C# spec, sections 8.8.4 (non-generic) and 20.8.10 (generic)). The spec states that the foreach statement shown above is the equivalent of the following (cf 20.8.10):

        IEnumerator<int> enumerator = ((IEnumerable<int>)(collection)).GetEnumerator();
        try
        {
            while (enumerator.MoveNext())
            {
                int element = (int)enumerator.Current; //notice the cast isn’t required
                Console.WriteLine(element);
            }
        }
        finally
        {
            enumerator.Dispose(); //see note below on the absence of a null-check
        }

Of course you can validate this by using the much beloved ildasm tool (source compiled with /o flag):

.method public hidebysig static void  Main() cil managed
{
  .entrypoint
  .locals init (int32 V_0,
           class [mscorlib]System.Collections.Generic.IEnumerator`1<int32> V_1)
  IL_0000:  call       class [mscorlib]System.Collections.Generic.IEnumerable`1<int32> Test::EvenNumbers()
  IL_0005:  callvirt   instance class [mscorlib]System.Collections.Generic.IEnumerator`1<!0> class [mscorlib]System.Collections.Generic.IEnumerable`1<int32>::GetEnumerator()
  IL_000a:  stloc.1
  .try
  {
    IL_000b:  br.s       IL_001a
    IL_000d:  ldloc.1
    IL_000e:  callvirt   instance !0 class [mscorlib]System.Collections.Generic.IEnumerator`1<int32>::get_Current()
    IL_0013:  stloc.0.0
    IL_0014:  ldloc.0
    IL_0015:  call       void [mscorlib]System.Console::WriteLine(int32)
    IL_001a:  ldloc.1.1
    IL_001b:  callvirt   instance bool [mscorlib]System.Collections.IEnumerator::MoveNext()
    IL_0020:  brtrue.s   IL_000d
    IL_0022:  leave.s    IL_002e
  }
  finally
  {
    IL_0024:  ldloc.1
    IL_0025:  brfalse.s  IL_002d
    IL_0027:  ldloc.1
    IL_0028:  callvirt   instance void [mscorlib]System.IDisposable::Dispose()
    IL_002d:  endfinally
  }
  IL_002e:  ret
}

Note: There’s a little more in this code than specified by the official spec. The finally-block contains an additional check to see whether the local V_1 (the IEnumerator<int>) isn’t null (see line IL_0025). You can see this even better when compiling the source without the /o compiler flag (you’ll notice a ldnull – ceq series of statements to perform the null check in that build).

The most important thing to remember for now is that the foreach statement is simply a short form to deal with an IEnumerator to iterate over the sequence (notice I don’t say collection) in a forward-only manner. 

2. Behind the scenes of the iterator

Now jump to the iterator’s definition itself:

    public static IEnumerable<int> EvenNumbers()
    {
        for (int i = 0; true; i += 2)
            yield return i;
    }

When you take a look at the IL of this method, you’ll find the following:

.method public hidebysig static class [mscorlib]System.Collections.Generic.IEnumerable`1<int32>
        EvenNumbers() cil managed
{
  .locals init (class Test/'<EvenNumbers>d__0' V_0)
  IL_0000:  ldc.i4.s   -2
  IL_0002:  newobj     instance void Test/'<EvenNumbers>d__0'::.ctor(int32)
  IL_0007:  stloc.0
  IL_0008:  ldloc.0
  IL_0009:  ret
}

A big unknown type appears – Test/’<EvenNumbers>d__0’ – which can be found in the same assembly of course. (Don’t worry about the mysterious -2 parameter passed to the constructor of this unknown type.)

<Intermezzo>

As you can already feel, compilers today are doing much more than compilers a decade ago. More and more easy-to-learn productivity constructs in a language require a complex mapping under the covers.

Other examples include:

  • using-statement: translation into a try-finally block with IDisposable

  • lock-statement:  translation to a try-finally block with Monitor.Enter and Monitor.Leave calls

  • anonymous methods: creation of a “cached anonymous delegate” and another private method

  • events: add and remove handler stuff

  • properties: getter and setter methods

I’m sure you can think of many more (not to speak about late-bound languages such as VB).

</Intermezzo>

Back to the unknown type I was referring to. What’s in a name?

Let’s start with the class definition

.class auto ansi sealed nested private beforefieldinit '<EvenNumbers>d__0'
       extends [mscorlib]System.Object
       implements class [mscorlib]System.Collections.Generic.IEnumerable`1<int32>,
                  [mscorlib]System.Collections.IEnumerable,
                  class [mscorlib]System.Collections.Generic.IEnumerator`1<int32>,
                  [mscorlib]System.Collections.IEnumerator,
                  [mscorlib]System.IDisposable
{
  .custom instance void [mscorlib]System.Runtime.CompilerServices.CompilerGeneratedAttribute::.ctor() = ( 01 00 00 00 )
}

Notice the type implements two generic interfaces, IEnumerable<T> and IEnumerator<T>. That makes it possible to be used in combination with the foreach statement (see above).

Next, what are the fields of the type?

.field private int32 '<>1__state'
.field private int32 '<>2__current'
.field public int32 '<i>5__1'

The first two, the state and the current field, are the most important ones for now.

On to the constructor

.method public hidebysig specialname rtspecialname
        instance void  .ctor(int32 '<>1__state') cil managed
{
  IL_0000:  ldarg.0
  IL_0001:  call       instance void [mscorlib]System.Object::.ctor()
  IL_0006:  ldarg.0
  IL_0007:  ldarg.1
  IL_0008:  stfld      int32 Test/'<EvenNumbers>d__0'::'<>1__state'
  IL_000d:  ret
}

Nothing special again, the base constructor of System.Object is called and the ‘<>1__state’ field is populated using the supplied constructor first (and only) argument’s value.

What about the properties?

Because of the implementation of IEnumerator<int32> and IEnumerator we expect two properties (a generic and a non-generic one), a getter for the current item during the iteration over the sequence:

.property instance int32 'System.Collections.Generic.IEnumerator<System.Int32>.Current'()
{
  .get instance int32 Test/'<EvenNumbers>d__0'::'System.Collections.Generic.IEnumerator<System.Int32>.get_Current'()
}

.property instance object System.Collections.IEnumerator.Current()
{
  .get instance object Test/'<EvenNumbers>d__0'::System.Collections.IEnumerator.get_Current()
}

The corresponding methods look as follows:

.method private hidebysig newslot specialname virtual final
        instance int32  'System.Collections.Generic.IEnumerator<System.Int32>.get_Current'() cil managed
{
  .override  method instance !0 class [mscorlib]System.Collections.Generic.IEnumerator`1<int32>::get_Current()
  IL_0000:  ldarg.0
  IL_0001:  ldfld      int32 Test/'<EvenNumbers>d__0'::'<>2__current'
  IL_0006:  ret
}


.method private hidebysig newslot specialname virtual final
        instance object  System.Collections.IEnumerator.get_Current() cil managed
{
  .override [mscorlib]System.Collections.IEnumerator::get_Current
  IL_0000:  ldarg.0
  IL_0001:  ldfld      int32 Test/'<EvenNumbers>d__0'::'<>2__current'
  IL_0006:  box        [mscorlib]System.Int32
  IL_000b:  ret
}

Basically, the field with the current “state” (see further) is returned by the Current property getter. Notice the boxing in the non-generic case, as we need to return an object of the “big mother” type System.Object.

<Intermezzo>

Just in case you wonder why there’s a ldarg.0 instructor at the beginning of all these methods… This first (hidden) parameter is the current instance point (“this”).

</Intermezzo>

Some less interesting methods

In order to keep the best for the last, first some less interesting methods:

  • The Reset method (.override [mscorlib]System.Collections.IEnumerator::Reset) just throws a NotSupportedException. Once the iteration has started, you can’t go back to the initial state unless you create a new instance of the enumerator, either by calling the iterator (method) again, or by calling GetEnumerator (see the foreach statement explanation).

  • The Dispose method (.override [mscorlib]System.IDisposable::Dispose) is empty.

Getting the enumerator

Now it becomes more and more interesting: the interface IEnumerable has a method called GetEnumerator used to return the corresponding IEnumerator (read this sentence both non-generic and generic please).

<Intermezzo>

Why a getter method and not a property? Because the GetEnumerator – as you’ll see in just a couple of seconds – can be time-consuming. For time-consuming getter actions, you should use a method instead (cf. D.2.1.1 in the Common Language Infrastructure Annotated Standard which states: “Do use a method in the following situations. (…) The operation is expensive (orders of magnitude slower than a field set would be).”).

</Intermezzo>

Time to investigate what’s happening in here… The non-generic one is the least sexy one of both:

.method private hidebysig newslot virtual final
        instance class [mscorlib]System.Collections.IEnumerator
        System.Collections.IEnumerable.GetEnumerator() cil managed
{
  .override [mscorlib]System.Collections.IEnumerable::GetEnumerator
  IL_0000:  ldarg.0
  IL_0001:  call       instance class [mscorlib]System.Collections.Generic.IEnumerator`1<int32> Test/'<EvenNumbers>d__0'::'System.Collections.Generic.IEnumerable<System.Int32>.GetEnumerator'()
  IL_0006:  ret
}

So, the hunted secret (of the GetEnumerator anyway) should be in the generic brother method. Before you continue, make sure you understand the crucial role this method plays in respect to the “consumer” (foreach-statement equivalent).

.method private hidebysig newslot virtual final
        instance class [mscorlib]System.Collections.Generic.IEnumerator`1<int32>
        'System.Collections.Generic.IEnumerable<System.Int32>.GetEnumerator'() cil managed
{
  .override  method instance class [mscorlib]System.Collections.Generic.IEnumerator`1<!0> class [mscorlib]System.Collections.Generic.IEnumerable`1<int32>::GetEnumerator()
  .locals init (class Test/'<EvenNumbers>d__0' V_0)
  IL_0000:  ldarg.0
  IL_0001:  ldflda     int32 Test/'<EvenNumbers>d__0'::'<>1__state'
  IL_0006:  ldc.i4.0
  IL_0007:  ldc.i4.s   -2
  IL_0009:  call       int32 [mscorlib]System.Threading.Interlocked::CompareExchange(int32&,int32,int32)
  IL_000e:  ldc.i4.s   -2
  IL_0010:  bne.un.s   IL_0016
  IL_0012:  ldarg.0
  IL_0013:  stloc.0
  IL_0014:  br.s       IL_001d
  IL_0016:  ldc.i4.0
  IL_0017:  newobj     instance void Test/'<EvenNumbers>d__0'::.ctor(int32)
  IL_001c:  stloc.0
  IL_001d:  ldloc.0
  IL_001e:  ret
}

Wow, pretty complex at first sight isn’t it? Let’s analyze what’s happening:

  IL_0001:  ldflda     int32 Test/'<EvenNumbers>d__0'::'<>1__state'
  IL_0006:  ldc.i4.0
  IL_0007:  ldc.i4.s   -2
  IL_0009:  call       int32 [mscorlib]System.Threading.Interlocked::CompareExchange(int32&,int32,int32)

Note: ldflda stands for “Load Field Address” (cf. 4.10 in the Common Language Infrastructure Annotated Standard).

Stack = ..., '<>1__state'&, 0, -2
Call =
System.Threading.Interlocked::CompareExchange

public static int CompareExchange (
    ref int location1,
    int value,
    int comparand
)

The (conceptual) equivalent of these three instructions is (C#): 

('<>1__state' == -2 ? 0 : '<>1__state')

A threading library method is used because of the need for an atomic compare and exchange operation to ensure correctness. The CompareExchange method always returns the original value of the first operand (in this case '<>1__state'), so the original (state) value end up on top of the stack:

Stack = ..., '<>1__state'

Let’s continue:

  IL_000e:  ldc.i4.s   -2
  IL_0010:  bne.un.s   IL_0016

  IL_0012:  ldarg.0
  IL_0013:  stloc.0
  IL_0014:  br.s       IL_001d

  IL_0016:  ldc.i4.0
  IL_0017:  newobj     instance void Test/'<EvenNumbers>d__0'::.ctor(int32)
  IL_001c:  stloc.0

The result of the CompareExchange (which is the original value of '<>1__state', see above) is compared to -2 (read the bne.un.s instruction as “if the result of CompareExchange is not equal to -2, then jump to IL_001d”).

  • In case of equality, nothing has happened (see further) since the constructor of the object was called, and the statements IL_0012 and IL_0013 are executed, after which control is transferred to IL_001d with the local V_0 set to the current instance (recall that ldarg.0 stands for “this”).

  • In case the of inequality, the current instance is already being used (i.e. an iteration has started, see further). In order to answer the call to GetEnumerator() we have to create a brand new instance of our '<EvenNumbers>d__0' type and return that). This is done in the statements IL_0016 and IL_0017, after which control is transferred to IL_001d with the local V_0 set to the newly created instance with an internal state '<>1__state' set to 0 (cf. IL_0016 and the constructor’s IL code, see above).

To recap:

  • Check the instance’s internal state:

    • If the internal state still (see IL_0000 in the EvenNumbers method above) equals to -2, make it 0 and return the current instance.

    • If the internal state doesn’t equal to -2 anymore, create a new instance with internal state set to 0 and return that new instance.

A call to GetEnumerator therefore results in a ready-to-be-used enumerator object. This means that, whenever you launch a foreach loop over the iterator (which implicitly calls GetEnumerator, see foreach-statement explanation), you end up with a unique instance of our internal class, with internal (initial) state set to 0.

Iterating over the sequence – MoveNext

One crucial method remains in order to be able to iterate over the sequence, the MoveNext method (of the IEnumerator). This method is implemented as a state machine, explanation of this in a minute:

.method private hidebysig newslot virtual final
        instance bool  MoveNext() cil managed
{
  .override [mscorlib]System.Collections.IEnumerator::MoveNext
  .locals init (int32 V_0)
  IL_0000:  ldarg.0
  IL_0001:  ldfld      int32 Test/'<EvenNumbers>d__0'::'<>1__state'
  IL_0006:  stloc.0
  IL_0007:  ldloc.0
  IL_0008:  switch     (
                        IL_0017,
                        IL_003a)
  IL_0015:  br.s       IL_0051
  IL_0017:  ldarg.0
  IL_0018:  ldc.i4.m1
  IL_0019:  stfld      int32 Test/'<EvenNumbers>d__0'::'<>1__state'
  IL_001e:  ldarg.0
  IL_001f:  ldc.i4.0
  IL_0020:  stfld      int32 Test/'<EvenNumbers>d__0'::'<i>5__1'
  IL_0025:  ldarg.0
  IL_0026:  ldarg.0
  IL_0027:  ldfld      int32 Test/'<EvenNumbers>d__0'::'<i>5__1'
  IL_002c:  stfld      int32 Test/'<EvenNumbers>d__0'::'<>2__current'
  IL_0031:  ldarg.0
  IL_0032:  ldc.i4.1
  IL_0033:  stfld      int32 Test/'<EvenNumbers>d__0'::'<>1__state'
  IL_0038:  ldc.i4.1
  IL_0039:  ret
  IL_003a:  ldarg.0
  IL_003b:  ldc.i4.m1
  IL_003c:  stfld      int32 Test/'<EvenNumbers>d__0'::'<>1__state'
  IL_0041:  ldarg.0
  IL_0042:  dup
  IL_0043:  ldfld      int32 Test/'<EvenNumbers>d__0'::'<i>5__1'
  IL_0048:  ldc.i4.2
  IL_0049:  add
  IL_004a:  stfld      int32 Test/'<EvenNumbers>d__0'::'<i>5__1'
  IL_004f:  br.s       IL_0025
  IL_0051:  ldc.i4.0
  IL_0052:  ret
}

Again pretty scary in the face in first instance. Starting at the top:

  IL_0000:  ldarg.0
  IL_0001:  ldfld      int32 Test/'<EvenNumbers>d__0'::'<>1__state'
  IL_0006:  stloc.0
  IL_0007:  ldloc.0
  IL_0008:  switch     (
                        IL_0017,
                        IL_003a)
  IL_0015:  br.s       IL_0051

This is a switch statement (which contains also a switch instruction) and works as follows:

  • IL_0000 returns the “this” instance.

  • IL_0001 loads the current internal state from the current (“this”) instance.

  • IL_0006 puts the state (which is now on top of the stack) in the local variable V_0 (of type int32).

  • IL_0007 loads this local V_0 again (to the top of the stack).

  • IL_0008 is a switch instruction (cf. 3.66 in the Common Language Infrastructure Annotated Standard):

    • If the variable on top of the stack equals 0, go to IL_0017.

    • If the variable on top of the stack equals 1, go to IL_003a.

  • IL_0015 is the fall-through after the switch instruction and can be seen as the “default” case (go to IL_0051).

We end up with two blocks: IL_0017 to IL_0039 and IL_003a to IL_004f.

Let’s start with the first one (IL_0017 to IL_0039):

  IL_0017:  ldarg.0
  IL_0018:  ldc.i4.m1
  IL_0019:  stfld      int32 Test/'<EvenNumbers>d__0'::'<>1__state'

IL_0017 to IL_0019 set the internal state to -1 (ldc.i4.m1). This is not a final state yet, but it’s already different from the mystery number -2, which causes another call to GetEnumerator to return a new instance of the enumerator class (see explanation above). Now the real work starts:

  IL_001e:  ldarg.0
  IL_001f:  ldc.i4.0
  IL_0020:  stfld      int32 Test/'<EvenNumbers>d__0'::'<i>5__1'
  IL_0025:  ldarg.0
  IL_0026:  ldarg.0
  IL_0027:  ldfld      int32 Test/'<EvenNumbers>d__0'::'<i>5__1'
  IL_002c:  stfld      int32 Test/'<EvenNumbers>d__0'::'<>2__current'

We can translate this as follows:

this.'<i>5__1' = 0; //IL_001e to IL_0020
this.'<>2__current' = this.'<i>5__1'; //LHS: IL_0025, IL_002c | RHS: IL_0026, IL_0027

Then the magic continues by setting the internal state to 1:

  IL_0031:  ldarg.0
  IL_0032:  ldc.i4.1
  IL_0033:  stfld      int32 Test/'<EvenNumbers>d__0'::'<>1__state'

So, the next time we call MoveNext, the switch statement (IL_0008) will jump to IL_003a. Finally, true is returned for the MoveNext call, indicating there is more to be yielded (or stated otherwise: foreach can continue to run).

  IL_0038:  ldc.i4.1 //1 == true
  IL_0039:  ret

You can trace this all the way back to (bold):

        for (int i = 0; true; i += 2)
            yield return i;

which you should (of course) translate to the IEnumerator-based equivalent (see above) to get the clearest possible view on the code (i.e. the loop variable i is not returned by the ret instruction in IL_0039, rather it’s returned through the Current property, which we did examine earlier and returns '<>2__current' which was set in IL_002c).

Time for the second block (IL_003a to IL_004f):

  IL_003a:  ldarg.0
  IL_003b:  ldc.i4.m1
  IL_003c:  stfld      int32 Test/'<EvenNumbers>d__0'::'<>1__state'

Again the story starts by setting the internal state to -1.

  IL_0041:  ldarg.0
  IL_0042:  dup
  IL_0043:  ldfld      int32 Test/'<EvenNumbers>d__0'::'<i>5__1'

Next, the current value of i (which is stored in a private helper field) is retrieved. Note: The dup instruction in IL_0042 duplicates the value on top of the stack. In fact, IL_0026 could be replaced by a dup instruction as well, so there seems to be a little discrepancy in the C# compiler’s IL generation (although both methodologies, i.e. IL_0041+IL_0042 and IL_0025+IL_0026, have the same result).

  IL_0048:  ldc.i4.2
  IL_0049:  add

Now two (2) is added to the value on top of the stack…

  IL_004a:  stfld      int32 Test/'<EvenNumbers>d__0'::'<i>5__1'

…and the result is stored in the value for i ('<i>5__1'). So far, we’ve done nothing more than:

'<i>5__1' += 2;

which is traced back to (bold):

        for (int i = 0; true; i += 2)
            yield return i;

Finally, the system jumps to IL_0025,

  IL_004f:  br.s       IL_0025

which triggers the execution of instructions IL_0025 to IL_0039:

  IL_0025:  ldarg.0
  IL_0026:  ldarg.0
  IL_0027:  ldfld      int32 Test/'<EvenNumbers>d__0'::'<i>5__1'
  IL_002c:  stfld      int32 Test/'<EvenNumbers>d__0'::'<>2__current'
  IL_0031:  ldarg.0
  IL_0032:  ldc.i4.1
  IL_0033:  stfld      int32 Test/'<EvenNumbers>d__0'::'<>1__state'
  IL_0038:  ldc.i4.1
  IL_0039:  ret

I explained these when talking about the first (switch-case) block. Basically, the helper field for i ('<i>5__1') becomes the current value (this.'<>2__current'), the internal state is set to 1 and the method returns true (“there is more to find in this sequence, foreach is allowed to continue”).

The “default case” block just returns 0, which means “nothing more to find over here” but should never occur. The only state transitions we saw are -2 to 0, 0 to 1 and 1 to 1 (plus the intermediary state change from/to -1).

  IL_0051:  ldc.i4.0
  IL_0052:  ret

3. What about yield break?

Beside of the yield return statement, there’s also yield break to indicate that the sequence ends (no further yielding can be done, not further iteration should be done, MoveNext returns false). The only thing this causes is a more complex state machine (same number of states, but additional conditional logic to check whether yielding should stop). Consider the following (trivial) example: 

    public static IEnumerable<int> EvenNumbers()
    {
        for (int i = 0; true; i += 2)
        {
            if (i == 100)
                yield break;
            yield return i;
        }
    }

Now MoveNext has the following look:

.method private hidebysig newslot virtual final
        instance bool  MoveNext() cil managed
{
  .override [mscorlib]System.Collections.IEnumerator::MoveNext
  .locals init (bool V_0,
           int32 V_1,
           bool V_2)
  IL_0000:  ldarg.0
  IL_0001:  ldfld      int32 Test/'<EvenNumbers>d__0'::'<>1__state'
  IL_0006:  stloc.1
  IL_0007:  ldloc.1
  IL_0008:  switch     (
                        IL_0019,
                        IL_0017)
  IL_0015:  br.s       IL_001b
  IL_0017:  br.s       IL_0059
  IL_0019:  br.s       IL_001d
  IL_001b:  br.s       IL_0073
  IL_001d:  ldarg.0
  IL_001e:  ldc.i4.m1
  IL_001f:  stfld      int32 Test/'<EvenNumbers>d__0'::'<>1__state'
 
IL_0024:  nop
  IL_0025:  ldarg.0
  IL_0026:  ldc.i4.0
  IL_0027:  stfld      int32 Test/'<EvenNumbers>d__0'::'<i>5__1'
  IL_002c:  br.s       IL_006f
  IL_002e:  nop
  IL_002f:  ldarg.0
  IL_0030:  ldfld      int32 Test/'<EvenNumbers>d__0'::'<i>5__1'
  IL_0035:  ldc.i4.s   100
  IL_0037:  ceq
  IL_0039:  ldc.i4.0
  IL_003a:  ceq
  IL_003c:  stloc.2
  IL_003d:  ldloc.2
  IL_003e:  brtrue.s   IL_0042
  IL_0040:  br.s      
IL_0073
  IL_0042:  ldarg.0
  IL_0043:  ldarg.0
  IL_0044:  ldfld      int32 Test/'<EvenNumbers>d__0'::'<i>5__1'
  IL_0049:  stfld      int32 Test/'<EvenNumbers>d__0'::'<>2__current'
  IL_004e:  ldarg.0
  IL_004f:  ldc.i4.1
  IL_0050:  stfld      int32 Test/'<EvenNumbers>d__0'::'<>1__state'
  IL_0055:  ldc.i4.1
  IL_0056:  stloc.0
  IL_0057:  br.s       IL_0077
  IL_0059:  ldarg.0
  IL_005a:  ldc.i4.m1
  IL_005b:  stfld      int32 Test/'<EvenNumbers>d__0'::'<>1__state'
 
IL_0060:  nop
  IL_0061:  ldarg.0
  IL_0062:  dup
  IL_0063:  ldfld      int32 Test/'<EvenNumbers>d__0'::'<i>5__1'
  IL_0068:  ldc.i4.2
  IL_0069:  add
  IL_006a:  stfld      int32 Test/'<EvenNumbers>d__0'::'<i>5__1'
  IL_006f:  ldc.i4.1
  IL_0070:  stloc.2
  IL_0071:  br.s       IL_002e
  IL_0073:  ldc.i4.0
  IL_0074:  stloc.0
  IL_0075:  br.s      
IL_0077
  IL_0077:  ldloc.0
  IL_0078:  ret

}

The path of the iteration with i equal to 100 is indicated in red-purple-orange and causes the method to return false (IL_0073). In case the current value isn’t 100, the branch statement on IL_003e jumps to IL_0042 and true (IL_0055) is returned.

Homework

Try to find out what the following enumerator translates to in IL (without CTRL-C, WIN-R, notepad, ENTER, CTRL-V, …, csc, …, ildasm, … you know what I mean):

     private static IEnumerable<string> GetItems()
     {
          yield return "Hello yield 1";
          yield return "Hello yield 2";
          yield return "Hello yield 3";
          yield return "Hello yield 4";
          yield return "Hello yield 5";
     }

Conclusion

The iterator feature of C# is far more complex than it might seem at first glance. It hides a complete state machine taking care of state transitions to keep the current “cursor” position in the sequence. This is in sharp contrast to the methodology where one returns a pre-populated collection (say List<SomeType>) that can be traversed using foreach as well (because it’s of course IEnumerable<SomeType>). Iterators provide a lazy pattern where stuff can be calculated when it’s needed. It’s up to the consumer to decide how much of the sequence to consume effectively.

This brings us to the world of so-called “continuations”, where the execution of a piece of code is virtually suspended till the consumer decides he want to get more stuff, which imposes a stateful approach under the covers (as we investigated in this post). Call it a small (procedurally defined) Windows Workflow Foundation state machine if you want to take it so far and if that helps you to understand it (maybe it just makes things more complex, my apologies if that’s the case)… Take a look at Don Box’ post on http://pluralsight.com/blogs/dbox/archive/2005/04/17/7467.aspx as well.

Maybe one last thing: why should I bother about this? One answer is LINQ; take a look at my previous LINQ post to get more information on the Standard Query Operators and download the source to count the number of yield statements encountered. (Tip: My FindString Windows PowerShell cmdlet might be useful to perform the count). 

Del.icio.us | Digg It | Technorati | Blinklist | Furl | reddit | DotNetKicks

Filed under:

Comments

# C# 3.0 Feature Focus - Part 4 - Extension Methods

Wednesday, December 06, 2006 1:24 PM by B# .NET Blog

Introduction In this C# 3.0 Feature Focus Week we'll focus on the new language features that will be

# C# 3.0 Feature Focus - Part 4 - Extension Methods

Wednesday, December 13, 2006 6:01 AM by B# .NET Blog

Introduction In this C# 3.0 Feature Focus Week we&#39;ll focus on the new language features that will

# C# 2.0 iterators revisited - The Pascal triangle

Wednesday, February 28, 2007 11:09 AM by B# .NET Blog

Last week, I introduced C# 2.0 to a few academic people who had prior exposure to C, C++ and Java. Does

# Iterator, it's implementation and CIL (with Misc Links)

Tuesday, June 24, 2008 10:31 PM by Yong Hee Park's Blog

Links for Iterator Implementation in C#: C# 2.0 Iterators community.bartdesmet.net/.../4121.aspx