Wednesday, April 09, 2008 2:00 AM bart

Pattern Matching in C# - Part 3

Welcome back beloved pattern matchers! In our last encounter we moved from closed to open pattern match objects in order to allow for reuse, improving the performance of our pattern matcher quite a bit. But we're not done yet: today we'll build upon our open pattern match and provide a way to pre-compile it into highly efficient code. Such pre-compilation is often seen in similar scenarios such as regular expression matching (see System.Text.RegEx) and can offer huge improvements.

 

Revisiting the MatchEntry

Previously, we accumulated information about match clauses and bodies in a list of MatchEntry objects which simply held the clause and the body for later use in TryMatch and Evaluate respectively. This very MatchEntry class will serve as our hook to optimize each individual match entry but turning it from an interpreter-based evaluation into a pre-compiled based evaluation. By doing so, we'll get red of the huge costs seen before to evaluate a match, largely overshadowed by reflection costs:

image

Remaining question is how. Here's the plan: add a Compile method to the MatchEntry class that turns the match clause into a delegate that takes in an object and returns a bool. At the same time, turn the match body into a delegate that takes in an object and returns an object array containing all extracted bindings. When, during pattern match evaluation, we encounter such a pre-compiled MatchEntry, we can simply invoke the delegates to check the clause and produce the result. Here's the original Execute method:

public T Execute(object o)
{
    if (Object != null)
        throw new InvalidOperationException("Execution is only valid on unbound pattern match objects.");

    if (_else == null)
        throw new InvalidOperationException("Can't evaluate the match. Incomplete match object. Are you missing an Else clause?");

    foreach (MatchEntry entry in _matches)
    {
        Dictionary<ParameterExpression, PropertyInfo> bindings;
        if (TryMatch(o, entry.Match, out bindings))
        {
            return Evaluate(o, entry.Match, entry.Action, bindings);
        }
    }

    return _else();
}

All we like to do is extend it as follows:

public T Execute(object o)
{
    if (Object != null)
        throw new InvalidOperationException("Execution is only valid on unbound pattern match objects.");

    if (_else == null)
        throw new InvalidOperationException("Can't evaluate the match. Incomplete match object. Are you missing an Else clause?");

    foreach (MatchEntry entry in _matches)
    {
        if (entry.CompiledMatch != null)
        {
            if (entry.CompiledMatch(o))
            {
                return (T)entry.Action.DynamicInvoke(entry.CompiledMapper(o));
            }
        }
        else
        {

            Dictionary<ParameterExpression, PropertyInfo> bindings;
            if (TryMatch(o, entry.Match, out bindings))
            {
                return Evaluate(o, entry.Match, entry.Action, bindings);
            }
        }
    }

    return _else();
}

Let's take a look at the flow of operations. CompiledMatch and CompiledMapper aren't methods, they are simply delegates:

internal class MatchEntry
{
   public LambdaExpression Match { get; set; }
   public Delegate Action { get; set; }

   public Func<object, bool> CompiledMatch { get; set; }
   public Func<object, object[]> <object, object[]> CompiledMapper { get; set; }


   ...
}

Basically we test whether or not the supplied object passes the match clause test which incorporates the unification algorithm including the type check and matching of individual property values. If it passes, we extract the unmatched parameters into an object array which is produced by the compiler mapper and fed into the match body delegate through DynamicInvoke. Nothing mind-blowing.

 

Match clause compilation

The match compilation follows the same skeleton as the original unification algorithm but instead of doing the evaluation straight away, it simply builds the plan to do so later when executing the open pattern match on a given object. The code will therefore look familiar and ideally it gets unified with the eager evaluation loop. Indeed, in the original code that lies at the foundation of this blog series, the (much larger) unification algorithm is shared with different back-ends plugging in to it: an eager interpreter and a couple of code-emitters of which this is one.

How does a match clause look like in its compiled form? Somewhat similar to this:

Person p;
if ((p = o as Person) != null && p.Age == 25)
   // body

Of course the number of equality checks (or in a broader sense, unification checks) is unbounded but what's important here is:

  1. The type check
  2. Fail-early semantics due to the use of && short-circuiting

This gives us the pattern for a match clause compilation, so let's present this one first:

internal void Compile()
{
    //
    // Only support for NewExpression.
    //

    NewExpression ne = Match.Body as NewExpression;
    if (ne == null)
        throw new NotSupportedException("Match clause can only contain NewExpression.");

    //
    // Lambda input parameter. 
    //
    ParameterExpression o = Expression.Parameter(typeof(object), "o");


    //
    // Created target type.
    //
    Type target = ne.Constructor.DeclaringType;

    //
    // Check the type of the object against the expected type.
    //
    Expression check = GetTypeCheck(o, target);


    //
   // List of bindings.
   //
   var bindings = new Dictionary<ParameterExpression,PropertyInfo>();

    //
    // Resolve parameters.
    //
    int i = 0;
    foreach (ParameterInfo param in ne.Constructor.GetParameters())
    {
        //
        // Mapping information from constructor parameters to properties required.
        //
        PropertyAttribute pa = param.GetCustomAttributes(typeof(PropertyAttribute), false).Cast<PropertyAttribute>().SingleOrDefault();
        if (pa == null)
            throw new InvalidOperationException(&ampquotInput object doesn't have required mapping information."); 

        //
        // Find the property.
        //
        PropertyInfo property = target.GetProperty(pa.Name);
        if (property == null)
            throw new InvalidOperationException(String.Format(&ampquotProperty {0} on type {1} not found.", pa.Name, target.Name)); 

        ConstantExpression ce;
        ParameterExpression pe; 

        //
        // Expression should either be a constant or a parameter reference.
        //
        Expression arg = ne.Arguments[i++];
        if ((ce = arg as ConstantExpression) != null)
        {
            //
            // Check the object's property matches the constant.
            //
            check = Expression.AndAlso(
                check,
                GetEqualityCheck(o, target, property, ce)
            );
        }
        else if ((pe = arg as ParameterExpression) != null)
        {
            //
            // Keep track of the loose parameter.
            //
            bindings[pe] = property;
        }
        else
            throw new NotSupportedException("Can only match constants.");
    }

    //
    // Compile match predicate.
    //
    CompiledMatch = Expression.Lambda<Func<object, bool>>(check, o).Compile();

Only the bold portions have changed, so most is familiar. Let's analyze the new approach step-by-step, starting from the bottom:

    //
    // Compile match predicate.
    //
    CompiledMatch = Expression.Lambda<Func<object, bool> >(check, o).Compile();

Here we use maybe the single most powerful method in .NET Framework 3.5's System.Core.dll assembly: LambdaExpression.Compile. We simply have an expression tree representing a check that produces a Boolean, wrap it inside a lambda expression with one parameter called o, and ask the runtime to compile it into efficient IL code. Actually we take the lazy route and let the framework do all the heavy lifting to get our check compiled. The only remaining thing for us to do is to create the check expression tree corresponding to the match clause by means of (simplified) unification. It all starts here, by declaring the input object lambda parameter:

    //
    // Lambda input parameter. 
    //
    ParameterExpression o = Expression.Parameter(typeof(object), "o");

This needs to happen upfront since we're going to refer to it quite often, e.g. to emit the type check (along the lines of &ampampampampquoto is Person") or to emit the matching conditions (e.g. ((Person)o).Age == 25). First is our type-check:

    //
    // Check the type of the object against the expected type.
    //
    Expression check = GetTypeCheck(o, target);

which gets implemented as follows:

private static Expression GetTypeCheck(ParameterExpression o, Type t)
{
    return Expression.TypeIs(p, t);
}

TypeIs corresponds to the is keyword in C#, comparing the passed in object's type against the type of the constructor of the match clause. For example, take this match clause:

(string name) => new Person(name, 25)

The corresponding type check becomes:

o is Person

Obviously, the code above shouldn't live in its own method if it's that short but in case you want stricter type checks (e.g. ignoring inheritance), this gives you the room to play. If we were to implement just a type switch we would be done here with the "unification". However, we need to emit more equality checks:

            //
            // Check the object's property matches the constant.
            //
            check = Expression.AndAlso(
                check,
                GetEqualityCheck(o, target, property, ce)
            );

As mentioned before, we only consider constants as comparable to keep the unification simple. What's interesting about the code above is the fact we accumulate all of the required conditions into one (giant) expression tree using AndAlso. This type of expression corresponds to the && short-circuiting Boolean and operator which is exactly what we want. Here's the GetEqualityCheck method:

private static MethodCallExpression GetEqualityCheck(ParameterExpression o, Type target, PropertyInfo property, ConstantExpression ce)
{
    return Expression.Call(
        typeof(object),
        "Equals",
        new Type[0],
        Expression.Convert(
            Expression.Property(
                Expression.Convert(
                    o,
                    target
                ),
                property
            ),
            typeof(object)
        ),
        Expression.Convert(
            ce,
            typeof(object)
        )
    );
}

Basically we call Object.Equals, passing in the value from the property on the passed-in object and the constant expression we found in the original expression tree. The Convert nodes are required to make the method binding work in the expression compiler, so we simply cast both inputs to System.Object. For our running sample, the resulting expression tree is:

image

 

Match clause mapper compilation

We're not done with the compilation yet. As the bridge between the match clause and match body, we need a mapper that extracts and binds the values for the match parameters:

(string name) => new Person(name, 25)

in order to feed those values in to the match clause. We had this piece of magic before in our unifier's Evaluate method. Here we just add it to the Compile method:

    //
    // Create list of unbound parameters.
    //
    Expression[] args = new Expression[Match.Parameters.Count];
    int j = 0;
    foreach (ParameterExpression param in Match.Parameters)
    {
        //
        // Parameters need to be bound in the match expression.
        //
        PropertyInfo property;
        if (!bindings.TryGetValue(param, out property))
            throw new InvalidOperationException("Parameter " + param.Name + " was not bound in the pattern match.");

        //
        // Expression to grab value from property.
        //
        args[j++] = Expression.Convert(
                        Expression.Property(
                            Expression.Convert(
                                o,
                                target
                            ),
                            property
                        ),
                        typeof(object)
                    );
    }

    //
    // Compile mapper function.
    //
    CompiledMapper = Expression.Lambda<Func<object, object[]>>(Expression.NewArrayInit(typeof(object), args), o).Compile();
}

This code is equally straight-forward and creates a lambda expression that maps the parameter 'o' representing the input object on an array of extracted property values, represented by a NewArrayInit node consisting of a series of Property nodes (which again need some conversion).

Again for our running sample, this is what happens:

image

 

Exposing compilation on the pattern object

Trivial plumbing in fact. Just add the following to Pattern<T>:

public void Compile()
{
    if (Object != null)
        throw new InvalidOperationException("Compilation is only valid on unbound pattern match objects.");

    if (_else == null)
        throw new InvalidOperationException("Can't compile the match. Incomplete match object. Are you missing an Else clause?");

    foreach (MatchEntry entry in _matches)
        if (entry.CompiledMatch == null)
            entry.Compile();
}

Alternatively, we could do the compilation as we add the match entries to the list. This would definitely be a benefit but we'll keep it as shown above. When having multiple back-ends in the compilation mechanism, the approach used above has its benefits since this Compile method has the capability of having a global view across the individual matches. For example, if you have multiple clauses testing for a Person object, we could get away with one single Person type check and rather turn the code into a TypeAs that attempts the conversion to a Person object only once, followed by null checks.

That's it: we've succeeded to compile the match clause and match body, hopefully giving us a mindblowing performance boost.

 

Drums please ... Please welcome Mr. Performance

Fasten your seatbelts and add the following to the benchmark:

pattern.Compile();

var res3 = persons.Select(p => pattern.Execute(p));

watch.Reset();
watch.Start();

foreach (var p in res3)
    ;

watch.Stop();
Console.WriteLine(watch.Elapsed + " " + watch.ElapsedMilliseconds / baseMs);

Debug.Assert(baseline.SequenceEqual(res3));

One line the difference, but what to expect from the performance... Here are the results:

image

Can't believe it? You should! One order of magnitude slower, and given the absolute numbers, this is workable. Some measurements again in CLR Profiler. We should be at easy with the memory footprint (exercise: confirm) so let's skip that one but rather look at the allocation graph:

image

Important to realize here is that the Compile phase is the one-time cost which should be O(n) in the number of match clauses. The remaining portion is clearly the iteration itself and the allocations happening here should be purely related to the delegate invocations. Indeed, here it is:

image

But what about the call tree? Tada...

image

Look at that - 14 12 (UPDATE: I missed by two lines) humble calls per iteration, allocating three objects. This is obviously just one iteration and more specifically one that didn't produce a match. In case of a match, we pay the price of going through the mapper which involves a quite expensive DynamicInvoke call:

image

This is a steady-state DynamicInvoke call, the initial invocation through the delegate is more expensive but information gets cached in the RuntimeTypeCache. However, compare to where we come from we've reduced the number of calls per iteration from 485 to 14 (non-match). The probability of having a match (and obviously, each match clause has a cost) will largely influence the overall cost of the execution since this requires the mapper to kick in through the reflection call. In our run, we had a chance of 11% to find a match (random numbers between 1 and 9, of which only Age == 5 was considered a match), so if you boost the probability you'll see those costs influence the performance negatively. For example, change the benchmark to generate random numbers between 4 and 5 (remember the second parameter to Random.Next takes the exclusive upper bound, so specify 4 and 6) - which yields a probability for a match of 50% - and you'll see the overall execution time drop to 60 ms (about 4 times slower).

A detailed performance analysis would drift us off track but you get the idea of how to measure and interpret. We've found the target for our next optimization.

 

Optimizing the call

So, is there something we can do about this DynamicInvoke call? Sure! Make it a plain delegate call. How? InvokeExpression comes to the rescue. We'll change the MatchEntry.Compile method's last part as follows:

    //
    // Create list of unbound parameters.
    //
    Expression[] args = new Expression[Match.Parameters.Count];
    int j = 0;
    foreach (ParameterExpression param in Match.Parameters)
    {
        //
        // Parameters need to be bound in the match expression.
        //
        PropertyInfo property;
        if (!bindings.TryGetValue(param, out property))
            throw new InvalidOperationException("Parameter " + param.Name + " was not bound in the pattern match.");

        //
        // Expression to grab value from property.
        //
        args[j++] = Expression.Convert(
      
                 Expression.Property(
                            Expression.Convert(
                                o,
                                target
                            ),
                            property
                        ),
                        typeof(object)
      
             );
    }

    //
    // Compile mapper function.
    //
    CompiledMapper = Expression.Lambda<Func<object, object[]>>(Expression.NewArrayInit(typeof(object), args), o).Compile();

    //
    // Compile invoker function.
    //

    Expression invoker = Expression.Invoke(Expression.Constant(Action), args);
    CompiledInvoker = Expression.Lambda<Func<object, T>>(invoker, o).Compile();
}

Make sure MatchEntry is a nested class in Pattern<T> since we're referencing generic parameter T in here to denote the result type. But man, how simply this fix is once more :-). First of all, we don't longer cast the parameters to System.Object. Second, we got rid of the array initialization. Third, we simply invoke the action delegate (which is kept in the MatchEntry as a property).

The only thing left to change is the Execute method on Pattern<T>:

public T Execute(object o)
{
    if (Object != null)
        throw new InvalidOperationException("Execution is only valid on unbound pattern match objects.");

    if (_else == null)
        throw new InvalidOperationException("Can't evaluate the match. Incomplete match object. Are you missing an Else clause?");

    foreach (MatchEntry entry in _matches)
    {
        if (entry.CompiledMatch != null)
        {
            if (entry.CompiledMatch(o))
            {
                return (T)entry.Action.DynamicInvoke(entry.CompiledMapper(o));
                return entry.CompiledInvoker(o);
            }
        }
        else
        {
            Dictionary<ParameterExpression, PropertyInfo> bindings;
            if (TryMatch(o, entry.Match, out bindings))
            {
                return Evaluate(o, entry.Match, entry.Action, bindings);
            }
        }
    }

    return _else();
}

That's it. You can guess by now: benchmark time!

image

Doesn't need further comments I guess... Below are the results of Call Tree analysis using CLR Profiler:

image

In red, you can see a match: <Benchmark>b__3(String) is the anonymous method for the Match clause, taking in the bound parameter and returning a bool. Indicated in blue is a mismatch: <Benchmark>b__2() is the anonymous method for the Else clause. The mismatch case is still the same compared to the previous run: 12 calls made, but the match case has improved dramatically, all the way down from 127 to 11. The reason the mismatch case takes one call more that the match is the additional MoveNext call caused by the foreach in the Execute method.

Could we improve even more? Yes, for example calls to Object::Equals can be eliminated in certain cases, replacing them by efficient equality checks for built-in types like Int32 and Boolean. Is it worth it? No.

 

Coming up in this series: extending the unification engine, a detour through Reflection.Emit and who knows what more :-).

Del.icio.us | Digg It | Technorati | Blinklist | Furl | reddit | DotNetKicks

Filed under: , ,

Comments

# re: Pattern Matching in C# - Part 3

Wednesday, April 09, 2008 2:32 AM by Steve

This is just cool, could you post your complete solution in a .zip file or something? :)

# re: Pattern Matching in C# - Part 3

Wednesday, April 09, 2008 3:07 AM by Koen

So are you gonna make it a Codeplex project now?

# re: Pattern Matching in C# - Part 3

Wednesday, April 09, 2008 10:53 AM by bart

Hi Steve, Koen,

I plan to make the code available at the end of the series. Currently those are snippets from a larger project I need to trim down and clean up first, but once I get to that I'll post the code.

Thanks,

-Bart

# re: Pattern Matching in C# - Part 3

Wednesday, April 09, 2008 1:43 PM by Willy Denoyette

Great posts Bart, they keep me busy these cold and rainy days. Groetjes van het 'thuisland. Willy.

# c bool

Saturday, April 12, 2008 6:27 PM by c bool

Pingback from  c bool

# Adventures in F# - F# 101 Part 8 (Mutables and Reference Cells)

Tuesday, April 15, 2008 4:09 PM by Matthew Podwysocki's Blog

Time for another adventure in F#, covering some of the basics of functional programming and F# in particular

# Pattern Matching in C# - Part 8

Wednesday, April 16, 2008 2:27 AM by B# .NET Blog

In the last handful of posts in this series we&#39;ve been looking at ways to match a whole set of patterns