Tuesday, April 15, 2008 2:00 AM bart

Pattern Matching in C# - Part 7

In our last encounter on the pattern matching mission we covered techniques to match T[] and List<T>. Today we cover another type that's being use a lot: dictionaries (the generic brothers of Hashtable which you could match too, exercise). I've already shown an example of such a match:

.Match((int bartAge, int johnAge) => new Dictionary<string, int>() {
    { "Bart", bartAge },
    { "John", johnAge }
}, (bartAge, johnAge) => { ... })

In our discussion we noticed it would be best not to constrain the number of entries in the dictionary based on the number of entries in the match clause. This is fairly similar to matching objects through object initializers where we just match the specified properties (comparable to keys) to have certain properties (comparable to values). If the reader feels the matching should be more restrictive, she should feel free to implement so. In addition to this, we do mandate the specified keys to be present since otherwise we don't know what value to extract and additionally, the matches can only be put on the keys while the extractions take place on the values (keys are unique, values aren't necessarily). Putting everything together, this is the skeleton of the sample's translation:

IDictionary<string, int> dictionary = o as IDictionary<string, int>;
int bartAge, johnAge;
if (dictionary != null && dictionary.TryGetValue("Bart", out bartAge) && dictionary.TryGetValue("John", out johnAge))
{
   ...
}

I'm using IDictionary over here to boost flexibility (I could have done the same in the previous post with IList<T> instead of List<T>) but that's just a minor detail. However, the above can't be translated one-on-one in our pattern matcher: the out keywords are problematic in the context of expression trees. Instead we'll stick with the more redundant equivalent:

IDictionary<string, int> dictionary = o as IDictionary<string, int>;
if (dictionary != null && dictionary.ContainsKey("Bart") && dictionary.ContainsKey("John"))
{
   int bartAge = dictionary["Bart"];
   int johnAge = dictionary["John"];
   ...
}

From a language point of view, initializing a Dictionary using collection initializers is not at all different from doing the same on say a List or an array; the core difference is the fact an Add method with two operandi is called. In other words, we're facing another of those ListInitExpressions, so we can fork the code introduced last time:

private Expression CompileListInitExpression(ParameterExpression o, Type target, ListInitExpression lie, Dictionary<ParameterExpression, Expression> bindings)
{
    if (lie.NewExpression.Arguments.Count != 0)
        throw new NotSupportedException("Collection initializers in match expressions should use the default constructor.");

    Type t = lie.Type.GetGenericTypeDefinition();
    if (t == typeof(List<>))
        return CompileListInitExpressionForListOfT(o, target, lie, bindings);
    else if (t == typeof(Dictionary<,>))
        return CompileListInitExpressionForDictionaryKonV(o, target, lie, bindings);
    else
        throw new NotSupportedException("Only List<T> and Dictionary<K,V> are supported.");
}

Don't bother about my rather verbose method naming, the inside of the method is much more appealing. The pattern is basically the same as before:

  1. Emit a type check
  2. Unify and extract all list entries
  3. Coalesce bindings to the same parameters and aggregate those with the check

Let's go there step by step. First the type check (trivial):

private Expression CompileListInitExpressionForDictionaryKonV(ParameterExpression o, Type target, ListInitExpression lie, Dictionary<ParameterExpression, Expression> bindings)
{
    Expression check = Expression.TypeIs(
        o,
        target
    );

Followed by the iteration over all initializers, which is where things get interesting. The Add method in Dictionary<K,V> takes two parameters: one of type K and one of type V. The compiler will have done all the type checking already to make sure types are compatible and such, and let's assume this overload of Add is the only one in reach. Here's the first part of the loop:

    int i = 0;
    foreach (var e in lie.Initializers)
    {
        //
        // Note: We know we're processing Dictionary<K,V>, which only has one Add overload,
        //       so we omit additional checks for now.
        //
        Expression keyArg = e.Arguments[0];
        Expression valArg = e.Arguments[1];

        ConstantExpression key = keyArg as ConstantExpression;
        if (key == null)
            throw new NotSupportedException("Keys for dictionaries used in pattern matches should be constant expressions.");

        check = Expression.AndAlso(
            check,
            Expression.Call(
                Expression.Convert(
                    o,
                    target
                ),
                target.GetMethod("ContainsKey"),
                key
            )
        );

First we make sure the expression used as the first parameter, which represents the key, is a constant. This restriction simplifies matters and makes much sense too since we don't want to get in the nasty business of reverse mapping (which key belongs to a given value?). Once we've done this sanity check, we extend the check with a call to ContainsKey on the dictionary (watch out for the cast, which is safe since we've already done a TypeIs before). The call is simply taking in the key expression as we got it.

After the key comes the lock value. Here we allow more flexibility in our usual style. In fact, this is one of those places where recursive unification would come in handy, allowing you to write matches like:

.Match((string name, int age) => new Dictionary<int, Person>() {
    { 12345, new Person() { Name = name, Age = age} }
}, (name, age) => { ... })

Conservative as we are, let's stick with ParameterExpression and ConstantExpression for the scope of this post:

        ParameterExpression pe;
        ConstantExpression ce;

        if ((pe = valArg as ParameterExpression) != null)
        {
            if (bindings.ContainsKey(pe))
            {
                check = Expression.AndAlso(
                    check,
                    GetEqualityCheckForDictionaryEntry(o, target, key, bindings[pe])
                );
            }
            else
                bindings[pe] = Expression.Call(
                    Expression.Convert(
                        o,
                        target
                    ),
                    target.GetMethod("get_Item"),
                    key
                );
        }
        else if ((ce = valArg as ConstantExpression) != null)
        {
            check = Expression.AndAlso(
                check,
                GetEqualityCheckForDictionaryEntry(o, target, key, ce)
            );
        }
        else
            throw new
NotSupportedException("Can only match constants.");

        i++;
    }

    return check;
}

The core differences with the previous post are outlined in bold. It's becoming a pattern... and yes, in a recursive descent-ish parser approach this kind of match would just occur once while the leaf nodes would be supplied based on carried type data (e.g. the extraction expression for an array is different from that for a property or a dictionary entry). Again, we can use get_Item to call the dictionary object's indexer, passing in the key and giving us a binding. The remainder cases need some equality checks to match constants and parameters respectively. Matching constants in this context doesn't make very much sense but that's a matter of taste:

.Match((int johnAge) => new Dictionary<string, int>() {
    { "Bart", 25 },
    { "John", johnAge }
}, (johnAge) => { ... })

would try to find an entry "Bart" with age set to 25 in order to allow a match for "John" to be found. If this looks awkward and too flexible, one can just comment out that part of the parser of course. Anyhow, here's the GetEqualityCheckForDictionaryEntry code:

private Expression GetEqualityCheckForDictionaryEntry(ParameterExpression o, Type target, ConstantExpression key, Expression e)
{
    return Expression.Call(
        typeof(object),
        "Equals",
        new Type[0],
        Expression.Convert(
            Expression.Call(
                Expression.Convert(
                    o,
                    target
                ),
                target.GetMethod("get_Item"),
                key
            ),
            typeof(object)
        ),
        Expression.Convert(
            e,
            typeof(object)
        )
    );
}

Again, straightforward: it just emits a call to Object.Equals, passing in the value retrieved through the indexer and the value specified as the expression. Need a sample?

var onBisectXY = new Pattern<bool>()
    .Match((int x) => new Dictionary<string, int>() { { "x", x }, { "y", x } }, x => true)
    .Else(() => false);

onBisectXY.Compile();

Random rand = new Random();
for (int i = 0; i < 100; i++)
{
    var p = new Dictionary<string, int> { { "x", rand.Next(0, 5) }, { "y", rand.Next(0, 5) } };
    if (onBisectXY.Execute(p))
        Console.WriteLine(p["x"] + "," + p["y"]);
}

producing:

image

A small matching exercise today. More stuff next time - stay tuned!

Del.icio.us | Digg It | Technorati | Blinklist | Furl | reddit | DotNetKicks

Filed under: , ,

Comments

# Adventures in F# - F# 101 Part 8 (Mutables and Reference Cells)

Tuesday, April 15, 2008 4:09 PM by Matthew Podwysocki's Blog

Time for another adventure in F#, covering some of the basics of functional programming and F# in particular

# Adventures in F# - F# 101 Part 8 (Mutables and Reference Cells)

Tuesday, April 15, 2008 4:17 PM by Community Blogs

Time for another adventure in F#, covering some of the basics of functional programming and F# in particular

# Reflective Perspective - Chris Alcock &raquo; The Morning Brew #75

Pingback from  Reflective Perspective - Chris Alcock  &raquo; The Morning Brew #75