Sunday, December 10, 2006 1:21 PM bart

C# 3.0 Extension Method Versioning Troubles - Some thoughts and random ideas

A few days ago I blogged about extension methods in C# 3.0 as a piece of glue to support LINQ (although it's not strictly required to enable LINQ, it's the most natural way to "extend" existing collections etc with query operators and allow a pretty straightforward translation between query operator syntax and chained extension method calls as an intermediate compilation step).

Somewhat later I got the following question from a reader of mine, John Rusk:

Bart,

What do you think about the possible versioing problems with extension methods? I blogged about it here, http://dotnet.agilekiwi.com/blog/2006/04/extension-methods-problem.html . Everyone is is talking about segregating extension methods into their own namespaces, but I'm not sure that completely solves the problem.

In this post I'd like to discuss the problem mentioned on the following pages which John is referring to:

Warning: The ideas below are just some personal random thoughts that haven't been revised thoroughly; no guarantuees are made and none of the ideas below are a reflection of the C# team's ideas. The only official source for C# 3.0 language specification is the one you can find on the Microsoft website. All syntax propositions are imaginary and won't compile at all. I'm also sure to have overlooked various issues with the proposals being made, so don't start throwing rotten tomatoes at me :-). Suggestions and feedback are welcome (as usual) but it might be more advisable to direct your concerns to the C# team directly.

 

Problem statement

Basically there are different problems to be considered:

  1. You have defined an extension method Foo, for class Bar in library SomeLib version 1.0. In version 2.0 of SomeLib however there's an instance method called Foo on class Bar. When you recompile the code, the compiler will now favor the instance method instead of the extension method and behavior has changed.

    //v1.0
    namespace SomeLib
    {
       public class Bar
       {
       }
    }

    //Extension for v1.0
    public static class MyExtensions
    {
       public static void Foo(this SomeLib.Bar bar)
       {
          // #@*
       }
    }

    //v2.0
    namespace SomeLib
    {
       public class Bar
       {
          public void Foo()
          {
             //... (different behavior than #@*)
          }
       }
    }

    //usage
    using SomeLib;
    using
    MyExtensions;

    class
     Program
    {
       public static void Main()
       {
          //behavior will change when compiling against SomeLib v2.0
          Bar b = new Bar();
          b.Foo();
       }
    }

  2. Multiple namespace imports can introduce conflicts if both namespaces have an extension method with the same name defined for the same target type:

    //v1.0
    namespace SomeLib
    {
       public class Bar
       {
       }
    }
  3. //First set of extensions
    namespace Ext1
    {

       public static class
    MyExtensions
       {
          public static void Foo(this SomeLib.Bar bar)
          {
             // #@*
          }
       }

      
    //...
    }

    //Second set of extensions
    namespace Ext2
    {

       public static class
    MyExtensions
       {
          public static void Foo(this SomeLib.Bar bar)
          {
             // $ยต!
          }
       }

      
    //...
    }

    //usage
    using SomeLib;
    using
    Ext1;
    using Ext2;

    class
     Program
    {
       public static void Main()
       {
          //Which Foo will be called?
          Bar b = new Bar();
          b.Foo();
       }
    }

To summarize, the "minimality" of the current extension method implementation might be its major drawback too. It's basically built around an extended meaning for the using statement and an additional use case for the this keyword. Einstein's words might be applicable:

"Everything should be made as simple as possible, but not simpler."

 

A few thoughts

As the matter in fact, method resolution precedence is a double-edged sword in this problem. If instance methods take precedence on extension methods, problem 1 can occur. If the reverse were true, i.e. extension methods that take precedence on instance methods, adding a new using statement to the code can change behavior because some extension is brought in scope. A few remarks:

  • It's important to realize that behavior won't change without a recompilation: the IL code remains a static method call till the compiler comes around and recompiles the code with the choice for an instance method instead of a static method on a static class (the extension method). So, having a new version of a library installed on a machine doesn't change the behavior all of a sudden: the extension method will still be called instead of the newly available instance method in the new version of the library.
  • Recompilation can be the trigger to some intelligent analysis: if the method being called was an extension method (i.e. a static method call in IL) and now an instance method is available, it could transform the original extension method instance-method-call-style code into the equivalent static-method-on-static-class-call code to resolve the conflict. Nevertheless, this introduces additional analysis of the existing assembly which is a rather cumbersome idea at first glance.
  • The ultimate solution would be not to have precedence at all: if an instance method and an extension method are available, there's a conflict and one has to make its intention explicit: for extension method calls this would come down to using a classic static-method-on-static-class-call but for instance method calls things would be more difficult because of additional syntax required: ((string) s).Reverse() might be an explicit but "noise introducing" indication to use an instance method, not yet taking possible subtleties with virtual method calls etc in consideration.

Another question one might ask is whether or not the use of using as the "extension methods importing operator" is the good choice. By doing so, using has a three-way semantic, i.e. importing namespaces, importing extensions through namespaces and working with IDisposable objects. This can introduce the side effect that importing a namespace (for convenience use of certain types, i.e. "abbreviating long names") also imports extension methods. A few remarks:

  • You might desire more fine-grained control, e.g. by writing stuff like using MyNamespace.TheStaticClassWithExtensions. As an example, one would replace using System.Query by using System.Query.Sequence. Obviously, the name choice for the static class is key to readability of the consumer code.
  • It's a good idea to segregate extension methods into a separate namespace which is exclusively used for extension methods. The compiler might enforce this rule. However, one might like to add other classes (static or not) to the same namespace as "helpers" to provide the functionality; additional rules for existence of those classes would be required: these might be nested or declared as internal or ...
  • A separate statement to import extension methods might be more expressive to indicate the use of extensions. It'd make the compiler's live a little easier by having an explicit list of "extension method sources". Possibilities include using extensions ... (too long?) or extend from ... (from is a token already) or extend with ... or whatever else.

What about multiple extension sources available for the same type? For example: namespace Bar contains an extension method called Reverse for System.String, whileas namespace Foo does the same. Namespace segregation might help but not eliminate all possible problems. Again, more fine-grained control might be desirable.

  • The last bullet from the previous paragraph could be taken a little step further, again introducing more control but more writing as well: extend <target type> with <extension type>. For example: extend System.String with Bar.StringExtension where StringExtension is the static class with extension methods.
  • This idea would be optimal when extension methods are made more explicit concerning type-based separation. For example, you might want to extend type System.String, which you might do in a separate "extension class" for that type: class StringExtension extends System.String. It's not inheritance (the : operator) but might be confusing compared to Java concerning keyword choice. It eliminates the need to use the this keyword on the first parameter, because extensions are now on a type-by-type basis: public string Reverse() { // use 'this' somewhere }. Therefore, the this keyword can be used as if you would be really inside the target type itself, except for that it only exposes public members. If you ever get the chance to change the target type itself, you could simply copy-paste the code written in the extension method to the target type's definition. The question that remains: where are static modifiers required for "clarity" (this and static are quite orthogonal)? Let's just leave the static modifiers in place for the sample below.
  • One might argue that with this level of fine-grained control, extension methods could take precedence under all circumstances: if you don't like a particular extension, you could just fall back on the old-style static method calls where you need the extensions. By doing this, extension methods hide instance methods; syntax coloring might indicate the use of extension methods.

 

Some random ideas

An example would be this:

using System;

namespace Bar
{
   public static class StringExtensions extends String // or public extension class StringExtensions : String
   {
      public static string Reverse()
      {
         char[] c = this.ToCharArray();
         Array.Reverse(c);
         return new string(c);
      }
   }
}

namespace Foo
{
   extend String with Bar.StringExtensions;

   class Program
   {
      public static void Main()
      {
         string s = "Hello world";
         Console.WriteLine(s.Reverse());
      }
   }  
}

Pros:

  • Versioning problem solved because extension methods take precedence (i.e. hide) over instance methods. Fine-grained control over the extension imports can be used as an argument for this principle. Explicit instance member call syntax isn't available; in such a case one has to remove the extend ... with ... statement.
    • Note: extend ... with ... could be extended even further to allow extension imports on a method-per-method basis or with 'except for' lists to reduce a conflict (heavy!):
      • extend <target type> with <extension class>
      • extend <target type> with <extension class>.<extension method>
        - or - extend <target type> with <extension method comma-separated list> in <extension class>
        - or - extend <target type> with <extension class> methods <extension method comma-separated list>
      • extend <target type> with <extension class> excluding <extension method comma-separated list>
        - or - extend <target type> with <extension class> methods excluding <extension method comma-separated list>
    • Samples:
      • extend String with Bar.StringExtensions;
      • extend String with Bar.StringExtensions.Reverse;
        extend String with Reverse in Bar.StringExtensions;
        extend String with Bar.StringExtensions methods Reverse;
      • extend String with Bar.StringExtensions excluding TrimEnd; //extension class has defined TrimEnd but we want to use the instance method defined on System.String
        extend String with Bar.StringExtensions methods excluding TrimEnd;
  • Can deal with multiple extensions for the same class thanks to fine-grained "extension imports". Multiple extensions for the same type could be imported if these don't overlap with each other (i.e. meaning they have a method with the same signature in the intersection of extension methods).
  • Porting extension methods to instance methods is a matter of copy-paste if one has access to the target type's source code (e.g. in a later stage of a product's evolution when a target type itself can be extended with functionality).

Cons:

  • More complex than the original idea of extension methods, for the end user and the compiler. In the end, there's more (but more explicit) syntactical sugar involved.
    • Sample:

      using System.Query;

      becomes

      extend IEnumerable<T> with System.Query.Sequence;
  • Requires a separate extension class for each type extended (needed for fine-grained control).
  • An explicit static method call to an extension method is less intuitive because of the hidden parameter:
    • Sample:

      Calling

         public static class StringExtensions extends String
         {
            public static string Reverse()

      is done like this

         string reverse = StringExtensions.Reverse(somestring);
    • This shouldn't be a problem if you do
      • consider the extend ... with ... excluding ... syntax which can be used to reduce static method calls;
      • consider possible IntelliSense support in Visual Studio which shows the hidden parameter when making an explicit static method call;
      • compare it with the current situation where the "this <class> parameter" gets hidden when calling an extension method (read: things are just turned upside down).
    • Observe that compilation of the extension method itself is not so much more difficult: all occurences of this should be replaced by the (hidden) first parameter's value/reference.

The solution of John using contractual explicitness seems pretty nice too, but syntax looks quite awkward at first glance with the "method casting style":

s.(IContainable)Contains(t);

Nevertheless, it provides a good solution to the precedence and versioning-related problems aforementioned by allowing explicitness for target method call choise. However, on the definition side things look a little weird for the moment:

public static class Extender : IFooable
{
   public static void Foo(this Thing t)
   {
      //...
   }
}

public interface IFooable
{
   void Foo();
}

For example, interfaces are used for instance members and the static class "implements" IFooable. Also, looking at the interface in isolation, you can't see the correlation with the extension of the "Thing" class.

 

Conclusion

To summarize, I do understand the concerns of people like John with the current implementation of extension methods and I also do agree things need to be changed a bit to make its use safer and to eliminate possible versioning issues. In this post, I've put a light on those issues and proposed ideas for possible solutions, which might be a little 'dramatic' because of the 'makeover character' of these propositions. Less disruptive solutions might work out pretty well too, such as a more fine-grained control over which extension types (and/or methods) have to be imported. In the end, I'd like to say that extension methods should be used with care, like most language features, concerning the risky versioning business we're faced with when writing reusable libraries.

Del.icio.us | Digg It | Technorati | Blinklist | Furl | reddit | DotNetKicks

Filed under: ,

Comments

# re: C# 3.0 Extension Method Versioning Troubles - Some thoughts and random ideas

Sunday, December 10, 2006 3:05 PM by John Rusk

Hi Bart, Good point about "my solution" looking a bit wierd. I do like the way it means that neither instance nor extension methods have precedence by default, but rather that if there is a conflict the user will be called on to resolve it. Also, if there is intentional implementation of an instance method to take precendence over an extension method (e.g. a LINQ method is implemented as an instance method for some type which needs special handling) then the instance method will be automatically, and _safely_ chosen by the compiler. I think this is important, in any solution to these issues. (See http://dotnet.agilekiwi.com/blog/2006/04/extension-methods-more-than-sugar.html for the pattern which I think the LINQ team wants to support for LINQ methods) I'll re-read the rest of your post again in a couple of days, when I have more time... Regards, John

# re: C# 3.0 Extension Method Versioning Troubles - Some thoughts and random ideas

Sunday, December 10, 2006 9:28 PM by bart

Hi John,

You're absolutely right about the possibility to have an intentional instance method precedence for LINQ stuff. For the moment a happy marriage between the mentioned problems and all the problem constraints seems a bit unlikely; changing it one way or another causes other "parties" to be unhappy. Whenever I have some free time left, I'll think further about possible implications.

-Bart

# Windows Vista - Exploring the Windows System Assessment Tool (WinSAT) API in C# (some reactions)

Wednesday, December 13, 2006 2:59 PM by B# .NET Blog

Yesterday I published a blog post about the WinSAT API in Windows Vista . It's always great to see others

# C# 3.0 : Pourquoi ne faut-il pas utiliser les méthodes d'extension ?

Thursday, December 06, 2007 3:34 AM by Code is poetry

Dans la lignée de mon message "Pourquoi ne faut il plus utiliser l'héritage de classe" , voici la suite.