Saturday, March 26, 2005 11:36 PM bart

Adventures in Comega - part 4 (Possibly-null values)

Introduction

In the 4th episode of my "Adventures in Comega" series I'll be talking about a (smaller) language feature called "possibly-null values". As an example, consider a boolean value which has only two values (binary logic): true or false. However, what happens if you want to express that the logical value can be possibly unknown? As a bool is a value type and the domain is only {0,1} we can't possibly express this. Possibly-null values solve this problem by allowing you to assign null to the variable (or, that is the same as not assigning to it anything whatsoever) to indicate that the value is unknown, but only if the value is marked as being possibly-null.

 

Reference types and null

Today, you can use the "null" value for reference types. Basically, null indicates that the variable is not assigned to. As the matter in fact, behind the scenes that's the same as having a NULL pointer in the variable, because of the nature of reference types. The memory location of the reference type-variable contains just an address (that is, a pointer) to another place in memory where the real data is stored (which is called dynamically allocated memory, cf. malloc in C). In C# today the null keyword is used for this purpose: indicating whether an object has been assigned to or not. Quite often you'll see code like this:

if (null != someVar)
  //do this
else
  //ow, there is some problem, handle it

C#-programming style tip: As a sideremark allow me to explain the programming style of putting a constant first in an equality comparison expression (null != somevar). The idea is that when you're comparing things in C#, you always need to type two characters: == for equality and != for non-equality. It's possible to forget one of these characters easily (not by lack of language knowlegde I hope, but because of a typo). When you do something like a == 5, there's no problem. But if you forget one of both equality symbols, you get a = 5, an assignment. Because assignments evaluate to a boolean value (true when the value is not 0, false if it is 0) this code will compile too (in C/C++ it does without warnings normally, C# will warn you about this risky construction). By reversing the constant and the variable like this 5 == a it's still possible to make the same mistake (5 = a) but now you'll get an error because a constant cannot be assigned to.

Another place where null values are used, is in database (you probably know the DbNull value). Today, in O/R mapping you can't directly express that a boolean or an integer or another basic typed field has as its value null, because null is not in the domain. Comega will help to solve this problem too.

 

NullReferenceException, casts and "as"

One of the others things that are related to the concept null is the NullReferenceException. Take a look at the following code:

SomeClass c = null;
c.DoSomething();

Although this compiles, the CLR will throw a NullReferenceException at runtime because you can't perform an operation on a null-valued variable. Or, in C-terms, you can't dereference a nullpointer:

SomeClass *c = NULL; //or "SomeClass* c = NULL", anyway c is a pointer (indicated by the asterisk)
(*c).DoSomething(); //the same as c->DoSomething(), but the *c syntax tells a little more in this demo for C-newbies :-)

The way to solve this problem is to put the whole thing in a try...catch block or by testing on the value of c. In the same way, the next piece of code with a property will fail:

SomeClass c = null;
string s = c.SomeStringValuedProperty;

Yet another place where the null value is present is when you're using the keyword "as" in C# to perform a cast that can possibly fail:

//assume you got some variable o of type System.Object (e.g. through a method parameter)
MyClass c = (MyClass) o; //will throw an exception if o is not a (subtype of) MyClass instance
MyClass cbis = o as MyClass; //won't throw an exception but will assign null to cbis if the type constraints are not fulfilled

 

Introducing possible-null values

In Comega, this problem is solved using possible-null values, as shown in the next example:

bool? b = null; //you can perform the test (null == b)

This piece of code declares a boolean variable that can be possibly null (indicated by the ?). So, you can assign null, true and false to it, and you can test it for a null value. In the case of a boolean value, this is kind of a ternary logic. Now, if you're using such a variable, you can even cast a null-valued variable without encountering an exception:

MyClass? c = null;
MyClass d = (MyClass) c; //d is not possibly-null but as c is possibly null, this cast does not throw an exception

Notice that you can't do this casting with a value type such as a bool, if you write this:

bool? b = null;
bool a = (bool) b; //will throw an exception, value types without a ? are never nullable

 

Transitivity of null values

One of the things Comega wants to solve by introducing the concept of possibly-null values is the (infamous?) NullReferenceException. The idea is to make null transitive, so that a property getter call on a null-valued variable can return null too:

MyClass? c = null;
bool? b = c.SomeBooleanValuedProperty; //s will be null; no exception should be thrown

"Homework": what will be the result of the following code snippets?

MyClass? c = null;
string s = c.SomeStringValuedProperty;

and of this:

MyClass? c = null;
string? s = c.SomeStringValuedProperty;

and this:

MyClass? c = null;
MyClass child = c.SomeChildMyClassValue;

 

And once again ... we'll dive into the IL stuff

Let's keep it as simple as possible this time :-). Consider this piece of code:

public static void Main()
{
  bool? b;
  Console.WriteLine(b);
  b = true;
  Console.WriteLine(b);
}

In compiled format, the IL of Main is this:

.method public hidebysig static void  Main() cil managed
{
  .entrypoint
  // Code size       32 (0x20)
  .maxstack  5
  .locals init (valuetype StructuralTypes.'Boxed' V_0)
  IL_0000:  ldloca.s   V_0
  IL_0002:  call       instance object StructuralTypes.'Boxed'::ToObject()
  IL_0007:  call       void [mscorlib]System.Console::WriteLine(object)
  IL_000c:  ldc.i4.1
  IL_000d:  newobj     instance void StructuralTypes.'Boxed'::.ctor(bool)
  IL_0012:  stloc.0
  IL_0013:  ldloca.s   V_0
  IL_0015:  call       instance object StructuralTypes.'Boxed'::ToObject()
  IL_001a:  call       void [mscorlib]System.Console::WriteLine(object)
  IL_001f:  ret
} // end of method Test::Main

Clearly, the type of bool? is translated into a StructuralType called Boxed, with a generic approach indicating the type of the target variable (in this case System.Boolean). We already saw the Boxed type in the previous post (also when using ? but then to indicate the number of occurrences inside a content class' definition struct). Now, you can take a closer look at the Boxed type.

One of the first things you'll see is the IsNull method:

.method public hidebysig instance bool  IsNull() cil managed
{
  // Code size       10 (0xa)
  .maxstack  8
  IL_0000:  ldarg.0
  IL_0001:  ldfld      bool[] StructuralTypes.'Boxed'::'box'
  IL_0006:  ldnull
  IL_0007:  ceq
  IL_0009:  ret
} // end of method 'Boxed'::IsNull

This is the one being used to determine the "null-ness" of the variable. Furthermore, there is a getter (GetValue) and a setter (SetValue), which are both self-explanatory (the same statement holds for the constructor and the Equals method).

Also, you'll find a couple of static methods for the operator overloads for equality, inequality and casting (both explicit and implicit). These are pretty simple to understand too if you know the nature of the comparison overloads (one for == , one for == and one for == thus in total 6 comparison operator static methods).

Notice you'll also find a class called BoxedEnumerator (generic - constructed with System.Boolean in our example - too) which was not used directly in our sample.

Del.icio.us | Digg It | Technorati | Blinklist | Furl | reddit | DotNetKicks

Filed under:

Comments

No Comments