Saturday, December 8, 2007

Testing For Collections

Where I work we developed some foundation classes to assist with our unit testing. We make heavy use of NHibernate on our product (We work one very large application) and wanted a way to ensure that we were not breaking our mapping files over time.

In order to accomplish this goal we developed a mechanism to manage multiple sessions and then compare object graphs between the two sessions. Basically we would call a creation method for the specified object type we are testing, assign all properties with random values, and then assign many-to-one associations using the child object's creation method.

The creation method has the option of automatically saving the constructed objects to an NHibernate session or not (these creation methods are also used for non NHibernate based tests).

After the object graph has been constructed we then flush the session, construct a new session and then load the same parent object view the new session. We then wrote a base method for our tests which compares the two objects to make sure they are identical.

Our comparison method uses reflection on objects which are being compared. We reflect over all properties and ensure they are all equal. When we reach a property which is not a simple type we recursively call our comparison method on that property to ensure all child properties of these objects are compared. To ensure there are no infinite loops we keep a collection of already verified objects to make sure we don't compare the same object twice.

The one catch here is that currently we ignore collections. Collections are far more difficult to test, especially because the majority of our collections are Sets. We can't simply check the number of positions in the collection and each position of the collection since every load of the object could result in a new order (Sets are not ordered after all).

During re-factoring we accidentally broke our comparison tests. While reading the below note that all collection properties of our business objects are interfaces which are later replaced with the appropriate collection implementation. The original implementer who checked to see if the property was a collection used the following code:


if (type.ToString().ToUpper().StartsWith("SYSTEM.COLLECTIONS")
|| type.ToString().ToUpper().StartsWith("IESI.COLLECTIONS"))


There are several things I don't like about that code. To be fair, the code actually worked for our implementations, but it is extremely dirty. There are several things wrong with this.

First, you shouldn't be checking for a type based on a name. Hard coding string based namespaces is unreliable and a bad idea. There is no way to get any sort of compile time checking from this.

Second, I'm typically opposed to ToUpper(), it's almost never what anyone wants, it's just been accepted from historical practices. People either do it because they don't want to think about what it should really be, or they do not know about using a CaseInsensitiveComparer.

Third, What if we chose to use a custom collection for one of our implementations? In that case it wouldn't actually have a name which matched either of those conditions.

So obviously I thought this needed to be re-factored. My attempt is shown below:

if (type is IEnumerable)

Seems really simple right? My thinking behind this was that every collection interface we use or have even known about has mandated that it also implement the IEnumerable interface. Take a look at all of your collection interfaces, no other interface is common between them. ICollection is not enforced by the generic interfaces. It may be enforced by the non-generic interfaces, but not the generic ones. As such I determined it wasn't safe to use ICollection.

Well, a co-worker of mine found out that I actually broke our mapping tests. Worse than that they were broken in a way where we would receive false positives. This simple change resulted in our mapping tests no longer testing strings. The System.String class is IEnumerable. This makes sense upon further inspection (enumerable list of characters). However, it was not something I first thought of.

So, the IEnumerable was bad, we have learned that. We could have checked for IEnumerable but not a String, but that seems dirty.

The sad thing here is that checking for ICollection actually works. Every single collection in the .NET 2.0+ framework and the Iesi.Collections package implements the ICollection interface, our problem with that is that it is not enforced by the interfaces we are using. Does anyone know why IList does not mandate ICollection? It mandates ICollection<T> and every implementation uses ICollection, so why not? It seems like it would have been beneficial to everyone. I have yet to find the drawback. After all, it does enforce IEnumerable.

My next attempt turned out to not actually work. I really was thinking this was the answer for us, but I didn't have my thinking cap on. My code can be found below. Note that the formatting is horrible, but needed to ensure that the blogger template does not cut off my code sample. Also note that obj in the example is the object which is being checked for a collection. Also note this is not a direct copy and paste, there are optimizations in our real code which are removed for simplicity here.

Type type = obj.GetType();
Type genericCollectionType = typeof(ICollection<>);

if (obj is ICollection
|| (type.IsGenericType
&& type.GetGenericArguments().Length
== genericCollectionType.GetGenericArguments().Length
&& genericCollectionType
.MakeGenericType(type.GetGenericArguments())
.IsAssignableFrom(type)
)
)


If you see here I check for ICollection first, and then if the item being tested does not match that interface I fall back on a generic collection test. Note that I compare the length of the generic arguments since attempting to MakeGenericType with the improper number of arguments would throw an exception.

This generic check works for collections like List<t>, Iesi.Collections.Generic.HashedSet<T> but it doesn't work for Dictionary<K,V>. Basically ICollection<T> has one generic type, but IDictionary<K,V> has two and therefore won't be tested. I would need to determine a way to see if it is assignable from ICollection<NameValuePair<K,V>> since that is how the Dictionary<> impelments ICollection<>.

I'm not sure the cleanest way to check for this. Plus, it won't actually be called since there doesn't exist a collection out there which doesn't implement ICollection.

So basically I'm jumping through a number of hoops which aren't totally needed, but I really wish the IList<> and IDictionary<> enforced ICollection<> then I would feel a whole lot more comfortable with this.

Does anyone know why they were left out? Is there a really good reason I am missing?

--John Chapman

No comments:

Blogger Syntax Highliter