Sunday, December 30, 2007

Sudoku Part 0: Introduction


It's been a while since my last post here. Part of the reason it has been so long is the fact that I traveled for the holidays, making it more difficult to make time to update this blog.

While traveling I break out sudoku (and other similar puzzle games) to help the time go by on the way to my destination. As a programmer I tend to take any problem I'm solving and analyze it as if I was developing a program to solve the problem for me. This isn't generally something I follow through on, but rather just a mental exercise where I wonder about the what ifs.

While working on a Sudoku puzzle I thought, as fun as solving this puzzle is, writing an algorithm that solved it in the same manner that I do would be even more enjoyable. Better yet, there may be many ways to build a sudoku solving algorithm. I also thought that in order to make an enjoyable sudoku solving algorithm I would also need to create a sudoku generating algorithm. Again, I thought there may be potential for many algorithms.

This time it got me thinking. Sudoku is fun, maybe this would be a good platform to introduce some topics. I have spent some time reading about Behavior-Driven Development (BDD), and would like to introduce an example of it on my blog. This seems like a good candidate. Most people are familiar with the puzzles. Secondly, I would also like to use this as an example for how to use a Dependency Injection framework. The fact that I hope to be able to generate multiple algorithms would seem to enforce the use of such a tool here.

I plan to make this a multi-part series as there are too many topics to cover to justify a single blog post. I am currently investigating this program now, so don't blame me if it takes a while to write the entire series.

Initial Thoughts

When I was first thinking about these problems on the Plane I thought a sudoku generator would be relatively simple. Well I can now say with confidence that it is not. The algorithm I formed in my head went something like the following:
  1. Divide the board in to rows, columns and regions, matching the general rules for the game.
  2. For each row, column and region create a set of Sudoku values (1-9) which are valid to be used for that grouping.
  3. Loop over all pieces of the board
  4. For each piece take the intersection of the valid values from this pieces row, column and region
  5. Randomly select a value from the remaining values, assign it's value to the piece and then remove it from the list of valid values for the pieces row, column and region.
  6. Once a valid board is created, randomly select a piece on the board to clear the value from
  7. Continue randomly removing values until the puzzle is no longer uniquely solvable, and re-add the last removed value.
This seemed like a reasonable algorithm to me. It seemed that there would always be a choice for valid values seeing as all previous pieces used valid values. I guess I just assumed that while placing pieces there would always be remaining valid pieces until you finished the puzzle. To my initial surprise, this algorithm failed on step 5 every time. The generator kept kitting situations where there were no valid values which could be placed on the current piece.

See the below example of the result of my algorithm. Note that there is no valid value which can be placed in the next box, even though all previous entries were entirely valid. According to the row of the piece the value should be 4. Yet 4 is invalid for that piece's column and region. So clearly 4 will not work there. Any value besides 4 will cause an issue for the row. This algorithm clearly doesn't work.


With that experiment I'm pretty much back to the drawing board. I do have a few ideas on algorithms that should work, but my goal with this is to have at least two working sudoku generation algorithms. So far I'm at least enjoying the hunt for a working algorithm. I would like to create at least one solution on my own, but at some point I may have to break down and look for some assistance.

--John Chapman

Sunday, December 16, 2007

UI: Good Use For Extension Methods

Ok, this is a little weird for me, but I think I just had my epiphany. I think I have finally converted in to an Extnsions Methods believer! I know, first there was C# 3.0 Extension Methods? A Good Idea? and then there was Reserving Judgement, but now I have found where I really, really like them.

How many of you have written a class like this:


public class Person
{
private string firstName;
private string lastName;

public string FirstName
{
get { return firstName; }
set { firstName = value; }
}

public string LastName
{
get { return lastName; }
set { lastName = value; }
}

public string GetFullName()
{
return LastName + ", " + FirstName;
}
}


I don't know about you, but I feel dirty every time I write something like that. To me this is display logic, and now it's polluting my business objects! This really smells to me. This smells so bad that I prefer to put a method on my page which does the formatting for me. But the problem is that people are used to this sort of syntax. They expect this to be how they format that class.

How do we satisfy both sides? Keep the presentation logic out of the business layer, yet keep the interfaces clean? Enter Extension Methods!

In order to solve this problem I would now create a new static class in my web assembly (UIHelper possibly, or another name that fits). This new class would look like the following:


public static class UIHelper
{
public static string GetFullName(this Person person)
{
return person.LastName + ", " + person.FirstName;
}
}


Now you can take the GetFullName method out of the Person class and use the extension method in your UI layer instead. The class will still work exactly the same way from a UI perspective, except when working within the Person there will no longer be a GetFullName method.

The Person class now looks like this:

public class Person
{
private string firstName;
private string lastName;

public string FirstName
{
get { return firstName; }
set { firstName = value; }
}

public string LastName
{
get { return lastName; }
set { lastName = value; }
}
}


Just make sure you add the using directive to the namespace which contains the UIHelper class on your page (or add an Import directive to your aspx file). Your code on the page now looks like the following:

This is actually the happiest I've been with any use of extension methods when you control the class which is being extended.

--John Chapman

Are Developers Lazy?

I have been watching the ASP.NET 3.5 Extensions with great interest. Of special note is the MVC framework that Microsoft is adding to ASP.NET. If you haven't heard anything about this framework I recommend you go take a look at the posts Scott Guthrie has made starting with this one. I'm really excited about the potential enhancements which are offered with this new framework. Yes, I know Monorail is nothing new, but something has always kept me from adopting it.

Of particular interest with the Mvc framework is the ability to easily test the majority of your UI code. This is something which today is near impossible. Plus, who hasn't been frustrated with the quality of the HTML which are generated by the ASP.NET WebForms? Finally a way to take real control of your web pages while still using the ASP.NET pipelines (or at least what is remaining of it).

What has really caught me by surprise though is the amount of complaining people are doing regarding the ControllerActionAttribute. Basically with the Mvc framework in order to expose an action from a controller (an action basically equates to the way a given URL interacts with the controller) you need to mark it with the ControllerActionAttribute. This is the default policy which Microsoft provides, and you can change it if you want to go through the many steps involved in re-working some of their components.

Developers have been seemingly outraged by this. Instead most developers (at least the ones who speak the loudest) seem to say that any public method on the Controller should automatically be considered a controller action meaning all public methods of a controller should be publicly accessible via a URL. Now, I may be in the minority here, but having public methods exposed from a URL is actually pretty scary, and I actually like explicitly specifying which methods are exposed to a URL and which are not. Not because I wouldn't be able to tell with just public methods, and not that I wouldn't be able to use it properly, but just from a sense that many developers mistakenly write public methods when they should not be public. A developer may not realize what the "magic" is which is taking place. By making it specific it takes very little time, and then there is far less chance of mistakenly exposing parts of the application that have no business being exposed.

Is it really that much work to mark a property as [ControllerAction]. I don't get it. It takes half a millisecond, and then everything is clear. After all, we do this same sort of thin with all of our unit testing frameworks don't we? Why not just automatically treat every public method of a test fixture as a test unless otherwise marked? The argument seems to hold there as well. I understand other Mvc frameworks have worked in this way where an attribute wasn't needed, but so what? Is it that we're lazy, or is it that we are resistant to change? If we're resistant to change, then it seems like this whole argument is moot since this Mvc framework is a big change already.

Maybe instead we're resistant to change so complaining ensues for a couple of months before we realize that there are actually benefits on the other side and then we quiet down for a while?

The really interesting part to all of this is that the people complaining about the attribute are the same people who complain about the designer and drag & drop "programming". Admittedly, I am also one of those people who hates the designer. This means that when it comes to writing their pages they aren't being lazy. They are taking great care of the final output and look of their application.

So if being lazy is not a problem while writing the HTML for your web pages (or views as I should say with Mvc), then why is assigning an attribute such a big deal?

--John Chapman

Wednesday, December 12, 2007

Visual Studio 2008 Test Result Cleanup!

I don't know how many other people used the MSTest (Unit Testing capabilities of Visual Studio 2005 Team Edition) but those test results really pile up after a while.

We currently have 525 unit tests on our current project. At one point in time I noticed that my disk space was running low on my development machine. I had no clue how so much space could evaporate so quickly. I started doing quick checks on different folders that I thought could be problematic. I eventually found my way to our solution folders. We typically have multiple branches loaded on our local machine, but I never thought we wrote enough code to justify a 20+ Gig project folder. Further investigation led me to a folder known as TestResults. I found out that Visual Studio was keeping the results of every unit test run I ever did.

Don't get me wrong, I love the ability to review past test runs to see where one failed. I really enjoy the ability to create bugs directly from the results of the test where the associated result is attached to the bug for the assigned developer to review (assuming it wasn't the assigned developer that noticed the failing test). But keeping a history of these files was really adding up!

I found myself deleting the old test runs from this folder every so often. Then something really interesting happened today. I tried running my full suite of tests after loading the solution in Visual Studio 2008. A message box appeared stating that my old test runs will be deleted since I will be over the maximum of 25 historical test runs. It made me agree to the deletion of these items and also told me that the number was configurable in the Tools > Options menu!

Now, maybe this feature existed in 2005, but I could never find it. This turns out to be one of those features that isn't really advertised, isn't major but just made me happy. For some reason it's the little things that seem to get me excited. It seems like this would have been so simple to leave out, yet they took care of it anyway.

Good job on that one Microsoft! You just made my life a little easier.

--John Chapman

Monday, December 10, 2007

Open Source Rocks!

I've recently began seriously investigating mocking frameworks (See NMock and Rhino Mocks specifically). I've looked at mock frameworks before, but I was quick to dismiss them. I never really saw the benefit. I honestly don't think I really understood what they were used for.

At first I thought maybe it was used to automatically create my business objects so that I would have full object graphs in memory. This is something we use a lot to perform mapping file tests, it turned out the tools didn't really fit for this purpose, and I pushed them aside.

Now, I spent some time again wondering if I missed the big picture when I looked before. After all this was years ago that I last researched mock frameworks. While the NMock interface seemed pretty clean to me I didn't like the use of strings. Rhino Mock's ability to use strongly typed method calls was a deal sealer. How could I use a weakly typed mock framework when a popular strongly typed variety exists?

I've decided that mock frameworks come in handy while performing unit tests which would otherwise use external resources. Previously I always skipped testing these methods and moved as much of the business logic to methods which had no outside dependencies. I've finally seen the light and embraced the mock frameworks for what they were truly intended.

If you haven't looked at Rhino Mocks, I strongly recommend you do! Let's take a very simple example to see the power of this tool! This is a totally contrived example, so don't blame me if it is unrealistic. Let's say you have a dependency on an external pricing service which finds the best available price for a given product in your companies purchasing department. Let's define the needed interface as IPricingService


public interface IPricingService
{
decimal GetPrice(Product p);
}


A very simple service for sure. Given any product it will return the current price which we can pay for that product. We'll use this service in our AddDetail Method of our Order class. See the Order, OrderDetail and Product implementations below. Note I use public fields for brevity only.

public class Product
{
public int Id;
public string Name;
}

public class Order
{
private IPricingService pricingService;

public IList<OrderDetail> Details
= new List<OrderDetail>();

public Order(IPricingService pricingService)
{
this.pricingService = pricingService;
}

public OrderDetail AddDetail(Product product, int quantity)
{
OrderDetail detail
= new OrderDetail(
product,
quantity,
pricingService.GetPrice(product));

Details.Add(detail);
return detail;
}
}

public class OrderDetail
{
public Product Product;
public int Quantity;
public decimal Price;

public OrderDetail(Product product,
int
quantity,
decimal
price)
{
this.Product = product;
this.Quantity = quantity;
this.Price = price;
}
}


Now our job is to test the AddDetail method which we see above. Well we don't want to rely on an actual pricing service which would use in production, so instead we will resort to our trusty Rhino Mocks framework to fill in a fake implementation. Now we can test 100% of the code in our Order without worrying about external dependencies. This helps keep our unit tests focussed as well as allows us to improve our code coverage.

[TestMethod]
public void AddDetail()
{
MockRepository mocks = new MockRepository();
Product p = new Product();

IPricingService pricingService
= mocks.CreateMock<IPricingService>();

Expect.Call(pricingService.GetPrice(p)).Return(15.50M);

mocks.ReplayAll();

Order order = new Order(pricingService);
OrderDetail newDetail = order.AddDetail(p, 20);
Assert.AreEqual<decimal>(15.50M, newDetail.Price);
Assert.AreEqual<int>(20, newDetail.Quantity);

mocks.VerifyAll();
}


Take special note of the MockRepository, the CreateMock method and the Expect builder. Expect basically tells the framework that it should expect a call to the method provided in the Call() and to expect it for the specified parameter. When the mock receives that call that the provided parameter it should return the value 15.5 as the price.

ReplayAll is a bit confusing, but basically this tells Rhino Mocks to stop using the method calls for recording instead use it to return the results we configured and to track the calls which are made.

After we have called ReplayAll we make our method calls and assert that the new order detail was created correctly and added with the appropriate price which was returned by the pricing service. VerifyAll tells us that our expectations we created before replaying were met (such as expecting a call to the GetPrice method with a parameter of p.

If I didn't want to set expectations for method calls I could have opted for the SetupResults class instead of Expect class.

This is just the tip of the ice berg regarding the framework. If this peeked your interest, you owe it to yourself to take a deeper look.

If you're like me and you saw the Expect.Call() taking an execution of the method you expect a call for you asked yourself "How the heck did he do that?" When I first tried it I was blown away that it worked. I tried to come up with ideas of how it worked.

This is one of the true beauties of open source software. I can satisfy my own curiosity! After playing with the framework I downloaded the source code to check it out. The final implementation wasn't actually that complex. Basically the proxy object which the mock framework creates tracks what the last method call was and what parameters were provided to the method call. So the Call() method doesn't really do anything except return the options for the last method call which was made. The fact that it is provided as an argument to Call() is complete ignored! I view this as genius. It's really thinking out of the box in a way that I'm not so sure I would have come up with. Being exposed to code that works in different ways that I think allows me to expand my mind and come up with solutions to my own problems I may not have previously come up with.

I've also spent a considerable amount of time reviewing some of the NHibernate code. This is after all how I was able to write such a specific performance test regarding the performance of NHibernate accessors after all. I used just the specific NHibernate pieces I needed.

As software developers we are unbelievable lucky to be living in such a day and age where such great software code is available for free for us to learn from. I recommend to everyone interested in really improving their skills find an open source project which revolves around something you are interested in and take a look. There is tons of great stuff out there! We should all take advantage.

Lastly, I just want to thank Oren Eini (aka Ayende Rahien), this Rhino Mocks framework he created is truly awesome.

--John Chapman

Sunday, December 9, 2007

Linq to Sql: The Good, The Bad & The Bottom Line

I promised my take on Linq to Sql a few days ago. I have spent some time over the past couple days playing with Linq to Sql connected to the AdventureWorks SQL Server sample database.

I have a lot of experience working with NHibernate so you may see some comparisons throughout the post.

Overview

Most everyone who is likely to read this post probably knows what Linq to Sql is. For those that don't, Linq to Sql (and really Linq in general) has been one of the most talked about (Once we found out Linq to Entities wasn't going to ship with Visual Studio 2008) and hyped features of Visual Studio 2008 and the .NET 3.5 framework.

Linq to Sql is actually a big shift for Microsoft. Linq to Sql is Microsoft's first production quality Object Relational Mapper or O/RM for short. They may have tried in the past with products such as ObjectSpaces, but this is the first tool to be released as a completed tool. O/RM tools exist to try and solve the Object-relational impedance mismatch which basically says that most applications are developed in object oriented programming languages these days yet the data which they operate on is typically stored in a relational database. This process of moving data between the objects and relations and vice versa is described as the impedance mismatch. There are obviously many fundamental differences between data stored in a relation and data stored in our objects.

Traditionally Microsoft has endorsed using DataSets to solve this problem. DataSets are essentially a relation based object in your object oriented programming language. Essentially it would allow you to work with your data in your application as relational data. The problem with this? You fail to take advantage of object oriented application design and the advantages it brings to you. Typically these programs have little testability and a significant amount of duplication. As such many O/RM tools became popular (although far less so than if Microsoft had endorsed them) such as NHibernate, LLBLGen Pro, Vanatec OpenAccess, Wilson ORMapper, EntitySpaces, eXpress Persistent Objects and many others (apologies to any I didn't list).

Note that Linq to Sql isn't necessarily a direct competitor to NHibernate or the other above listed O/RM tools for the .NET framework, that is Linq to Entities (AKA ADO.NET Entity Framework). Linq to Sql is more of an introduction to the O/RM world.

The Good

  • The Linq query language itself
The Linq query language is just awesome. It really is a joy when you start to work with it. It can quickly become a pain because it is complex, but then it makes you realize just how powerful it is. I have never seen a query language that is quite so rich. Basic queries are very simple to write and understand, yet it also provides functionality for very complex queries.

Plus, the queries are strongly typed so now there is much less to worry about when refactoring your business objects as compile-time checks are now available for your queries. Note that even with stored procedures, if you change a column in a table referenced by a stored procedure, it won't inform you that you just broke a stored procedure. Likewise stored queries in your applications will not inform you if you change a property name or column either.

For fun see the following blog post: Taking LINQ to Objects to Extremes: A fully LINQified RayTracer. This is not something you woudl actually do, but it does help reinforce just how powerful Linq really is.
  • Better Naming Conventions Than NHibernate
While working with Linq to Sql I felt that the methods on the context were easy to understand and more intuitive than the NHibernate equivalents. For example, when you want to save your changes to your database NHibernate says Flush whereas Linq to Sql uses SubmitChanges. But the big advantages are Linq to Sql's InsertOnSubmit vs NHibernate's Save as well as Attach versus NHibernate's Update or Lock methods.

I can't tell you how many times I've explained how the Save, Update and Lock functionality for NHibernate works. Most people seem to think that they need to call these methods to cause a database operation to take place. They assume Save = Execute Insert NOW, and Update means execute an update NOW! Then they use Flush for good measure because someone told them too. The Linq to Sql naming convention seems to imply that that is not quite what is going on.
  • Simple to Get Started
It didn't take me very long to get up and going with Linq to Sql. While I'm not the biggest fan of the Object Relational Designer, it sure is easy to use and fast to build basic object graphs. Someone who is not familiar with O/RM tools should be able to have objects mapped to database tables in a matter of minutes. This could work very well for simple RAD applications. This process really couldn't be much simpler.
  • Superior Optimistic Concurrency Support
My apologies to any O/RM tools out there that have as good concurrency support as Linq to Sql, I just know I prefer the flexibilty offered by Linq to Sql over NHibernate's. Now, that being said NHibernate's concurrency has always worked fine for me, it's just nice to have additional options.

First, when a ChangeConflictException is thrown it includes a ton of information such as the entity involved, the columns involved and allows your code to recover from it. Linq to Sql will also let you configure if you want to catch all change conflicts or fail as soon as the first conflict is found. These are features, which to my knowledge, NHibernate does not support.

Plus, this is basic but Linq to Sql has native support for SQL Server timestamp columns. This allows you to ensure that you know of all updates even if it occurs outside the scope of Linq to Sql. For some reason NHibernate still does not support this type of column. Instead it rolls its own version column.

Resolving stale data with RefreshMode allows for many options when re-syncing your objects with the database. Again, I just like the options.
  • Superior Stored Procedure Support
If you have a wealth of stored procedures, rest assured they are easy to use from Linq to Sql. Just drag (I do feel dirty using that word) the stored procedure from the server explorer to the methods list in the object relational designer and you will see a new method on your associated context which directly calls that stored procedure. To your code it looks the same as any other method.

Note it is also possible to write your Linq to Sql CRUD through stored procedures. This is also a relatively simple process.

The Bad
  • Very Basic Object Support
This is actually the killer here. Linq to Sql is a very basic O/RM and does not support many of the object oriented concepts sophisticated applications are likely to use. Just a few of the missing features are:
    • No Inheritance
    • No Value based objects (IE NHibernate Components)
    • No Timespan support (A huge problem for the Logistics field I work in)
    • Collections limited to EntitySet (which isn't even a real Set)
      • Where is the Dictionary support at least?
  • No SaveOrUpdate Equivalent
This forces more persistence knowledge to a lower level requiring that all code which associates an object with a context must know if it already exists in the database or not. This basically just adds extra checks in your code which should not be necessary. Sometimes it can seem a bit dirty to check if an object already has a primary key or not yourself, it seems like logic which doesn't belong within the application itself.
  • GUI based Drag & Drop
Yes, I know you can use a seperate mapping file, much like you can with NHibernate, but this isn't realistic. If you don't use the designer, you don't get the code generation. If you don't get the code generation you are responsible for writing all of the many Hooks in your objects that Linq to Sql needs. Folks, these objects are quite dirty. At least with NHibernate your objects are complete persistence ignorant (aka POCO aka Plain Old CLR Object) meaning they look clean and usable for more than just NHibernate. Therefore using anything besides the designer isn't very feasible.

The big problem here though is that your entire object graph needs to live in one diagram and the code behind these objects winds up in a single code file by default. This just isn't acceptable for applications of any size. Diagrams which contain 20-30 objects would be a major pain here, let alone applications that have hundreds. For large applications this just wouldn't fly.
  • Relationships Aren't Interface Based
All of the associations to related objects are handled with EntitySet and EntityRef. Whereas with NHibernate you have the ISet and just the object type you expect. This basically forces the Linq to Sql references on your object, decreasing the ability for unit testing your objects in my opinion. I also don't like the persistence based decencies on my objects.
  • Transaction API is Goofy
For whatever reason you need to handle all explicit transactions outside of the Linq to Sql context. You have to create and the commit it outside the context while supplying the transaction to the context while it is in use. Linq to Sql implicitly uses transactions for all calls to SubmitChanges, but you would think it would be possible to begin new transactions via the context, and then commit or roll them back through the context as well.

The Bottom Line

Really, I have only touched on a brief overview of Linq to Sql here. The important question I ask myself is, "Would I use this framework?". Well, it's a bit of a difficult question. If I was writing a small application which I knew would not grow in to a large one and my object model would be simple enough for the limited object support, yes I would use it. I could get up and going very fast, and I enjoy working with the context interfaces.

However, if I was working on a larger application (really doesn't take much to be too large for what I would do with Linq to Sql), or one which I thought had potential to adjust and grow over time, I would skip Linq to Sql and look for my trusty NHibernate.

So really, it would only be used for a very small subset of problems out there that I would try to solve.

All of that being said, I think Linq to Sql is very important to the .NET development community. Microsoft has historically tried to pretend that O/RM tools didn't exist and to do any development except their DataSets or repetitive patterns was crazy. Now that Microsoft has a framework to endorse it should greatly expand the exposure to such technologies in the .NET development community. I think overall this is a good thing, and will result in overall superior developers.

My only concern with this introduction is that people may get the idea that O/RM tools are nice, and get you up and going fast but fall flat on their face once you try to do anything advanced and then you need to resort to the same tools you used all along. This was actually a very common opinion by people I talked to about NHibernate a few years ago. They had heard of others using O/RM tools (not NHibernate specifically) and how they just don't handle advanced things, they are only good for simple things.

With Linq to Sql I hope developers become exposed to O/RM and become curious about other tools such as NHibernate when Linq to Sql is too simple for what they need instead of grouping all O/RM tools together as being too simple and idealistic.

I'm actually excited about the potential of the .NET development community now that more people will be exposed to O/RM. Long live O/RM tools, you have been lifesavers for me!

--John Chapman

More on Extension Method Judgment

Just another tidbit from my last post Reserving Judgment. I was reading Scott Guthrie's latest post on the ASP.NET MVC framework (ASP.NET MVC Framework (Part 4): Handling Form Edit and Post Scenarios), and noticed that he showed a tool which was using extension methods in much the same way Greg Young was describing. See Scott's use of the ASP.NET MVC HTML Helpers.

What can I say, I actually kind of like how they work in that scenario.

Reserving Judgment

I have been very quick to put down certain pieces of the .NET 3.5 framework (reference prior post on extension methods and partial methods), most specifically extension methods and partial methods. I'm not quite ready to give in on partial methods yet, they still just seem like a tool used to make code generators slightly easier, but I think I'm ready to reserve judgment on extension methods now.

I still think extension methods are dangerous and are going to be subject to blatant misuse. That being said, I think I need to keep an open mind regarding the potential uses for extension methods. Note that I'm referring to the use of extension methods entirely within code that you control.

I recently read a blog post by Greg Young titled A Use For Extension Methods where he basically stated that there are a lot of people out there that share my kind of viewpoint. He also wanted to show one possible way where extension methods could be used. He basically showed that extension methods were useful in controlling the context of intellisense when using a fluent interface for a pseudo-builder pattern. I call it pseudo since really there is only one type of available builder.

Basically his extension method looks like the following:


public class Builder

{

}

public class Create

{

public static Builder New = new Builder();

}


public static CostBuilder Cost(this Builder s)

{

return new CostBuilder();

}


So now when you call the Create.New you'll only see the builders which are in the namespaces which you referenced in your using directives.

Fair enough, upon seeing this I thought it seemed interesting, but went in to my pessimistic anti-extension method mode and came up with my usual answer, why not just do this:

public class Create

{

public T New() where T : new()

{

return new T();

}

}



But upon reading this, it's pretty lame isn't it? It's basically a generic factory that doesn't new anything, why not just call new CostBuilder().DoStuff().DoMoreStuff()?

But Greg's interface is a bit cleaner than what I am offering. It's not that extension methods are really needed here, it's just that it can actually make your code a little cleaner and slightly easier to read (when working on large projects), which is something I didn't expect to say regarding extension methods.

Maybe I need to wait this one out for a year to see what kinds of things developers come up with while using extension methods.

--John Chapman

Saturday, December 8, 2007

C# Type Inference But Still Strongly Typed

I have spent considerable time today reviewing Linq and more specifically Linq to Sql. I'm currently working on a blog post where I'll go in to the details of what I think the pros and cons of Linq to Sql are as well as my overall opinion. In case you couldn't guess it I'll be using NHibernate for my comparisons, after all it is what I'm familiar with.

While reviewing some things I ran in to the following compile time check. It was very simple for me to resolve, but I wonder if it will cause developers to fall in to traps. Especially those developers who have some experience with weakly typed languages such as Javascript.

Take a look at the following code I wrote:


AdventureWorksDataContext context =
new AdventureWorksDataContext();

var orders = from po in context.PurchaseOrderHeaders
select po;

if (chkUseDate.Checked)
{
orders = from po in orders
where po.OrderDate > dtOrderFrom.Value
select po;
}

orders = from po in orders
orderby po.OrderDate ascending
select new
{
po.PurchaseOrderID,
po.RevisionNumber,
po.OrderDate,
po.ShipDate
};


Does anyone see what is wrong with the code above and why it failed to compile?

The compile-time error was:
Cannot implicitly convert type 'System.Linq.IQueryable<AnonymousType#1>' to 'System.Linq.IQueryable<BLL.PurchaseOrderHeader>'. An explicit conversion exists (are you missing a cast?)

After seeing that I immediately realized that I tried to use an object which type inferred to return PurchaseOrderHeader objects to return anonymous type objects instead. You can't just change a reference to be of another type in C# 3.0, hence the strong typing, I should know better.

But honestly, with the whole var keyword, I wasn't really thinking about it. It was a minor slip up, but I wonder how many developers will fall in to that trap. I think some developers may have seen the var keyword before in Javascript and they may have used it in the fashion I just did.

That being said, I have been enjoying my time with Linq today. I should have a post up within the next few days with more details.

P.S. If you're wondering what is going on with the 3 step linq ueries above, that's how you write dynamic queries in Linq. Simply reference the previously defined query in your new linq query in order to further restrict the query which you are building. Keep in mind that writing a linq query doesn't perform any operations. You have to either enumerate over the values of the query or call a method on the query like ToArray(), ToDictionary(), Select() etc...

If you're curious how to resolve the issue above you just need to declare a new variable for the last query to store the new type. var results = <Linq expression> would work just fine.

--John Chapman

Solution Folders To Group Projects

While reading Scott Guthrie's blog today I stumbled upon one of his links of the day for december 8th entitled Big Solutions Can Be Organized Using Solution Folders which is at the .NET Tip of the Day web site.

Actually, this is really cool. Am I the only schmuck who didn't realize this was possible? I have used Solution Folders in the past, but it was always to group files which were not part of the actual build process. For example we always had a Libraries folder which contained the third party dlls which need to be referenced to build the project. This way all a developer needs to do to build the project is to download the latest source code, and the solution will automatically download the needed referenced assemblies. Plus any upgrade we do to a third party dll automatically propagates when someone gets latest.

But the ability to place projects within the folders to group them together? This will actually be very handy for us. At work our current solution has 33 projects, and truthfully it is growing! Being able to group the projects which are similar or somehow related will really come in handy for us.

Why didn't I realize this sooner?

--John Chapman

Testing For Collections

Where I work we developed some foundation classes to assist with our unit testing. We make heavy use of NHibernate on our product (We work one very large application) and wanted a way to ensure that we were not breaking our mapping files over time.

In order to accomplish this goal we developed a mechanism to manage multiple sessions and then compare object graphs between the two sessions. Basically we would call a creation method for the specified object type we are testing, assign all properties with random values, and then assign many-to-one associations using the child object's creation method.

The creation method has the option of automatically saving the constructed objects to an NHibernate session or not (these creation methods are also used for non NHibernate based tests).

After the object graph has been constructed we then flush the session, construct a new session and then load the same parent object view the new session. We then wrote a base method for our tests which compares the two objects to make sure they are identical.

Our comparison method uses reflection on objects which are being compared. We reflect over all properties and ensure they are all equal. When we reach a property which is not a simple type we recursively call our comparison method on that property to ensure all child properties of these objects are compared. To ensure there are no infinite loops we keep a collection of already verified objects to make sure we don't compare the same object twice.

The one catch here is that currently we ignore collections. Collections are far more difficult to test, especially because the majority of our collections are Sets. We can't simply check the number of positions in the collection and each position of the collection since every load of the object could result in a new order (Sets are not ordered after all).

During re-factoring we accidentally broke our comparison tests. While reading the below note that all collection properties of our business objects are interfaces which are later replaced with the appropriate collection implementation. The original implementer who checked to see if the property was a collection used the following code:


if (type.ToString().ToUpper().StartsWith("SYSTEM.COLLECTIONS")
|| type.ToString().ToUpper().StartsWith("IESI.COLLECTIONS"))


There are several things I don't like about that code. To be fair, the code actually worked for our implementations, but it is extremely dirty. There are several things wrong with this.

First, you shouldn't be checking for a type based on a name. Hard coding string based namespaces is unreliable and a bad idea. There is no way to get any sort of compile time checking from this.

Second, I'm typically opposed to ToUpper(), it's almost never what anyone wants, it's just been accepted from historical practices. People either do it because they don't want to think about what it should really be, or they do not know about using a CaseInsensitiveComparer.

Third, What if we chose to use a custom collection for one of our implementations? In that case it wouldn't actually have a name which matched either of those conditions.

So obviously I thought this needed to be re-factored. My attempt is shown below:

if (type is IEnumerable)

Seems really simple right? My thinking behind this was that every collection interface we use or have even known about has mandated that it also implement the IEnumerable interface. Take a look at all of your collection interfaces, no other interface is common between them. ICollection is not enforced by the generic interfaces. It may be enforced by the non-generic interfaces, but not the generic ones. As such I determined it wasn't safe to use ICollection.

Well, a co-worker of mine found out that I actually broke our mapping tests. Worse than that they were broken in a way where we would receive false positives. This simple change resulted in our mapping tests no longer testing strings. The System.String class is IEnumerable. This makes sense upon further inspection (enumerable list of characters). However, it was not something I first thought of.

So, the IEnumerable was bad, we have learned that. We could have checked for IEnumerable but not a String, but that seems dirty.

The sad thing here is that checking for ICollection actually works. Every single collection in the .NET 2.0+ framework and the Iesi.Collections package implements the ICollection interface, our problem with that is that it is not enforced by the interfaces we are using. Does anyone know why IList does not mandate ICollection? It mandates ICollection<T> and every implementation uses ICollection, so why not? It seems like it would have been beneficial to everyone. I have yet to find the drawback. After all, it does enforce IEnumerable.

My next attempt turned out to not actually work. I really was thinking this was the answer for us, but I didn't have my thinking cap on. My code can be found below. Note that the formatting is horrible, but needed to ensure that the blogger template does not cut off my code sample. Also note that obj in the example is the object which is being checked for a collection. Also note this is not a direct copy and paste, there are optimizations in our real code which are removed for simplicity here.

Type type = obj.GetType();
Type genericCollectionType = typeof(ICollection<>);

if (obj is ICollection
|| (type.IsGenericType
&& type.GetGenericArguments().Length
== genericCollectionType.GetGenericArguments().Length
&& genericCollectionType
.MakeGenericType(type.GetGenericArguments())
.IsAssignableFrom(type)
)
)


If you see here I check for ICollection first, and then if the item being tested does not match that interface I fall back on a generic collection test. Note that I compare the length of the generic arguments since attempting to MakeGenericType with the improper number of arguments would throw an exception.

This generic check works for collections like List<t>, Iesi.Collections.Generic.HashedSet<T> but it doesn't work for Dictionary<K,V>. Basically ICollection<T> has one generic type, but IDictionary<K,V> has two and therefore won't be tested. I would need to determine a way to see if it is assignable from ICollection<NameValuePair<K,V>> since that is how the Dictionary<> impelments ICollection<>.

I'm not sure the cleanest way to check for this. Plus, it won't actually be called since there doesn't exist a collection out there which doesn't implement ICollection.

So basically I'm jumping through a number of hoops which aren't totally needed, but I really wish the IList<> and IDictionary<> enforced ICollection<> then I would feel a whole lot more comfortable with this.

Does anyone know why they were left out? Is there a really good reason I am missing?

--John Chapman

Thursday, December 6, 2007

MS Unit Exception Testing

How do people handle exception expectations within the Microsoft unit testing framework built in to Visual Studio Team Edition? Of course there is the built in ExpectedExceptionAttribute where you mark the test itself with an exception which should be thrown while performing the unit test. If no exception is thrown it fails, if an exception of that type is thrown it passes, if the exception is of the wrong type it fails.

I don't know about others, but this rarely sits well with me. Typically when I write a test method I want the ability to test other conditions when an exception is thrown, or I want to use a single test method to test all exceptional cases within a single business object method. Do most people just write one test per case of a method? It seems like this would result in a lot of duplication of unit tests.

To address this issue I have historically used the following code:


[TestMethod]
public void TestSomeMethod()
{
try
{
SomeMethod();
Assert.Fail("Expected ExpectedException to be thrown "
+ "while calling SomeMethod.");
}
catch (ExpectedException) { }
}


Recently it occurred to me, why do I have to go through this process every time I want to test for exceptions. At first I was hopeful that an additional mechanism would be made available with Visual Studio 2008 MSUnit, but it was not. I then decided to investigating the ability to extend the framework.

Honestly, I feel a little dirty saying this, but this appears to be a good use case for Extension Methods. Yes, I know my previous statements regarding extension methods (C# 3.0 Extension Methods? A Good Idea?), but I honestly think this would be a good use of it.

Imagine writing the following code instead:

[TestMethod]
public void TestSomeMethod()
{
Assert.ThrowsException<ExpectedException>
(delegate { SomeMethod(); });
}


Wow, I really would enjoy that functionality if it was available to me. Unfortunately for me, I can not make this an extension methods. Extension methods are only allowed on instances of objects. Since Assert is a static class you can not add extension methods. So instead of using an extension method we are instead left with using our own custom static class for testing.

See the implementation of this below:

public static class CustomAssert
{
public delegate void MethodDelegate();

public static void ThrowsException<T>
(MethodDelegate method)
where T: Exception
{
ThrowsException<T>(method,
"Expected Exception of type "
+ typeof(T).ToString()
+ " was not thrown.");
}

public static void ThrowsException<T>
(MethodDelegate method,
string message)
where
T : Exception
{
try
{
method.DynamicInvoke();
Assert.Fail(message);
}
catch (T) { }
}
}


and now the final tests looks like:

[TestMethod]
public void TestMethod1()
{
CustomAssert.ThrowsException<ApplicationException>
(delegate { SomeMethod(); });
CustomAssert.ThrowsException<ApplicationException>
(delegate { SecondMethod(5); });
}
In all honesty when I first decided this was something I was interested in I thought extension methods would work for this. While writing the code for this blog I came to the realization that they wouldn't.

While researching how to make this approach work I did stumble across the xUnit.Net framework. It looks like the xUnit.Net framework already supports very similar functionality to what I am describing in this post. If you're interested with a unit testing framework with functionality like what I described maybe that is where you should look. If you want to stick to the existing Microsoft MS Unit framework, this may be a good approach to better handle your exception testing needs.

--John Chapman

Tuesday, December 4, 2007

DateTime.Now Precision Issues... Enter StopWatch!

I'm a little bit embarrassed by this one. Previously I bogged about DateTime.Now Precision, or the lack there of. I first discovered this phenomenon while performing my performance tests for my blog post NHibernate Access Performance, which showed the differences between property and field level access within NHibernate.

Basically I found that DateTime.Now was not very precise and worked horribly as a benchmarking tool to test relative performance. Well, now I've come to my senses and realized that the nail I was banging on was really a screw, and using the hammer wasn't really the best tool for the job. I've now come to learn of the screwdriver which answers my benchmarking needs. It turns out the System.Diagnostics.StopWatch class exists for just the purpose I was trying to use DateTime.Now for. DateTime.Now is not supposed to be accurate enough to be used for diagnostics.

The StopWatch class is very simple. To use it you basically construct an instance of the object, Call Start(), perform your operations and then call Stop(). The StopWatch will contain a property ElapsedTicks with the most accurate elapsed time available.

While I wish I knew about this before when I originally recorded the benchmarks, I know about it now, and I always say I enjoy learning new things. With this new found knowledge of the StopWatch class I decided to re-run my NHibernate Access Performance tests to see how they looked. The results were dramatically different than I found with the DateTime mechanism.

First, I re-ran the 100,000 accesses test with two threads running synchronously. This is meant to replace my original test I ran. These results can be seen below. Note that all times are in milliseconds.


I then decided to run these same tests again but this time in a synchronous manner, meaning that only one test would be running at any given time. These results can be seen below:


In truth, these last numbers are probably the most accurate representation of the true performance implications of using the various NHibernate access strategies.

Of interest though is that fact that my earlier tests for property level access were showing values of close to 600 ms for the basic property setter where as a simple switch to the StopWatch showed just 62 ms. There was a difference of a full order of magnitude? I was actually surprised by just how much these numbers varied. It turns out that the accuracy of DateTime.Now is even worse than I thought.

--John Chapman

Blogger Syntax Highliter