Sunday, December 30, 2007

Sudoku Part 0: Introduction


It's been a while since my last post here. Part of the reason it has been so long is the fact that I traveled for the holidays, making it more difficult to make time to update this blog.

While traveling I break out sudoku (and other similar puzzle games) to help the time go by on the way to my destination. As a programmer I tend to take any problem I'm solving and analyze it as if I was developing a program to solve the problem for me. This isn't generally something I follow through on, but rather just a mental exercise where I wonder about the what ifs.

While working on a Sudoku puzzle I thought, as fun as solving this puzzle is, writing an algorithm that solved it in the same manner that I do would be even more enjoyable. Better yet, there may be many ways to build a sudoku solving algorithm. I also thought that in order to make an enjoyable sudoku solving algorithm I would also need to create a sudoku generating algorithm. Again, I thought there may be potential for many algorithms.

This time it got me thinking. Sudoku is fun, maybe this would be a good platform to introduce some topics. I have spent some time reading about Behavior-Driven Development (BDD), and would like to introduce an example of it on my blog. This seems like a good candidate. Most people are familiar with the puzzles. Secondly, I would also like to use this as an example for how to use a Dependency Injection framework. The fact that I hope to be able to generate multiple algorithms would seem to enforce the use of such a tool here.

I plan to make this a multi-part series as there are too many topics to cover to justify a single blog post. I am currently investigating this program now, so don't blame me if it takes a while to write the entire series.

Initial Thoughts

When I was first thinking about these problems on the Plane I thought a sudoku generator would be relatively simple. Well I can now say with confidence that it is not. The algorithm I formed in my head went something like the following:
  1. Divide the board in to rows, columns and regions, matching the general rules for the game.
  2. For each row, column and region create a set of Sudoku values (1-9) which are valid to be used for that grouping.
  3. Loop over all pieces of the board
  4. For each piece take the intersection of the valid values from this pieces row, column and region
  5. Randomly select a value from the remaining values, assign it's value to the piece and then remove it from the list of valid values for the pieces row, column and region.
  6. Once a valid board is created, randomly select a piece on the board to clear the value from
  7. Continue randomly removing values until the puzzle is no longer uniquely solvable, and re-add the last removed value.
This seemed like a reasonable algorithm to me. It seemed that there would always be a choice for valid values seeing as all previous pieces used valid values. I guess I just assumed that while placing pieces there would always be remaining valid pieces until you finished the puzzle. To my initial surprise, this algorithm failed on step 5 every time. The generator kept kitting situations where there were no valid values which could be placed on the current piece.

See the below example of the result of my algorithm. Note that there is no valid value which can be placed in the next box, even though all previous entries were entirely valid. According to the row of the piece the value should be 4. Yet 4 is invalid for that piece's column and region. So clearly 4 will not work there. Any value besides 4 will cause an issue for the row. This algorithm clearly doesn't work.


With that experiment I'm pretty much back to the drawing board. I do have a few ideas on algorithms that should work, but my goal with this is to have at least two working sudoku generation algorithms. So far I'm at least enjoying the hunt for a working algorithm. I would like to create at least one solution on my own, but at some point I may have to break down and look for some assistance.

--John Chapman

Sunday, December 16, 2007

UI: Good Use For Extension Methods

Ok, this is a little weird for me, but I think I just had my epiphany. I think I have finally converted in to an Extnsions Methods believer! I know, first there was C# 3.0 Extension Methods? A Good Idea? and then there was Reserving Judgement, but now I have found where I really, really like them.

How many of you have written a class like this:


public class Person
{
private string firstName;
private string lastName;

public string FirstName
{
get { return firstName; }
set { firstName = value; }
}

public string LastName
{
get { return lastName; }
set { lastName = value; }
}

public string GetFullName()
{
return LastName + ", " + FirstName;
}
}


I don't know about you, but I feel dirty every time I write something like that. To me this is display logic, and now it's polluting my business objects! This really smells to me. This smells so bad that I prefer to put a method on my page which does the formatting for me. But the problem is that people are used to this sort of syntax. They expect this to be how they format that class.

How do we satisfy both sides? Keep the presentation logic out of the business layer, yet keep the interfaces clean? Enter Extension Methods!

In order to solve this problem I would now create a new static class in my web assembly (UIHelper possibly, or another name that fits). This new class would look like the following:


public static class UIHelper
{
public static string GetFullName(this Person person)
{
return person.LastName + ", " + person.FirstName;
}
}


Now you can take the GetFullName method out of the Person class and use the extension method in your UI layer instead. The class will still work exactly the same way from a UI perspective, except when working within the Person there will no longer be a GetFullName method.

The Person class now looks like this:

public class Person
{
private string firstName;
private string lastName;

public string FirstName
{
get { return firstName; }
set { firstName = value; }
}

public string LastName
{
get { return lastName; }
set { lastName = value; }
}
}


Just make sure you add the using directive to the namespace which contains the UIHelper class on your page (or add an Import directive to your aspx file). Your code on the page now looks like the following:

This is actually the happiest I've been with any use of extension methods when you control the class which is being extended.

--John Chapman

Are Developers Lazy?

I have been watching the ASP.NET 3.5 Extensions with great interest. Of special note is the MVC framework that Microsoft is adding to ASP.NET. If you haven't heard anything about this framework I recommend you go take a look at the posts Scott Guthrie has made starting with this one. I'm really excited about the potential enhancements which are offered with this new framework. Yes, I know Monorail is nothing new, but something has always kept me from adopting it.

Of particular interest with the Mvc framework is the ability to easily test the majority of your UI code. This is something which today is near impossible. Plus, who hasn't been frustrated with the quality of the HTML which are generated by the ASP.NET WebForms? Finally a way to take real control of your web pages while still using the ASP.NET pipelines (or at least what is remaining of it).

What has really caught me by surprise though is the amount of complaining people are doing regarding the ControllerActionAttribute. Basically with the Mvc framework in order to expose an action from a controller (an action basically equates to the way a given URL interacts with the controller) you need to mark it with the ControllerActionAttribute. This is the default policy which Microsoft provides, and you can change it if you want to go through the many steps involved in re-working some of their components.

Developers have been seemingly outraged by this. Instead most developers (at least the ones who speak the loudest) seem to say that any public method on the Controller should automatically be considered a controller action meaning all public methods of a controller should be publicly accessible via a URL. Now, I may be in the minority here, but having public methods exposed from a URL is actually pretty scary, and I actually like explicitly specifying which methods are exposed to a URL and which are not. Not because I wouldn't be able to tell with just public methods, and not that I wouldn't be able to use it properly, but just from a sense that many developers mistakenly write public methods when they should not be public. A developer may not realize what the "magic" is which is taking place. By making it specific it takes very little time, and then there is far less chance of mistakenly exposing parts of the application that have no business being exposed.

Is it really that much work to mark a property as [ControllerAction]. I don't get it. It takes half a millisecond, and then everything is clear. After all, we do this same sort of thin with all of our unit testing frameworks don't we? Why not just automatically treat every public method of a test fixture as a test unless otherwise marked? The argument seems to hold there as well. I understand other Mvc frameworks have worked in this way where an attribute wasn't needed, but so what? Is it that we're lazy, or is it that we are resistant to change? If we're resistant to change, then it seems like this whole argument is moot since this Mvc framework is a big change already.

Maybe instead we're resistant to change so complaining ensues for a couple of months before we realize that there are actually benefits on the other side and then we quiet down for a while?

The really interesting part to all of this is that the people complaining about the attribute are the same people who complain about the designer and drag & drop "programming". Admittedly, I am also one of those people who hates the designer. This means that when it comes to writing their pages they aren't being lazy. They are taking great care of the final output and look of their application.

So if being lazy is not a problem while writing the HTML for your web pages (or views as I should say with Mvc), then why is assigning an attribute such a big deal?

--John Chapman

Wednesday, December 12, 2007

Visual Studio 2008 Test Result Cleanup!

I don't know how many other people used the MSTest (Unit Testing capabilities of Visual Studio 2005 Team Edition) but those test results really pile up after a while.

We currently have 525 unit tests on our current project. At one point in time I noticed that my disk space was running low on my development machine. I had no clue how so much space could evaporate so quickly. I started doing quick checks on different folders that I thought could be problematic. I eventually found my way to our solution folders. We typically have multiple branches loaded on our local machine, but I never thought we wrote enough code to justify a 20+ Gig project folder. Further investigation led me to a folder known as TestResults. I found out that Visual Studio was keeping the results of every unit test run I ever did.

Don't get me wrong, I love the ability to review past test runs to see where one failed. I really enjoy the ability to create bugs directly from the results of the test where the associated result is attached to the bug for the assigned developer to review (assuming it wasn't the assigned developer that noticed the failing test). But keeping a history of these files was really adding up!

I found myself deleting the old test runs from this folder every so often. Then something really interesting happened today. I tried running my full suite of tests after loading the solution in Visual Studio 2008. A message box appeared stating that my old test runs will be deleted since I will be over the maximum of 25 historical test runs. It made me agree to the deletion of these items and also told me that the number was configurable in the Tools > Options menu!

Now, maybe this feature existed in 2005, but I could never find it. This turns out to be one of those features that isn't really advertised, isn't major but just made me happy. For some reason it's the little things that seem to get me excited. It seems like this would have been so simple to leave out, yet they took care of it anyway.

Good job on that one Microsoft! You just made my life a little easier.

--John Chapman

Monday, December 10, 2007

Open Source Rocks!

I've recently began seriously investigating mocking frameworks (See NMock and Rhino Mocks specifically). I've looked at mock frameworks before, but I was quick to dismiss them. I never really saw the benefit. I honestly don't think I really understood what they were used for.

At first I thought maybe it was used to automatically create my business objects so that I would have full object graphs in memory. This is something we use a lot to perform mapping file tests, it turned out the tools didn't really fit for this purpose, and I pushed them aside.

Now, I spent some time again wondering if I missed the big picture when I looked before. After all this was years ago that I last researched mock frameworks. While the NMock interface seemed pretty clean to me I didn't like the use of strings. Rhino Mock's ability to use strongly typed method calls was a deal sealer. How could I use a weakly typed mock framework when a popular strongly typed variety exists?

I've decided that mock frameworks come in handy while performing unit tests which would otherwise use external resources. Previously I always skipped testing these methods and moved as much of the business logic to methods which had no outside dependencies. I've finally seen the light and embraced the mock frameworks for what they were truly intended.

If you haven't looked at Rhino Mocks, I strongly recommend you do! Let's take a very simple example to see the power of this tool! This is a totally contrived example, so don't blame me if it is unrealistic. Let's say you have a dependency on an external pricing service which finds the best available price for a given product in your companies purchasing department. Let's define the needed interface as IPricingService


public interface IPricingService
{
decimal GetPrice(Product p);
}


A very simple service for sure. Given any product it will return the current price which we can pay for that product. We'll use this service in our AddDetail Method of our Order class. See the Order, OrderDetail and Product implementations below. Note I use public fields for brevity only.

public class Product
{
public int Id;
public string Name;
}

public class Order
{
private IPricingService pricingService;

public IList<OrderDetail> Details
= new List<OrderDetail>();

public Order(IPricingService pricingService)
{
this.pricingService = pricingService;
}

public OrderDetail AddDetail(Product product, int quantity)
{
OrderDetail detail
= new OrderDetail(
product,
quantity,
pricingService.GetPrice(product));

Details.Add(detail);
return detail;
}
}

public class OrderDetail
{
public Product Product;
public int Quantity;
public decimal Price;

public OrderDetail(Product product,
int
quantity,
decimal
price)
{
this.Product = product;
this.Quantity = quantity;
this.Price = price;
}
}


Now our job is to test the AddDetail method which we see above. Well we don't want to rely on an actual pricing service which would use in production, so instead we will resort to our trusty Rhino Mocks framework to fill in a fake implementation. Now we can test 100% of the code in our Order without worrying about external dependencies. This helps keep our unit tests focussed as well as allows us to improve our code coverage.

[TestMethod]
public void AddDetail()
{
MockRepository mocks = new MockRepository();
Product p = new Product();

IPricingService pricingService
= mocks.CreateMock<IPricingService>();

Expect.Call(pricingService.GetPrice(p)).Return(15.50M);

mocks.ReplayAll();

Order order = new Order(pricingService);
OrderDetail newDetail = order.AddDetail(p, 20);
Assert.AreEqual<decimal>(15.50M, newDetail.Price);
Assert.AreEqual<int>(20, newDetail.Quantity);

mocks.VerifyAll();
}


Take special note of the MockRepository, the CreateMock method and the Expect builder. Expect basically tells the framework that it should expect a call to the method provided in the Call() and to expect it for the specified parameter. When the mock receives that call that the provided parameter it should return the value 15.5 as the price.

ReplayAll is a bit confusing, but basically this tells Rhino Mocks to stop using the method calls for recording instead use it to return the results we configured and to track the calls which are made.

After we have called ReplayAll we make our method calls and assert that the new order detail was created correctly and added with the appropriate price which was returned by the pricing service. VerifyAll tells us that our expectations we created before replaying were met (such as expecting a call to the GetPrice method with a parameter of p.

If I didn't want to set expectations for method calls I could have opted for the SetupResults class instead of Expect class.

This is just the tip of the ice berg regarding the framework. If this peeked your interest, you owe it to yourself to take a deeper look.

If you're like me and you saw the Expect.Call() taking an execution of the method you expect a call for you asked yourself "How the heck did he do that?" When I first tried it I was blown away that it worked. I tried to come up with ideas of how it worked.

This is one of the true beauties of open source software. I can satisfy my own curiosity! After playing with the framework I downloaded the source code to check it out. The final implementation wasn't actually that complex. Basically the proxy object which the mock framework creates tracks what the last method call was and what parameters were provided to the method call. So the Call() method doesn't really do anything except return the options for the last method call which was made. The fact that it is provided as an argument to Call() is complete ignored! I view this as genius. It's really thinking out of the box in a way that I'm not so sure I would have come up with. Being exposed to code that works in different ways that I think allows me to expand my mind and come up with solutions to my own problems I may not have previously come up with.

I've also spent a considerable amount of time reviewing some of the NHibernate code. This is after all how I was able to write such a specific performance test regarding the performance of NHibernate accessors after all. I used just the specific NHibernate pieces I needed.

As software developers we are unbelievable lucky to be living in such a day and age where such great software code is available for free for us to learn from. I recommend to everyone interested in really improving their skills find an open source project which revolves around something you are interested in and take a look. There is tons of great stuff out there! We should all take advantage.

Lastly, I just want to thank Oren Eini (aka Ayende Rahien), this Rhino Mocks framework he created is truly awesome.

--John Chapman

Sunday, December 9, 2007

Linq to Sql: The Good, The Bad & The Bottom Line

I promised my take on Linq to Sql a few days ago. I have spent some time over the past couple days playing with Linq to Sql connected to the AdventureWorks SQL Server sample database.

I have a lot of experience working with NHibernate so you may see some comparisons throughout the post.

Overview

Most everyone who is likely to read this post probably knows what Linq to Sql is. For those that don't, Linq to Sql (and really Linq in general) has been one of the most talked about (Once we found out Linq to Entities wasn't going to ship with Visual Studio 2008) and hyped features of Visual Studio 2008 and the .NET 3.5 framework.

Linq to Sql is actually a big shift for Microsoft. Linq to Sql is Microsoft's first production quality Object Relational Mapper or O/RM for short. They may have tried in the past with products such as ObjectSpaces, but this is the first tool to be released as a completed tool. O/RM tools exist to try and solve the Object-relational impedance mismatch which basically says that most applications are developed in object oriented programming languages these days yet the data which they operate on is typically stored in a relational database. This process of moving data between the objects and relations and vice versa is described as the impedance mismatch. There are obviously many fundamental differences between data stored in a relation and data stored in our objects.

Traditionally Microsoft has endorsed using DataSets to solve this problem. DataSets are essentially a relation based object in your object oriented programming language. Essentially it would allow you to work with your data in your application as relational data. The problem with this? You fail to take advantage of object oriented application design and the advantages it brings to you. Typically these programs have little testability and a significant amount of duplication. As such many O/RM tools became popular (although far less so than if Microsoft had endorsed them) such as NHibernate, LLBLGen Pro, Vanatec OpenAccess, Wilson ORMapper, EntitySpaces, eXpress Persistent Objects and many others (apologies to any I didn't list).

Note that Linq to Sql isn't necessarily a direct competitor to NHibernate or the other above listed O/RM tools for the .NET framework, that is Linq to Entities (AKA ADO.NET Entity Framework). Linq to Sql is more of an introduction to the O/RM world.

The Good

  • The Linq query language itself
The Linq query language is just awesome. It really is a joy when you start to work with it. It can quickly become a pain because it is complex, but then it makes you realize just how powerful it is. I have never seen a query language that is quite so rich. Basic queries are very simple to write and understand, yet it also provides functionality for very complex queries.

Plus, the queries are strongly typed so now there is much less to worry about when refactoring your business objects as compile-time checks are now available for your queries. Note that even with stored procedures, if you change a column in a table referenced by a stored procedure, it won't inform you that you just broke a stored procedure. Likewise stored queries in your applications will not inform you if you change a property name or column either.

For fun see the following blog post: Taking LINQ to Objects to Extremes: A fully LINQified RayTracer. This is not something you woudl actually do, but it does help reinforce just how powerful Linq really is.
  • Better Naming Conventions Than NHibernate
While working with Linq to Sql I felt that the methods on the context were easy to understand and more intuitive than the NHibernate equivalents. For example, when you want to save your changes to your database NHibernate says Flush whereas Linq to Sql uses SubmitChanges. But the big advantages are Linq to Sql's InsertOnSubmit vs NHibernate's Save as well as Attach versus NHibernate's Update or Lock methods.

I can't tell you how many times I've explained how the Save, Update and Lock functionality for NHibernate works. Most people seem to think that they need to call these methods to cause a database operation to take place. They assume Save = Execute Insert NOW, and Update means execute an update NOW! Then they use Flush for good measure because someone told them too. The Linq to Sql naming convention seems to imply that that is not quite what is going on.
  • Simple to Get Started
It didn't take me very long to get up and going with Linq to Sql. While I'm not the biggest fan of the Object Relational Designer, it sure is easy to use and fast to build basic object graphs. Someone who is not familiar with O/RM tools should be able to have objects mapped to database tables in a matter of minutes. This could work very well for simple RAD applications. This process really couldn't be much simpler.
  • Superior Optimistic Concurrency Support
My apologies to any O/RM tools out there that have as good concurrency support as Linq to Sql, I just know I prefer the flexibilty offered by Linq to Sql over NHibernate's. Now, that being said NHibernate's concurrency has always worked fine for me, it's just nice to have additional options.

First, when a ChangeConflictException is thrown it includes a ton of information such as the entity involved, the columns involved and allows your code to recover from it. Linq to Sql will also let you configure if you want to catch all change conflicts or fail as soon as the first conflict is found. These are features, which to my knowledge, NHibernate does not support.

Plus, this is basic but Linq to Sql has native support for SQL Server timestamp columns. This allows you to ensure that you know of all updates even if it occurs outside the scope of Linq to Sql. For some reason NHibernate still does not support this type of column. Instead it rolls its own version column.

Resolving stale data with RefreshMode allows for many options when re-syncing your objects with the database. Again, I just like the options.
  • Superior Stored Procedure Support
If you have a wealth of stored procedures, rest assured they are easy to use from Linq to Sql. Just drag (I do feel dirty using that word) the stored procedure from the server explorer to the methods list in the object relational designer and you will see a new method on your associated context which directly calls that stored procedure. To your code it looks the same as any other method.

Note it is also possible to write your Linq to Sql CRUD through stored procedures. This is also a relatively simple process.

The Bad
  • Very Basic Object Support
This is actually the killer here. Linq to Sql is a very basic O/RM and does not support many of the object oriented concepts sophisticated applications are likely to use. Just a few of the missing features are:
    • No Inheritance
    • No Value based objects (IE NHibernate Components)
    • No Timespan support (A huge problem for the Logistics field I work in)
    • Collections limited to EntitySet (which isn't even a real Set)
      • Where is the Dictionary support at least?
  • No SaveOrUpdate Equivalent
This forces more persistence knowledge to a lower level requiring that all code which associates an object with a context must know if it already exists in the database or not. This basically just adds extra checks in your code which should not be necessary. Sometimes it can seem a bit dirty to check if an object already has a primary key or not yourself, it seems like logic which doesn't belong within the application itself.
  • GUI based Drag & Drop
Yes, I know you can use a seperate mapping file, much like you can with NHibernate, but this isn't realistic. If you don't use the designer, you don't get the code generation. If you don't get the code generation you are responsible for writing all of the many Hooks in your objects that Linq to Sql needs. Folks, these objects are quite dirty. At least with NHibernate your objects are complete persistence ignorant (aka POCO aka Plain Old CLR Object) meaning they look clean and usable for more than just NHibernate. Therefore using anything besides the designer isn't very feasible.

The big problem here though is that your entire object graph needs to live in one diagram and the code behind these objects winds up in a single code file by default. This just isn't acceptable for applications of any size. Diagrams which contain 20-30 objects would be a major pain here, let alone applications that have hundreds. For large applications this just wouldn't fly.
  • Relationships Aren't Interface Based
All of the associations to related objects are handled with EntitySet and EntityRef. Whereas with NHibernate you have the ISet and just the object type you expect. This basically forces the Linq to Sql references on your object, decreasing the ability for unit testing your objects in my opinion. I also don't like the persistence based decencies on my objects.
  • Transaction API is Goofy
For whatever reason you need to handle all explicit transactions outside of the Linq to Sql context. You have to create and the commit it outside the context while supplying the transaction to the context while it is in use. Linq to Sql implicitly uses transactions for all calls to SubmitChanges, but you would think it would be possible to begin new transactions via the context, and then commit or roll them back through the context as well.

The Bottom Line

Really, I have only touched on a brief overview of Linq to Sql here. The important question I ask myself is, "Would I use this framework?". Well, it's a bit of a difficult question. If I was writing a small application which I knew would not grow in to a large one and my object model would be simple enough for the limited object support, yes I would use it. I could get up and going very fast, and I enjoy working with the context interfaces.

However, if I was working on a larger application (really doesn't take much to be too large for what I would do with Linq to Sql), or one which I thought had potential to adjust and grow over time, I would skip Linq to Sql and look for my trusty NHibernate.

So really, it would only be used for a very small subset of problems out there that I would try to solve.

All of that being said, I think Linq to Sql is very important to the .NET development community. Microsoft has historically tried to pretend that O/RM tools didn't exist and to do any development except their DataSets or repetitive patterns was crazy. Now that Microsoft has a framework to endorse it should greatly expand the exposure to such technologies in the .NET development community. I think overall this is a good thing, and will result in overall superior developers.

My only concern with this introduction is that people may get the idea that O/RM tools are nice, and get you up and going fast but fall flat on their face once you try to do anything advanced and then you need to resort to the same tools you used all along. This was actually a very common opinion by people I talked to about NHibernate a few years ago. They had heard of others using O/RM tools (not NHibernate specifically) and how they just don't handle advanced things, they are only good for simple things.

With Linq to Sql I hope developers become exposed to O/RM and become curious about other tools such as NHibernate when Linq to Sql is too simple for what they need instead of grouping all O/RM tools together as being too simple and idealistic.

I'm actually excited about the potential of the .NET development community now that more people will be exposed to O/RM. Long live O/RM tools, you have been lifesavers for me!

--John Chapman

More on Extension Method Judgment

Just another tidbit from my last post Reserving Judgment. I was reading Scott Guthrie's latest post on the ASP.NET MVC framework (ASP.NET MVC Framework (Part 4): Handling Form Edit and Post Scenarios), and noticed that he showed a tool which was using extension methods in much the same way Greg Young was describing. See Scott's use of the ASP.NET MVC HTML Helpers.

What can I say, I actually kind of like how they work in that scenario.

Reserving Judgment

I have been very quick to put down certain pieces of the .NET 3.5 framework (reference prior post on extension methods and partial methods), most specifically extension methods and partial methods. I'm not quite ready to give in on partial methods yet, they still just seem like a tool used to make code generators slightly easier, but I think I'm ready to reserve judgment on extension methods now.

I still think extension methods are dangerous and are going to be subject to blatant misuse. That being said, I think I need to keep an open mind regarding the potential uses for extension methods. Note that I'm referring to the use of extension methods entirely within code that you control.

I recently read a blog post by Greg Young titled A Use For Extension Methods where he basically stated that there are a lot of people out there that share my kind of viewpoint. He also wanted to show one possible way where extension methods could be used. He basically showed that extension methods were useful in controlling the context of intellisense when using a fluent interface for a pseudo-builder pattern. I call it pseudo since really there is only one type of available builder.

Basically his extension method looks like the following:


public class Builder

{

}

public class Create

{

public static Builder New = new Builder();

}


public static CostBuilder Cost(this Builder s)

{

return new CostBuilder();

}


So now when you call the Create.New you'll only see the builders which are in the namespaces which you referenced in your using directives.

Fair enough, upon seeing this I thought it seemed interesting, but went in to my pessimistic anti-extension method mode and came up with my usual answer, why not just do this:

public class Create

{

public T New() where T : new()

{

return new T();

}

}



But upon reading this, it's pretty lame isn't it? It's basically a generic factory that doesn't new anything, why not just call new CostBuilder().DoStuff().DoMoreStuff()?

But Greg's interface is a bit cleaner than what I am offering. It's not that extension methods are really needed here, it's just that it can actually make your code a little cleaner and slightly easier to read (when working on large projects), which is something I didn't expect to say regarding extension methods.

Maybe I need to wait this one out for a year to see what kinds of things developers come up with while using extension methods.

--John Chapman

Saturday, December 8, 2007

C# Type Inference But Still Strongly Typed

I have spent considerable time today reviewing Linq and more specifically Linq to Sql. I'm currently working on a blog post where I'll go in to the details of what I think the pros and cons of Linq to Sql are as well as my overall opinion. In case you couldn't guess it I'll be using NHibernate for my comparisons, after all it is what I'm familiar with.

While reviewing some things I ran in to the following compile time check. It was very simple for me to resolve, but I wonder if it will cause developers to fall in to traps. Especially those developers who have some experience with weakly typed languages such as Javascript.

Take a look at the following code I wrote:


AdventureWorksDataContext context =
new AdventureWorksDataContext();

var orders = from po in context.PurchaseOrderHeaders
select po;

if (chkUseDate.Checked)
{
orders = from po in orders
where po.OrderDate > dtOrderFrom.Value
select po;
}

orders = from po in orders
orderby po.OrderDate ascending
select new
{
po.PurchaseOrderID,
po.RevisionNumber,
po.OrderDate,
po.ShipDate
};


Does anyone see what is wrong with the code above and why it failed to compile?

The compile-time error was:
Cannot implicitly convert type 'System.Linq.IQueryable<AnonymousType#1>' to 'System.Linq.IQueryable<BLL.PurchaseOrderHeader>'. An explicit conversion exists (are you missing a cast?)

After seeing that I immediately realized that I tried to use an object which type inferred to return PurchaseOrderHeader objects to return anonymous type objects instead. You can't just change a reference to be of another type in C# 3.0, hence the strong typing, I should know better.

But honestly, with the whole var keyword, I wasn't really thinking about it. It was a minor slip up, but I wonder how many developers will fall in to that trap. I think some developers may have seen the var keyword before in Javascript and they may have used it in the fashion I just did.

That being said, I have been enjoying my time with Linq today. I should have a post up within the next few days with more details.

P.S. If you're wondering what is going on with the 3 step linq ueries above, that's how you write dynamic queries in Linq. Simply reference the previously defined query in your new linq query in order to further restrict the query which you are building. Keep in mind that writing a linq query doesn't perform any operations. You have to either enumerate over the values of the query or call a method on the query like ToArray(), ToDictionary(), Select() etc...

If you're curious how to resolve the issue above you just need to declare a new variable for the last query to store the new type. var results = <Linq expression> would work just fine.

--John Chapman

Solution Folders To Group Projects

While reading Scott Guthrie's blog today I stumbled upon one of his links of the day for december 8th entitled Big Solutions Can Be Organized Using Solution Folders which is at the .NET Tip of the Day web site.

Actually, this is really cool. Am I the only schmuck who didn't realize this was possible? I have used Solution Folders in the past, but it was always to group files which were not part of the actual build process. For example we always had a Libraries folder which contained the third party dlls which need to be referenced to build the project. This way all a developer needs to do to build the project is to download the latest source code, and the solution will automatically download the needed referenced assemblies. Plus any upgrade we do to a third party dll automatically propagates when someone gets latest.

But the ability to place projects within the folders to group them together? This will actually be very handy for us. At work our current solution has 33 projects, and truthfully it is growing! Being able to group the projects which are similar or somehow related will really come in handy for us.

Why didn't I realize this sooner?

--John Chapman

Testing For Collections

Where I work we developed some foundation classes to assist with our unit testing. We make heavy use of NHibernate on our product (We work one very large application) and wanted a way to ensure that we were not breaking our mapping files over time.

In order to accomplish this goal we developed a mechanism to manage multiple sessions and then compare object graphs between the two sessions. Basically we would call a creation method for the specified object type we are testing, assign all properties with random values, and then assign many-to-one associations using the child object's creation method.

The creation method has the option of automatically saving the constructed objects to an NHibernate session or not (these creation methods are also used for non NHibernate based tests).

After the object graph has been constructed we then flush the session, construct a new session and then load the same parent object view the new session. We then wrote a base method for our tests which compares the two objects to make sure they are identical.

Our comparison method uses reflection on objects which are being compared. We reflect over all properties and ensure they are all equal. When we reach a property which is not a simple type we recursively call our comparison method on that property to ensure all child properties of these objects are compared. To ensure there are no infinite loops we keep a collection of already verified objects to make sure we don't compare the same object twice.

The one catch here is that currently we ignore collections. Collections are far more difficult to test, especially because the majority of our collections are Sets. We can't simply check the number of positions in the collection and each position of the collection since every load of the object could result in a new order (Sets are not ordered after all).

During re-factoring we accidentally broke our comparison tests. While reading the below note that all collection properties of our business objects are interfaces which are later replaced with the appropriate collection implementation. The original implementer who checked to see if the property was a collection used the following code:


if (type.ToString().ToUpper().StartsWith("SYSTEM.COLLECTIONS")
|| type.ToString().ToUpper().StartsWith("IESI.COLLECTIONS"))


There are several things I don't like about that code. To be fair, the code actually worked for our implementations, but it is extremely dirty. There are several things wrong with this.

First, you shouldn't be checking for a type based on a name. Hard coding string based namespaces is unreliable and a bad idea. There is no way to get any sort of compile time checking from this.

Second, I'm typically opposed to ToUpper(), it's almost never what anyone wants, it's just been accepted from historical practices. People either do it because they don't want to think about what it should really be, or they do not know about using a CaseInsensitiveComparer.

Third, What if we chose to use a custom collection for one of our implementations? In that case it wouldn't actually have a name which matched either of those conditions.

So obviously I thought this needed to be re-factored. My attempt is shown below:

if (type is IEnumerable)

Seems really simple right? My thinking behind this was that every collection interface we use or have even known about has mandated that it also implement the IEnumerable interface. Take a look at all of your collection interfaces, no other interface is common between them. ICollection is not enforced by the generic interfaces. It may be enforced by the non-generic interfaces, but not the generic ones. As such I determined it wasn't safe to use ICollection.

Well, a co-worker of mine found out that I actually broke our mapping tests. Worse than that they were broken in a way where we would receive false positives. This simple change resulted in our mapping tests no longer testing strings. The System.String class is IEnumerable. This makes sense upon further inspection (enumerable list of characters). However, it was not something I first thought of.

So, the IEnumerable was bad, we have learned that. We could have checked for IEnumerable but not a String, but that seems dirty.

The sad thing here is that checking for ICollection actually works. Every single collection in the .NET 2.0+ framework and the Iesi.Collections package implements the ICollection interface, our problem with that is that it is not enforced by the interfaces we are using. Does anyone know why IList does not mandate ICollection? It mandates ICollection<T> and every implementation uses ICollection, so why not? It seems like it would have been beneficial to everyone. I have yet to find the drawback. After all, it does enforce IEnumerable.

My next attempt turned out to not actually work. I really was thinking this was the answer for us, but I didn't have my thinking cap on. My code can be found below. Note that the formatting is horrible, but needed to ensure that the blogger template does not cut off my code sample. Also note that obj in the example is the object which is being checked for a collection. Also note this is not a direct copy and paste, there are optimizations in our real code which are removed for simplicity here.

Type type = obj.GetType();
Type genericCollectionType = typeof(ICollection<>);

if (obj is ICollection
|| (type.IsGenericType
&& type.GetGenericArguments().Length
== genericCollectionType.GetGenericArguments().Length
&& genericCollectionType
.MakeGenericType(type.GetGenericArguments())
.IsAssignableFrom(type)
)
)


If you see here I check for ICollection first, and then if the item being tested does not match that interface I fall back on a generic collection test. Note that I compare the length of the generic arguments since attempting to MakeGenericType with the improper number of arguments would throw an exception.

This generic check works for collections like List<t>, Iesi.Collections.Generic.HashedSet<T> but it doesn't work for Dictionary<K,V>. Basically ICollection<T> has one generic type, but IDictionary<K,V> has two and therefore won't be tested. I would need to determine a way to see if it is assignable from ICollection<NameValuePair<K,V>> since that is how the Dictionary<> impelments ICollection<>.

I'm not sure the cleanest way to check for this. Plus, it won't actually be called since there doesn't exist a collection out there which doesn't implement ICollection.

So basically I'm jumping through a number of hoops which aren't totally needed, but I really wish the IList<> and IDictionary<> enforced ICollection<> then I would feel a whole lot more comfortable with this.

Does anyone know why they were left out? Is there a really good reason I am missing?

--John Chapman

Thursday, December 6, 2007

MS Unit Exception Testing

How do people handle exception expectations within the Microsoft unit testing framework built in to Visual Studio Team Edition? Of course there is the built in ExpectedExceptionAttribute where you mark the test itself with an exception which should be thrown while performing the unit test. If no exception is thrown it fails, if an exception of that type is thrown it passes, if the exception is of the wrong type it fails.

I don't know about others, but this rarely sits well with me. Typically when I write a test method I want the ability to test other conditions when an exception is thrown, or I want to use a single test method to test all exceptional cases within a single business object method. Do most people just write one test per case of a method? It seems like this would result in a lot of duplication of unit tests.

To address this issue I have historically used the following code:


[TestMethod]
public void TestSomeMethod()
{
try
{
SomeMethod();
Assert.Fail("Expected ExpectedException to be thrown "
+ "while calling SomeMethod.");
}
catch (ExpectedException) { }
}


Recently it occurred to me, why do I have to go through this process every time I want to test for exceptions. At first I was hopeful that an additional mechanism would be made available with Visual Studio 2008 MSUnit, but it was not. I then decided to investigating the ability to extend the framework.

Honestly, I feel a little dirty saying this, but this appears to be a good use case for Extension Methods. Yes, I know my previous statements regarding extension methods (C# 3.0 Extension Methods? A Good Idea?), but I honestly think this would be a good use of it.

Imagine writing the following code instead:

[TestMethod]
public void TestSomeMethod()
{
Assert.ThrowsException<ExpectedException>
(delegate { SomeMethod(); });
}


Wow, I really would enjoy that functionality if it was available to me. Unfortunately for me, I can not make this an extension methods. Extension methods are only allowed on instances of objects. Since Assert is a static class you can not add extension methods. So instead of using an extension method we are instead left with using our own custom static class for testing.

See the implementation of this below:

public static class CustomAssert
{
public delegate void MethodDelegate();

public static void ThrowsException<T>
(MethodDelegate method)
where T: Exception
{
ThrowsException<T>(method,
"Expected Exception of type "
+ typeof(T).ToString()
+ " was not thrown.");
}

public static void ThrowsException<T>
(MethodDelegate method,
string message)
where
T : Exception
{
try
{
method.DynamicInvoke();
Assert.Fail(message);
}
catch (T) { }
}
}


and now the final tests looks like:

[TestMethod]
public void TestMethod1()
{
CustomAssert.ThrowsException<ApplicationException>
(delegate { SomeMethod(); });
CustomAssert.ThrowsException<ApplicationException>
(delegate { SecondMethod(5); });
}
In all honesty when I first decided this was something I was interested in I thought extension methods would work for this. While writing the code for this blog I came to the realization that they wouldn't.

While researching how to make this approach work I did stumble across the xUnit.Net framework. It looks like the xUnit.Net framework already supports very similar functionality to what I am describing in this post. If you're interested with a unit testing framework with functionality like what I described maybe that is where you should look. If you want to stick to the existing Microsoft MS Unit framework, this may be a good approach to better handle your exception testing needs.

--John Chapman

Tuesday, December 4, 2007

DateTime.Now Precision Issues... Enter StopWatch!

I'm a little bit embarrassed by this one. Previously I bogged about DateTime.Now Precision, or the lack there of. I first discovered this phenomenon while performing my performance tests for my blog post NHibernate Access Performance, which showed the differences between property and field level access within NHibernate.

Basically I found that DateTime.Now was not very precise and worked horribly as a benchmarking tool to test relative performance. Well, now I've come to my senses and realized that the nail I was banging on was really a screw, and using the hammer wasn't really the best tool for the job. I've now come to learn of the screwdriver which answers my benchmarking needs. It turns out the System.Diagnostics.StopWatch class exists for just the purpose I was trying to use DateTime.Now for. DateTime.Now is not supposed to be accurate enough to be used for diagnostics.

The StopWatch class is very simple. To use it you basically construct an instance of the object, Call Start(), perform your operations and then call Stop(). The StopWatch will contain a property ElapsedTicks with the most accurate elapsed time available.

While I wish I knew about this before when I originally recorded the benchmarks, I know about it now, and I always say I enjoy learning new things. With this new found knowledge of the StopWatch class I decided to re-run my NHibernate Access Performance tests to see how they looked. The results were dramatically different than I found with the DateTime mechanism.

First, I re-ran the 100,000 accesses test with two threads running synchronously. This is meant to replace my original test I ran. These results can be seen below. Note that all times are in milliseconds.


I then decided to run these same tests again but this time in a synchronous manner, meaning that only one test would be running at any given time. These results can be seen below:


In truth, these last numbers are probably the most accurate representation of the true performance implications of using the various NHibernate access strategies.

Of interest though is that fact that my earlier tests for property level access were showing values of close to 600 ms for the basic property setter where as a simple switch to the StopWatch showed just 62 ms. There was a difference of a full order of magnitude? I was actually surprised by just how much these numbers varied. It turns out that the accuracy of DateTime.Now is even worse than I thought.

--John Chapman

Thursday, November 29, 2007

Where Is System.Collections.Collections.Generic.ISet?

.NET Framework 3.5 is released! Among the new enhancements? System.Collections.Generic.HashedSet (Reference: BCL Team Blog). Finally, proper Set semantics in the .NET language. Those of you who are familiar with NHibernate and the Iesi collections should be plenty familiar with Iesi.Collections.HashedSet and the other set based collections. Does this mean that NHibernate could finally be implemented with just the framework's collection structures?

It turns out no! Microsoft has chosen to not add an interface which matches this collections set based operations. There is no matching ISet interface which the new HashedSet implements. While some people will say "yeah but it still uses IEnumerable", I think this is a huge shortcoming of the .NET 3.5 framework. There is enough semantically different regarding a set than the other collections that it warranted its own interface to control how someone would interact with it.

The only thought I could come up with regarding why this obvious interface was left out is that if it is just an interface you can not guarantee that the collection behaves like a set. For example with a set you can not add the same item twice. If a custom collection were to implement said ISet and decided to let the Add method add the same item multiple times, what would stop them?

I suppose a set isn't something you can just define by an interface. Maybe I should let my emotions cool down prior to jumping over Microsoft for leaving out this interface. I guess I was just imagining the possibility of using NHibernate without having to add Iesi.Collections to my project. Then again, what would make me think that this would change, or change in a relevant time frame?

All that being said, I still wish they included the interface! After all there is an IDictionary which could theoretically be implemented in such a way that the custom collection could add the same key multiple times. If the Dictionary semantics can't be guaranteed by the already existing interface, then I say it would have been ok for such an interface to exist for the set.

--John Chapman

Tuesday, November 27, 2007

VSTS 2008 Web Test Forms Authentication

With the release of Visual Studio 2008 I was eager to test the Web Testing functionality to see if my issues from 2005 had been resolved. If you remember, Visual Studio 2005 did not support forms authentication with its web test functionality (see original post: Visual Studio 2005 Team Edition Web Test and the follow up VSTS Web Test Error Found).

Seeing as there is a hot fix for Visual Studio 2005 to resolve this issue, I figured the issue would be resolved with 2008. I figured this was finally my chance to make heavy use of the Web Test functionality. I opened up Visual Studio 2008 Team Suite and gave a simple web test a shot. No go! I receive the same exact errors I received with 2005.

How is this possible? How has such a blatant issue existed for over 2 years now? I know Microsoft is aware of it (based on the existence of the Knowledge Base article) so why hasn't someone resolved it for the release of 2008?

I'm very disappointed by this. So at this point I'm stuck looking for a work around. I'm still looking for suggestions if anyone has them.

Note that this is really a cookie problem and not a forms authentication issue. The cookie is sent to the client but never returned to the server. When I find a suitable workaround I'll post it here.

--John Chapman

Monday, November 19, 2007

VS 2008 Professional Download Horrors

This one is really strange to me. Today was the big release to manufacturing of Visual Studio 2008. I decided to try and download the VS 2008 Professional edition (I actually downloaded Team Suite Trial fist) only to run in to the fun of the Akamai Download Manager.

This thing caused me so many issues! I wonder if this is the first time Microsoft decided to use this product. I was trying to download it while using IE 7 in Windows Vista only to notice that the page just kept refreshing. OK, so eventually I realize that the pop up blocker must be blocking the download but not telling me. I disable the pop-up blocker and see the Akamai pop-up and then it requests that I install an AcitveX component. OK, not my favorite thing in the world, but for VS 2008, you bet!

The download manager launches and asks me where I would like to place the file. I say "Download" please, it says "Sorry, you don't have access to that folder." I don't? That is strange. I try my E: drive instead, same story. I then try my Documents folder, it looks like it works. Except I get an error message stating that it can not access my documents folder and instead will place it in Temporary Internet Files.

3.3 Gigs later the file is complete, only to notice that the file is invalid. I hunt it down in Temporary Internet Files and it can't be copied or opened. The OS says the file doesn't exist!

Ok, this is frustrating. After a few more minutes I took the actions I should have taken earlier, I opened up Firefox. Now I go to the same pages I went to before, and it just works. The download manager opens (as a Java Applet this time) and I select my download location, and it just works. Why was IE such a fiasco?

You would have thought Microsoft would have tested this on Vista with IE, wouldn't you? Thank you Firefox, I couldn't have downloaded Microsoft's product without you! Who would have thought?

--John Chapman

NHibernate Access Performance Revisited

Two days ago I posted a blog titled NHibernate Access Performance. After making that post the results kept bothering me. First, the results did not seem accurate. I could not for the life of me figure out how the CodeDom optimizer performed so well when accessing private fields. I reviewed the NHibernate code over and over and could not determine how it possibly performed better than the basic getter/setter. Add to that my discovery from yesterday regarding the DateTime.Now Precision and I had to re-run these tests.

I learned that my instincts were dead on. The CodeDom optimizer for private field access isn't any faster than the basic field level access. It turns out I had a bug in my code where it was actually using the CodeDom property level access instead. I know, I know, I should be ashamed, I shouldn't write bugs! Truth be told, this was way too simple, I should have caught this earlier.

Now, due to the issue with the DateTime.Now precision issue, I decided to run my tests for 10,000,000 accesses. I figured at this point any precision issues should be insignificant. See the updated chart:


What's really interesting about these results is that the private field access now takes twice as long as the public property access when using basic reflection. This is more along the lines of what I would have expected. I do not know why 10,000,000 loops was enough to notice this difference but 100,000 loops wasn't. I wonder if there is some hit taken upon the first access of a property which is not present for a field? If I find out more about this I will write a new post.

--John Chapman

Sunday, November 18, 2007

DateTime.Now Precision

While re-evaluating the numbers from yesterday's post NHibernate Access Performance, I thought that the returned time in milliseconds seemed a bit strange. Specifically the repeat of 13.67 and 15.62 milliseconds. What are the odds that you see the exact same values twice while running the tests I was running? I started to wonder about how precise the DateTime.Now (or DateTime.UtcNow) values really are. I always assumed they would be updated once the ticks of the processor are incremented. It doesn't look like that is the case.

For fun try running a console application where you write the current time in ticks to the screen two commands in a row. They are exactly the same. At least they were for me.

Now for more fun try the following code:


DateTime now = DateTime.UtcNow;
while (now == DateTime.UtcNow) { }
Console.WriteLine(((TimeSpan)(DateTime.UtcNow - now)).TotalMilliseconds);


When I first wrote this code and tried it, it returned 15.624 milliseconds every single time (A common value I see while running my performance tests from yesterday). However, now when I run it I see .9765 milliseconds every time. Something is controlling the precision of the DateTime.Now and I have no idea what.

I would have expected something like this to have been publicized more. I've never seen articles written explaining the precision of DateTime.Now. The lack of precision seems like it could be an issue for systems which perform many transactions per second. It seems like it would be helpful to be able to guarantee order based on time. That unfortunately doesn't seem to be the case anymore.

As a result of these findings I think I'm going to re-run the tests from yesterday regarding private field access versus public property access in NHibernate. When ran for longer periods of times the results are actually a little bit different than what we saw before. Not different enough to change my conclusions, but different enough to be interesting none the less.

I've also been trying to hunt down how the reflection optimizer is actually helping with the private field access. From the code I see in NHibernate it looks like the performance should be identical to my non optimized getter/setter for the fields. Look for more information to come on that topic if I find it.

Saturday, November 17, 2007

NHibernate Access Performance

*1/5/2008 Update - Source code is now available for download if you would like to test these findings yourself. Download Here*

*11/19/2007 Update - It turns out my instincts were correct. Upon further review of the code the CodeDom field getter/setter was actually using the CodeDom property getter/setter. I had a very hard time understanding how the CodeDom reflection optimizer improved private field level access, now it turns out that it did not. An updated chart has been posted at the bottom of this article *

Recently I was involved in a discussion on the NHibernate forums regarding how to implement the null object pattern which later moved to a discussion regarding the performance impact of such a pattern. I have been told many things regarding the performance impact of reflection and more specifically the performance impact of reflection access of a property versus a field, but I have never actually researched these items myself. I finally took the time to closer examine the impact of NHibernate access mechanisms, and the results really surprised me!

I've always been told that accessing private members is significantly slower than accessing public members (due to Code Access Security checks). Because of this I used to prefer to access properties instead of private fields. However, after noticing that even with a very large application the reflection impact of field access versus property access wasn't noticeable I shifted gears to believe that all items which should not be settable from code should rely on NHibernate's nosetter access mechanism. Basically why expose a setter for your object's Id property when it is an identity column that only NHibernate should ever populate? This makes our code safer and helps ensure that someone who is new to NHibernate does not try to set that id value due to a misunderstanding of how this new O/RM paradigm works.

Test Overview

I figured it was about time to do some actual tests which showed the impact of using public property reflection versus private field reflection. I decided to write a small .NET console application which would include a simple type which contained a single private field and a single public property which wrapped that field. See the below class:


public class SampleObject
{
private int val;

public int Val
{
get
{
return val;
}
set
{
val = value;
}
}
}


I then chose the mechanism I would use to measure the relative performance. I decided on using threading to allow both property access and field access to run at exactly the same time. I figured that I did not want my test results to be impacted by unknown differences in system environments while the tests were running. I figured if both tests are running at exactly the same time both accessors would be subject to exactly the same system constraints. I determined that I would start two threads, and then immediately call Join on both threads from the main application thread which would allow me to evaluate the differences. Note that I calculate the processing time within each thread to ensure the time is as accurate as possible.

Now that I know how I will compare my field access versus property access I had to determine the metrics I wanted to measure. NHibernate 1.2 provides a few options regarding how it should access or set the values of fields and properties. The tool will always use reflection, but there are actually multiple ways NHibernate can use reflection.

1) Basic Getter/Setter
NHibernate's first and most basic mechanism for accessing/setting property values is via the NHibernate.Property.BasicGetter and NHibernate.Property.BasicSetter classes. Basically these classes wrap up a System.Reflection.PropertyInfo instance and then use that PropertyInfo's GetValue and SetValue methods to get or set the value via reflection. This is probably the mechanism most developers have used to get/set values via reflection (if they have used reflection).

2) Field Getter/Setter

NHibernate offers NHibernate.Property.FieldGetter and NHibernate.Property.FieldSetter to provide the basic getter/setter functionality to fields as well as properties. This works the same as above but uses the System.Reflection.FieldInfo class instead of the PropertyInfo class.

3) CodeDom Reflection Optimizer

Due to the performance impact of using reflection to get/set values NHibernate introduced the concept of "reflection optimizers". Basically NHibernate will build a custom class which it will use to access the field/property value of your object and then access the value via a delegate to this newly created accessor method to allow NHibernate to bypass reflection for each access.

The CodeDom mechanism of reflection optimization is used to support .NET 1.1. NHibernate writes it's own C# code in code which wraps up your property/field access and then runs this dynamically generated C# code through the built in System.CodeDom.Compiler.CodeDomProvider class or more specifically the Microsoft.CSharp.CSharpCodeProvider class which is used to generate a C# code compiler and then compile the dynamically written C# code. After compiling NHibernate uses reflection to dynamically create an instance of the new class and then proceeds to use that class for it's access.

4) Lightweight Reflection Optimizer

this technique of reflection optimization is very similar to the above mentioned CodeDom optimization except that it does not require a C# compiler. I believe it is called lightweight since it does not incur the overhead of the compiler. Instead this technique relies on the System.Reflection.Emit.ILGenerator class to dynamically build the accessor class. This basically skips the compiler and provides the output which the above compiler would provide.

Testing Code

For all tests I used the NHibernate types IGetter, ISetter and IReflectionOptimizer to ensure that my code followed the exact same code path that users of NHibernate can expect.
Now that I have my testing technique

For my testing loops I used the following code:

public void TestGet()
{
object value;
DateTime begin = DateTime.Now;
for (int i = 0; i < NUM_LOOPS; i++)
{
value = getter.Get(obj);
}
Time = DateTime.Now - begin;
}

public void TestSet()
{
object value;
DateTime begin = DateTime.Now;
for (int i = 0; i < NUM_LOOPS; i++)
{
setter.Set(i);
}
Time = DateTime.Now - begin;
}

And then to run the actual tests I needed the following code:

Thread propertyThread = new Thread(propertyContainer.TestGet);
Thread fieldThread = new Thread(fieldContainer.TestGet);

propertyThread.Start();
fieldThread.Start();

propertyThread.Join();
fieldThread.Join();

The prior loop is written for each case passing in the appropriate Getter/Setter.
After each run the results are output to the console window for me to analyze.

Test Results

Now for the good stuff! How did the results turn out? First for reference I ran this test on a slightly outdated computer (Athlon XP 3700+ 2GB Ram Vista Ultimate) with release code. The results I found were not at all what I expected. Note that for each scenario I ran the methods through 100,000 loops so the times shown are the time it takes (in milliseconds) to get/set a field/property value 100,000 times and with two simultaneous threads. Each bar graph pair was run simultaneously. See the graph:














































Property (msecs)

Field (msecs)
Basic Getter
524.38

391.58
Basic Setter
671.83

505.83
Lightweight Getter
13.67

27.34
Lightweight Setter
29.30

15.62
CodeDom Getter
15.62

25.39
CodeDom Setter
35.15

13.67




NHibernate Access Performance



Now, the first thing that jumps out at you is that the Reflection Optimizer strategies are significantly faster than the non optimized techniques. Well duh, it has optimize in the name right? That was to be expected. I don't know that I thought it would make this much of an impact, I expected something more along the lines of 3 times faster, but not 38 times faster! This part was really encouraging.

The part that threw me for a loop was that when using basic reflection the private field access was faster than the public property access? I never would have guessed that. Haven't we all been hearing about how slow private field access is for a long time?

Given this I can't possibly see why someone would avoid using field level access. With this sort of performance it seems to provide the cleanest access mechanism to hydrate your objects and then to analyze and save the changes. From now there is no way I'll put a setter on a property where it does not make sense from a public API perspective. I feel like the results of these test lifted some chains off my shoulders. (OK, very light chains made from plastic, but chains none the less!)

Is anyone else surprised by these results? I find them encouraging, but if I was making predictions before running the tests, this is not what I would have expected.

If anyone would like the full source code (it's about 210 lines of code total) leave me a comment and I will e-mail it to you.

--John Chapman

*11/19/2007 Update - Below is an updated chart that includes the correct data. It turns out that a bug in the code caused the CodeDom field getter/setter to use the property getter/setter instead. The results now make a lot more sense. I'm sorry for any confusion.

Note that the tests had to be re-ran and hence the numbers came out slightly differently this time. Also note that the items had to be abbreviated (LW = LightWeight and CD = CodeDom). Take note that the CodeDom optimizer does nothing for field level access, which is what I originally expected (since after reviewing the code I saw that for non Basic Getter/Setters it just calls the provided IGetter/ISetter.

With this run I also added a direct property access getter and setter as a base line. This helps to show the true performance of using reflection (or the reflection optimizers of NHibernate). Note that no direct value is provided for the field since direct access of a private field is not possible.

*For Updated Results with 10,000,000 Loops see follow up post: NHibernate Access Performance Revisited.

--John Chapman

Saturday, November 10, 2007

ASP.NET Page.IsPostBack: Who Started It?

I've seen it in many applications. I've even been guilty of it in the past. But how did it get started? I'm talking about the applications that look like the following:


protected void Page_Load(object sender, EventArgs e)
{
if (!Page.IsPostBack)
{

}
}

Why in the world do we do this? Developers that check if (Page.IsValid) are just as guilty. If you haven't figured it out I'm talking about using the Page property of the ASP.NET Page class.

If you're not familiar with the ASP.NET framework, you have System.Web.UI.Control which is the cornerstone of all visual aspects in ASP.NET. All visual elements inherit from that base class, including the ASP.NET page itself. The control class has a Page property which references the System.Web.UI.Page class which contains the control. In the case of the Page object, the Page property just returns itself.

Whenever we use Page.IsValid or Page.IsPostBack on our page we are saying "give me the page of this page and then check if it is valid" when we could just say IsValid or IsPostBack since then we would say is this page valid?

This is not something I've only seen once or twice, I've seen it on almost every ASP.NET project I've worked on. I have seen every developer I have ever worked with do this. But why would so many people choose to use Page.IsPostBack instead of just IsPostBack? I could understand one subset of developers doing this but it almost seems universal.

Out of curiosity I broke out Professional ASP.NET which is the oldest ASP.NET reference book I had available to me (Publshed June 2001) and I took a look at the IsPostBack section. Sure enough everywhere they gave sample code, it was Page.IsPostBack! Are these the guys to blame for this? Does anyone have a better reference that shows this behavior? I don't have access to the old MSDN documentation from the 1.0 days.

Next time we come across the Page.IsPostBack why not change it to Page.Page.IsPostBack or even Page.Page.Page..IsPostBack? If that's too confusing maybe we could try this.Page.Page.IsPostBack. There are endless possibilities here.

How did this happen? Am I off base here? Is this not as wide spread as I seem to think it is? Is it just my bad luck in the experiences I have had? Have people noticed this on other projects as well? Please tell me it is not just me!

--John Chapman

Tuesday, November 6, 2007

VSTS Web Test Error Found

I previously mentioned that I was intrigued by the Visual Studio 2005 Web Test feature, but was having issues with actually running my tests on an application which uses Forms Authentication (See the prior post: Visual Studio 2005 Team Edition Web Test)

Well it turns out I found my answer, and I don't like it. I found a knowledge base on MSDN which describes my issue (http://support.microsoft.com/kb/936005). It turns out that Forms Authentication doesn't work at all! This is after I saw other places that seemed to indicate it would work flawlessly.

How in the world does this make it out the door? They say they have a hotfix for it (which you have to go through holy hell to actually get btw), but why is it a hotfix and not at least something which was fixed by SP1? People, Visual Studio 2005 has been out for over 2 years, and this basic piece of functionality still doesn't work? How is that acceptable?

The quote in the "Cause" section of the knowledge base is priceless.


This problem occurs because the cookies in Visual Studio 2005 Team System use the Request for Comments (RFC) specification. However, some Web browsers do not use the RFC specification. For example, Internet Explorer does not use the RFC specification.


I would have expected Internet Explorer to have been the one browser Microsoft actually tested with. How would Microsoft put out a tool that doesn't properly support its own browser?

I'm not sure if I'll just wait for Visual Studio 2008 to see if that resolves my issues (it is due for release this month after all) or if I should look for a way to run my tests without using Forms Authentication.

--John Chapman

Blogger Syntax Highliter