Friday, January 30, 2009

Further Reading on IoC Containers

Eric Nelson was kind enough to invite me to contribute to the MSDN Flash newsletter. It should be published next Wednesday. The article was limited to 500 words, only enough for a very brief introduction to IoC containers, so for those who want to explore this exciting topic further, here are some links:

Oren Eini's (AKA Ayende) MSDN article: Inversion of Control and Dependency Injection: Working with Windsor Container:

http://msdn.microsoft.com/en-us/library/aa973811.aspx

Hamilton Verissimo's Code Project article: Introducing Castle - Part 1:

http://www.codeproject.com/KB/architecture/introducingcastle.aspx

The first chapter of Dhanji R. Prasanna's new book on Dependency Injection:

http://www.manning.com/prasanna/prasanna_meapch1.pdf

 

Some popular IoC containers for .NET:

Castle Windsor. This one is probably the most popular, I'm biased though since it's the one I use. It was originally written by Hamilton Verissimo (who now works on the MEF team) and maintained is by Ayende.

StructureMap. Also very highly regarded. It's author, Jeremy Miller, has an excellent blog.

Spring.NET. A port of the popular Java Spring container.

Ninject and Autofac are both interesting newcomers.

Unity is from the Microsoft Patterns and Practices group.

Glen Block will kill me, but MEF is also an IoC container, although one that's specifically targeted at developing plug-in application architectures.

Thursday, January 22, 2009

Tonight’s talk: Implementing the Repository Pattern

I’m going to be talking at this evening’s Open Source .NET Exchange at Skills Matter. It’s just around the corner from Farringdon station and is ably organised by Gojko Adzic. I think there are still some places left. It’s free, but you need to register in advance. It should be great evening with some excellent speakers. I’ll be there too :)

You can download the slides for my talk here:

http://static.mikehadlow.com/Implementing the Repository Pattern.pptx

See you there!

Tuesday, January 20, 2009

Capturing and playing back web service calls with Wireshark and SoapUI

I'm working on a web service that processes requests from an existing application. I don't have any access to the client except via its UI. Before the service is called you have to fill in several pages of forms; very tedious from a testing point of view. What I wanted to do was capture the request and be able to play it back to my web service.

To capture the request I use Wireshark. This is an excellent tool for capturing network traffic. The detail is awesome, you can see the entire protocol stack for each packet: Hardware, Ethernet, IP, TCP, HTTP and drill down into the details of each. Not only that, but it knows how to reassemble muti-packet messages so you can see each individual packet that makes up an HTTP request or response but also the complete message.

All I had to do was install Wireshark on the web service's server and set the filter to look for any request for my web service. The service asmx file is called MyService.asmx, so the filter looks like this:

http.request.uri contains "MyService.asmx"

Then I can run through the forms on the client application, hit submit and my message appears in the Wireshark UI.

wireshark

When I click on the packet in the top frame (the blurred blue line, no you can't have my IP address), the entire protocol stack is displayed. You can see Ethernet II, Internet Protocol, Transmission Control Protocol, Hypertext Transfer Protocol and even the SOAP XML. I imagine this tool would be a fantastic way of teaching the fundamentals of networking. I can right click on the 'eXtensible Markup Language' line and copy the entire SOAP envelope to the clipboard.

OK, so I've captured the SOAP request, but how do I play it back? Enter SoapUI. I was doing a lot of web service work a few years back and made extensive use of this tool. I even wrote about it. In fact I was so impressed that I had a go at writing something similar as a custom Visual Studio project type, WsdlWorks. I just about got it working, but VS integration is a bitch and pretty much sapped all my enthusiasm for the project. Now it just languishes on CodePlex.

But that's enough about me. Back to playing back our captured request. We create a new SoapUI project, give it the WSDL URL of the web service and simply paste in the SOAP envelope that we captured with Wireshark:

 soapUI

Then it's just a question of hitting the play button, firing the SOAP request at the web service and viewing the response. Easy.

Of course it's a simple matter of changing the end point that SoapUI fires the request to, which makes it easy to test the service running in debug mode on my local machine. We can run a series of regression tests on the client application, capture the requests and use SoapUI to play them all back at any time. Great for testing the service.

Wireshark can equally be used if you are developing a client and have no access to the remote service. You can filter on practically any value of the HTTP header to capture all the requests to a particular host for example.

Saturday, January 17, 2009

Eric Evans on Repositories

domain_driven_design

Domain Driven Design by Eric Evans is the bible of that school of software development. It’s one of the most influential books in the realm of enterprise application architecture. Evans brings a real clarity of purpose to both the analysis and implementation of business software. I read it back in 2004 when it was first published and, along with Martin Fowler’s ‘Patterns of Enterprise Application Architecture’, has probably had the most influence on the way I think about building business systems. I’m not alone, you really have to read it if you want to be taken seriously as an application architect.

There’s been a bit of a backlash recently against a tendency in DDD circles to treat ‘the blue book’ almost too literally as a bible, but that shouldn’t detract from what is a fantastic piece of work.

The term ‘repository’ as a way of encapsulating  object persistence is well defined in the book and Evan’s definition is often referred to when discussing the repository pattern. I thought it was worth re-reading the chapter on Repositories and summarising it here so that I have a baseline for any further discussions.

Part II of the book describes a way of modelling user domains using object oriented programming. The technique is to describe your model in terms of entities and value types grouped together in aggregates.  In simple terms an aggregate is an object graph that has a lifecycle determined by the root entity. For example, an aggregate might have a root of customer with related orders, order-lines, address etc. An order does not have an existence separate from a customer. If the customer was deleted you would expect the rest of the graph; orders, order-lines etc; to be deleted as well. A product on the other hand, while it has a relationship with an order-line, also has a life cycle independent of that order-line.

Repositories are responsible for persisting entities and value types. They are described in their own section in chapter 6 and are said to have the following advantages:

  • “They present clients with a simple model for obtaining persistent objects and managing their life cycle.”
  • “The decouple application and domain design from persistence technology, multiple database strategies, or even multiple data sources.”
  • “They communicate design decisions about object access.”
  • “They allow easy substitution of a dummy implementation, for use in testing (typically using an in-memory collection)”

Thus the core purpose of the repository is to encapsulate persistence. The client should appear to be simply using an entity collection and all the details of object relational mapping and specific data access APIs should be hidden behind that collection like interface. Repositories should only be provided for aggregate roots:

“For each type of object that needs global access, create an object that can provide the illusion of an in-memory collection of all objects of that type. Set up access through a well-known global interface. Provide methods to add and remove objects, which will encapsulate the actual insertion of removal of data in the data store. Provide methods that select objects based on some criteria and return fully instantiated objects or collections of objects whose attribute values meet the criteria, thereby encapsulating the actual storage and query technology. Provide repositories only for aggregate roots that actually need direct access. Keep the client focused on the model, delegating all object storage and access to the Repositories.”

Transactions should not be a concern of the repository. He suggests that the client should handle them: “Leave transaction control to the client”. Interestingly, Evans does not mention the Unit of Work pattern in the repository discussion although it’s implied in the section on transactions.

Entity creation should not be the concern of the repository. Keep the concept of ‘factory’ and ‘repository’ distinct, although, in theory, a repository might use a factory internally.

In chapter 9, Evans describes ‘specifications’ as a way of encapsulating queries as part of the domain model. Much of the detail is concerned with providing a single interface for both in-memory (Java in his case) and repository level (SQL) querying. The core point is that specification definition is a domain concern and is best decoupled from the repository, although earlier he does say that in simple situations it might make sense to have methods on the repository to encapsulate queries.

There is also an excellent discussion of the compromises that may have to be made to co-ordinate the object and relational schemas.

So the core message is that repositories are a collection like interface that encapsulate persistence and that queries should encapsulated in the domain as specifications.

Re-reading the section on repositories and thinking about my own use of the term ‘repository’ in software I’ve been building recently tells me that I’m mostly in agreement with Evans. I think I’ve neglected to enforce the aggregate root rule, my repositories will persist any entity in my domain. In practice I don’t have well defined aggregates in my domain and that’s something I should improve. I’ve also allowed unit of work concerns to leak into my repositories. That’s something I’m keen to correct. As for the debate about exposing IQueryable’1, Evans doesn’t have much to say. Obviously, Java doesn’t have anything similar to LINQ so it wouldn’t be an option in any case, but the emphasis on treating the repository as a domain collection does fit quite nicely with the pattern of having specifications implemented as extension methods of a repository that implements IQueryable’1.

Wednesday, January 14, 2009

The best explanation of lazy loading ever!

I love this post by Dylan Beattie, The Story of the Lazy-Loading Lunchbox, and that's not just because he name checks me (honest :).

Tuesday, January 13, 2009

In search of Wild Repository

jungle

Today I'm going to hack deep into the open source jungle to search for examples of wild repository. We'll be able to see the way that this species mutated into many divergent forms, and maybe learn some lessons about growing our own domestic repository on the way.

There's a lot of discussion about what a repository should be. I'm just going to be looking at generic repositories in this post, but it's worth noting that many people have the opinion that such a thing should not be blessed with the name repository; saying that it is merely a generic DAO. I'll leave these semantic arguments for another day.

As I was finishing this post, I came upon DDD Repositories in the wild: Colin Jack by Tobin Harris. It just goes to show that I don't have a single original idea :P

Rhino Commons

I first heard tell of the generic repository in this excellent article by Ayende (AKA Oren Eini) on Inversion of Control containers. So it's only fair that I start with Ayende's own IRepository<T> from Rhino.Commons. It's worth noting that this particular example is now extinct, Ayende now believes that you should tailor a specific IRepository<T> per project.

Rhino.commons.repository

Wow, it's huge! This kind of gigantism can occur in any class if left untended by the SRP. Ayende is heroically scathing of his own creation:

"To take the IRepository<T> example, it currently have over 50 methods. If that isn't a violation of SRP, I don't know what is. Hell, you can even execute a stored procedure using the IRepository<T> infrastructure. That is too much."

It's also worth noting that this repository exposes NHibernate types such as DetachedCriteria and ICriterion. You couldn't use it with any other ORM. I also dislike the paging and ordering concerns that have leaked into the FindAll methods.

A last point worth noting is that all the methods that return collections return an ICollection<T>.

Sharp Architecture

Next we encounter the Sharp Architecture repository:

sharp.architecture.repository

A nice small repository with close to the minimum number of methods you could get away with. Billy McCafferty has had to work hard to keep it this way, coming under some pressure to let it bloat. It's somewhat limited in the kind of filtering you can do with the FindAll and FindOne methods as they are limited to filtering on property values. Sharp architecture is also based around NHibernate, but no NHibernate types have been allowed to find their way into the repository.

Notice that IRepository is a specialisation of IRepositoryWithTypeId. This is a useful generalisation for situations where your primary keys are types other than int.

The collection type of choice here is List<T>.

Fluent NHibernate

Wandering deeper into the forest we run headfirst into Fluent NHibernate. They provide another pleasantly small repository implementation:

fluent.nhibernate.repository

This one is interesting because it's the first time we've seen any use of System.Linq in a repository. The Query method and both the FindBy overloads take a LINQ expression. Looking at their NHibernate implementation one can see that this is simply passed through to the NHibernate.Linq provider. The collection type used is a simple array, so although a LINQ provider is used to resolve the collection from the given expression, they don't want to make sure that the expression has been executed and the final collection created before it leaves the repository.

Primary keys are expected to be long values, which is useful when you've got more than two billion records :)

Suteki Shop

Last, and most definitely least, I'd like to show you my own domesticated repository from Suteki Shop:

suteki.shop.repository

The most controversial aspect is that I return IQueryable<T> from my GetAll method. You can read my attempt at justifying this here. Another point of difference is that I surface the underlying unit of work. Nothing is persisted to the database until the client calls SubmitChanges. Most other repositories hide this behind a simpler 'Save' or 'SaveOrUpdate' method. I don't really have a strong opinion about this, so I could probably be persuaded that the simpler approach is best.

So?

So, leaving the Rhino Commons monster aside, the main difference between the other three repositories is the way the find or query methods are structured. LINQ is the battleground: Do you keep well away, like Sharp Architecture? Do you leverage expressions, but make sure the collection is loaded by the time it leaves the repository? Or do you run with scissors and return IQueryable<T>?

Allowing types from System.Linq to be exposed from the repository is OK because it's a core part of the .NET framework, but what about NHibernate types like ICriteria? I think it's a poor choice to surface these. We should be attempting to insulate the application from the data access tools. In theory we should be able to swap in any reasonably well specified ORM. In practice I've found this to be problematic because of the different mapping and LINQ capabilities provided by different ORMs, but the intention should remain.

Just to wind up, I'd be very interested in hearing about other generic repository implementations out there. I'm going to be giving a talk next week about this pattern and need lots of help!

Monday, January 12, 2009

Should my repository expose IQueryable?

This post is an attempt to describe an interesting point of difference about the way a generic repository can be implemented. I'm writing it to lay out my argument that exposing IQueryable on a generic repository is a good thing. I know a lot of people disagree, so I'm hoping I can spark a debate and win it learn why I'm wrong.

I talked about the generic repository pattern a while back. My IRepository interface looks something like this:

public interface IRepository<T> where T : class
{
    T GetById(int id);
    IQueryable<T> GetAll();
    void InsertOnSubmit(T entity);
    void DeleteOnSubmit(T entity);
    void SubmitChanges();
}

It assumes that the underlying data access model is based on Unit of Work, but with that caveat, it will happily wrap any data access technology that supports a LINQ provider. I've used it successfully with both LINQ to SQL and NHibernate. I've found returning an IQueryable<T> very useful in my applications, there's a nice pattern where I can chain extension methods:

var articles = articleRepository
    .GetAll()
    .ThatMatch(criteria)
    .ToPagedList(pageNumber, pageSize);

However, returning IQueryable<T> is controversial. Quite a few people I respect think that it allows data access concerns to leak into the wider application. Since the way IQueryable<T> is resolved depends on the LINQ provider, some part of my application may fail at runtime. What may work with LINQ to Objects in my unit tests may fail with LINQ to SQL or NHibernate.Linq because not every expression can be parsed into SQL. Also, we are allowing what can be passed to the database to be specified at any point in our application rather than containing that specification within our repository. That makes it possible to write articleRepository.GetAll().AsEnumerable() when we might have millions of articles in the database. Not good.

But in our brave new ORM world, hasn't the horse already bolted?

We are happy to use NHibernate or LINQ to SQL to track changes to our domain entities, and we trust these tools to write the updates correctly back to the database. We also rely on them to lazy load our domain object graph as required. Sure it means we can have an inefficient conversation with the database if we're not careful (the N+1 problem for example), but those are minor concerns compared with the tremendous benefits of ORMs. Behind the scenes the ORM is fiddling with our POCO domain entities so they're not really POCO any more, but we accept that.

Allowing instances of IQueryable<T> to escape from the repository is a similar case. We're saying, "I don't care when the query execution takes place, or what the query looks like, just give me the correct collection of entities". We are giving up control of what queries can be sent to our database, but does this really matter? The benefits are huge if we can live with this loss of control. It's real persistence ignorance. Rather than being concerned about exactly what queries are being sent down the wire we are simply filtering collections in a natural style and leaving it to our infrastructure to work out how to give us what we want.

We can successively filter collections knowing that only the entities we need will be loaded. In the above example I can have query specifications that are specific to that entity (ThatMatch) and others that work on any collection (ToPagedList), mix and match as needed, and only load a single page of entities from the database or whatever backing store we've configured.

As an aside, LINQ to SQL has a really nice (but nasty at the same time) feature that allows you to do this:

var orders = customer.Orders.Where(order => order.Price > 10.0);

It will only lazy load the orders where the price is greater than 10. You can't do this with NHibernate. The nasty part is that you have to implement your entity collection properties as EntitySet (which implements IQueryable) and that really is letting a data access concern leak into the application. But there's no reason why a proxy implementation couldn't do the same thing, so long as we define our collections as IQueryable<T>.

So let us embrace IQueryable<T>. Do not fear lazy evaluation.

Saturday, January 10, 2009

More Curry?

Sorry, I just can’t resist :P

image

I read most of F# for Scientists by Dr Jon Harrop over the Christmas holidays. Now, don’t be put of by the title, it’s a really wonderful little book. I’m no scientist, my undergraduate degree was a general social science pick and mix affair, but I found most of it straightforward. I had to skip some of the complex mathematics but that didn’t seem to hurt my appreciation of programming principles being discussed.

It’s really nice to find a small programming book. Far too many assume too little intelligence from the reader and waffle at length on trivial subjects. It doesn’t help that the IT profession seems to value its books by the killogram. Dr Harrop doesn’t suffer from either of these traits and is happy to introduce difficult subjects in a concise and direct style. Now that means that I sometimes had to spend a while on each page to make sure I understood it, but that’s far better than reading page after page of whatever for dummies.

If I’ve taken anything from this book, it’s a much better understanding of currying. I talked about it a while back when discussing an excellent presentation of functional C# by Oliver Sturm, but at that time I hadn’t understood how central it was to understanding F#.

I’m going to try to show how currying is built into F# as a core part of the language, and how what at first appears to be imperative style syntax is in fact very different.

The first thing to note about F# is that every function only has one argument and one return value. When you write a function that looks like it’s got several arguments, what you are actually creating is a curried set of functions. Take a simple add function:

let add a b = a + b

Now, when I first saw this syntax, I thought, OK add takes two arguments, a and b, and adds them with an implicit return value. But when we run this assignment in F# interactive we get this:

val add : int -> int –> int

This is telling us that we have a function that takes an int and returns a function. the function it returns takes and int and returns an int.

If we were to write this in C# it would look like this

Func<int, Func<int, int>> add = x => y => x + y;

Now if we write:

let add5 = add 5

We get:

val add5 : (int -> int)

In C# we would write this:

var add5 = add(5);

It makes sense that we are using our function that takes and int and returns a function. The new function is called add5, it takes and int as its argument and returns an int. We can use this function like this:

add5 2

which gives:

val it : int = 7

If we write:

add 5 2

We’re actually writing the same as above but on one line. In C# it would look like this:

add(5)(2)

F# is built from the ground up to leverage currying. This is very powerful, but you have to get it, or it becomes very confusing.

But why is it useful? I spend a lot of time talking about Dependency Injection as a way of creating component oriented software. In our object oriented C# world, dependency injection means that we can write generic high level classes that encapsulate the orchestration of lower level ones and defer their concrete resolution until runtime. Currying is a way of doing this at a functional level. By factoring out higher level functions that take lower level functions as arguments we can reuse those patterns of higher level orchestration. Think of it as functional dependency injection.

Friday, January 09, 2009

Javascript tools

In the last month or so, I've written more Javascript than I had in the last several years. It's been quite a steep learning curve for me, and I'm still very much of a newbe, but here's a list of stuff that I've found useful:

  1. jQuery. There's no way I would attempt any Javascript work now without this fantastic library. It takes most of the browser pain away <cough>IE</cough> and makes working with the DOM a very pleasant experience.
  2. Firebug. The essential Javascript debugging web development tool. You are in the dark without this fantastic Firefox plugin.
  3. YUI Test. I've been doing Test Driven Development in .NET for years. There's no way I want to write significant amounts of code without a unit testing framework. The experience is not as slick as NUnit + TestDriven.NET, but I guess it's early days for TDD and Javascript.
  4. Json Formatter. I'm passing some complex Json object graphs back and forth, this neat little utility just works.
  5. Functional Javascript. I'm in the process of having my simple mind expanded by F# and functional programming. Javascript is a version of  Lisp, so get out your Y-combinators!
  6. Visual Studio. This is my Javascript editor, but it's not all it's trumped up to be. I haven't been able to get the intellisense to work, and and seems to take an age to update the syntax checking. Correct code will sit there for several seconds covered in squiggly red lines, not a great experience.

With this toolset, writing reasonably serious Javascript code is quite pleasant. I still spend most of my time looking stuff up, and a Javascript guru would probably feel nauseous looking at my code, but I've actually been having a lot of fun. Now there's something I'd never have thought I would say :)

Integrating Fluent NHibernate and the Windsor NHibernate Facility

Our current project uses NHibernate as its ORM. Until recently we've been using the default hbm mapping files, but we've been suffering for some time from Fluent NHibernate envy. So now we've decided to take the plunge and migrate our project to code based mapping. My esteemed colleague Keith Bloom has been doing most of the work for this, and this post is simply me taking a free ride on all his hard work. Thanks Keith :)

I'm not going to describe Fluent NHibernate here. You can check out the project page if you want to see what it's all about. Suffice to say that it's a really nice API to describe object-relational mapping. Once you've defined your mappings, it's a simple case of applying them to the NHibernate configuration with a handy extension method:

var cfg = new Configuration()
	.Configure()
	.LoadMappingAssembly(Assembly.LoadFrom("Name of assembly that contains your maps."));

However we're using Windsor's very convenient NHibernate integration facility in our project. It does all the configuration and session management for us, so we don't have to worry about it. The problem is, that because it handles it for us, there's not an immediately obvious place to access the configuration to apply the Fluent NHibernate mappings.

It turns out that the simplest way of doing this is to write your own implementation of IConfigurationBuilder. This is the NHibernate facility class that actually creates the NHibernate configuration:

public class FluentNHibernateConfigurationBuilder : IConfigurationBuilder
{
	public Configuration GetConfiguration(IConfiguration facilityConfiguration)
	{
	    var defaultConfigurationBuilder = new DefaultConfigurationBuilder();
	    var configuration = defaultConfigurationBuilder.GetConfiguration(facilityConfiguration);
             configuration.AddMappingsFromAssembly(Assembly.LoadFrom("Name of assembly that contains your maps."));
	    return configuration;
	}
}

Note that we're simply deferring the creation of the NHibernate configuration to the DefaultConfigurationBuilder and then adding the call to to AddMappingsFromAssembly before passing it on. All we have to do now is configure the facility to use our new configuration builder:

<facility id="nhibernate"
          isWeb="false"
          type="Castle.Facilities.NHibernateIntegration.NHibernateFacility, Castle.Facilities.NHibernateIntegration"
          configurationBuilder="Keith.WindsorNHibernate.Services.FluentNHibernateConfigurationBuilder, Keith.WindsorNHibernate">
  <factory id="nhibernate.factory">
    <settings>
      <item key="show_sql">true</item>
      <item key="connection.provider">NHibernate.Connection.DriverConnectionProvider</item>
      <item key="connection.driver_class">NHibernate.Driver.SqlClientDriver</item>
      <item key="dialect">NHibernate.Dialect.MsSql2005Dialect</item>
      <item key="connection.connection_string">Data Source=.\SQLEXPRESS;Initial Catalog=Northwind;Integrated Security=True</item>
    </settings>
  </factory>
</facility>

Keith has kindly allowed me to post his test solution which you can download below. You'll need sqlexpress with a copy of Northwind to run it.

http://static.mikehadlow.com/Keith.WindsorNHibernate.zip