Crazy Talk: Reducing ORM Friction
Let’s get this out of the way: I know you’re going to think I’m nuts as you read this. You may “pfft” to what you’re about to read – know that I know you’re “pfft”-ing me. All I ask is that you consider what I’m about to suggest…
Development Friction
The term “friction” gets thrown around a lot in terms of development. It’s what it sounds like: something you do or a process you undertake that slows you down as you crank out code. If you think on this for a second – when you’re building an application what’s the number one thing that slows you down (technically speaking. Running out of Red Bull doesn’t count)?
For me it’s the database “stuff”. It’s why I made SubSonic – I was tired of thinking about it and I wanted something faster and easier.
Tangent
This whole post (and thing I’m about to dive into) was dreamt up during a bike ride to the store. I thought a fun post idea was to “send a post back in time” and entitle it “Greetings from the Year 2012″, and I would laugh at all the silly stuff I did in 2008. The top of that list, for me, is the continuing struggle we have with persisting data “properly”. After 10+ years, the marriage of web and database is still arguing over the same old stuff. Can’t we move on? Shouldn’t this be easier by now?
You might have other sources of friction, or you might be saying “dude [MY ORM] roolz! LOLZ at U“. I’ll bet it does – but you still need to work with it (and your database) as you build out your site. Even if it takes you 1 minute to “update and regen” – you still have to mess with the DB and mess with mapping (if you do that).
I’ve talked with a lot of people over the last few weeks about this and asked them the same question:
In 5 years what do you think will finally be changed?
The answer, every single time, is a variation on “ORM’s will finally work properly“. What if I told you that you don’t need to wait for 5 years for this? What if I told you to ditch your ORM and your database and focus on what’s the most important thing: your application?
How Do You Do What You Do?
There are generally two camps of developers:
- The guys who create a database and then build the app, and then update the database, and then change the app, etc.
- The guys who write tests, create a model, update their database and remap their ORM, rinse and repeat.
If you’re a TDD fan (or want to be) you might be interested in Domain-Driven Development (DDD). Yes, it’s another one of those buzz-words but you might actually be a DDD person right now, and not even know it. Check it out:
“What it’s all about is creating as simple a model as possible, one that still captures what’s important for the domain of the application. During development the process could really be described as knowledge-crunching by the developers and domain experts together. The knowledge that is gained is put into the model.”
(Nilsson, Applying Domain-driven Design and Patterns)
Yes, I’m quoting a DDD book. I’ve been absorbed
. The point here is that most of us have always worked this way – working closely with clients to understand their business, and making sure what we build is focused.
The split comes in with what system you work from: the database or the tests? Which is more appropriate in terms of building out an application domain? I’m hoping to convince you, today, to toss your database as you don’t need it. Not yet anyway. Focus on your tests and your domain and I think you’ll see that you can move a lot faster.
Database YAGNI (or the Database as a Feature)
YAGNI is the principle of “You Aint Gonna Need It”, which essentially means “don’t add it unless requirements/testing make you add it” and is one of the main drivers of TDD. It keeps code light and manageable, and keeps cruft out of your application.
I think this should apply to architectures as well – why implement, design, and build a database (with data access code) when the application doesn’t need it? And yes, this is where the Crazy Talk kicks in.
<CrazyTalk>
Databases are pretty heavy information organization and retrieval systems. Even the lighter-weight ones are capable of powering some pretty high-level querying at very rapid speeds.
Do you need this when you’re developing? Do you need to play “ORM-Catch up”? Probably not.
What if, in 10 years, the platform could just translate your model for you and save it properly without you thinking about it?
What if I told you that you can do that now? Well you can – and this is where you’re gonna say “PFFT” and tell me I’m crazy. But as I said – I know you’re going to say it… so just pretend I can hear you…
</CrazyTalk>
Off The Reservation with OODBs
Object Databases have been around for a long, long time. If you’ve never heard of these things, well they basically crumple your object up into binary (by serialization) and save it to disk for you to access later. If you’d like to read more, this is a great post on OODBs, what they are, and how they work. To summarize:
An OODBMS is the result of combining object oriented programming principles with database management principles. Object oriented programming concepts such as encapsulation, polymorphism and inheritance are enforced as well as database management concepts such as the ACID properties (Atomicity, Consistency, Isolation and Durability) which lead to system integrity, support for an ad hoc query language and secondary storage management systems which allow for managing very large amounts of data. The Object Oriented Database Manifesto [Atk 89] specifically lists the following features as mandatory for a system to support before it can be called an OODBMS; Complex objects, Object identity, Encapsulation , Types and Classes ,Class or Type Hierarchies, Overriding,overloading and late binding,Computational completeness , Extensibility, Persistence , Secondary storage management, Concurrency, Recovery and an Ad Hoc Query Facility.
Here’s my thought for you: What if you used an OODB for development ONLY and implemented SQL Server later, when you know what you need to create.
You can’t ditch your RDBMS entirely – never. However I will suggest that working on two systems at once, when you don’t need to, is silly. You can do a lot more in terms of RAD development right now – and port to SQL later.
Doesn’t it make more sense to create the database when, and only when, you need it? Hold that thought – I’m coming back to it.
DB4O – A Free OSS Object Database
I’ve been using DB4O a lot over the last few months and I really like it. The tutorials are very easy and getting up to speed is no problem at all. First, however, I’d like to go over some things you might be wondering about.
Let’s say I have three objects in my model:
public class Product { public string Name { get; set; } public Supplier Supplier { get; set; } public IList<Review> Reviews { get; set; } } public class Review { public string AuthorEmail { get; set; } public string Body { get; set; } } public class Supplier { public string Name { get; set; } }
You might be wondering about how these objects are persisted, and how integrity might be enforced. It’s actually required that they enforce many of the same concepts as an RDBMS so that you don’t make a mess out of your model storage.
For instance, if I create a Supplier for a Product, it will be stored as an independent Supplier that I can then assign to another Product. If I change that Supplier’s name, it will get changed for both. It’s a single object, and the OODB works with the idea of “pointers” in the same way that a regular database does. The thing here, however, is that there is no joining – the relationships implicit and understood. In this way, an OODB can actually (and often do) outperform their RDBMS counterparts.
But I’m not here to talk about perf and scaling – it doesn’t matter for what I’m suggesting
.
If you’re curious and want to play along at home, go and download DB4O from their website and install it. To use it you have to make a few references in your application (specifically Db4objects.Db4o.dll). You then need to write some wrapper code – but I’ve got that covered for you:
using Db4objects.Db4o; using System.Web; using System.IO; public class DB4O { static readonly object padlock = new object(); // static object container variable static IObjectContainer _db = null; static string _dbPath = System.Configuration.ConfigurationManager.ConnectionStrings["ObjectStore"].ConnectionString; public static string DBPath { get { return _dbPath; } set { _dbPath = value; } } public static IObjectContainer Container { get { lock (padlock) { if (_db == null) { //check to see if this is pointing to data directory //change as you need btw if (_dbPath.Contains("|DataDirectory|")) { //we know, then, that this is a web project //and HttpContext is hopefully not null... _dbPath = _dbPath.Replace("|DataDirectory|", ""); string appDir = HttpContext.Current.Server.MapPath("~/App_Data/"); _dbPath = Path.Combine(appDir, _dbPath); } _db = Db4oFactory.OpenFile(_dbPath); } return _db; } } } public static void CloseContainer() { if (_db != null) { _db.Close(); } _db = null; }
This code assumes that you
Have an App_Data directory and that you have a connection string in your Web.config like this one:
<add name="ObjectStore" connectionString="|DataDirectory|ObjectStore.yap"/>
Yes, that’s a singleton you see there, and yes I know it’s probably making you cringe. The reason we need to use a singleton is that the IObjectContainer locks the binary file where the data is kept. File locking and singletons might not sit well with you right now, but in this case is that the way I have this setup here is for a single user – ME – because I’m developing against it. If this were a live app I would be able to set a bunch of settings to make this thread-safe etc.
But it’s not production – it’s development only (have I mentioned this yet?) so you don’t need to worry about singletons, perf, and scaling.
Great, so now that we have our container, let’s store an object. This isn’t really a test, but you’ve read a lot more crazy stuff so far, so this won’t surprise you much. Consider this a spike please – or maybe pretend I’m asserting something:
[TestMethod] public void ObjectRepo_Should_Store_Product() { Product p = new Product(); p.Name = "test product"; p.Supplier=new Supplier("Test supplier"); DB.Container.Store(p); }
Yes, it’s that easy. But that’s not the best part. I can add a DLL to the project that DB4O just released, called “Db4objects.Db4o.Linq” and it does what you might imagine, which is exremely cool:
[TestMethod] public void ObjectRepo_Should_Return_Product() { var result = from Product p in DB.Container where p.Name == "test product" select p; Assert.AreEqual(1, result.Count()); }
Yes, that would be LINQ, working with an OODB container. We can also query a bit deeper, with no problems:
[TestMethod] public void ObjectRepo_Should_Return_Product_By_Supplier() { var result = from Product p in DB.Container where p.Supplier.Name == "Test supplier" select p; Assert.AreEqual(1, result.Count()); }
And to illustrate my point above about object independence and integrity, I can also get the Supplier, independent of the Product:
[TestMethod] public void ObjectRepo_Should_Return_Supplier() { var result = from Supplier s in DB.Container select s; Assert.AreEqual(1, result.Count()); }
This, literally, is the tip of the iceberg. DB4O has support for transactions, indexing, and many different ways of querying to improve performance and usage. But I’m not talking about perf here
– it doesn’t matter, this is only for development.
Also it’s worth noting that if I change the properties of Product around, it won’t break. The changed property just won’t get loaded – but everything else will. So you can change and alter as required and nothing breaks!
Real-World Application
What I’m suggesting is that you can create a IRepository<T> and then implement a nice ObjectRepository<T> to work with in your application. It’s very simple – and yes, here’s some more code for you:
using System; using System.Collections; using System.Linq; using System.Linq.Expressions; public interface IRepository<T> { IQueryable<T> GetAll(); PagedList<T> GetPaged(int pageIndex, int pageSize); IQueryable<T> Find(Expression<Func<T, bool>> expression); void Save(T item); void Delete(T item); }
Now, you can implement this quite nicely with DB4O:
using System; using System.Collections; using System.Linq; using Db4objects.Db4o.Linq; public class ObjectRepository<T> : IRepository<T> where T: class { /// <summary> /// Returns all T records in the repository /// </summary> public IQueryable<T> GetAll() { return (from T items in DB4O.Container select items).AsQueryable(); } /// <summary> /// Returns a PagedList of items /// </summary> /// <param name="pageIndex">zero-based index to be used for lookup</param> /// <param name="pageSize">the size of the paged items</param> /// <returns></returns> public PagedList<T> GetPaged(int pageIndex, int pageSize) { var query=(from T items in DB4O.Container select items).AsQueryable(); return new PagedList<T>(query,pageIndex,pageSize); } /// <summary> /// Finds an item using a passed-in expression lambda /// </summary> public IQueryable<T> Find(System.Linq.Expressions.Expression<Func<T, bool>> expression) { return GetAll().Where(expression); } /// <summary> /// Saves an item to the database /// </summary> /// <param name="item"></param> public void Save(T item) { DB4O.Container.Store(item); } /// <summary> /// Deletes an item from the database /// </summary> /// <param name="item"></param> public void Delete(T item) { DB4O.Container.Delete(item); } }
If you’re wondering what a “PagedList” is – you can find out more here.
ORM Selection Made Easy
Suppose you didn’t need to worry about your database as you build your application. Better yet, suppose you didn’t need to worry about your ORM! This latter thought is actually a critical, critical item. When I suggested that you could wait until you’re about to launch to actually implement a database, a friend of mine asked me “well if you do that, you might back yourself into such a complex model that your ORM won’t handle it. What do you do then?”.
And I said “BINGO“.
Put another way, what you’re getting by not worrying about your ORM (until you need it) is the freedom to develop your app without influence from your database. It is true that you can make a model that’s too complex for your favorite ORM – but doesn’t that mean your favorite ORM probably wasn’t up to the task anyway? Isn’t it much nicer to find that out the easy way?
In most cases you can probably just jump over to SQL very simply, just by replacing the reference to ObjectRepository to something like SqlRepository (this code is using Linq To Sql – but you can change this out with EF in the future – I’ll try to update this later):
using System; using System.Collections; using System.Data.Linq; using System.Linq; public class SqlRepository<T> : IRepository<T> where T: class { NorthwindDB.DB _db = null; public SqlRepository(){ _db=new NorthwindDB.DB(); } /// <summary> /// Gets the table provided by the type T and returns for querying /// </summary> private Table<T> Table { get { return _db.GetTable<T>(); } } /// <summary> /// Returns all T records in the repository /// </summary> public IQueryable<T> GetAll() { return Table; } /// <summary> /// Returns a PagedList of items /// </summary> /// <param name="pageIndex">zero-based index to be used for lookup</param> /// <param name="pageSize">the size of the paged items</param> /// <returns></returns> public PagedList<T> GetPaged(int pageIndex, int pageSize) { return new PagedList<T>(Table, pageIndex, pageSize); } /// <summary> /// Finds an item using a passed-in expression lambda /// </summary> public IQueryable<T> Find(System.Linq.Expressions.Expression<Func<T, bool>> expression) { return Table.Where(expression); } /// <summary> /// Saves an item to the database /// </summary> /// <param name="item"></param> public void Save(T item) { if (!Table.Contains(item)) { Table.InsertOnSubmit(item); } _db.SubmitChanges(); } /// <summary> /// Deletes an item from the database /// </summary> /// <param name="item"></param> public void Delete(T item) { Table.DeleteOnSubmit(item); _db.SubmitChanges(); } }
Since we’re wrapping everything in IRepository<T>, you can swap parts as needed. I’ll be you were wondering why you might want to develop this way (I know I was a long time ago – who really ever swaps components anyway?) – but this is a good example of why you might want to decouple your system as much as possible.
Swapping out data stores like this is equivalent to turning your minivan into a ferrari if you want to drive the Autobahn, and back again when you need to get the kids from Soccer.
Am I nuts? If you think I am – please give me some details as I think this would make for an interesting discussion. Just don’t tell the Alt.NET guys yet
.
Tags: crazytalk







