SubSonic Scaling
A question came up in the dashCommerce forums about SubSonic’s ability to scale. I’d like to address this a bit as the link has been deleted and I don’t appear to have the rights to comment on Chris Cyvas’s blog.
A Question of Scale
When you discuss scalability, you’re not talking about speed – you’re talking about availability and “the lack of curve” on the requests vs. response time chart. Ideally you want your response time to remain the same as your users go up – offering the golden “flat line” that means your site will scale. This flat line is achieved through various means and is in itself a hotly-debated topic.
There’s a lot to this – server speed, number of servers, caching, indexing of the DB, etc. As many of you know, this is a very large topic.
Rather than discuss how applications should scale – I’ll keep this focused on what you can expect from your data access. I’ll use something Chris Cyvas brought up in the forums (the thread, along with my comments on his blog, have been deleted – no links… sorry) which is the reason for this post:
“…Think 500,000 products and millions of Sku’s or something close to that. We’d probably have to scrap SubSonic, but the more I work with it the less I like it. It’s a good tool for prototyping and getting things done quickly, but when you look to scale like this my thinking is that it won’t hold up”
Right off the bat I might suggest that the size of your DB has nothing to do with scaling – but what I think Chris is after here is “operations per second” – or something to that effect – in other words a very high traffic site that punches a mess of data in and out of the DB. He could also be discussing Lazy/Eager loading – but that’s an issue we dealt with before 2.0 – so for now I’m going to assume we’re talking high DB traffic.
Test Number One: Inserting Lots Of Data
It’s not easy to test this kind of thing – but what I can do is share with you some tests I run here on my local machine. I’ve beefed them up a bit to address specifically what Chris is worried about (500,000 records). And then I doubled it. And then doubled it again. For this first test I’ll insert 1,000,000 orders into the orders table, and simultaneously insert 1,000,000 items into the Order Details table of Northwind.
I’ll be sure to “interleave” this as well (an Order is entered, then an Order Detail) – to make sure indexing is honored and and that I close off any pending table operation that currently exists.
What I’m looking for with this test is a memory leak anywhere or if something “just isn’t right”. It’s sort of like over-clocking your CPU – I want to run this operation and make sure it’s nice and quick, and most importantly that my RAM doesn’t start elevating. Here’s the code for my test:
static void Main(string[] args) { Console.WriteLine(DateTime.Now.ToString()); DateTime dtStart = DateTime.Now; //create an Order //let's save a meeeeellllionnn items shall we? for (int i = 1; i < 1000000; i++) { Console.WriteLine("Creating order " + i.ToString()); Order o = new Order(); o.CustomerID = "ALFKI"; o.EmployeeID = 5; o.Freight = 10; o.OrderDate = DateTime.Now; o.RequiredDate = DateTime.Now; o.ShipAddress = "Somwhere, Someday"; o.ShipCity = "City"; o.ShipCountry = "US"; o.ShipName = "Shipper"; o.ShippedDate = DateTime.Now; o.ShipPostalCode = "99999"; o.ShipRegion = "KS"; o.Save("me"); OrderDetail detail = new OrderDetail(); detail.OrderID = o.OrderID; detail.ProductID = 13; detail.Quantity = 1; detail.UnitPrice = 100; detail.Save("me"); } DateTime dtEnd = DateTime.Now; Console.WriteLine("Done!" ); Console.WriteLine("Started on " + dtStart.ToString()); Console.WriteLine("Ended on "+dtEnd.ToString()); Console.Read(); }
This code is happily running away on my machine right now as I write this – which I understand isn’t a perfect test scenario however what I’m looking for, specifically, are memory leaks and system “spin outs” – something to indicate that SubSonic “changes” with the size of the DB or somehow flips out when continually processing data. Here’s where I’ve started with my TaskManager after running for about 30 seconds (the highlight is my VS process running the app):
It’s zipping along pretty well, and the RAM is not moving – this is good news.
The results.
I’ve just pumped a million orders (and items – 2 million records total) into my SQLExpress Northwind database and it took just about 22 minutes (which is about 1500 records a second):

The TaskManager didn’t change, VS didn’t lock – overall I’d say this was a pretty smooth process. Here’s the final TaskManager and you can see that memory is unchanged:
The question now is – what do you DO with these records? I would imagine if you asked for all the orders back – yah you might crap your system
as ADO doesn’t like transferring that much data over the wire. Running SELECTs is a matter of asking for the right data back – the rest is just indexing. Anyone can shoot their application in the foot with too many connections or a poorly formed query.
But hey – let’s make sure of this.
Test Number Two: Pulling Records Back Out
The next test loops the new records and pulls them all out, one by one into an object:
//this is a record we just inserted above DateTime dtStart = DateTime.Now; for (int i = 10248; i < 1010248; i++) { Order o = new Order(i); Console.WriteLine("Hello from Order " + i.ToString()); } DateTime dtEnd = DateTime.Now; Console.WriteLine("Done!"); Console.WriteLine("Started on " + dtStart.ToString()); Console.WriteLine("Ended on " + dtEnd.ToString()); Console.Read();
And I’m watching my memory carefully using TaskManager once again, which thankfully isn’t moving:
The really cool part here is that it finished this loading process in about four minutes
. That’s loading up a million records, one by one, each with it’s own DB call, in precisely 247 seconds. That’s 4048 Order objects loaded per second (with no memory leaks) which I think is pretty neat.
Test Number Three: Loading Collections
Most people work with typed collections, and I wanted to see what would happen if I looped over every single new record (1,000,000 of em) and loaded n+10 into a collection (10248, in my Northwind DB, is the OrderID where the new data starts after the initial insert operations above):
//Collection Loading test with 10 records DateTime dtStart = DateTime.Now; for (int i = 10248; i < 1010248; i++) { int nextTen=i+10; OrderCollection coll = new Select().From<Order>().Where("orderid") .IsBetweenAnd(i, nextTen).ExecuteAsCollection<OrderCollection>(); Console.WriteLine("Hello from Orders " + i.ToString()+" - "+nextTen); } DateTime dtEnd = DateTime.Now; Console.WriteLine("Done!"); Console.WriteLine("Started on " + dtStart.ToString()); Console.WriteLine("Ended on " + dtEnd.ToString()); Console.Read();
This was a fun one, and same result – nice and fast and no memory leaks:
The collections loaded in 4 minutes and 22 seconds – or 262 seconds. That’s a total of 10,000,000 order objects loaded over 262 seconds, comes out to about 38,167 Order objects loaded a second. I think that’s fairly fast – don’t you? But let’s not stop there…
Test Number Three: Loading Bigger Collections – 100 at a Time
Same code as above, but this time I’m loading 100 items per call. Check it:
Memory’s in check and things are going nicely. The final load result is just about 5 minutes. That’s a whole lot of data per second:
Summary and Obligatory Caveat
Scaling is a huge issue and all I can offer with respect to SubSonic is that we’ve really tried to make it fly and get out of your way – and hopefully I’ve shown that here. I know “Console Benchmarks” aren’t really a complete test – but hopefully what I’ve shown is that we’ve tried to keep a light footprint and that you can expect that SubSonic won’t break down on you – no matter how many records you have.
That said – scaling an application properly is an art form and extends from the Web Server (interceptors) to caching at the app layer, down to proper indexing in the DB. Many, many moving parts and SubSonic is only one of them. Scale at your own risk…







