Home MVC Storefront

SubSonic Scaling

A question came up in the dashCommerce forums about SubSonic's ability to scale. I'd like to address this a bit as the link has been deleted and I don't appear to have the rights to comment on Chris Cyvas's blog.

A Question of Scale
When you discuss scalability, you're not talking about speed - you're talking about availability and "the lack of curve" on the requests vs. response time chart. Ideally you want your response time to remain the same as your users go up - offering the golden "flat line" that means your site will scale. This flat line is achieved through various means and is in itself a hotly-debated topic.

There's a lot to this - server speed, number of servers, caching, indexing of the DB, etc. As many of you know, this is a very large topic.

Rather than discuss how applications should scale - I'll keep this focused on what you can expect from your data access. I'll use something Chris Cyvas brought up in the forums (the thread, along with my comments on his blog, have been deleted - no links... sorry) which is the reason for this post:

"...Think 500,000 products and millions of Sku's or something close to that. We'd probably have to scrap SubSonic, but the more I work with it the less I like it. It's a good tool for prototyping and getting things done quickly, but when you look to scale like this my thinking is that it won't hold up"

Right off the bat I might suggest that the size of your DB has nothing to do with scaling - but what I think Chris is after here is "operations per second" - or something to that effect - in other words a very high traffic site that punches a mess of data in and out of the DB. He could also be discussing Lazy/Eager loading - but that's an issue we dealt with before 2.0 - so for now I'm going to assume we're talking high DB traffic.

Test Number One: Inserting Lots Of Data
It's not easy to test this kind of thing - but what I can do is share with you some tests I run here on my local machine. I've beefed them up a bit to address specifically what Chris is worried about (500,000 records). And then I doubled it. And then doubled it again. For this first test I'll insert 1,000,000 orders into the orders table, and simultaneously insert 1,000,000 items into the Order Details table of Northwind.

I'll be sure to "interleave" this as well (an Order is entered, then an Order Detail) - to make sure indexing is honored and and that I close off any pending table operation that currently exists.

What I'm looking for with this test is a memory leak anywhere or if something "just isn't right". It's sort of like over-clocking your CPU - I want to run this operation and make sure it's nice and quick, and most importantly that my RAM doesn't start elevating. Here's the code for my test:

        static void Main(string[] args) {

            Console.WriteLine(DateTime.Now.ToString());
            DateTime dtStart = DateTime.Now;

            //create an Order
            //let's save a meeeeellllionnn items shall we?
            for (int i = 1; i < 1000000; i++) {
                Console.WriteLine("Creating order " + i.ToString());
                Order o = new Order();
                o.CustomerID = "ALFKI";
                o.EmployeeID = 5;
                o.Freight = 10;
                o.OrderDate = DateTime.Now;
                o.RequiredDate = DateTime.Now;
                o.ShipAddress = "Somwhere, Someday";
                o.ShipCity = "City";
                o.ShipCountry = "US";
                o.ShipName = "Shipper";
                o.ShippedDate = DateTime.Now;
                o.ShipPostalCode = "99999";
                o.ShipRegion = "KS";

                o.Save("me");

                OrderDetail detail = new OrderDetail();
                detail.OrderID = o.OrderID;
                detail.ProductID = 13;
                detail.Quantity = 1;
                detail.UnitPrice = 100;
                detail.Save("me");
            }
            DateTime dtEnd = DateTime.Now;
            Console.WriteLine("Done!" );
            Console.WriteLine("Started on " + dtStart.ToString());
            Console.WriteLine("Ended on "+dtEnd.ToString());
            Console.Read();
        }

This code is happily running away on my machine right now as I write this - which I understand isn't a perfect test scenario however what I'm looking for, specifically, are memory leaks and system "spin outs" - something to indicate that SubSonic "changes" with the size of the DB or somehow flips out when continually processing data. Here's where I've started with my TaskManager after running for about 30 seconds (the highlight is my VS process running the app):

ss_inserts  

It's zipping along pretty well, and the RAM is not moving - this is good news.

The results.
I've just pumped a million orders (and items - 2 million records total) into my SQLExpress Northwind database and it took just about 22 minutes (which is about 1500 records a second):

ss_done

orderscount

The TaskManager didn't change, VS didn't lock - overall I'd say this was a pretty smooth process. Here's the final TaskManager and you can see that memory is unchanged:

tm_2 

The question now is - what do you DO with these records? I would imagine if you asked for all the orders back - yah you might crap your system :) as ADO doesn't like transferring that much data over the wire. Running SELECTs is a matter of asking for the right data back - the rest is just indexing. Anyone can shoot their application in the foot with too many connections or a poorly formed query.

But hey - let's make sure of this.

Test Number Two: Pulling Records Back Out

The next test loops the new records and pulls them all out, one by one into an object:

            //this is a record we just inserted above
            DateTime dtStart = DateTime.Now;
            for (int i = 10248; i < 1010248; i++) {
                Order o = new Order(i);
                Console.WriteLine("Hello from Order " + i.ToString());

            }
            DateTime dtEnd = DateTime.Now;
            Console.WriteLine("Done!");
            Console.WriteLine("Started on " + dtStart.ToString());
            Console.WriteLine("Ended on " + dtEnd.ToString());
            Console.Read();

And I'm watching my memory carefully using TaskManager once again, which thankfully isn't moving:

single_order 

The really cool part here is that it finished this loading process in about four minutes :). That's loading up a million records, one by one, each with it's own DB call, in precisely 247 seconds. That's 4048 Order objects loaded per second (with no memory leaks) which I think is pretty neat.

ss_load_done

 

Test Number Three: Loading Collections
Most people work with typed collections, and I wanted to see what would happen if I looped over every single new record (1,000,000 of em) and loaded n+10 into a collection (10248, in my Northwind DB, is the OrderID where the new data starts after the initial insert operations above):

           //Collection Loading test with 10 records
           DateTime dtStart = DateTime.Now;
           for (int i = 10248; i < 1010248; i++) {
               int nextTen=i+10;
               OrderCollection coll = new Select().From<Order>().Where("orderid")
                   .IsBetweenAnd(i, nextTen).ExecuteAsCollection<OrderCollection>();
               Console.WriteLine("Hello from Orders " + i.ToString()+" - "+nextTen);


           }
           DateTime dtEnd = DateTime.Now;
           Console.WriteLine("Done!");
           Console.WriteLine("Started on " + dtStart.ToString());
           Console.WriteLine("Ended on " + dtEnd.ToString());
           Console.Read();

This was a fun one, and same result - nice and fast and no memory leaks:

ss_collections

The collections loaded in 4 minutes and 22 seconds - or 262 seconds. That's a total of 10,000,000 order objects loaded over 262 seconds, comes out to about 38,167 Order objects loaded a second. I think that's fairly fast - don't you? But let's not stop there...

Test Number Three: Loading Bigger Collections - 100 at a Time
Same code as above, but this time I'm loading 100 items per call. Check it:

ss_collections_100

Memory's in check and things are going nicely. The final load result is just about 5 minutes. That's a whole lot of data per second:

ss_collections_100_done

 

Summary and Obligatory Caveat
Scaling is a huge issue and all I can offer with respect to SubSonic is that we've really tried to make it fly and get out of your way - and hopefully I've shown that here. I know "Console Benchmarks" aren't really a complete test - but hopefully what I've shown is that we've tried to keep a light footprint and that you can expect that SubSonic won't break down on you - no matter how many records you have.

That said - scaling an application properly is an art form and extends from the Web Server (interceptors) to caching at the app layer, down to proper indexing in the DB. Many, many moving parts and SubSonic is only one of them. Scale at your own risk...

Lars Mæhlum avatar
Lars Mæhlum says:
Tuesday, August 19, 2008

As much as I can see, I have yet to see any scaling problems inherit in SubSonic.

I have used it for some pretty db-intensive apps, and it has always performed at least evenly with the stored procs and datatables method.


Andrew Rimmer avatar
Andrew Rimmer says:
Tuesday, August 19, 2008

I have been using Subsonic on most of my projects in the last year or two and haven't had any performance issues (not since 2.x anyway).

My biggest project (an e-commerce store funnily enough) has over 50 tables, with the product table containing over 4 million records. It copes without a problem


Scott Watermasysk avatar
Scott Watermasysk says:
Tuesday, August 19, 2008

The reality is all tools that do "general" things fall apart as you start to reach very high end numbers.

I don't think that is a problem since the problem is usually a good problem to have. As in the case of dashCommerce, if you have 500,000 products to sell, an off the shelf e-commerce tool is not going to fit the bill.


Duckie avatar
Duckie says:
Tuesday, August 19, 2008

Same experience here.

You should have some numbers to compare them to. How long does it take a simple datareader to output the same result?


Todd avatar
Todd says:
Tuesday, August 19, 2008

I agree with Scott, anyone with 500,000 products isn't going to use dashCommerce or any other open source solution. And what's up with Chris? Subsonic wouldn't be the first place I would look if I had a scaling issue. I wonder what his experience is with scaling sites to hundreds of thousands of users.


Leo avatar
Leo says:
Tuesday, August 19, 2008

"...the more I work with it the less I like it"

Doesn't look like Chris' only problem is the one of scale. That is a pretty bold statement, sure would be nice to know how Chris arrived at that conclusion.


Ryan Lanciaux avatar
Ryan Lanciaux says:
Tuesday, August 19, 2008

I've used SubSonic on very large catalog sites (not ecommerce). It hasn't skipped a beat and any performance issues I had run into when testing were not related to SubSonic.

Thank you for taking the time to generate the metrics on this great tool!


Craig avatar
Craig says:
Tuesday, August 19, 2008

I think a product table with 500,000 records and millions of records is not all that large. If they are having performance problems I think they need to do some system re-organisation and possible re-architecture.


Dave Savage avatar
Dave Savage says:
Tuesday, August 19, 2008

Thankfully Rob and the team have done a great job on the quality checks for SubSonic. I used SubSonic very early in it's existence and I did run into memory issues. I haven't had time to go back and really review what the issue was but after adding some better "cleanup" to my objects, I ceased to run into memory issues.

There is a good saying:

"It's not the size that matters, it's how you use it."

Cheers Rob. =)


Mike Brown avatar
Mike Brown says:
Tuesday, August 19, 2008

Interesting numbers. I don't think Subsonic would hold back an application from scaling to thousands of concurrent users. If you have a page that shows 50 items, a thousand users viewing that page isn't putting strain on your Data Access Layer, it's putting strain on the DB itself.

All other things being equal a block of code that returns in under a second with one user, is going to return in under a second with a thousand.

If you're having issues with the database not scaling because of how much data you have, check your indices. Open up SQL Performance Analyzer and let it find what's killing you. The last place you should look is your Data Access Layer because unless you're doing something foolish like querying on a text field with a like clause, it's not the culprit here.


Pete avatar
Pete says:
Tuesday, August 19, 2008

To me it seems like that guy just is not a fan of subsonic. There is nothing wrong with that. However it is not very fair of him to just say hey subsonic can not perform well enough for me. If he is going to say something like that he should back it up with actual proof and reasons.


Aaron Fischer avatar
Aaron Fischer says:
Tuesday, August 19, 2008

@Leo: I get the impression Chris is looking for an excuse to drop subsonic. And scalability is the first item to come to mind.

Every layer of abstraction is going to add time and memory. The key to scalability is analyzing your system for bottle necks and taking the appropriate corrective actions.


Colby avatar
Colby says:
Tuesday, August 19, 2008

We are using SubSonic to browse a healthcare audit database running on SQL Server 2005. Currently it has around 800 million records in the main table with most records referencing additional tables. As long as you don't get too ambitious with your queries SS handles it like a champ. Nearly all of the performance issues we have run in to are DB or network related.

P.S. - You should be looking at the memory of TestLoader.exe and not devenv.exe ;)


spootwo avatar
spootwo says:
Tuesday, August 19, 2008

I am glad you wrote this article. I tried to sell subsonic to my last workplace. I had the things running smoothly using subsonic, until one day I ran into similar clueless remarks that derailed my entire project:

1) Look at the size of that dll (the generated code from the db) it's over 1MB!!!!!! (So is System.Data, but they somehow couldn't hear me over their shouting)

2) Oh...open source...[Insert lame excuse]

3) We care about speed of our database and this might run slower (the old approach to loading a collection of data made more hits to the db then my subsonic approach). It's ironic that 3 places I have worked cared about speed and used Datasets over datareaders.

4) We only use stored procedures : I don't even understand why this should stop development since Subsonic can use stored procs, but this excuse just kept coming up again and again.

In the end the conclusion was something like :'somehow subsonic might make things worse.' They decided to completely remove Subsonic despite 3 months of work.

Using Subsonic is fun and rewarding. Coding datasets and adapters is painful and repetitive.

I would like a nice (glossy to impress management) printout, maybe with a Microsoft logo, showing how Subsonic can beat the crap out of the classic ADO.NET approach.

Despite my grumblings thank you for showing me the power of .NET. Reading the Subsonic source code brought me to whole new level. Thanks Rob.


Todd avatar
Todd says:
Tuesday, August 19, 2008

I don't think Chris has a problem with Subsonic, I think he has a problem with Rob.

Chris mentioned donations he made to several OS components used in dashCommerce except for 2. One of them was excluded because, as he says, "the project leader is a bit of a dweeeeeeeb." I'm assuming he's talking about Rob, given the fact that he deletes Rob's comments on his blog and seems to dislike Subsonic for no good reason (he's never explained it thoroughly).

Grow up Chris...


Rob Conery avatar
Rob Conery says:
Tuesday, August 19, 2008

@Todd - yah the "dweeeb" comment was hysterical, I saw that last night and it gave me a good laugh :). Been a while since I was called a dweeb :).

@Colby - You are very correct, for some reason I was thinking it was running in the dev environment and I should really know better than that! For what it's worth LoadTester.exe didn't change either - I should really change those diagrams... I might just do this...


Ben avatar
Ben says:
Tuesday, August 19, 2008

You should remove the repeated calls to Console.WriteLine (or at least batch them up) for improved benchmark times. You'll be surprised how much overhead it adds.


Just3Ws avatar
Just3Ws says:
Tuesday, August 19, 2008

I created an audit system for a previous employer and used SubSonic for the entire data-access layer. The only time I ran into problems was when I tried to get too greedy and was pulling over a few million records (don't recall the exact metric) per read. SubSonic would cause a heavy hit on the memory and cpu, but that probably had more to do with pulling a few million records, transforming them and writing them into an audit database than there being a problem with SubSonic. ;-) I am a very happy user. Thanks for the great tool!


Joe Brinkman avatar
Joe Brinkman says:
Tuesday, August 19, 2008

Subsonic - just more drag & drop sugary sweetness from Microsoft. Real coders do every thing by hand, that way they can refactor their crappy code later using ReSharper.


Rob Conery avatar
Rob Conery says:
Tuesday, August 19, 2008

ROFL! The daily dose of Joe B...


Doug avatar
Doug says:
Tuesday, August 19, 2008

As Dare Obasanjo said, this is usally a case of "Scalability: I Don't Think That Word Means What You Think It Does"

www.25hoursaday.com/.../ScalabilityIDon

It might be interesting to read some figures, but ultimately, if it works for so many people, scalability probably isn't really an issue. It also wasn't the problem being solved by Subsonic, but it is interesting to see people aren't citing instances where Subsonic is the problem when scaling.


Kevin avatar
Kevin says:
Tuesday, August 19, 2008

I am sorry you had to dignify this with a response. It is obvious this guy just has a bruised ego and is pouting now because you called him out. He raised this whole issue from a single forum post by a single guy (who probably just crapped in his own pool). rubbish


Jesse Foster avatar
Jesse Foster says:
Wednesday, August 20, 2008

Rob,

Have you run the same kind of tests against the repository pattern that you are building into the mvc storefront? I really like the pattern, but I cant get speeds anywhere near what you posted here with subsonic.

Jesse Foster | jf26028


Mansour avatar
Mansour says:
Wednesday, August 20, 2008

Rob,

What about caching? I have never used SubSonic in a project but I have applied data caching on web applications and the performance gains is significant. What happens when you run your select queries twice? Does SubSonic cache automatically or are there constructs that need to be built in the code?

Thanks!

Mansour


Rob Conery avatar
Rob Conery says:
Wednesday, August 20, 2008

Hi Mansour - we don't have caching built in as I've always felt it is an app concern. I know NHib and other ORMs do - but we don't.


Firefly avatar
Firefly says:
Wednesday, August 20, 2008

Hehe I actually feel bad for that guy now :) Imagine all the response that Rob is gathering. I must admit this is very amusing, especially the "dweeeb" part... It feel like HS all over again. :)


GeoffAtDatagaard avatar
GeoffAtDatagaard says:
Wednesday, August 20, 2008

@Rob, I did some benchmarking of my own using your code.

I was more interested in Repository vs Active Record than trying to prove to myself that SS was doing the job.

My rates are a long way short of your's, but the meagre rig I have at home probably accounts for this :-(

(P4 2.4 500MB XPP), and it's running SQL Server Developer.

Here's what I found, and I think its impressive too.

350.87 Active Record Inserts per second

347.82 Repository Inserts per second

645.16 Active Record loads per second

689.65 Repository Loads per second

These rates were without console.writeln's

Rates reduce by an average of 11% with console.writeln's


GeoffAtDatagaard avatar
GeoffAtDatagaard says:
Wednesday, August 20, 2008

Damm, I checked it twice before posting but ....

350.87 Active Record Inserts per second

347.82 Repository Inserts per second

689.65 Active Record loads per second

645.16 Repository Loads per second


Jesse Foster avatar
Jesse Foster says:
Wednesday, August 20, 2008

Are you guys using the repository templates in subsonic?

I have given linq to sql a try, and even with opening the connection at the beginning, performance is nothing compared to what you guys are seeing.

Jesse


Andrew avatar
Andrew says:
Wednesday, August 20, 2008

We have been using SubSonic for over a year and couldn't be happier. It is used in all of our web applications and haven't seen any problems with perfomance... Enjoy reading your blogs dude...


Doug Wilson avatar
Doug Wilson says:
Wednesday, August 20, 2008

I found SubSonic through the SubTEXT project. Since then I've used it in several projects with no problems. One of them has a number of tables well over a million records each and I'm happy to report it's not a problem. Overall SubSonic has dramatically improved my development times and simplified the paging and sorting code. I only wish I had found it sooner as I still have a couple of projects that use hand built sprocs/datareaders DAL which is a PITA to maintain.

Considering that Chris has built his business on code Rob created, you'd think he'd be more grateful.


Yitzchok avatar
Yitzchok says:
Wednesday, August 20, 2008

Hi All,

Just cool down a bit please.

1. "The performance has nothing to do with SubSonic."

2. There is Caching built into dC (but is not active by default)

3. Chris problem with SubSonic (from what I understand) is that you have to derive from a SubSonic base class (That means you have to add a reference to the SubSonic’s Dll to use the generated objects) not that there is no way around but that is the normal way it gets generated (but it looks like he is having a problem expressing it).

4. There is some problem with the architecture of dC that we are fixing we are adding IoC and Server Side paging this is probably where the main performance hit is.

What do you say Rob maybe it's time to add "Persistence Ignorance" as an option to SubSonic?


Firefly avatar
Firefly says:
Wednesday, August 20, 2008

Yitzchok, I think the issue here is not whether Chris is having a problem with SubSonic. But there seem to be a speculation that he went out of his way to delete Rob post.


Rob Conery avatar
Rob Conery says:
Wednesday, August 20, 2008

Hi Yitzchok - In terms of cooling down, I've stayed fairly mum on this whole thing for months now. Every since his post (which got kicked pretty high by the LightSpeed guys) of yesterday I've been receiving a ton of emails and, to be honest, I've probably stayed quiet for a bit too long.

I honestly don't mind if Chris uses another ORM tool - it's up to him and I wish him the best (as I said in the forum post). But he did not need to say that SubSonic didn't scale, and that the project was dying. I also didn't ask him to delete my posts and comments on his blog. I don't have much sympathy for him right now.

In terms of 3 - yes this is an issue that's come up with folks and it's why we moved to the Repository a bit more with 2.1. I'm still unclear on why adding a reference to SubSonic is bad (it gets put in the bin of your executable anyway) but nonetheless - I see the point. DI can help with this, as Chris noted. This isn't a fault in SubSonic.

RE Persistence Ignorance - I think what you mean is that you want "POCOs" - plain old objects to work with (SubSonic has PI using the Provider Model). I'm doing this with the Storefront (using L2S) and I'd like to make this easier with SubSonic - yes.


Yitzchok avatar
Yitzchok says:
Wednesday, August 20, 2008

@Firefly

I don't think he deleted it He probably just didn't activate it ;)

@Rob

>>Just cool down a bit please.

I was not talking to you Rob (You in a way have the right) just to all the people that got so...

I think Chris should write a blog post to clarify things.


John Kirk avatar
John Kirk says:
Wednesday, August 20, 2008

It seems to me that Chris should quitely hand back CSK to Rob instead of bad mouthing it and Rob. It's not like Chris paid Rob for this project, right? Very bad form on Chris' part.


inXperience avatar
inXperience says:
Sunday, August 24, 2008

Hi Rob, first of all i would like to thank you for your work for the community.

One minor think - i wonder if in your Test Number Two: Pulling records back out the line:

Console.WriteLine("Hello from Order " + i.ToString());

should be something like

Console.WriteLine("Hello from Order " + o.Id.ToString());

so that you access the value from the objet instead of just writing the integer i?

my best wishes


Kalyan avatar
Kalyan says:
Monday, August 25, 2008

Rob, Just ignore such persons comments and keep up the good work. I love SubSonic so do a lot of people.

"SUBSONIC ROCKS"

Thanks for this great tool.


Khurram avatar
Khurram says:
Wednesday, August 27, 2008

Subsonic is a cool project. I have some suggestion for features, which might interest you.

If some has already achieved this can you please point that to me?

• Provide a soft delete functionality (or record versioning) in the SubStage (a checkbox says that never deletes the record) and store that into an audit table.

o Provide a select of pull all the versions of a particular one record.

• Provide an audit of rows and fields.

o Row level auditing can a achieved by the soft delete functionality

o Field level auditing can be achieved to insert an extra row in FieldLevelAditTable table which takes old value and new value of during the update operation.

Thanks, Khurram


Ozmosis avatar
Ozmosis says:
Thursday, August 28, 2008

Rob rocks.



Search Me
Subscribe

Index Of MVC Screencasts

You can watch all of the MVC Screencasts up at ASP.NET, and even leave comments if you like.

Popular Posts
 
My Tweets
  • @haacked must.... resist... assimilation...
  • Dinner at the Haacks. How did Phil get such a cute kid? Evidently Phil's in the doghouse though...
  • @shanselman dude turn off twitter and drive! that's gotta be illegal!
  • For D'Arcy and Justice... Scottgu goes Canuck! http://twitpic.com/mfz1
  • Working in ScottGu's office with @shanselman. Wearing an Orange Polo and saying "go ahead" a lot for some reason.
  About Me



Hi! My name is Rob Conery and I work at Microsoft. I am the Creator of SubSonic and was the Chief Architect of the Commerce Starter Kit (a free, Open Source eCommerce platform for .NET)

I live in Kauai, HI with my family, and when my clients aren't looking, I sometimes write things on my blog (giving away secrets of incalculable value).