As with all projects, you hit roadblocks. And I've hit one head-on!
Linq Happens
I got a great comment on my blog the other day from David Jade (no link...):
It seems to me (from watching SQL Profiler) that LazyList doesn't really work. There seems to be two issues here:
First, from looking at stack traces it seems that somewhere in the depths of Linq to SQL, the LazyList members are being touched which causes the inner IQueryable to be executed while constructing the model objects.
Second, it seems that sometimes a query in a repository that generates a model object that contains LazyList members seems to get wildly complex and pre-load a lot of data which should be lazily loaded. I say sometimes since it seems to happen for the first LazyList model member but not subsequent LazyList model members (in fact if you rearrange the model members, it will switch which one gets pre-loaded). This is not necessary related to using SingleOrDefault() either as it happen when returning IQueryable as well.
I had seen this happening already, but figured that I would tune this out with some perf optimizations in the coming weeks. But my spidey senses started tingling...
Is Perf A Feature?
A lot has been written about Premature Optimization being the Root of All Evil but I think the point that is missed is that if you have something (like I do here) that isn't working at all the way you want, you may want to take a look at it :) in case you start smelling smoke, as I am right now. This is echoed by Robert O'Callahan:
There's a folklore quote "premature optimization is the root of all evil", attributed to Tony Hoare and Donald Knuth...
"The First Rule of Program Optimization: Don't do it. The Second Rule of Program Optimization (for experts only!): Don't do it yet.". Unfortunately --- and I'm not the first to note this --- this advice, taken out of context and followed slavishly, often leads people into deep trouble. The problem is that it collides with another well-known feature of software development: it gets more expensive to make changes to the system later in its development...
If you leave optimization to late in development, then profile it and find that fixing the performance bottleneck requires a major change in your design, or forces major module interface changes, then you have a serious problem
Completely agree with this. I've been there :).
In my case, I'm using a pattern that's not been tested before, and kind of running with scissors. To some people, 100 queries in one call may not be a big deal. To me, well it's chocolate-coated negligence.
I've elected to stop what I'm doing and dive into this mess. And what a mess it is! This is why you haven't seen another screencast, I'm neck deep in code right now, trying to figure out why I can't get my LazyList to work like it looks like it should!
Please Read This Before Providing Your Comment
I probably didn't put this high enough in the post - I know many are on their way down the page here writing an "I told you SO!" and while you might have a point - I'm not giving up yet. There has to be a way - and I don't want to give up yet as I think there's a lot of value in making this work.
This post is my attempt at being perfectly transparent with trying to solve a very aggravating problem (as you'll see). I'd love to hear some thoughts other than "give up" :).
Is This List Lazy Or Not!
In the SQLCatalogRepository, here is the code for grabbing Category and Products:
public IQueryable<Category> GetCategories() { var culturedName = from ct in db.CategoryCultureDetails where ct.Culture.LanguageCode == System. Globalization. CultureInfo. CurrentUICulture.TwoLetterISOLanguageName select new { ct.CategoryName, ct.CategoryID }; return from c in db.Categories join cn in culturedName on c.CategoryID equals cn.CategoryID select new Category { ID = c.CategoryID, Name = cn.CategoryName, ParentID = c.ParentID ?? 0, Products = new LazyList<Product>( from p in GetProducts() join cp in db.Categories_Products on p.ID equals cp.ProductID where cp.CategoryID == c.CategoryID select p ) }; }
You can see the call to LazyList<Product> above, where the IQueryable definition for Product is passed in to the constructor. This is supposed to stay "as a definition" until the List is enumerated - calling the query then.
Here's the code for GetProducts():
public IQueryable<Product> GetProducts() { var cultureDetail = from cd in db.ProductCultureDetails where cd.Culture.LanguageCode == System. Globalization. CultureInfo. CurrentUICulture. TwoLetterISOLanguageName select cd; var result = from p in db.Products join detail in cultureDetail on p.ProductID equals detail.ProductID select new Product { ID = p.ProductID, Name = p.ProductName, Description = detail.Description, ShortDescription = detail.ShortDescription, Price = detail.UnitPrice ?? p.BaseUnitPrice, Manufacturer = p.Manufacturer, ProductCode=p.ProductCode, //reviews Reviews=new LazyList<ProductReview>(from r in GetReviews() where r.ProductID==p.ProductID select r), //images Images=new LazyList<ProductImage>(from i in GetProductImages() where i.ProductID == p.ProductID select i), }; return result; }
You can see I'm doing the same here for Images and Reviews.
What Works, What Doesn't.
If I comment out the call to Reviews and Images (which are just very basic IQeryable definitions), LazyList works perfectly and all is good - one call to the DB:
Uncomment the constructors in GetProducts() for Images and Reviews, and BOOM:
100+ queries. Now it's clear to me that something is tripping IQueryable - but what? I know it's something inside the Product definition. Time to Dive Deeper.
Starting At The Beginning
If LazyList() works, then I should be able to do this without hitting the DB:
ICatalogRepository rep = new SqlCatalogRepository(); LazyList<Product> prods = new LazyList<Product>(rep.GetProducts());
... and indeed that's the case. My Profiler confirms this with a nice blank trace. So now I know LazyList works! I'm just using it wrong somewhere.
If I ask for the count, it should hit the DB:
ICatalogRepository rep = new SqlCatalogRepository(); LazyList<Product> prods = new LazyList<Product>(rep.GetProducts()); int productCount = prods.Count(); Assert.AreEqual(productCount, 51);
... and the Profiler confirms this. Unfortunately it hits it 100 times :). It doesn't do this for Category when it gets enumerated (without Images/Reviews).
Calling Count()
I've traced the problem to Count. Something is calling Count on the property which is tripping the enumerator. I've changed my test to this, so I can be sure that I'm not inadvertantly calling every product:
ICatalogRepository rep = new SqlCatalogRepository(); LazyList<Product> prods = new LazyList<Product>(rep.GetProducts()); Product p = prods[0];
This kicks the enumerator off for my Product list, which in turn populates the list with Products, which then (evidently) need to load every Image/Review. This is wrong.
If I look at the Call Stack for LazyList when this happens - check this out. Count is being called for the ProductReview IList internally, which is firing it's enumerator:
After Count.get() is called, the next step is to the CopyTo() method. If I'm understanding things here, the list items of the Inner list are being copied to the outer (derived) LazyList, which is causing the iteration craziness.
Why Does Category Work? Or Does It...
I found the answer, and it's weird. Add this to your "101 Strange Things About LINQ To Sql" notebook :). Let's take a look at the IQueryable definition for Category again:
select new Category { ID = c.CategoryID, Name = cn.CategoryName, ParentID = c.ParentID ?? 0, Products = new LazyList<Product>( from p in GetProducts() join cp in db.Categories_Products on p.ID equals cp.ProductID where cp.CategoryID == c.CategoryID select p ) };
Notice here that for LazyList<Product>, I'm using a join (because I have to) on a Many to Many relationship. Linq To Sql translates this as a set of nested INNER and LEFT joins (I won't hurt your eyes with the SQL). Just know that this whole thing is filled in ONE SQL CALL. Which is awesome for Linq To Sql, and kudos to the team for that.
The upshot here is that it only APPEARED to work (the Product List not getting filled until I called it)- Linq To SQL just outsmarted me :) and indeed filled the Product list - from one call. If I change that IQueryable join pattern for Products, it wouldn't work at all no doubt.
Perhaps I can do the same for Images/Reviews. If I rework the definition for Product/Images to include a join (which I don't need):
Images=new LazyList<ProductImage> (from i in GetProductImages() join p1 in db.Products on i.ProductID equals p1.ProductID where p1.ProductID == p.ProductID select i),
It works the same way! Only ONE CALL to the DB to load up the Images along with the Product :). Now that's some progress!
BOOM! Not So Fast
If I reset both Images and Reviews to use joins, it stops working again - I guess the join structure was a tad too much for it :). Oh well - it was sort of haackish anyway :).
Summary
The problem appears to be the Count property. More specifically it's the CopyTo. I don't know if there's a way around this as the Count property of the IList property needs to have a number to it.
For the Linq To Sql guys out there - Linq To Sql objects don't have a way around this either. Even if you specify LoadOptions for the Context, they will still be called ONE AT A TIME.
I have ways out of this, to be sure. But I really want LazyList to work and I've asked Ayende to check out what I'm doing. Hopefully he will give me his usual "Oh this is really simple stuff, it surprises me that you don't know this...".
I would really love some ideas here; hit me!
Technorati Tags: aspnetmvc
