One thing I've learned is I can't stay quiet for too long on this stuff :), so here's an update on what I'm working on...
To Understand Recursion, You Must First Understand Recursion
I have to thank Eric for that one - I had heard it in school a few times and forgot it. Eric reminded me of it recently when I was trying to leverage SubSonic's new Query tool into a Visitor Pattern (and grumbling non-stop).
Rolling your own IQueryable isn't necessarily difficult in and of itself. The core mechanics of it are fairly straightforward, and understanding the Expression Tree that gets generated is also pretty straightforward.
The trouble comes in when trying to reuse code and a system you've already created. In particular are the BooleanExpressions that get created when you use syntax like this:
where c.Country == "UK"
This creates a BooleanExpression, consisting of exactly two sub-expressions (Right and Left) and an Operand that evaluates them. In the above example, Right is "c.Country" (a MemberAccess expression) and Left is "UK" (a ConstantExpression). These expressions are compared with an operand, which in this case is "EqualTo".
Well, that's not so bad really. What gets fun is if you have something like this (ignore for now that the query is meaningless):
var x = from p in query where (p.ProductID > 50 || p.ProductID < 10) || (p.CategoryID < 2 && p.CategoryID >5) select p;
This evaluates, once again, to a single BooleanExpression. In this case, however, the Left and Right are themselves BooleanExpressions. And if I wanted to, I could nest the crap out of that query and have an infinite set of BooleanExpressions...
To effectively get at the bits we need to read, you have to run up a Visitor to scale the limbs of the Expression tree and build out the WHERE statement properly. The examples I've see of this use a StringBuilder and append the bits on as the BooleanExpression is scaled - a nice transaltion pattern at work.
This is a problem with SubSonic - our Constraints don't work this way.
Constraints are objects, added to a collection as you construct your query, which we reconstruct when you run your query. Currently we have absolutely no way to run the above query with SubSonic (until now - read on to know more).
The good thing is I was able to bend our SQL Parser we use to parse BooleanExpressions using Matt Warren's Visitor bits. If I didn't have Matt's Blog, you wouldn't have Linq for SubSonic. I'm just not that smart!
In the end it wasn't so difficult and I'm pretty happy with what turned out. The thing I'm a little worried about, however, is how this will translate to other providers. Most providers use ANSI standards for WHERE statements - but there's always an exception.
Implementing a BooleanExpression visitor for each provider can prove to be ... well challenging I'm sure. We'll see.
Deciphering That Where Statement
Quick, tell me what this LINQ query should return:
var query =
from c in db.Customers
join o in db.Orders on c.CustomerID equals o.CustomerID
let m = c.Phone
orderby c.City
where c.Country == "UK"
where m != "555-5555"
select new { c.City, c.ContactName } into x
where x.City == "London"
select x;
Matt Warren uses this query to test out his IQueryprovider bits. And it's a reasonably good benchmark to work from. The interesting challenge for me, building out the framework which will decipher this query, is that I need to somehow understand what it is you mean. And that's not exactly easy.
What I've been working with, traditionally, is the notion that if I approximate SQL with the SubSonic API, then you and I can easily "come to an understanding" on what SubSonic should return to you in the form of a result set.
It's a major challenge, and LINQ deviates a bit from standard SQL and sort of bleeds into C# (or VB) programmatic constructs that can offer confusion.
It may seem like I'm complaining - I'm not really :). What I'm looking at here is the support that's going to go into this effort, which (in short) will entail helping developers:
- Understand LINQ in general and
- Understand why the query they wrote is returning what it's returning.
I'll Probably Get It Wrong
Let's just get that out of the way :). If you search Google on Linq To Sql Weirdness (odd that one of my posts is in the top 5) you'll quickly realize that it's not exactly straightforward what you'll get back when you try to run a query that's not "SELECT * FROM...".
These are the queries that scare me.
Why I'm Doing It
Linq doesn't scare me. It's a great tool and I'm certain that I'll screw up a few things along the way. I've cribbed most of the code that Matt Warren popped on his blog (link above) and I feel really good about how I'm constructing the WHERE bits - but I know I'm going to get something wrong here.
All the same, Linq is worth it. It allows you to place your query logic in your application and removes DB dependencies nicely. We have Linq To Sql, Linq To Objects, Linq To NHibernate, and soon enough Linq To SubSonic.
If all of us ORM Provider guys can share a query language, well I think that's a good thing don't you? No more learning different syntax - it's all the same.
No, our "new" query tool isn't going anywhere. You'll have a choice (as always) to use what you like. If you find yourself hating LINQ ,that's fine too. I happen to see a nice place in the universe for it - specifically for having a common query language across DBs, objects, XML, etc that we can all understand someday.
Where I'm At
Right now I have lots of green lights, and I can handle some really, really complex WHERE statements. Nested ANDs/ORs and Expressions are going through just fine.
I need to wire up New(), which is the method that's called when you do this:
var x = from p in query where p.ProductID > 50 || p.ProductID < 10 select new { thinger=p.ProductID, thingerName=p.ProductName };
It's not very difficult to do this, but it involves (as you can imagine) some reflection. I don't mind using it - contrary to what most people think, reflection isn't resource-consuming as a rule. SOME of it is, but things like Activator.CreateInstance are actually quite snappy. So is traversing PropertyInfo[].
That said - I don't want to do it if I can get away from it. Normally you can lean on generics to help with this:
public T GetMyObject<T>() where T:new(){ T item=new T(); ... }
The issue that I have before me, here, is that IQueryable<T> won't let you add the "where T: new()" constraint (which makes sure that T has a parameterless constructor).
This constraint is in place because of Anonymous Typing (var x=...) - so I can't code my way around it.
Aside from that, here's what I have left to do:
- Aggregates (COUNT, SUM, AVG, ETC). This should be pretty straightforward
- INs
- Nested Aggregates
- Nested WHEREs and SELECTS
- Edge cases
- Perf tests
I know that at some point I'll need to cut the support for crazy queries. Sanity has to take over at some point, and when I see stuff like this:
var customerOrderGroups =
from c in customers
select
new {c.CompanyName,
YearGroups =
from o in c.Orders
group o by o.OrderDate.Year into yg
select
new {Year = yg.Key,
MonthGroups =
from o in yg
group o by o.OrderDate.Month into mg
select new { Month = mg.Key, Orders = mg }
}
};
I know that I'll see it in the forums and my immediate response will be "next time use Mandarin. It's more readable". I spose I can't say that if I'm trying to be helpful - but you get the idea :).
Who knows, maybe I'll get lucky and what I'm building here will be able to fire this no problem. But it sort of goes back to my original point:
What's this query supposed to do? How do YOU know it will? How will I know? Who's right?
Schedule
I'm working on SubSonic feverishly right now, and I'm really hoping to get something out to people who want to help me tune it and build it. I honestly have no idea when it will be ready - new things step in my way every day with respect to implementing IQueryable.
I'm going on vacation for 5 days tomorrow; if I get lucky then maybe in 2-3 weeks I'll have something ready for people to play with.
Source
I haven't branched 3.0 yet - there's a lot of stuff to do with it still and it's in on a private source server at the moment. When I get close I'll share the source out - I promise.
