Home SubSonic MVC-Storefront

LINQ: Understanding LINQ, Vars, and Lambdas

I’ve been working lately with LINQ and LinqToSql in an effort to see how SubSonic can “play nice” with the emerging framework. In my previous posts I discussed the DataContext in LinqToSql and what it means from an architectural point of view. I also discussed ideas for abstracting the framework a bit.

In this post I want to go a little deeper with LINQ as a language, and see what’s under the hood of the technology, and get to know the players a bit.

Dramatis Personae
There are 3 major elements to understanding a LINQ query, and it pays to take a second to get to the know them VERY well, since understanding them will allow you to work with the language much easier (because LINQ is very much a language in and of itself). These guys are:

  1. The new var keyword
  2. System.Data.Linq.Table<> and
  3. The Lambda Expression

Consider this LINQ query as an example:

            Northwind.NorthwindContext db = new NorthwindContext(myConnectionString);
            var result =
                  from products in db.Products
                  select products;

I already dived into the DataContext – this is the thing that works up a connection to your DB, and then keeps track of all the changes you make during your context “session”. Let’s take a look at the other stuff though…

Fashionably Late Typing With VAR
This is a new construct for C# and you can think of this a “Late-typed”. A lot of people (self included) thought of this as “variant” – sort of a “bucket type” – but that’s not accurate. All that var does is let you defer typing until you know what it is you’re working with. It’s not the same as the Object – it’s meant to be worked with as a typed value. Consider this:

    var myDateVar=DateTime.Now;

    object myDateOb=DateTime.Now;

These are both valid assignments, but if you try to work with myDateOb as an object…

    myDate=myDateOb.AddDays(4);

… you’ll get a compiler error because it’s an object, not a DateTime object.

If you do the same with myDateVar:

    myDate=myDateVar.AddDays(4);

Everything works out fine.

Why do this? Because of LINQ! There’s a gap when working with a query language like LINQ – what do I do with my results? Using something like a Reader is nice – but that doesn’t allow you to use LINQ on non-DB queries (like querying over an array of strings let’s say). DataTables are nice too but they don’t have any kind of type-referencing either.

By using a var, you can return a result set and work directly with it as a newly-typed object. So if we rewrite the query above, we can iterate over the result set using the same var keyword:

var_query

There are all kinds of great applications of late-typing that allow you to work in a “duck-typing” way with a degree of freedom from compiler pain. Duck-typing (working with compiler-inferred types) is more a style of programming than it is a technology. I won’t go into that here, but there should be some recognition that introducing this “loosey goosey” way of programming into a pretty strict language set (C#) can cause some headaches. This, more than anything, is a good reason to step out and learn a new language on the weekend (like Ruby!) that does this kind of thing so you can see for yourself the goods and bads from a learned perspective (which is always important).

System.Data.Linq.Table<>
When working with a LINQ statement, you’ll be running queries against a representative “table” that derives from the type System.Data.Linq.Table<>. The type that fits itself in that Generic Bikini <> is defined by your from clause (and you can have more than one). So in our example query, this would be DataContext.Products. Since we used LinqToSql to query the database, LINQ knows that when we say “from db.Products” we’re talking about a System.Data.Linq.Table<Product>. In fact

the var result set of a LINQ query is a System.Data.Linq.Table<whatever your select statement is>. This is KEY to understanding the way LINQ works.

There are some key methods in this class, and they are:

  • Single() – returns a single record according to the passed-in Lambda expression (more below)
  • Where() – returns a subset of records according to a Where()
  • Count() – returns the count of the rows in the System.Data.Linq.Table<>
  • ToList<>() – sends the results to a typed list
  • ToArray<>() – sends the results to a typed array

What’s even more important with this class is that it’s full of query methods that allow you to do things like

  • Aggregate()
  • Sum()
  • GroupBy()
  • Any()
  • Average()
  • Contains()

and so on. This might seem confusing – why in the world would you want to run a query then Aggregate (or GroupBy etc) on the result set? The answer is that System.Data.Linq.Table<> not only holds the result, it also constructs a LINQ query! So you can use these methods easily with LINQ:

        Northwind.NorthwindContext db = new NorthwindContext();
        decimal? result =db.Products.Sum(p => p.UnitPrice);

Notice here that I didn’t need to use a var (though I could have if I wanted). Since db.Products is a System.Data.Linq.Table<Product> – I can just query it and it goes and gets the data. Many of these methods return IEnumerable or IQueryable objects (which you can think of as System.Data.Linq.Table<> constructs), which you can append onto your query as needed. So if you wanted to use Where() instead of the build in where clause, you could:

        Northwind.NorthwindContext db = new NorthwindContext();
        var result =
              from products in db.Products.Where(p => p.CategoryID==5)
              select products;

I’ll go more into these aggregate and advanced queries down below. First we need to figure out just what this Lambda thing is we’re working with.

Method Shorthand With Lambda
Let’s jump right to it – Lambda expressions are methods in and of themselves, and the whole idea to them is that they don’t need to be typed – they infer their types. A lot of people call them Anonymous Methods which is partially true – they’re actually anonymous method degate shorthand (huh?).

The syntax for a Lambda expression is: parameter(s) => function.

No types are needed – they are inferred from the parameters (Left Side) of the Lambda by their use in the Right Side

To fully get this, let’s write a simple method to return a square of a number, and then translate that to a Lambda:

        decimal GetSquare(decimal number){
            decimal result=number*number;
            return result;
        }

That’s not that much code is it? It could be written nicely in Lambda form however like this:

x => x * x;

In this Lambda we’re saying “take this parameter, x, and shove it through this function x * x”. This statement is key to understanding why Lambdas are a big deal to LINQ:

Lambdas can be parsed into a Left/Right expression (Binary) for use in LINQ Expression Trees

Understanding this concept, you can see just how powerful the LINQ query language can be by embedding methods and expressions withing methods and expressions.

Another note on Lambdas – you might see the term “Func” all over the intellisence prompter when running up queries – this is an “Anonymous Delegate” declaration – basically a definition holder for a Lambda expression. So to fully write out the Lambda expression above in 3.5 code, we’d use:

    Func<decimal, decimal> = x => x*x;

Then to use it, you’d write:

    decimal root=f1(4);

Now we’re getting to the meat of it all. When you’re working with a LINQ query and you specify a “Where” (for example), you’re asked to pass in a “Func<Type,bool>” – this is an Anonymous Delegate that gets defined at runtime by you. In other words, you get to specify to a pretty detailed level what you want to see. So consider this:

    Northwind.NorthwindContext db = new NorthwindContext();
    List<Northwind.Data.Product> result=db.Products    .Where(p => p.ProductID >20 && p.CategoryID ==5 && p.Deleted ==false).ToList<Product>();

In this expression we’re passing in the parameter Product, and then a boolean expression – which is precisely the “Func” signature you see in intellisense: “Func<Product, boolean>”. You can see how this type of “anonymous expression generation” can come in very handy!

This expression can be shorthanded by using the where clause:

    var result =
    from products in db.Products
    where products.ProductID > 20 && products.CategoryID == 5 && products.Deleted == false
    select products;

In fact you can now see that LINQ is essentially two things: a massive expression parser and a set of Extension Methods. The where clause above is simply an extension method for creating the same Lambda statement above it.

Bugs In The Butter
In working up all the examples for this post I decided I was going to do some cool rollups and joins. Turns out that it *seems* to be broken. I’ve communicated what I found to the MS team and they’re looking into it.

To illustrate, let’s take a look at a simple join statement on a rollup query:

    var result =
          from products in db.Products
          join orderItems in db.Order_Details on products.ProductID equals orderItems.ProductID into joined
          select new {products.ProductName, OrderTotal=joined.Sum(t => t.UnitPrice*t.Quantity)};

In this query, I’m joining the Order Details table (shows renamed properly with an underscore :p ) into a simple table called “joined”. Simple enough so far. Next I’m returning a result set with the name of the product and a total for all orders on it. Notice how Sum() takes a Lambda? Remembering from discussion above you can dissect this Lambda by now (I hope) and see that it can be transalated to “Func(Type, decimal)”.

Our left side (the parameter) is the type Product, and the right side is our decimal: UnitPrice * Quantity. The important thing here is the right side: it’s type “decimal”, not nullable type “decimal?”. That should be OK right?

Well it’s not unfortunately. Running up the debugger you can see the SQL that’s created for this query:

SELECT [t0].[ProductName], (
    SELECT SUM([t1].[UnitPrice] * (CONVERT(Decimal(29,4),[t1].[Quantity])))
    FROM [dbo].[Order Details] AS [t1]
    WHERE [t0].[ProductID] = [t1].[ProductID]
    ) AS [value]
FROM [dbo].[Products] AS [t0]

A nested SELECT? Dude where’s my JOIN? Before I get my crank on – note the intelligent CONVERT. Since Quantity is an int, the multiplication with UnitPrice in SQL Server would have resulted in an int result – not what we want at all. LINQ is smart enough to convert this for us so we don’t need to stress on it :) .

But it’s this conversion that’s also causing us a problem – but in the code this time. Since LINQ is using a nested SELECT here without a JOIN, it’s effectively doing a LEFT OUTER – returning ALL products and then rolling up stats. Shouldn’t “join” be an INNER by default since that’s what it is in SQL Server? What if I have some products that don’t have orders?

The answer: Runtime Crash.

To see this more clearly, add a record to your Products table and then run the above LINQ query – the error you get is

“The null value cannot be assigned to a member with type System.Decimal which is a non-nullable value type.”

Hmmm – two things are wrong with this. Why isn’t it a nullable type? There could be situations where I’m Summing and I have null data (like right now)!

Also – why is LINQ doing a nested SELECT? It might not be a bug – but it’s something to be aware of.

Sum()ming It All Up
Hopefully understanding the major players with LINQ will help you to see how you can use LINQ more effectively, and also when NOT to use LINQ. In the tests I’ve done,  I’ve found that writing up more advanced summing/statistical queries is much easier in SQL though I’m sure that there might be some exceptions.

Also, the error I received reminds me that LINQ, above all, is a translation tool and at some point my mind isn’t going to work like those in Redmond (for better or for worse, til death do us part). Sometimes it’s better to just use SQL (view or SP), which is something I say a lot with SubSonic :) .

Would love to hear your thoughts, and if you’re an MS/LINQ person, I’d love to hear your take on the exception above.

Share and Enjoy: These icons link to social bookmarking sites where readers can share and discover new web pages.
  • DotNetKicks
  • del.icio.us
  • Technorati
  • TwitThis
  • Reddit
  • Slashdot

Tags: