This is the fourth entry in a series demonstrating how C# 3.0 query syntax is translated into the underlying method calls for Linq-to-Objects, and that you can write your own versions of the Linq extension methods. The previous entries were:
I know I said last time that this post would be about ordering, but now seemed an opportune point to look at the let contextual keyword, which allows you to create range variables within the scope of a Linq query. This is particularly useful if you want to convert a range variable to another form and then operate on it, or to simplify Boolean predicates for a where clause by extracting them. For example if we wanted to find all customers who have a name beginning with the letters A-M we could write:
var items =
from customer in customers
let firstLetter = char.ToUpper(customer.Name[0])
where firstLetter >= 'A' && firstLetter <= 'M'
select customer;
Because we needed to take into account the casing of the letter it had to be transformed; if this was done in the where clause we'd have to do the transformation twice so using let produces more concise code that should be more efficient. Unlike the other clauses there is no specific extension method for let, in fact it reuses the Select extension method to construct an anonymous tuple containing the two range variables:
var items = customers
.Select(
customer => new { customer = customer, firstLetter = char.ToUpper(customer.Name[0]) })
.Where(
tuple => tuple.firstLetter >= 'A' && tuple.firstLetter <= 'M')
.Select(
tuple => tuple.customer);
The following illustration shows the construction of the anonymous tuple in purple, and then how it flows through and has the values extracted in green.
Now if we have multiple let clauses in a row, you might imagine that the anonymous tuple would simply contain all of the range variables directly. In reality it doesn't work like this, and each let variable is subsequently aggregated into a tuple pair. For example if we wanted to limit the results to customers who have placed orders and wrote it as follows:
var items =
from customer in customers
let firstLetter = char.ToUpper(customer.Name[0])
let hasOrders = customer.Orders.Count > 0
where firstLetter >= 'A' && firstLetter <= 'M' && hasOrders
select customer;
The translation by the compiler is:
var items = customers
.Select(
customer => new { customer = customer, firstLetter = char.ToUpper(customer.Name[0]) })
.Select(
tuple1 => new { tuple1, hasOrders = tuple1.customer.Orders.Count > 0 })
.Where(
tuple2 => tuple2.tuple1.firstLetter >= 'A' &&
tuple2.tuple1.firstLetter <= 'M' &&
tuple2.hasOrders)
.Select(
tuple2 => tuple2.tuple1.customer);
Similarly to some of the other statements we've seen, this is a sub-optimal translation with regards to efficiency. It would be possible to convert this to use a single Select method to construct a single tuple, which would mean fewer method invocations, object allocations, and property accesses, e.g.
var items = customers
.Select(
customer => new {
customer = customer,
firstLetter = char.ToUpper(customer.Name[0]),
hasOrders = customer.Orders.Count > 0 })
.Where(
tuple => tuple.firstLetter >= 'A' &&
tuple.firstLetter <= 'M' &&
tuple.hasOrders)
.Select(
tuple => tuple.customer);
Because of this efficiency, it seems prudent to recommend against using the let keyword in cases where you do not need to transform a variable before operating on it, so in the example above I'd still use it to convert the first character to upper-case but would probably write the orders condition inline. This could be considered a micro-optimisation, but here I don't feel the let statement really adds clarity to the orders check, so if there are two equally clear options then you may as well default to the more efficient one.
Next time we really will be looking at ordering results.
Posted
Apr 21 2008, 08:20 PM
by
Greg Beech