PLINQ

Sun, January 25, 2009, 06:59 PM under ParallelComputing | LINQ
With VS2008 (more specifically .NET Framework 3.5) a wonderful thing was introduced: LINQ. Given the declarative nature of Language Integrated Query it was a prime candidate for trying to inject automatic parallelization in it (i.e. to run faster by seamlessly taking advantage of multiple cores). The result of those efforts is what I mentioned 18 months ago (Parallel LINQ) and followed with a screencast 4 months later: see the 2nd link in the list here. In this post I'll do a written overview based on the latest bits. Before we continue, you should understand that PLINQ applies only to LINQ to Objects (i.e. IEnumerable-based sources where lambdas are bound to delegates, not IQueryable-based sources where the lambdas are bound to expressions). It also does not interfere with the deferred execution principles of LINQ, of course.

PLINQ as a black box
PLINQ is really simple if you want to treat it as a black box; all you do as a user is add the .AsParallel extension method to the source of you LINQ query and you are done! The following query
var result =
from x in source
where [some condition]
select [something]
...can be parallelized as follows:
var result =
from x in source.AsParallel()
where [some condition]
select [something]
Notice that the only difference is the AsParallel method call appended to the source and we can of course use this pattern with more complex queries.

Why Does It Work
To understand why the above compiles we have to remind ourselves of how LINQ works and that the first version of the code above is really equivalent to:
var result =  source.Where(x => [some condition]).Select(x => [something]);
...so when we parallelize it we are simply changing it to be the following:
var result =  source.AsParallel().Where(x => [some condition]).Select(x => [something]);
In other words the call to AsParallel returns something that also has the typical extension methods of LINQ (e.g. Where, Select and the other 100+ methods). However, with LINQ these methods live in the static System.Linq.Enumerable class whereas with PLINQ they live in the System.Linq.ParallelEnumerable class. How did we transition from one to the other? Well, AsParallel is itself an extension method on IEnumerable types and all it does is a "smart" cast of the source (the IEnumerable) to a new type which means the extension methods of this new type are picked up (instead of the ones directly on IEnumerable). In other words, by inserting the AsParallel method call, we are swapping out one implementation (Enumerable) for another (ParallelEnumerable). And that is why the code compiles fine when we insert the AsParallel method. For a more precise understanding, in the VS editor simply right click on AsParallel, choose Go To Definition and follow your nose from there…

How Does It Work
OK, so we can see why the above compiles when we change the original sequential query with our parallelised query, which we now understand is based on the introduction of new .NET 4 types such as ParallelQuery and ParallelEnumerable – all in System.Core.dll in the System.Linq namespace. But how does the new implementation take advantage (by default when it is worth it) of all the cores on your machine? Remember our friendly task-based programming model? The implementation of the methods of the static ParallelEnumerable class uses Tasks ;-). Given that the implementation is subject to change and more importantly given that we have not shipped .NET 4 yet, I will not go into exactly how it uses the Tasks, but I leave that to your imagination (or to your decompiler-assisted exploration ;)).

Simple Demo Example
Imagine a .NET 4 Console project with a single file and 3 methods, 2 of which are:
  static void Main()
{
Stopwatch sw = Stopwatch.StartNew();
DoIt();
Console.WriteLine("Elapsed = " + sw.ElapsedMilliseconds.ToString());
Console.ReadLine();
}
static bool IsPrime(int p)
{
int upperBound = (int)Math.Sqrt(p);
for (int i = 2; i <= upperBound; i++)
{
if (p % i == 0) return false;
}
return true;
}
…without worrying too much about the implementation details of IsPrime (I stole this method from the walkthrough you get in the VS2010 CTP). So the only question is where is the 3rd method, which clearly must be named Doit. Here you go:
  static void DoIt()
{
IEnumerable arr = Enumerable.Range(2, 4000000);
var q =
from n in arr
where IsPrime(n)
select n.ToString();
List list = q.ToList();
Console.WriteLine(list.Count.ToString());
}
Now if you run this you will notice that on your multi-core machine only 1 core gets used (e.g. 25% CPU utilization on my quad core). You'll also notice in the console the number of milliseconds it took to execute. How can you make this execute much faster (~4 times faster on my machine) by utilizing 100% of your total CPU power? Simply change one line of code in the Doit method:
from n in arr.AsParallel()
How cool is that?

Can It Do More
What the PLINQ implementation does is it partitions your source container into multiple chunks in order to operate on them in parallel. You can configure things such as the degree of parallelism, control ordering, specify buffering options, whether to run parts of the query sequentially etc. To experiment with all that, just explore the other new extension methods (e.g. AsOrdered, AsUnordered) and, finally, the new enumerations (e.g. ParallelQueryMergeOptions). I leave that experimentation to you dear reader ;)

LINQ to XML, namespaces and VB

Tue, May 6, 2008, 06:30 AM under dotNET | LINQ
Recently I was playing with some LINQ to XML for a demo I was preparing and was having trouble retrieving the expected values from what was a very straightforward query.

Have a look at the XML file that looks like this (the results of programmatically calling this Amazon service).

Each Item element represents a book and I wanted to retrieve the Title. How would you form that query with LINQ to XML?

I went for the obvious:
  var res =
from ia in XElement.Parse(e.Result).DescendantsAndSelf("ItemAttributes")
select ia.Element("Title").Value;
When that did not produce the expected results I scratched my "tired" head at the time and pinged MikeT who came up with the correct way of doing this (you still have time to work it out on your own).

The clue (and at the same time further "excuse") is that all my previous experiments with LINQ to XML involved using my own demo XML files that never had namespaces inside so I forgot all about them (haven't paid the tax in a while). I find quite ugly what you have to do to incorporate namespaces in a LINQ to XML query, but there seems to be no nicer alternative to the following (thanks Mike):
void SomeMethod()
{
var res =
from ia in XElement.Parse(e.Result).DescendantsAndSelf(n("ItemAttributes"))
select ia.Element(n("Title")).Value;

// TODO use res
}

static XName n(string name)
{
return XNamespace.Get("http://webservices.amazon.com/AWSECommerceService/2005-10-05") + name;
}
I still didn't like this solution for the simple demo I wanted to use it for. So, I recalled VB's superior support for XML and I converted the project to VB and used the following instead which needs no extra method and is all round more elegant:
  Dim res = _
From ia In XElement.Parse(e.Result)...<n:ItemAttributes> _
Select ia...<n:Title>.Value()
...and if you are wondering where the n comes from, that is at the top of the VB file:
Imports <xmlns:n="http://webservices.amazon.com/AWSECommerceService/2005-10-05">
...another aspect of VB's beauty and another thing I had the opportunity to mention in my demo ;-)

LINQ to DataSet video

Mon, December 10, 2007, 09:07 AM under dotNET | Orcas | LINQ
I have already written about LINQ to DataSet, and for more details than what I provided you should follow the links I suggested on that post. In addition, I just published an 18' video on LINQ to DataSet which I hope you enjoy...

LINQ to SQL in Beta 2

Wed, August 15, 2007, 10:56 AM under Orcas | LINQ
LINQ to SQL has had some positive changes in VS 2008 Beta 2.

The most significant is that it is now faster (for background to this, see Rico's post) and has numerous bug fixes. Another very visible change is that for customisation, partial methods are utilised everywhere (LINQ to SQL was one of the drivers behind that feature). An aesthetic tool change is that when you map your database to the auto generated classes, you do not add a "LINQ to SQL File" item, instead you add a "LINQ to SQL Classes". It also has a much more professional icon rather than the old one that looked like it was drawn by... me! Here is the new one:


Opening System.Data.Linq.dll in your favourite dissasembler will reveal quite a few refactorings, but instead of looking inside out, I'll hand it over to Dinesh that has a fuller change list.

LINQ samples

Thu, August 9, 2007, 12:57 PM under Orcas | LINQ
I opened the Help menu of Visual Studio 2008 Beta 2 and observed three additions compared to Beta 1. Can you spot them without reading beyond the screenshot:


That's right:
1. MSDN Forums – browses to the MSDN forums.
2. Report a Bug – on my machine takes me here.
3. Samples – opens an htm file from your machine.

That last one is the most interesting to me. It includes hyperlinks to many sample projects on your machine. Clicking on "Visual C# Samples" will open a ZIP file that has two folders in it, one of them called "LinqSamples" that contains the BuildSamples.sln. Open that and you'll have 15 very interesting projects.

Included in that solution (no need to download anything from the web, contrary to following links) is the Paste XML As XElement add in project, the LinqToSqlQueryVisualizer debugger visualizer project and the ExpressionTreeVisualizer debugger visualizer project. Also, worth the admission fee alone is the SampleQueries project with literally hundreds of sample LINQ to everything queries in a format that you can learn from:

Go explore now!

LINQ to SQL pipeline

Mon, August 6, 2007, 03:23 PM under Orcas | LINQ
Stumbled upon a great video of Matt Warren (questioned by Luca Bolognese and shot by Charlie Calvert) explaining the pipeline of LINQ to SQL i.e. it assumes you know how to use LINQ to SQL already and builds on that to explain the internals. Watch it here.

LINQ's relationship to the new language features

Tue, July 31, 2007, 03:43 AM under Orcas | VisualStudio | LINQ
If you are interested in understanding the relationship of the new C#3 and VB9 features to LINQ, then watch my 20' screencast and let me know there if you have any questions (unlike other videos, I advise you not to speed it up, because I speak very fast as it is ;)).

Parallel LINQ

Mon, July 9, 2007, 09:43 AM under ParallelComputing | LINQ | Links
Having done a lot of real world work with threading in the .NET Framework including working with the numerous limitations of NETCF v1.0 (inc. implementing the BackgroundWorker for it), I read with interest Sam's parallel computing blog post to see what links there would be in there and I wasn't disappointed. Check it out if you are new to threading in the managed world. That reminded me about a related topic that I keep mentioning at the end of my LINQ presentations and that I've been meaning to blog about in response to Tim's question: Parallel LINQ (PLINQ).

One of the numerous benefits of LINQ is the move to declarative programming, which has side benefits over and beyond the obvious ones. By writing code that tells the engine what it is that we want it to do rather than how to go about it, we open new possibilities where the engine can take our intent and split/execute it on multiple threads/cores (since ultimately it is responsible on the how). While most devs "get" that, it may sound a bit woolly to others, so here are some links/info about PLINQ.

I believe the first mention of PLINQ is in this eWeek article (August 2006). Joe Duffy announced his involvement (September 2006) and then followed it up with more info and a slide deck (January 2007). Bart watched a presentation on the topic and spills the beans on the AsParallel extension method which he follows with a NON-Plinq code example (April 2007). Finally, watch a real 5' demo by Luca from Tech Ed [between 51:26-56:40] (May 2007).

Of course, it is early days and PLINQ will not ship with VS2008 (or even as part of the wider Orcas-wave) but you get the idea... The earlier you start taking advantage of LINQ, the earlier you'll be able to take advantage of PLINQ when it eventually ships ;)

System.Data.Linq.dll

Sat, May 19, 2007, 06:19 AM under dotNET | Orcas | LINQ
One of the green bit assemblies in .NET Framework 3.5 is System.Data.Linq.dll. As you may have guessed, this assembly contains the framework implementation for LINQ to SQL. Regular readers will know about my other multiple LINQ posts, but I have never posted about LINQ to SQL (formerly known as DLinq).

So, if you are interested in DLINQ, stay tuned on a series of posts that ScottGu has started here. For a detailed LINQ to SQL tutorial series of videos, check out the output from our local nugget machine (aka MikeT). I have captured here the direct links to the WMVs for the 18 screencasts: Intro, DataContext, Mapping, Tools, Inserting Data, Deleting Data, Updates, Concurrency, Joins, NULLs, Deferred Query Execution, Deferred Loading, Stored Procedures (querying), Stored Procedures (updates), Standard Functions, Custom Functions, Inheritance, Transactions.

Using Extension methods in Fx 2.0 projects

Sun, May 13, 2007, 03:30 PM under dotNET | Orcas | LINQ
For the importance and context of this blog entry, please DO read my previous blog post including the disclaimer at the top.

The feature I have not explicitly addressed yet is extension methods. If you are a C# developer that has been using the "magic" this keyword read my explanation to understand what we are talking about here. You see, even though extension methods are a compiler trick, they do rely on the new System.Runtime.CompilerServices.ExtensionAttribute introduced in System.Core.dll. Since we've already established that System.Core is a Fx v3.5 assembly and hence is not usable from 2.0 projects, you might think this is the end of the road for extension method usage in 2.0 projects. Let's explore this further.

In a 2.0 Orcas project try and compile the following code:
static class Program
{
//[ExtensionAttribute()]
static void DisplayInterval(this Timer t)
{
Console.WriteLine(t.Interval.ToString());
}

static void Main(string[] args)
{
System.Timers.Timer t = new System.Timers.Timer(2000);

Program.DisplayInterval(t); // long hand usage of method
t.DisplayInterval(); // short hand usage of method (extension)

Console.ReadLine();
}
}
You will see a compiler error as follows:
"Cannot use 'this' modifier on first parameter of method declaration without a reference to System.Core.dll. Add a reference to System.Core.dll or remove 'this' modifier from the method declaration"
What the compiler *really* wants to say is:
"You are using the 'this' modifier and it depends on the ExtensionAttribute that itself resides in the System.Core.dll and you do not have a reference to it"
Once you understand what it is really saying, and once you realise that it only needs this attribute at compile time (since at runtime the IL to be interpreted will still include the call to the utility method in long hand), then it all becomes clear what you need to do next. Declare the attribute yourself! It isn't such a big logical jump (or else I wouldn't have made it :-))

So add to your 2.0 project a new empty code file and paste in it the following code:
namespace System.Runtime.CompilerServices
{
public class ExtensionAttribute : Attribute { }
}
Then, wherever you declare the extension methods, optionally add to the top:
using System.Runtime.CompilerServices;
Now your projects will compile and you can use extension methods!

So summing up the previous post and this one, all 6 language features behind LINQ are usable from Fx v2.0 projects. But the LINQ to objects/xml/data implementations clearly are not usable because they live in Fx 3.5 assemblies. You can of course offer your own LINQ implementations for your own data sources. For example, the following code will compile in a Fx 2.0 project...
using Moth.Linq;
static class Program
{
static void Main(string[] args)
{
var ints = new []{1,2,3,4};
var res2 = from p in ints
where p > 2
select p;

System.Diagnostics.Debugger.Break();
// examine res2
Console.ReadLine();
}
}
...if you also add to the project this (demo) code file here.


BTW, how soon are you guys out there planning to move your existing 2.0 projects to Visual Studio "Orcas"? ;-)

Using C# 3.0 from .NET 2.0

Sun, May 13, 2007, 03:15 PM under Orcas | VisualStudio | LINQ
Please note that this and my next post are based on Beta 1 Orcas bits. Later builds might change the facts so always try it yourselves.

There still appears to be some confusion with regards to the new language features that make LINQ to objects work. In particular, there are claims that it all works with .NET 2.0 and counterclaims that it doesn't. IMO the confusion arises from people not being precise about what they are talking about. I'll try and give you here my view (which of course I think is as clear as can get :-))

To start with, everything here is in the context of Visual Studio "Orcas" – forget earlier versions. Orcas supports multitargeting so you can build projects that target .NET Framework v2.0. So the real question we are addressing here is whether you can choose to build a NetFx 2.0 project that also uses some of the new C#3 language features.

If you are not familiar with LINQ, please go read my detailed LINQ blog posts and then come back. So after reading about LINQ I hope you realise that:
LINQ =
1. local variable inference +
2. object intiliazers +
3. anonymous types +
4. lambda expressions +
5. extension methods +
6. query expressions +
7. framework support.

Let's look at some of the language features. The first 4 do work with 2.0 projects i.e. the following compiles and runs fine in a 2.0 Orcas project:
class Program
{
delegate T Func<T>(T t);
static void Main(string[] args)
{
// object init -
// makes no sense in this example but hey...
Timer t = new Timer(2000)
{
AutoReset = false,
Enabled = true
};

// local var inf, anonymous types
var i = new { Name = "Fred", Age = 6 };
Console.WriteLine(i.ToString());

// lambdas
Func<int> f = j => j + 1;
Console.WriteLine(f(5));

Console.ReadLine();
}
}
Can you spot them? Local variable type inference, object initiliazers, lambda expressions and anonymous types. This shouldn't surprise you if you read my previous blog posts since I have stated clearly many times that these language features are simply compiler tricks. The IL that gets generated is the same old IL – no dependency on the framework and no dependency on the runtime. So 2.0 projects in Orcas can use 4 of the smart compiler tricks, this is good! As an aside, other 3.0 compiler features irrelevant to LINQ such as automatic properties, also work in 2.0 projects.

On the gloomy side, it goes without saying that number 7 above, the "framework support", resides in System.Core.dll and you cannot reference that assembly from 2.0 projects, so no extension method implementations for you in 2.0 projects. If at this point we think of number 6 above, "query expressions" (i.e. from...where...select syntax), we realise that their usefulness is in friendly mapping to the extension method implementations in System.Core.dll (or System.Xml.Linq or Systam.Data.Linq or your own extension method implementations). So even though query expressions are supported when targeting v2.0 projects from VS "Orcas" (the syntax highlighting when you try it is a clue) there is no implementation to map them onto - other than your own. Well, even your "own implementation" implies that you can use Extension methods for v2.0 projects and I haven't addressed that yet (number 5 above). See my next post now.

System.Xml.Linq

Fri, May 11, 2007, 09:36 AM under dotNET | Orcas | LINQ
Part of .NET Framework 3.5 is a new framework assembly: System.Xml.Linq dll. Now, I know most people describe this in the context of LINQ (it used to be called XLinq) and indeed it works very well with LINQ but I wanted to stress here that effectively this is a standalone new XML API. You can use it without ever going anywhere near the LINQ syntax.

There are 3 namespaces inside the assembly the main one being System.Xml.Linq with 23 types, the most important shown below:

The ones you are most likely to use all the time are XElement and XAttribute (I bet you thought I'd say XDocument but actually you don't really have to use that one!). So why yet another new API for XML? Because this one is really cool! One aspect of the coolness is the way you create XML documents/fragments: You can do that in a single statement where the creation of XML items and their association takes place in the same step! Also while writing that single step/statement, you get to do it in the natural way that you think of the layout of the XML document. For example, in the following screenshot I have captured the code and the document it creates side by side. Can you see how intuitive it is?

Another aspect of the coolness of the API is that you can indeed use it with the LINQ syntax both for creation and for querying. To assist with that there are a bunch of extension methods in 3 static classes: System.Xml.Linq.Extensions, System.Xml.Schema.Extensions and System.Xml.XPath.Extensions. FYI, there are no other types in the Schema and XPath namespaces. I'll let you explore that yourself in the Beta 1 of Orcas.

The coolest thing about this API however is only available to Visual Basic developers. The VB compiler is smart enough to let you type in XML literals while it produces the necessary calls to the new XML classes. The code I showed above in C# 3.0 can be written in VB9 as per the following:

To see more about creating the documents, how LINQ syntax can be thrown in and how VB syntax with XML truly rocks you need some nuggets of course. Well, my colleague beat me to it so I encourage you to view his videos here (he is a bit behind the times using the March CTP so note that adding a reference to System.Core and System.Xml.Linq is done for you in Beta 1 for 3.5 projects).

equals versus ==

Thu, March 8, 2007, 03:09 AM under Orcas | LINQ
At a recent event where I was presenting on LINQ, I showed a query with a join, similar to the following:
      var results =
from p in Process.GetProcesses()
join p2 in MyProccess.GetMyProcList()
on p.ProcessName equals p2.MyProcName
where p.Threads.Count > 14
orderby p.ProcessName descending
select new { p2.MyProcDescription, ThreadCount = p.Threads.Count, p.Id };
After the session one of the delegates asked me: "Why do we have to use the equals keyword and not just ==". In other words he would have preferred to type:
on p.ProcessName == p2.MyProcName
I didn't have a good answer but promised to look into it.

Looking into it involved pinging the product team and Matt Warren (C# software architect) came up with the reply, which I include unedited, warts and all :-)
"The reason C# has the word ‘equals’ instead of the ‘==’ operator was to make it clear that the ‘on’ clause needs you to supply two separate expressions that are compared for equality not a single predicate expression. The from-join pattern maps to the Enumerable.Join() standard query operator that specifies two separate delegates that are used to compute values that can then be compared. It needs them as separate delegates in order to build a lookup table with one and probe into the lookup table with the other. A full query processor like SQL is free to examine a single predicate expression and choose how it is going to process it. Yet, to make LINQ operate similar to SQL would require that the join condition be always specified as an expression tree, a significant overhead for the simple in-memory object case."
Makes sense!

UPDATE: Vladimir Sadov from the Visual Basic team told me that VB also uses Equals for pretty much the same reasons.

LINQ Resources

Tue, February 20, 2007, 04:37 AM under dotNET | Orcas | LINQ
In order to really understand LINQ and before moving to LINQ to data/xml/etc, one has to appreciate the features that LINQ builds on and how LINQ to objects works (unless you like to think of it as magic :)).

To that end I suggest you read my blog entries in order:
1. LINQ
2. Local Variable Type Inference (and the VB version)
3. Object Initiliasers
4. Anonymous Types
5. Extension Methods
6. Lambda Expressions (and the VB version)
7. Decomposing LINQ (includes Query Expressions)

My 1 hour talk on LINQ is coming to a city near you (if you live in the UK) and my code-littered slides are available to download here (pptx format).

If you want to play with LINQ you need Orcas. You can get the May CTP of last year or the January CTP from this year but to be brutally honest, namespaces/syntax/features have changed so the one you really really want is the Feb/March CTP so I would wait for that one which is just round the corner.

For even more info, see the following links:
- Official LINQ page
- We have some screencasts here and channel9 has some media here.
- Plenty of blogs with LINQ or C#3 or VB9 categories and some of them are here, here, here, here, here and here.
- For more, search.

For all your questions, as usual the msdn forums offer free support so please go there.

Decomposing LINQ

Sun, February 18, 2007, 07:04 PM under dotNET | Orcas | LINQ
Please revisit the example from my previous post on LINQ. Here I only include the query snippet of code:
       var results =
from p in Process.GetProcesses()
where p.Threads.Count > 6
select new {p.ProcessName, ThreadCount = p.Threads.Count, p.Id };
...or in VB if you prefer:
    Dim results = _
From p In Process.GetProcesses() _
Where p.Threads.Count > 6 _
Select New With {p.ProcessName, .ThreadCount = p.Threads.Count, p.Id}
The above is based on a number of new language features including one that I have not mentioned until now: query expressions. If we were to take away query expressions, the code above would look like this:
   var results =
Process.GetProcesses()
.Where(p => p.Threads.Count > 6)
.Select(p=>new {p.ProcessName, ThreadCount = p.Threads.Count, p.Id });
In VB:
    Dim results = _
Process.GetProcesses() _
.Where(Function(p) p.Threads.Count > 6) _
.Select(Function(p) New With {p.ProcessName, .ThreadCount = p.Threads.Count, p.Id})
The two snippets above are identical. There isn't much more to explain about Query Expressions other than... that is the way it is! The two code snippets above are identical, the first one using query expression syntax to beautify the real code shown in the 2nd one.

So let's look at the 2nd snippet more closely. The first line (Process.GetProcesses()) returns an array and we know that arrays implement IEnumerable. Well it turns out that in Framework 3.5 there are some extension methods for IEnumerable including a method called Where that takes as an argument a delegate (named Func) and returns an IEnumerable. This extension method (along with other extension methods) and the delegate it accepts (along with other delegates) are in the System.Linq.Enumerable class that is in Core dll.

Given what we just said, the second line above (.Where(p => p.Threads.Count > 6)) should make a lot more sense now: the way we pass in a delegate to the Where extension method is by using a lambda expression (VB lambda link). The 3rd line of code is also straightforward once you know that another extension method on IEnumerable is Select. It also takes a delegate and we used a lambda expression for that as well. Note how in the body of the lambda we use anonymous types. What Select returns is a generic IEnumerable of the anonymous type we create on the fly. Since we cannot write compilable code of an IEnumerable of an anonymous type, we use local variable type inference (VB inference link) for the results variable and again take advantage of inference when we want to access each element:
      foreach (var o in results)
{
Console.WriteLine(o.ToString());
}
Note that if you look closely at the extension methods in Enumerable and the delegates in the System.Linq namespace, you will find extreme usage of generics and that is how that code can deal with arbitrary conditions, anonymous types etc that are used in LINQ queries. Don’t forget the compiler magic of course that does generate quite a bit of goo in addition to the code you write. For that last point, compile the code above and open the assembly with reflector and you'll see what I mean :-)

To finish off, take a glance at the original query without the query expression syntax; what would the code look like if we also took away all the new language features (but still using the new Enumerable class of the Framework 3.5)? First we have to write two extra methods and declare a helper class):
    private class MothAnonymousType
{
public string ProcessName;
public int ThreadCount;
public int Id;
//
public override string ToString()
{
return "{ ProcessName = " + ProcessName + ", ThreadCount = " + ThreadCount + ", Id = " + Id + " }";
}
}
private static bool MothWhere(Process p)
{
return p.Threads.Count > 6;
}
private static MothAnonymousType MothSelect(Process p)
{
MothAnonymousType type1 = new MothAnonymousType();
type1.ProcessName = p.ProcessName;
type1.ThreadCount = p.Threads.Count;
type1.Id = p.Id;
return type1;
}
and then our calling code changes to this:
    static void Main(string[] args)
{
IEnumerable<MothAnonymousType> results =
Enumerable.Select(
Enumerable.Where(Process.GetProcesses(), new Func<Process, bool>(MothWhere)),
new Func<Process, MothAnonymousType>(MothSelect));
//
foreach (MothAnonymousType o in results)
{
Console.WriteLine(o.ToString());
}

Console.ReadLine();
}


I think I prefer the original query syntax don’t you ;-)

Lambda Expressions in VB9

Sun, February 18, 2007, 06:54 PM under dotNET | Orcas | LINQ
This short blog entry assumes that you have read my description of C# lambda expressions and here I will literally just show you the proposed VB syntax that corresponds to the lambda we ended up with in that blog post: i => i > 2. The corresponding syntax is: Function(x) i > 2

So, the full example, given the following code:
  Delegate Function SomeDelegate(ByVal i As Integer) As Boolean
Public Sub SomeMethod()
'
'REMEMBER FROM LAST TIME, THE KEY LINE IS THE FOLLOWING
Dim sd As SomeDelegate = New SomeDelegate(AddressOf OtherMethod)
'
' other code here
YetOneMore(sd)
End Sub
Private Function OtherMethod(ByVal i As Integer) As Boolean
Return i > 2
End Function
Private Sub YetOneMore(ByVal f As SomeDelegate)
Dim res As Boolean = f(5)
Console.WriteLine(res.ToString())
End Sub
We can get rid of the OtherMethod method completely and inline on the delegate creation line using the VB9 lambda expression sysntax, like this:
    Dim sd As SomeDelegate = Function(i) i > 2

WARNING: for this feature only and for the VB case only, I do not have a compiler that supports it yet. Unlike all my other blog posts, I am basing the above on a spec rather than hands on experience. If I find that it changes when I get new bits, I will come back here and update this.

Lambda Expressions C# 3.0

Sun, February 18, 2007, 06:48 PM under dotNET | Orcas | LINQ
There are two aspects to lambda expressions and I will only discuss one of them in this post. The aspect not discussed is the one that is most useful, but to get there we must first understand the syntax, which follows.

Lambdas are simply shorthand to creating a delegate and pointing it to a method plus they offer type inference. Consider the following C# example:
delegate bool SomeDelegate(int i);
public void SomeMethod()
{
SomeDelegate sd = new SomeDelegate(OtherMethod);
//
// other code here
YetOneMore(sd);
}
private bool OtherMethod(int i)
{
return i > 2;
}
private void YetOneMore(SomeDelegate f)
{
bool res = f(5);
Console.WriteLine(res.ToString());
}
Nothing complicated (or useful) takes place. Before I rewrite it using a lambda expression, let's re-write it a bit using C# 2.0 anonymous methods:
delegate bool SomeDelegate(int i);
private void SomeMethod()
{
SomeDelegate sd =
delegate(int i){return i > 2;}
;
//
// other code here
YetOneMore(sd);
}
private void YetOneMore(SomeDelegate f)
{
bool res = f(5);
Console.WriteLine(b.ToString());
}
If you are not familiar with anonymous methods, basically we have inlined OtherMethod by using the delegate keyword.

Lambda expressions take this to the next level of conciseness.
This line:
    delegate(int i){return i > 2;}
can be written like this:
    (int i) => { return i > 2;}
so all we've done there is replace the delegate keyword with the funny syntax =>

However the beauty is that given the body only has a single return statement, we can make it even more concise *and* get the compiler to infer the parameter type:
 SomeDelegate sd = i => i > 2;
And that is the basics of lambdas: it looks weird, it is concise and it does some inference for us. In the future I will post about the other aspect of lambdas which is actually what most people are excited about: lambdas bound to parameter expressions.

Extension methods C# 3.0 and VB9

Sun, February 18, 2007, 06:43 PM under dotNET | Orcas | LINQ
How many times have you written wrapper methods for objects that in reality you wished were part of the object itself? For example, if you find yourself checking in multiple places if a string is all uppercase, you may write a wrapper method for it, e.g.:
namespace Helper
{
public static class StringHelper
{
public static bool IsAllUpper(string s)
{
//implementation left as an exercise to the reader
}
}
}
...or in VB if you prefer:
Namespace Helper
Public Module StringHelper
Public Function IsAllUpper(ByVal s As String) As Boolean
'implementation left as an exercise to the reader
End Function
End Module
End Namespace
...which you then call like this (assuming the Helper namespace is in scope):
string s = ...
bool itIs = StringHelper.IsAllUpper(s);
In VB:
Dim s As String = ...
Dim itIs As Boolean = StringHelper.IsAllUpper(s)
...when really, what you want is something more readable like this:
bool itIs = s.IsAllUpper();
In VB:
Dim itIs As Boolean = s.IsAllUpper()
That is exactly what extension methods let you do. To achieve that, a new attribute has been introduced: System.Runtime.CompilerServices.ExtensionAttribute (in System.Core.dll).

In the VB example above, simply add the Extension attribute to the method in the module:
    <Extension()> _
Public Function IsAllUpper(ByVal s As String) As Boolean
...
In c#, there is a shortcut to using the attribute which makes the following two represent the same thing:
[Extension()]
public static bool IsAllUppercase(string s) {...}

public static bool IsAllUppercase(this string s) {...}
...except the C# compiler will not let you write the first version and will instruct you to use the "this" keyword instead (a bit like it doesn't let you write an explicit finalizer method and instead instructs you to use destructor syntax).

So, back to our example, if in the c# case we simply insert the this keyword before the string argument in the IsAllUpper helper method (and in the VB case insert the Extension attribute before the method) then we can indeed type s.IsAllUpper() and the compiler is happy. In fact, the advantage of doing this is that intellisense will help you by showing the method when you type a dot after the variable thus making the helper method much more discoverable than what it would be today - see the following screenshot for how it distinguishes extension methods from normal methods by sticking an arrow in front:


Please note the following points which hopefully help answer any questions you might have:
0. It is not by accident that I chose a static class in C# and a module in VB. Those are the only places where you can define extension methods.
1. The compiler generates the longhand version: no magic is taking place here at runtime. Since you are not *really* adding a method to the class, you can only access public members of the object (e.g. only public members of the string in our example) from your extension method. It also follows that extensions are available to subclasses of the class you are extending.
2. For an extension method to be visible/applicable/in scope, you must import the namespace it resides in. So in our example, if you do not stick a using Helper; (or in VB Imports Helper) at the top of the file where you make the call, the extension method would not show up in intellisense and would not compile.
3. If two namespaces with extensions methods that have the same name are brought into scope, a compile error occurs; hence the usefulness of the previous point.
4. If you add to the object a real method that has the same name as the extension method, the real method takes precedence and your extension method is silently ignored.
5. Extension methods for properties are not possible.

Overall, I like this feature *a lot* (will post some concrete examples of why in the future). My only worry is that some devs will get lazy and use extension methods rather than inheritance when the latter is more applicable... time will tell.

Anonymous Types in C# 3.0 and VB9

Sun, February 18, 2007, 06:38 PM under dotNET | Orcas | LINQ
Recall local variable type inference (and the VB9 story) and object initialisers? Here is a reminder:
var o = new SomeType(); //inference
SomeType o = new SomeType {SomeField=DateTime.Now, AnotherField=5.6}; //initialisers
We can of course combine them:
var o = new SomeType {SomeField=DateTime.Now, AnotherField=5.6};
...or in VB if you prefer:
Dim o = New SomeType With {.SomeField = DateTime.Now, .AnotherField = 5.6}
Now, imagine that you were only using variable o in a single method. Also imagine that the type of o (SomeType) only has public fields/properties and no other methods/functions. Also imagine that SomeType was not used anywhere outside that single method. I know all this requires vivid imagination but humour me and picture that scenario.

Well, in the imaginary scenario, you really don't need to declare/define the type! Think about it, why would you need to know what type o is? All you need to be able to do is create something that looks like it and access the public fields/properties. That is exactly what the “anonymous types” feature offers:
var o = new {SomeField=DateTime.Now, AnotherField=5.6};
In VB:
Dim o = New With {.SomeField = DateTime.Now, .AnotherField = 5.6}
In the code above, the compiler generates a class for us, which is visible in IL. The name of the type is not visible to our code and the name is not otherwise usable. Hence we call the feature: anonymous types. If you use your favourite disassembler you can see what the name of the type is but that information will be of academic value.

Another featurette of anonymous types is that the compiler can infer the field names so if you amend the code above like this:
var o2 = new {DateTime.Now, AnotherField=5.6};
In VB:
Dim o2 = New With {DateTime.Now, .AnotherField = 5.6}
...then the anonymous type will have a property called Now that has the value of DateTime.Now and this will of course show up in intellisense e.g. Console.WriteLine(o.Now);

Also note that anonymous types override the ToString method and return something sensible in the format “{Field1 = value1, Field2 = value2}”

Of the three new language features that I've described so far, anonymous types looks like the most useless. Stick with this one for a while. When I bring it all together for LINQ, you'll see the usefulness of the feature.

Object Initializers in C# 3.0 and VB9

Sun, February 18, 2007, 06:36 PM under dotNET | Orcas | LINQ
How many times have you created an object and immediately started setting properties on it? For example, wherever I create a Thread object I use the following almost boilerplate snippet:
Thread myThread = new Thread(MyThreadMethod);
myThread.Name = "my thread";
myThread.IsBackground = true;
..or in VB if you prefer
    Dim myThread As New Thread(AddressOf MyThreadMethod)
myThread.Name = "my thread"
myThread.IsBackground = True
With object initialisers, you can combine the first 3 statements into one like so:
Thread myThread = new Thread(MyThreadMethod) {Name="my thread", IsBackground=true};
In VB:
Dim myThread As New Thread(AddressOf MyThreadMethod) With {.Name = "my thread", .IsBackground = True}
So, object initialisers is a feature that lets you assign public properties (and public fields) straight after the constructor in braces, without having to repeat the object variable name and type separate statements.

Note that the compiler generates the long hand code. For example, when you type the following statement:
TextBox t = new TextBox {Text="Hi", Multiline=true, Location = new Point(5,5), Size=new Size(50,100)};
In VB:
Dim t As New TextBox With {.Text = "Hi", .Multiline = True, .Location = New Point(5, 5), .Size = New Size(50, 100)}
...the compiler generates IL similar to if you typed:
      TextBox t = new TextBox(); 
t.Text = "Hi";
t.Multiline = true;
t.Location = new Point(5, 5);
t.Size = new Size(50, 100);
This feature saves you some typing and results in more concise code. While I like object initialisers, their full usefulness will become apparent when combined with the language enhancement that we look at next.

Option Infer in VB9

Sun, February 18, 2007, 06:30 PM under dotNET | Orcas | LINQ
I will assume here that you have read the narrative to my C# post about local variable type inference and build on it to discuss the VB syntax.

The VB syntax for local variable type inference is:
Dim i = 3
Dim s = "hi"
Dim o = New SomeType()
...which the compiler generates as:
Dim i As Integer = 3
Dim s As String = "hi"
Dim o As New SomeType()
Note that for reference types, you are not really saving a lot of typing compared to C#. The example I used in that post was:
    Dim myCol = New Dictionary(Of Integer, SomeType)()
, which represents a saving of only one character in VB:
    Dim myCol As New Dictionary(Of Integer, SomeType)()
What is more important is that some VB developers will be thinking right now “So what? I could always type Dim o = whatever”. My answer is “Not if you turned Option Strict On”. Don’t forget what I wrote in the c# version: this is early binding. Like shown above, the compiler infers the type and generates it for you. At this point, the astute reader will have a question: “So if I turn off Option Strict, then what happens? The old behaviour or the new?”. To which my answer is “Excellent point, let’s talk about that :-)”.

There is a new option in VB9: Option Infer. Option Infer, like its cousins, can be turned on/off at the project level or at the code file level. If you have Option Strict Off, then whether a variable has the old behaviour or new is determined by the setting of Option Infer. This also means that, unlike c#, you can turn off local variable type inference even in the strongly typed world (by setting Option Strict to On and then Option Infer to off). I believe the default for new VB projects will be that Option Infer is On.

There is a good opportunity here for all that VB6 code you ported over to .NET land and did not want to revisit with Option Strict On. If you turn Option Infer On, the code that did work, will now work faster. The code that had issues, will still have issues (for that you must still turn option Strict On).

Anyway, back to our syntax, the following example shows how to get variable inference in the For Each case like I showed before in C#:
    Dim col As IEnumerable(Of SomeType) = New SomeCollectionType()
For Each a In col 'a is inferred by the compiler to be SomeType
a.MethodOnSomeType() 'intellisense here pops-up as expected
Next
In the code above, the variable a is inferred by the compiler to be SomeType. Of course, if you turn Option Infer Off, then you will get a compiler error even with Option Strict Off. The only way not to get a compiler error for variable a is to turn Option Explicit Off too but we know that is just mad.

More language features follow!

Local variable type inference in C# 3.0

Sun, February 18, 2007, 06:22 PM under dotNET | Orcas | LINQ
A new feature in C# version 3 that comes with Orcas allows you to declare variables within the body of a method like this:
var i = 3;
var s = "hi";
var o = new SomeType();
Note that the compiler generates IL that is identical to what it would generate if you typed:
int i = 3;
string s = "hi";
SomeType o = new SomeType();
In other words the 2nd line below compiles fine but the 3rd does not compile:
var b = true;
b = false; //fine
b = "pull the other one"; //Cannot implicitly convert type 'string' to 'bool'
So, the important point is that this local variable type inference, results in early-bound strongly-typed code. There is no late-binding taking place, besides the similarity with the var keyword in script languages where everything is an object/variant - that is not the case here.

Another point to stress is that variable inference does not work for class level fields or method arguments or anywhere else other than for local variables in a method.

Another example is the following:
IEnumerable<SomeType> col = new SomeCollectionType();
foreach (var a in col) //a is inferred by the compiler to be SomeType
{
a.MethodOnSomeType(); //intellisense here pops-up as expected
}
It is neat that the compiler can infer what type you expect a variable to be by looking at the right hand-side of the assignment, but I personally will not be using this often as I prefer explicitness. I guess it is a time saver for very long types, for example:
// I'd rather type this
var myCol = new Dictionary<int, SomeType>();
//...than type this
Dictionary<int, SomeType> myCol = new Dictionary<int, SomeType>();
Local variable type inference is a feature that seems not very useful in its own right, but becomes important when used in conjunction with other language enhancements.

Language INtegrated Query

Sun, February 18, 2007, 06:16 PM under dotNET | Orcas | LINQ
The biggest announcement at the last PDC was the LINQ project. It is now the headline feature of Orcas offered to all developers from web to client to device. Given that there are no changes to the runtime engine in Orcas, you will not be surprised to learn that LINQ is based entirely on compiler magic, the introduction of some fancy new syntax and some help from Core.dll.

LINQ simplifies querying objects, data and XML by integrating query and transform operations into the programming language (currently VB9 and C#3). It introduces a concise declarative syntax that is consistent irrespective of what your underlying data source is: in memory custom objects, datasets, XML, Entities etc. LINQ is also highly extensible which means that we are already seeing LINQ to other sources (e.g. search for blinq, plinq, linq to amazon, linq to WMI and so on).

I think that most devs will make most use of the LINQ-enabled ADO.NET (LINQ to SQL, LINQ to DataSet and LINQ to Entities) and of LINQ over XML. Before you start learning about those, it is my opinion you should learn about LINQ to objects first and then build on that.

Here is an example of some LINQ syntax:
    static void Main(string[] args)
{
var results =
from p in Process.GetProcesses()
select p;
//
foreach (var o in results)
{
Console.WriteLine(o.ToString());
}
//
Console.ReadLine();
}
This will output in your console a list of all the running processes on your machine in the following format "System.Diagnostics.Process (devenv)".

We can modify the query as follows:
        from p in Process.GetProcesses()
where p.Threads.Count > 6
select p;
...which unsurprisingly will trim the list to only show processes that have more than 6 threads.

We can modify the query further like so:
        from p in Process.GetProcesses()
where p.Threads.Count > 6
orderby p.ProcessName descending
select p;
...which due to the easily readable nature of linq, you can easily tell that it orders the resulting list of processes in descending order by the ProcessName (on my machine that puts the winlogon process at the top).

How about if I just want to see a few specific properties of each process (say Name, thread count and process id) ? Easily done by changing the query as follows:
        from p in Process.GetProcesses()
where p.Threads.Count > 6
orderby p.ProcessName descending
select new {p.ProcessName, ThreadCount = p.Threads.Count, p.Id };
...which on my machine produces this screenshot.

Finally, the Visual Basic syntax for the example above follows:
  Sub Main()
Dim results = _
From p In Process.GetProcesses() _
Where p.Threads.Count > 6 _
Order By p.ProcessName Descending _
Select New With {p.ProcessName, .ThreadCount = p.Threads.Count, p.Id}
'
For Each o In results
Console.WriteLine(o.ToString())
Next
'
Console.ReadLine()
End Sub
I think you get the idea: Picture the code you would have to write today to achieve the same results! The code above is easier to write, easier to read and maintain, and much more concise. Furthermore, because of its declarative nature, the compiler can perform any optimisations that it sees fit and is not restricted to your imperative solution (this is a promising aspect and one I'll revisit in the future). The other benefit is that the syntax you see above can be used to query XML and data as hinted at the beginning of this post. Again, you’ll hear more on that in a future blog entry.

The next step is to understand how the above funny syntax really works and that is the subject of my following blog entries available right now :)