Parallelizing a loop with tasks, avoiding the pitfall

Tue, October 13, 2009, 02:39 PM under ParallelComputing
Most readers of this blog should know that it is extremely easy to parallelize loops with .NET 4. So serial code performing matrix multiplication like this:
    for (int i = 0; i < size; i++)
{
for (int j = 0; j < size; j++)
{
int tmp = 0;
for (int k = 0; k < size; k++)
{
tmp += m1[i, k] * m2[k, j];
}
result[i, j] = tmp;
}
}
…can be parallelized like this:
      Parallel.For(0, size, (int i) =>
{
for (int j = 0; j < size; j++)
{
int tmp = 0;
for (int k = 0; k < size; k++)
{
tmp += m1[i, k] * m2[k, j];
}
result[i, j] = tmp;
}
});
How about parallelizing the code using Tasks directly (instead of indirectly)? The clever thing to do is somehow partition the data (statically or dynamically) and assign chunks to tasks, but even the naïve approach of one task per outer iteration is better than nothing:
    Task[] tasks = new Task[size];
for (int i = 0; i < size; i++)
{
tasks[i] = Task.Factory.StartNew(() =>
{
for (int j = 0; j < size; j++)
{
int tmp = 0;
for (int k = 0; k < size; k++)
{
tmp += m1[i, k] * m2[k, j];
}
result[i, j] = tmp;
}
});
}
Task.WaitAll(tasks);
Can you see the issue with the code above?

TIP: the issue has nothing to do with tasks, threads or even .NET 4. It has to do with anonymous methods since the day they were introduced. The clue is that the following snippet fixes the issue:
    Task[] tasks = new Task[size];
for (int n = 0; n < size; n++)
{
int i = n;
tasks[n] = Task.Factory.StartNew(() =>
{
for (int j = 0; j < size; j++)
{
int tmp = 0;
for (int k = 0; k < size; k++)
{
tmp += m1[i, k] * m2[k, j];
}
result[i, j] = tmp;
}
});
}
Task.WaitAll(tasks);
If you know the reason please move along. If however this was news to you, go read the entire page (and the outbound links) on stackoverflow.

By the way, we could have fixed the snippet with this approach too, since tasks accept a state parameter:
    Task[] tasks = new Task[size];
for (int i = 0; i < size; i++)
{
tasks[i] = Task.Factory.StartNew(ii =>
{
int iii = (int)ii;
for (int j = 0; j < size; j++)
{
int tmp = 0;
for (int k = 0; k < size; k++)
{
tmp += m1[iii, k] * m2[k, j];
}
result[iii, j] = tmp;
}
}, i);
}
Task.WaitAll(tasks);
More on why you may choose to pass the variable as state instead of capturing it, in a future post.
Thursday, October 22, 2009 4:56:14 AM (Pacific Daylight Time, UTC-07:00)
Hi Daniel,

thanks for this nice blog post... I wrote a small demo code using "Task"..it is something like this
Task[] tasks = new Task[10];

for (int i = 0; i < 10; i++)
{
tasks[i] = Task.Factory.StartNew(() =>
{
Console.WriteLine(i);
});
}
Task.WaitAll(tasks);

it is a console application. When I hit ctrl+f5 I get 10 times 10 in different lines but if I add a break point at console.writeline and debug it, I get 0,1,2...9 as an output.. whats the reason behind that???

thanks,

Maulik
Maulik
Sunday, November 01, 2009 6:59:44 PM (Pacific Standard Time, UTC-08:00)
Maulik, with or without breakpoints, the output of your snippet is not guaranteed and it depends on timing. Run it a few times (and maybe even on different machine configurations) and you’ll establish yourself the deterministic behavior. Change your snippet to use a local variable (or pass it as state) and then the results become deterministic (you’ll get 0 to 9, but of course the order will not be guaranteed). For why this happens read the links I provided.
Monday, November 23, 2009 1:45:04 AM (Pacific Standard Time, UTC-08:00)
Hi Daniel,

I don't follow the last example passing state. Surely you need to copy the loop variable before the call to StartNew(), not after it in the body of the statement lambda, thus:

Task[] tasks = new Task[size];
for (int i = 0; i < size; i++)
{
int n = i;
tasks[i] = Task.Factory.StartNew(ii =>
{
for (int j = 0; j < size; j++)
{
int tmp = 0;
for (int k = 0; k < size; k++)
{
tmp += m1[ii, k] * m2[k, j];
}
result[ii, j] = tmp;
}
}, n);
}
Task.WaitAll(tasks);
RandomNoise
Thursday, December 03, 2009 7:36:35 PM (Pacific Standard Time, UTC-08:00)
RandomNoise, sorry I don't follow the thing you don't follow :-|

Maybe if you try to put the two pieces of code in Visual Studio you'll be able to follow my example or better describe what you think is wrong.
Monday, December 07, 2009 2:31:48 AM (Pacific Standard Time, UTC-08:00)
Hi Daniel,

I was querying the purpose of the statement 'int iii = (int)ii;' in your last example, since it looks a lot like the statement 'int i = n;' in the previous example and which was the main point of your post, but happens after the call to StartNew(), so can't be achieving the same purpose.

What I hadn't picked up on was that the Task.Factory.StartNew method only accepts Action<object> delegates, therefore to pass an int as a parameter to the closure one has to pass it as an object and cast it in the body of the statement lambda back to an int, hence the statement 'int iii = (int)ii;'.
RandomNoise
Comments are closed.