Give a session on C++ AMP – here is how

Wed, September 21, 2011, 06:53 PM under GPGPU | ParallelComputing

2/29/2012 added some Beta notes inline

Ever since presenting on C++ AMP at the AMD Fusion conference in June, then the Gamefest conference in August, and the BUILD conference in September, I've had numerous requests about my material from folks that want to re-deliver the same session. The C++ AMP session I put together has evolved over the 3 presentations to its final form that I used at BUILD, so that is the one I recommend you base yours on.BUILD session

Please get the slides and the recording from channel9 (I'll refer to slide numbers below).

This is how I've been presenting the C++ AMP session:

Context

  1. (slide 3, 04:18-08:18) Start with a demo, on my dual-GPU machine. I've been using the N-Body sample.
  2. (slide 4) Use an nvidia slide that has additional examples of performance improvements that customers enjoy with heterogeneous computing.
  3. (slide 5) Talk a bit about the differences today between CPU and GPU hardware, leading to the fact that these will continue to co-exist and that GPUs are great for data parallel algorithms, but not much else today. One is a jack of all trades and the other is a number cruncher.
  4. (slide 6) Use the APU example from amd, as one indication that the hardware space is still in motion, emphasizing that the C++ AMP solution is a data parallel API, not a GPU API. It has a future proof design for hardware we have yet to see.
  5. (slide 7) Provide more meta-data, as blogged about when I first introduced C++ AMP.

Code

  1. (slide 9-11) Introduce C++ AMP coding with a simplistic array-addition algorithm – the slides speak for themselves.
  2. (slide 12-13) index, and extent (Beta note: the old slide also refers to a grid class, which we removed in favor of just extent)
  3. (Slide 14-16) array, array_view and comparison between them.
  4. (Slide 17) parallel_for_each.
  5. (slide 18, 21) restrict.
  6. (slide 19-20) actual restrictions of restrict(amp) – the slides speak for themselves. (Beta note: the slide refers  to restrict(direct3d), which is now restrict(amp))
  7. (slide 22) bring it altogether with a matrix multiplication example.
  8. (slide 23-24) accelerator, and accelerator_view.
  9. (slide 26-29) Introduce tiling incl. tiled matrix multiplication [tiling probably deserves a whole session instead of 6 minutes!].

IDE

  1. (slide 34,37) Briefly touch on the concurrency visualizer. It supports GPU profiling, but enhancements specific to C++ AMP come at the Beta timeframe.
  2. (slide 35-36, 51:54-59:16) Demonstrate the GPU debugging experience in VS 11.

Summary

  1. (slide 39) Re-iterate some of the points of slide 7, and add the point that C++ AMP is an open specification.
  2. (slide 40) Links to content – see slide – including where all your questions should go: http://social.msdn.microsoft.com/Forums/en/parallelcppnative/threads.

Slides for similar presentation updated for Beta

The BUILD recording and slides are valid for the VS 11 Beta and beyond, with regards to C++ AMP - so watch the session and download those slides. Additionally, if you are going to repeat the session, I have updated the slides including some tweaks and you can download the updated deck here (note the slide numbers above do not map exactly to the new deck).

"But I don't have time for a full blown session, I only need 2 (or just 1, or 3) C++ AMP slides to use in my session on related topic X"

If all you want is a small number of slides, you can take some from the session above and customize them. But because I am so nice, I have created some slides for you, including talking points in the notes section. Download them here.