The way I think about Diagnostic tools

Tue, March 13, 2012, 06:47 PM under Random | SoftwareProcess

Every software has issues, or as we like to call them "bugs". That is not a discussion point, just a mere fact. It follows that an important skill for developers is to be able to diagnose issues in their code. Of course we need to advance our tools and techniques so we can prevent bugs getting into the code (e.g. unit testing), but beyond designing great software, diagnosing bugs is an equally important skill.

To diagnose issues, the most important assets are good techniques, skill, experience, and maybe talent. What also helps is having good diagnostic tools and what helps further is knowing all the features that they offer and how to use them.

The following classification is how I like to think of diagnostics. Note that like with any attempt to bucketize anything, you run into overlapping areas and blurry lines. Nevertheless, I will continue sharing my generalizations ;-)

It is important to identify at the outset if you are dealing with a performance or a correctness issue.

  1. If you have a performance issue, use a profiler. I hear people saying "I am using the debugger to debug a performance issue", and that is fine, but do know that a dedicated profiler is the tool for that job. Just because you don't need them all the time and typically they cost more plus you are not as familiar with them as you are with the debugger, doesn't mean you shouldn't invest in one and instead try to exclusively use the wrong tool for the job. Visual Studio has a profiler and a concurrency visualizer (for profiling multi-threaded apps).
  2. If you have a correctness issue, then you have several options - that's next :-)

This is how I think of identifying a correctness issue

  1. Do you want a tool to find the issue for you at design time? The compiler is such a tool - it gives you an exact list of errors. Compilers now also offer warnings, which is their way of saying "this may be an error, but I am not smart enough to know for sure". There are also static analysis tools, which go a step further than the compiler in identifying issues in your code, sometimes with the aid of code annotations and other times just by pointing them at your raw source. An example is FxCop and much more in Visual Studio 11 Code Analysis.
  2. Do you want a tool to find the issue for you with code execution? Just like static tools, there are also dynamic analysis tools that instead of statically analyzing your code, they analyze what your code does dynamically at runtime. Whether you have to setup some unit tests to invoke your code at runtime, or have to manually run your app (and interact with it) under the tool, or have to use a script to execute your binary under the tool… that varies. The result is still a list of issues for you to address after the analysis is complete or a pause of the execution when the first issue is encountered. If a code path was not taken, no analysis for it will exist, obviously. An example is the GPU Race detection tool that I'll be talking about on the C++ AMP team blog. Another example is the MSR concurrency CHESS tool.
  3. Do you want you to find the issue at design time using a tool? Perform a code walkthrough on your own or with colleagues. There are code review tools that go beyond just diffing sources, and they help you with that aspect too. For example, there is a new one in Visual Studio 11 and searching with my favorite search engine yielded this article based on the Developer Preview.
  4. Do you want you to find the issue with code execution? Use a debugger - let’s break this down further next.

This is how I think of debugging:

  1. There is post mortem debugging. That means your code has executed and you did something in order to examine what happened during its execution. This can vary from manual printf and other tracing statements to trace events (e.g. ETW) to taking dumps. In all cases, you are left with some artifact that you examine after the fact (after code execution) to discern what took place hoping it will help you find the bug. Learn how to debug dump files in Visual Studio.
  2. There is live debugging. I will elaborate on this in a separate post, but this is where you inspect the state of your program during its execution, and try to find what the problem is. More from me in a separate post on live debugging.
  3. There is a hybrid of live plus post-mortem debugging. This is for example what tools like IntelliTrace offer.

If you are a tools vendor interested in the diagnostics space, it helps to understand where in the above classification your tool excels, where its primary strength is, so you can market it as such. Then it helps to see which of the other areas above your tool touches on, and how you can make it even better there. Finally, see what areas your tool doesn't help at all with, and evaluate whether it should or continue to stay clear. Even though the classification helps us think about this space, the reality is that the best tools are either extremely excellent in only one of this areas, or more often very good across a number of them. Another approach is to offer a toolset covering all areas, with appropriate integration and hand off points from one to the other.

Anyway, with that brain dump out of the way, in follow-up posts I will dive into live debugging, and specifically live debugging in Visual Studio - stay tuned if that interests you.