About Watchdog

Tue, December 21, 2004, 03:47 PM under MobileAndEmbedded
My experience in this area is with CF apps on CE, but the principles apply on other platforms/technologies.

So the requirement is that your application never hangs (e.g. like it would if you touched a UI control from a worker thread) *and* never sits there with a crash dialog the user stares at (like it would if an unhandled exception occurred).

The way we achieve this is by implementing a watchdog. In a sentence, a watchdog monitors your application and when the latter becomes unresponsive, the watchdog takes a certain action. How does the dog know your application is unresponsive? Your application tells it every X units of time (e.g. 5 milliseconds or 30 seconds etc) that it is OK. How does it tell it? By any mechanism you design, e.g. signaling a named event, calling a library/platform method etc. What does the watchdog do when it has not been told for some time? It restarts your app or resets the unit or notifies some other resource. For embedded devices you can get dedicated hardware watchdogs, and even some chips (e.g. the XScale) include watchdog features. WinCE 5.0 offers watchdog APIs so you could even use those rather than roll your own.

No matter how you implement it, the watchdog is usually very simple, in order to avoid it being susceptible to locks/crashes itself!

To recap:
1. You need a watchdog (be it another process, a physical part or whatever) a.k.a. the dog
2. Your watchdogged app tells the dog it is OK on a predefined interval (a.k.a. kick, pet, tickle, stroke etc)
3. When the dog has not received a kick (or a number of) on predefined interval(s) it takes some action (usually restart the app/unit)

You can make the situation more complex by adjusting at runtime the interval (depending on what your app does), you can have different threads in your app kicking at different intervals and so on and so forth.

Next time I'll discuss an application of the above principles for a CF app on a CE device.