Back to libsm overview

libsm : Assert and Abort


$Id: assert.html,v 1.6 2001-08-27 21:47:03 ca Exp $

Introduction

This package contains abstractions for assertion checking and abnormal program termination.

Synopsis

#include <sm/assert.h>

/*
**  abnormal program termination
*/

void sm_abort_at(char *filename, int lineno, char *msg);
typedef void (*SM_ABORT_HANDLER)(char *filename, int lineno, char *msg);
void sm_abort_sethandler(SM_ABORT_HANDLER);
void sm_abort(char *fmt, ...)

/*
**  assertion checking
*/

SM_REQUIRE(expression)
SM_ASSERT(expression)
SM_ENSURE(expression)

extern SM_DEBUG_T SmExpensiveRequire;
extern SM_DEBUG_T SmExpensiveAssert;
extern SM_DEBUG_T SmExpensiveEnsure;

#if SM_CHECK_REQUIRE
#if SM_CHECK_ASSERT
#if SM_CHECK_ENSURE

cc -DSM_CHECK_ALL=0 -DSM_CHECK_REQUIRE=1 ...

Abnormal Program Termination

The functions sm_abort and sm_abort_at are used to report a logic bug and terminate the program. They can be invoked directly, and they are also used by the assertion checking macros.
void sm_abort_at(char *filename, int lineno, char *msg)
This is the low level interface for causing abnormal program termination. It is intended to be invoked from a macro, such as the assertion checking macros. If filename != NULL then filename and lineno specify the line of source code on which the logic bug is detected. These arguments are normally either set to __FILE__ and __LINE__ from an assertion checking macro, or they are set to NULL and 0. The default action is to print an error message to smioerr using the arguments, and then call abort(). This default behaviour can be changed by calling sm_abort_sethandler.

void sm_abort_sethandler(SM_ABORT_HANDLER handler)
Install 'handler' as the callback function that is invoked by sm_abort_at. This callback function is passed the same arguments as sm_abort_at, and is expected to log an error message and terminate the program. The callback function should not raise an exception or perform cleanup: see Rationale. sm_abort_sethandler is intended to be called once, from main(), before any additional threads are created: see Rationale. You should not use sm_abort_sethandler to switch back and forth between several handlers; this is particularly dangerous when there are multiple threads, or when you are in a library routine.

void sm_abort(char *fmt, ...)
This is the high level interface for causing abnormal program termination. It takes printf arguments. There is no need to include a trailing newline in the format string; a trailing newline will be printed if appropriate by the handler function.

Assertions

The assertion handling package supports a style of programming in which assertions are used liberally throughout the code, both as a form of documentation, and as a way of detecting bugs in the code by performing runtime checks.

There are three kinds of assertion:

SM_REQUIRE(expr)
This is an assertion used at the beginning of a function to check that the preconditions for calling the function have been satisfied by the caller.

SM_ENSURE(expr)
This is an assertion used just before returning from a function to check that the function has satisfied all of the postconditions that it is required to satisfy by its contract with the caller.

SM_ASSERT(expr)
This is an assertion that is used in the middle of a function, to check loop invariants, and for any other kind of check that is not a "require" or "ensure" check.
If any of the above assertion macros fail, then sm_abort_at is called. By default, a message is printed to stderr and the program is aborted. For example, if SM_REQUIRE(arg > 0) fails because arg <= 0, then the message
foo.c:47: SM_REQUIRE(arg > 0) failed
is printed to stderr, and abort() is called. You can change this default behaviour using sm_abort_sethandler.

How To Disable Assertion Checking At Compile Time

You can use compile time macros to selectively enable or disable each of the three kinds of assertions, for performance reasons. For example, you might want to enable SM_REQUIRE checking (because it finds the most bugs), but disable the other two types.

By default, all three types of assertion are enabled. You can selectively disable individual assertion types by setting one or more of the following cpp macros to 0 before <sm/assert.h> is included for the first time:

SM_CHECK_REQUIRE
SM_CHECK_ENSURE
SM_CHECK_ASSERT
Or, you can define SM_CHECK_ALL as 0 to disable all assertion types, then selectively define one or more of SM_CHECK_REQUIRE, SM_CHECK_ENSURE or SM_CHECK_ASSERT as 1. For example, to disable all assertions except for SM_REQUIRE, you can use these C compiler flags:
-DSM_CHECK_ALL=0 -DSM_CHECK_REQUIRE=1
After <sm/assert.h> is included, the macros SM_CHECK_REQUIRE, SM_CHECK_ENSURE and SM_CHECK_ASSERT are each set to either 0 or 1.

How To Write Complex or Expensive Assertions

Sometimes an assertion check requires more code than a simple boolean expression. For example, it might require an entire statement block with its own local variables. You can code such assertion checks by making them conditional on SM_CHECK_REQUIRE, SM_CHECK_ENSURE or SM_CHECK_ASSERT, and using sm_abort to signal failure.

Sometimes an assertion check is significantly more expensive than one or two comparisons. In such cases, it is not uncommon for developers to comment out the assertion once the code is unit tested. Please don't do this: it makes it hard to turn the assertion check back on for the purposes of regression testing. What you should do instead is make the assertion check conditional on one of these predefined debug objects:

SmExpensiveRequire
SmExpensiveAssert
SmExpensiveEnsure
By doing this, you bring the cost of the assertion checking code back down to a single comparison, unless expensive assertion checking has been explicitly enabled. By the way, the corresponding debug category names are
sm_check_require
sm_check_assert
sm_check_ensure
What activation level should you check for? Higher levels correspond to more expensive assertion checks. Here are some basic guidelines:
level 1: < 10 basic C operations
level 2: < 100 basic C operations
level 3: < 1000 basic C operations
...

Here's a contrived example of both techniques:

void
w_munge(WIDGET *w)
{
    SM_REQUIRE(w != NULL);
#if SM_CHECK_REQUIRE
    /*
    **  We run this check at level 3 because we expect to check a few hundred
    **  table entries.
    */

    if (sm_debug_active(&SmExpensiveRequire, 3))
    {
        int i;

        for (i = 0; i < WIDGET_MAX; ++i)
        {
            if (w[i] == NULL)
                sm_abort("w_munge: NULL entry %d in widget table", i);
        }
    }
#endif /* SM_CHECK_REQUIRE */

Other Guidelines

You should resist the urge to write SM_ASSERT(0) when the code has reached an impossible place. It's better to call sm_abort, because then you can generate a better error message. For example,
switch (foo)
{
    ...
  default:
    sm_abort("impossible value %d for foo", foo);
}
Note that I did not bother to guard the default clause of the switch statement with #if SM_CHECK_ASSERT ... #endif, because there is probably no performance gain to be had by disabling this particular check.

Avoid including code that has side effects inside of assert macros, or inside of SM_CHECK_* guards. You don't want the program to stop working if assertion checking is disabled.

Rationale for Logic Bug Handling

When a logic bug is detected, our philosophy is to log an error message and terminate the program, dumping core if possible. It is not a good idea to raise an exception, attempt cleanup, or continue program execution. Here's why.

First of all, to facilitate post-mortem analysis, we want to dump core on detecting a logic bug, disturbing the process image as little as possible before dumping core. We don't want to raise an exception and unwind the stack, executing cleanup code, before dumping core, because that would obliterate information we need to analyze the cause of the abort.

Second, it is a bad idea to raise an exception on an assertion failure because this places unacceptable restrictions on code that uses the assertion macros. The reason is this: the sendmail code must be written so that anywhere it is possible for an assertion to be raised, the code will catch the exception and clean up if necessary, restoring data structure invariants and freeing resources as required. If an assertion failure was signalled by raising an exception, then every time you added an assertion, you would need to check both the function containing the assertion and its callers to see if any exception handling code needed to be added to clean up properly on assertion failure. That is far too great a burden.

It is a bad idea to attempt cleanup upon detecting a logic bug for several reasons:

Here is a strategy for making sendmail fault tolerant. Sendmail is structured as a collection of processes. The "root" process does as little as possible, except spawn children to do all of the real work, monitor the children, and act as traffic cop. We use exceptions to signal expected but infrequent error conditions, so that the process encountering the exceptional condition can clean up and keep going. (Worker processes are intended to be long lived, in order to minimize forking and increase performance.) But when a bug is detected in a sendmail worker process, the worker process does minimal or no cleanup and then dies. A bug might be detected in several ways: the process might dereference a NULL pointer, receive a signal 11, core dump and die, or an assertion might fail, in which case the process commits suicide. Either way, the root process detects the death of the worker, logs the event, and spawns another worker.

Rationale for Naming Conventions

The names "require" and "ensure" come from the writings of Bertrand Meyer, a prominent evangelist for assertion checking who has written a number of papers about the "Design By Contract" programming methodology, and who created the Eiffel programming language. Many other assertion checking packages for C also have "require" and "ensure" assertion types. In short, we are conforming to a de-facto standard.

We use the names SM_REQUIRE, SM_ASSERT and SM_ENSURE in preference to to REQUIRE, ASSERT and ENSURE because at least two other open source libraries (libisc and libnana) define REQUIRE and ENSURE macros, and many libraries define ASSERT. We want to avoid name conflicts with other libraries.