[CPROVER Manual TOC](../../)

## Example: Buffer Overflows

To give a brief overview of the capabilities of CBMC
we start with a simple example. The issue of *[buffer
overflows](http://en.wikipedia.org/wiki/Buffer_overflow)* has obtained
wide public attention. A buffer is a contiguously allocated chunk of
memory, represented by an array or a pointer in C. Programs written in C
do not provide automatic bounds checking on the buffer, which means a
program can – accidentally or deliberately – write beyond a buffer. The
following example is a perfectly valid C program (in the sense that a
compiler compiles it without any errors):

```C
int main()
{
  int buffer[10];
  buffer[20] = 10;
}
```

However, the write access to an address outside the allocated memory
region can lead to unexpected behavior. In particular, such bugs can be
exploited to overwrite the return address of a function, thus enabling
the execution of arbitrary user-induced code. CBMC is able
to detect this problem and reports that the "upper bound property" of
the buffer has been violated. CBMC is capable of checking these
lower and upper bounds, even for arrays with dynamic size. A detailed
discussion of the properties that CBMC can check
automatically is [here](../../properties/).

## First Steps

We assume you have already installed CBMC and the necessary support
files on your system. If not so, please follow the instructions
[here](../../installation/).

Like a compiler, CBMC takes the names of .c files as command line
arguments. CBMC then translates the program and merges the function
definitions from the various .c files, just like a linker. But instead
of producing a binary for execution, CBMC performs symbolic simulation
on the program.

As an example, consider the following simple program, named
[file1.c](https://raw.githubusercontent.com/diffblue/cbmc/develop/doc/cprover-manual/file1.c):

```C
int puts(const char *s)
{
}

int main(int argc, char **argv)
{
  puts(argv[2]);
  return 0;
}
```

Of course, this program is faulty, as the `argv` array might have fewer
than three elements, and then the array access `argv[2]` is out of
bounds. Now, run CBMC:

```
cbmc file1.c --show-properties --bounds-check --pointer-check
```

The two options `--bounds-check` and `--pointer-check` instruct CBMC to
look for errors related to pointers and array bounds. CBMC will print
the list of properties it checks. Note that it lists, among others, a
property labelled with "pointer outside object bounds in argv" together with the location
of the faulty array access. As you can see, CBMC largely determines the
property it needs to check itself. This is realized by means of a
preliminary static analysis, which relies on computing a fixed point on
various [abstract
domains](http://en.wikipedia.org/wiki/Abstract_interpretation). More
detail on automatically generated properties is provided
[here](../../properties/).

Note that these automatically generated properties need not necessarily
correspond to bugs – these are just *potential* flaws, as abstract
interpretation might be imprecise. Whether these properties hold or
correspond to actual bugs needs to be determined by further analysis.

CBMC performs this analysis using *symbolic simulation*, which
corresponds to a translation of the program into a formula. The formula
is then combined with the property. Let's look at the formula that is
generated by CBMC's symbolic simulation:

```
cbmc file1.c --show-vcc --bounds-check --pointer-check
```

With this option, CBMC performs the symbolic simulation and prints the
verification conditions on the screen. A verification condition needs to
be proven to be valid by a [decision
procedure](http://en.wikipedia.org/wiki/Decision_problem) in order to
assert that the corresponding property holds. Let's run the decision
procedure:

```
cbmc file1.c --bounds-check --pointer-check
```

CBMC transforms the equation you have seen before into CNF and passes it
to a SAT solver (more background on this step is in the book on
[Decision Procedures](http://www.decision-procedures.org/)). It then
determines which of the properties that it has generated for the program
hold and which do not. Using the SAT solver, CBMC detects that the
property for the object bounds of `argv` does not hold, and will display:

```plaintext
[main.pointer_dereference.6] line 7 dereference failure: pointer outside object bounds in argv[(signed long int)2]: FAILURE
```

## Counterexample Traces

Let us have a closer look at this property and why it fails. To aid the
understanding of the problem, CBMC can generate a *counterexample trace*
for failed properties. To obtain this trace, run:

    cbmc file1.c --bounds-check --pointer-check --trace

CBMC then prints a counterexample trace, that is, a program trace that
begins with `main` and ends in a state which violates the property. In
our example, the program trace ends in the faulty array access. It also
gives the values the input variables must have for the bug to occur. In
this example, `argc` must be one to trigger the out-of-bounds array
access. If you add a branch to the example that requires that `argc>=3`,
the bug is fixed and CBMC will report that the proofs of all properties
have been successful.

To simplify further processing of counterexample traces, CBMC supports XML as a
possible output format.

```
    cbmc file1.c --bounds-check --pointer-check --trace --xml-ui
```

## Verifying Modules

In the example above, we used a program that starts with a `main`
function. However, CBMC is aimed at embedded software, and these kinds
of programs usually have different entry points. Furthermore, CBMC is
also useful for verifying program modules. Consider the following
example, called [file2.c](https://raw.githubusercontent.com/diffblue/cbmc/develop/doc/cprover-manual/file2.c):

```C
int array[10];

int sum()
{
  unsigned i, sum;

  sum = 0;
  for(i = 0; i < 10; i++)
    sum += array[i];

  return sum;
}
```

To set the entry point to the `sum` function, run:

```plaintext
cbmc file2.c --function sum --bounds-check
```

It is often necessary to build a suitable *harness* for the function in
order to set up the environment appropriately.

## Loop Unwinding

When running the previous example, you will have noted that CBMC unwinds
the `for` loop in the program. As CBMC performs Bounded Model Checking,
all loops have to have a finite upper run-time bound in order to
guarantee that all bugs are found. CBMC can optionally check that enough
unwinding is performed. As an example, consider the program
[binsearch.c](https://raw.githubusercontent.com/diffblue/cbmc/develop/doc/cprover-manual/binsearch.c):

```C
int binsearch(int x)
{
  int a[16];
  signed low = 0, high = 16;

  while(low < high)
  {
    signed middle = low + ((high - low) >> 1);

    if(a[middle] < x)
      high = middle;
    else if(a[middle] > x)
      low = middle + 1;
    else // a[middle]==x
      return middle;
  }

  return -1;
}
```

If you run CBMC on this function, you will notice that the unwinding
does not stop on its own. The built-in simplifier is not able to
determine a run time bound for this loop. The unwinding bound has to be
given as a command line argument:

```plaintext
cbmc binsearch.c --function binsearch --unwind 6 --bounds-check --unwinding-assertions
```

CBMC verifies that the array accesses are within the bounds;
note that this actually depends on the result of the right shift. In
addition, as CBMC is given the option `--unwinding-assertions`, it also
checks that enough unwinding is done, i.e., it proves a run-time bound.
For any lower unwinding bound, there are traces that require more loop
iterations. Thus, CBMC will report that the unwinding assertion has
failed. As usual, a counterexample trace that documents this can be
obtained with the option `--property`.

## Unbounded Loops

CBMC can also be used for programs with unbounded loops.  In this case, CBMC
is used for bug hunting only; CBMC does not attempt to find all bugs.  The
following program
([lock-example.c](https://raw.githubusercontent.com/diffblue/cbmc/develop/doc/cprover-manual/lock-example.c))
is an example of a program with a user-specified property:

```C
_Bool nondet_bool();
unsigned int nondet_unsigned_int();
_Bool LOCK = 0;

_Bool lock()
{
  if(nondet_bool())
  {
    assert(!LOCK);
    LOCK = 1;
    return 1;
  }

  return 0;
}

void unlock()
{
  assert(LOCK);
  LOCK = 0;
}

int main()
{
  unsigned got_lock = 0;
  unsigned times = nondet_unsigned_int();

  while(times > 0)
  {
    if(lock())
    {
      got_lock++;
      /* critical section */
    }

    if(got_lock != 0)
      unlock();

    got_lock--;
    times--;
  }
}
```

The `while` loop in the `main` function has no (useful) run-time bound.
Thus, a bound has to be set on the amount of unwinding that CBMC
performs. There are two ways to do so:

1.  The `--unwind` command-line parameter can to be used to limit the
    number of times loops are unwound.
2.  The `--depth` command-line parameter can be used to limit the number
    of program steps to be processed.

Given the option `--unwinding-assertions`, CBMC checks whether the
argument to `--unwind` is large enough to cover all program paths. If
the argument is too small, CBMC will detect that not enough unwinding is
done reports that an unwinding assertion has failed.

Reconsider the example. For a loop unwinding bound of one, no bug is
found. But for a bound of two, CBMC detects a trace that
violates an assertion. Without unwinding assertions, or when using the
`--depth` command line switch, CBMC does not prove the program correct,
but it can be helpful to find program bugs. The various command line
options that CBMC offers for loop unwinding are described in the section
on [understanding loop unwinding](../../cbmc/unwinding/).

## A Note About Compilers and the ANSI-C Library

Most C programs make use of functions provided by a library; instances
are functions from the standard ANSI-C library such as `malloc` or
`printf`. The verification of programs that use such functions has two
requirements:

1.  Appropriate header files have to be provided. These header files
    contain *declarations* of the functions that are to be used.
2.  Appropriate *definitions* have to be provided.

Most C compilers come with header files for the ANSI C library
functions. We briefly discuss how to obtain/install these library files.

### Linux

Linux systems that are able to compile software are usually equipped
with the appropriate header files. Consult the documentation of your
distribution on how to install the compiler and the header files. First
try to compile some significant program before attempting to verify it.

### Windows

On Microsoft Windows, CBMC is pre-configured to use the compiler that is
part of Microsoft's Visual Studio. Microsoft's [Visual Studio
Community](http://www.visualstudio.com/en-us/products/visual-studio-community-vs)
is fully featured and available for download for free from the Microsoft
webpage. Visual Studio installs the usual set of header files together
with the compiler. However, the Visual Studio compiler requires a large
set of environment variables to function correctly. It is therefore
required to run CBMC from the *Visual Studio Command Prompt*, which can
be found in the menu *Visual Studio Tools*.

Note that in both cases, only header files are available. CBMC only
comes with a small set of definitions, which includes functions such as
`malloc`. Detailed information about the built-in definitions is
[here](../../goto-cc/).

## Further Reading

-   [Understanding Loop Unwinding](../../cbmc/unwinding/)
-   [Hardware Verification using ANSI-C Programs as a
    Reference](http://www-2.cs.cmu.edu/~svc/papers/view-publications-ck03.html)
-   [Behavioral Consistency of C and Verilog Programs Using Bounded
    Model Checking](http://www-2.cs.cmu.edu/~svc/papers/view-publications-cky03.html)
-   [A Tool for Checking ANSI-C
    Programs](http://www-2.cs.cmu.edu/~svc/papers/view-publications-ckl2004.html)

We also have a [list of interesting applications of
CBMC](http://www.cprover.org/cbmc/applications/).
