Thursday, July 7, 2011

Segmentation Fault

A segmentation fault or bus error occurs when the hardware notifies a operating system about a memory access violation. On receiving the notification the OS sends a signal to the process which caused the exception. Now its upto the process receiving the exception to decide how it would like to handle this exception. By default the process receiving the signal dumps its state in to a core file and terminates but the default signal handler can be overridden to customize how the signal is handled.

There are many ways to get a segmentation fault, at least in the lower-level languages such as C(++). A common way to get a segmentation fault is to dereference a null pointer:
int *p = NULL;
*p = 1;
Segmentation fault also happens when you try to write to a portion of memory that was marked as read-only:
char *str = "Foo"; // Compiler marks the constant string as read-only
*str = 'b';              // Which means this is illegal and results in a segfault
Accessing Dangling pointer also causes segmentation faults like here:
char *p = NULL;
{
    char c;
    p = &c;
}
// Now p is dangling
The pointer p dangles because it points to character variable c that ceased to exist after the block ended. And when you try to dereference dangling pointer (like *p='A'), you would probably get a segmentation fault.


Another reason for segmentation fault is to recurse without a base case, which causes a stack overflow:
int main() { main(); }
Segmentation fault also happens when accessing a region of memory already freed

Whenever segmentation faults happens, a core is generated by the process in the current directory. By default, most of the Linux distros disable the creation of core. So if you dont find the core file run the following command to enable the creation of core files.
$ ulimit -c unlimited
 Generally core files are created as core.<pid>, pid is the pid of the process that created the core.

Running the above command will set ulimit to unlimited on that terminal. So when you start a new terminal it would not be set. So to make it default in all the new terminals add it into ~/.bashrc.

But to make sense out of the core dump you must compile the program with debugging symbols. Lets take an example program to understand how to use the core dump to the fullest.
$ cat hello.c
#include <stdio.h>
int
main ()
{
        int *p = NULL;
        *p = 2;
        return 0;
}
 Now enable debugging symbols while compiling
$ gcc -g -O0 hello.c -o hello
Now when you run this program, you will receive the following error
$ ./hello
Segmentation fault (core dumped)
You will find that a core file is created in the current directory.

Now run the core with the debugger (on most of the distros gdb is provided if not install it)
$ gdb hello core.5123
It will display the line in the program that caused the error. There are many other options that you can use with gdb to debug your issue, we will visit that in future posts.

Some of the best practices to follow while programming in C/C++ is to 
  • Always check for NULL before dereferencing the pointers. Example

             if (ptr == NULL)
                       //return with error
  • Also check for NULL after the call to dynamic memory allocation functions like malloc, calloc, new etc. Example

            ptr = (int *) malloc (10 * sizeof (int));
            if (ptr == NULL)
                       //return with error
References
* http://collectd.org/wiki/index.php/Core_file
* http://stackoverflow.com/questions/2346806/what-is-segmentation-fault
* http://en.wikipedia.org/wiki/Segmentation_fault

1 comment:

  1. this is really a good post briefing out the concept, why it happens, how to recover from it and the good part is that you have added the references so it would encourage people to get to know more abt it..

    I would like to see more on gdb in the future posts.

    ReplyDelete