
In computing, a null pointer or null reference is a value saved for indicating that the pointer or reference does not refer to a valid object. Programs routinely use null pointers to represent conditions such as the end of a list of unknown length or the failure to perform some action; this use of null pointers can be compared to nullable types and to the Nothing value in an option type.
A null pointer should not be confused with an uninitialized pointer: a null pointer is guaranteed to compare unequal to any pointer that points to a valid object. However, in general, most languages do not offer such guarantee for uninitialized pointers. It might compare equal to other, valid pointers; or it might compare equal to null pointers. It might do both at different times; or the comparison might be undefined behaviour. Also, in languages offering such support, the correct use depends on the individual experience of each developer and linter tools. Even when used properly, null pointers are semantically incomplete, since they do not offer the possibility to express the difference between "not applicable", "not known", and "future" values.[citation needed]
Because a null pointer does not point to a meaningful object, an attempt to access the data stored at that (invalid) memory location may cause a run-time error or immediate program crash. This is the null pointer error. It is one of the most common types of software weaknesses, and Tony Hoare, who introduced the concept, has referred to it as a "billion dollar mistake".
C
In C, two null pointers of any type are guaranteed to compare equal. Prior to C23, the preprocessor macro NULL
was provided, defined as an implementation-defined null pointer constant in <stdlib.h>
, which in C99 can be portably expressed as ((void *)0)
, the integer value 0
converted to the type void*
(see pointer to void type). Since C23, a null pointer is represented with nullptr
which is of type nullptr_t
(something originally introduced to C++11), providing a type safe null pointer.
The C standard does not say that the null pointer is the same as the pointer to memory address 0, though that may be the case in practice. Dereferencing a null pointer is undefined behavior in C, and a conforming implementation is allowed to assume that any pointer that is dereferenced is not null.
In practice, dereferencing a null pointer may result in an attempted read or write from memory that is not mapped, triggering a segmentation fault or memory access violation. This may manifest itself as a program crash, or be transformed into a software exception that can be caught by program code. There are, however, certain circumstances where this is not the case. For example, in x86 real mode, the address 0000:0000
is readable and also usually writable, and dereferencing a pointer to that address is a perfectly valid but typically unwanted action that may lead to undefined but non-crashing behavior in the application; if a null pointer is represented as a pointer to that address, dereferencing it will lead to that behavior. There are occasions when dereferencing a pointer to address zero is intentional and well-defined; for example, BIOS code written in C for 16-bit real-mode x86 devices may write the interrupt descriptor table (IDT) at physical address 0 of the machine by dereferencing a pointer with the same value as a null pointer for writing. It is also possible for the compiler to optimize away the null pointer dereference, avoiding a segmentation fault but causing other undesired behavior.
C++
In C++, while the NULL
macro was inherited from C, the integer literal for zero has been traditionally preferred to represent a null pointer constant. However, C++11 introduced the explicit null pointer constant nullptr
and type nullptr_t
to be used instead, providing a type safe null pointer. nullptr
and type nullptr_t
were later introduced to C in C23.
Other languages
In some programming language environments (at least one proprietary Lisp implementation, for example),[citation needed] the value used as the null pointer (called nil
in Lisp) may actually be a pointer to a block of internal data useful to the implementation (but not explicitly reachable from user programs), thus allowing the same register to be used as a useful constant and a quick way of accessing implementation internals. This is known as the nil
vector.
In languages with a tagged architecture, a possibly null pointer can be replaced with a tagged union which enforces explicit handling of the exceptional case; in fact, a possibly null pointer can be seen as a tagged pointer with a computed tag.
Programming languages use different literals for the null pointer. In Python, for example, a null value is called None
. In Java and C#, the literal null
is provided as a literal for reference types. In Pascal and Swift, a null pointer is called nil
. In Eiffel, it is called a void
reference.
Null dereferencing
Because a null pointer does not point to a meaningful object, an attempt to dereference (i.e., access the data stored at that memory location) a null pointer usually (but not always) causes a run-time error or immediate program crash. MITRE lists the null pointer error as one of the most commonly exploited software weaknesses.
- In C, dereferencing a null pointer is undefined behavior. Many implementations cause such code to result in the program being halted with an access violation, because the null pointer representation is chosen to be an address that is never allocated by the system for storing objects. However, this behavior is not universal. It is also not guaranteed, since compilers are permitted to optimize programs under the assumption that they are free of undefined behavior.
- In Delphi and many other Pascal implementations, the constant
nil
represents a null pointer to the first address in memory which is also used to initialize managed variables. Dereferencing it raises an external OS exception which is mapped onto a PascalEAccessViolation
exception instance if theSystem.SysUtils
unit is linked in theuses
clause. - In Java, access to a null reference (
null
) causes aNullPointerException
(NPE), which can be caught by error handling code, but the preferred practice is to ensure that such exceptions never occur. - In Lisp,
nil
is a first class object. By convention,(first nil)
isnil
, as is(rest nil)
. So dereferencingnil
in these contexts will not cause an error, but poorly written code can get into an infinite loop. - In .NET and C#, access to null reference (
null
) causes aNullReferenceException
to be thrown. Although catching these is generally considered bad practice, this exception type can be caught and handled by the program. - In Objective-C, messages may be sent to a
nil
object (which is a null pointer) without causing the program to be interrupted; the message is simply ignored, and the return value (if any) isnil
or0
, depending on the type. - Before the introduction of Supervisor Mode Access Prevention (SMAP), a null pointer dereference bug could be exploited by mapping page zero into the attacker's address space and hence causing the null pointer to point to that region. This could lead to code execution in some cases.
Mitigation
While we could have languages with no nulls, most do have the possibility of nulls so there are techniques to avoid or aid debugging null pointer dereferences. Bond et al. suggest modifying the Java Virtual Machine (JVM) to keep track of null propagation.
We have three levels of handling null references, in order of effectiveness:
- languages with no null;
- languages that can statically analyse code to avoid the possibility of null dereference at run time;
- if null dereference can occur at runtime, tools that aid debugging.
Pure functional languages are an example of level 1 since no direct access is provided to pointers and all code and data is immutable. User code running in interpreted or virtual-machine languages generally does not suffer the problem of null pointer dereferencing.[dubious – discuss]
Where a language does provide or utilise pointers which could become void, it is possible to avoid runtime null dereferences by providing compilation-time checking via static analysis or other techniques, with syntactic assistance from language features such as those seen in the Eiffel programming language with Void safety to avoid null derefences, D, and Rust.
In some languages analysis can be performed using external tools, but these are weak compared to direct language support with compiler checks since they are limited by the language definition itself.
The last resort of level 3 is when a null reference occurs at runtime, debugging aids can help.
Alternatives to null pointers
This section does not cite any sources.(March 2025) |
As a rule of thumb, for each type of struct or class, define some objects representing some state of the business logic replacing the undefined behaviour on null. For example, "future" to indicate a field inside a structure that will not be available right now (but for which we know in advance that in the future it will be defined), "not applicable" to indicate a field in a non-normalized structure, "error", "timeout" to indicate that the field could not be initialized (probably stopping normal execution of the full program, thread, request or command).
History
In 2009, Tony Hoare stated that he invented the null reference in 1965 as part of the ALGOL W language. In that 2009 reference Hoare describes his invention as a "billion-dollar mistake":
I call it my billion-dollar mistake. It was the invention of the null reference in 1965. At that time, I was designing the first comprehensive type system for references in an object oriented language (ALGOL W). My goal was to ensure that all use of references should be absolutely safe, with checking performed automatically by the compiler. But I couldn't resist the temptation to put in a null reference, simply because it was so easy to implement. This has led to innumerable errors, vulnerabilities, and system crashes, which have probably caused a billion dollars of pain and damage in the last forty years.
See also
- Memory debugger
- Zero page
Notes
- "CWE-476: NULL Pointer Dereference". MITRE.
- "Null References: The Billion Dollar Mistake". InfoQ. Retrieved 5 September 2024.
- ISO/IEC 9899, clause 6.3.2.3, paragraph 4.
- ISO/IEC 9899, clause 7.17, paragraph 3: NULL... which expands to an implementation-defined null pointer constant...
- ISO/IEC 9899, clause 6.3.2.3, paragraph 3.
- "WR14-N3042: Introduce the nullptr constant". open-std.org. 2022-07-22. Archived from the original on December 24, 2022.
- ISO/IEC 9899, clause 6.5.3.2, paragraph 4, esp. footnote 87.
- Lattner, Chris (2011-05-13). "What Every C Programmer Should Know About Undefined Behavior #1/3". blog.llvm.org. Archived from the original on 2023-06-14. Retrieved 2023-06-14.
- Stroustrup, Bjarne (March 2001). "Chapter 5:
Theconst
qualifier (§5.4) prevents accidental redefinition ofNULL
and ensures thatNULL
can be used where a constant is required.". The C++ Programming Language (14th printing of 3rd ed.). United States and Canada: Addison–Wesley. p. 88. ISBN 0-201-88954-4. - "CWE-476: NULL Pointer Dereference". MITRE.
- The Objective-C 2.0 Programming Language, section "Sending Messages to nil".
- "OS X exploitable kernel NULL pointer dereference in AppleGraphicsDeviceControl"
- Bond, Michael D.; Nethercote, Nicholas; Kent, Stephen W.; Guyer, Samuel Z.; McKinley, Kathryn S. (2007). "Tracking bad apples". Proceedings of the 22nd annual ACM SIGPLAN conference on Object oriented programming systems and applications - OOPSLA '07. p. 405. doi:10.1145/1297027.1297057. ISBN 9781595937865. S2CID 2832749.
- "Void-safety: Background, definition, and tools". Retrieved 2021-11-24.
- Bartosz Milewski. "SafeD – D Programming Language". Retrieved 17 July 2014.
- "Fearless Security: Memory Safety". Archived from the original on 8 November 2020. Retrieved 4 November 2020.
- Tony Hoare (2009-08-25). "Null References: The Billion Dollar Mistake". InfoQ.com.
- Tony Hoare (2009-08-25). "Presentation: "Null References: The Billion Dollar Mistake"". InfoQ.com.
References
- Joint Technical Committee ISO/IEC JTC 1, Subcommittee SC 22, Working Group WG 14 (2007-09-08). International Standard ISO/IEC 9899 (PDF) (Committee Draft).
{{cite book}}
: CS1 maint: multiple names: authors list (link) CS1 maint: numeric names: authors list (link)
In computing a null pointer or null reference is a value saved for indicating that the pointer or reference does not refer to a valid object Programs routinely use null pointers to represent conditions such as the end of a list of unknown length or the failure to perform some action this use of null pointers can be compared to nullable types and to the Nothing value in an option type A null pointer should not be confused with an uninitialized pointer a null pointer is guaranteed to compare unequal to any pointer that points to a valid object However in general most languages do not offer such guarantee for uninitialized pointers It might compare equal to other valid pointers or it might compare equal to null pointers It might do both at different times or the comparison might be undefined behaviour Also in languages offering such support the correct use depends on the individual experience of each developer and linter tools Even when used properly null pointers are semantically incomplete since they do not offer the possibility to express the difference between not applicable not known and future values citation needed Because a null pointer does not point to a meaningful object an attempt to access the data stored at that invalid memory location may cause a run time error or immediate program crash This is the null pointer error It is one of the most common types of software weaknesses and Tony Hoare who introduced the concept has referred to it as a billion dollar mistake CIn C two null pointers of any type are guaranteed to compare equal Prior to C23 the preprocessor macro NULL was provided defined as an implementation defined null pointer constant in lt a href wiki Stdlib h class mw redirect title Stdlib h stdlib h a gt which in C99 can be portably expressed as void 0 the integer value 0 converted to the type void see pointer to void type Since C23 a null pointer is represented with nullptr which is of type nullptr t something originally introduced to C 11 providing a type safe null pointer The C standard does not say that the null pointer is the same as the pointer to memory address 0 though that may be the case in practice Dereferencing a null pointer is undefined behavior in C and a conforming implementation is allowed to assume that any pointer that is dereferenced is not null In practice dereferencing a null pointer may result in an attempted read or write from memory that is not mapped triggering a segmentation fault or memory access violation This may manifest itself as a program crash or be transformed into a software exception that can be caught by program code There are however certain circumstances where this is not the case For example in x86 real mode the address 0000 0000 is readable and also usually writable and dereferencing a pointer to that address is a perfectly valid but typically unwanted action that may lead to undefined but non crashing behavior in the application if a null pointer is represented as a pointer to that address dereferencing it will lead to that behavior There are occasions when dereferencing a pointer to address zero is intentional and well defined for example BIOS code written in C for 16 bit real mode x86 devices may write the interrupt descriptor table IDT at physical address 0 of the machine by dereferencing a pointer with the same value as a null pointer for writing It is also possible for the compiler to optimize away the null pointer dereference avoiding a segmentation fault but causing other undesired behavior C In C while the NULL macro was inherited from C the integer literal for zero has been traditionally preferred to represent a null pointer constant However C 11 introduced the explicit null pointer constant nullptr and type nullptr t to be used instead providing a type safe null pointer nullptr and type nullptr t were later introduced to C in C23 Other languagesIn some programming language environments at least one proprietary Lisp implementation for example citation needed the value used as the null pointer called nil in Lisp may actually be a pointer to a block of internal data useful to the implementation but not explicitly reachable from user programs thus allowing the same register to be used as a useful constant and a quick way of accessing implementation internals This is known as the nil vector In languages with a tagged architecture a possibly null pointer can be replaced with a tagged union which enforces explicit handling of the exceptional case in fact a possibly null pointer can be seen as a tagged pointer with a computed tag Programming languages use different literals for the null pointer In Python for example a null value is called None In Java and C the literal null is provided as a literal for reference types In Pascal and Swift a null pointer is called nil In Eiffel it is called a void reference Null dereferencingBecause a null pointer does not point to a meaningful object an attempt to dereference i e access the data stored at that memory location a null pointer usually but not always causes a run time error or immediate program crash MITRE lists the null pointer error as one of the most commonly exploited software weaknesses In C dereferencing a null pointer is undefined behavior Many implementations cause such code to result in the program being halted with an access violation because the null pointer representation is chosen to be an address that is never allocated by the system for storing objects However this behavior is not universal It is also not guaranteed since compilers are permitted to optimize programs under the assumption that they are free of undefined behavior In Delphi and many other Pascal implementations the constant nil represents a null pointer to the first address in memory which is also used to initialize managed variables Dereferencing it raises an external OS exception which is mapped onto a Pascal EAccessViolation exception instance if the System SysUtils unit is linked in the uses clause In Java access to a null reference null causes a NullPointerException NPE which can be caught by error handling code but the preferred practice is to ensure that such exceptions never occur In Lisp nil is a first class object By convention first nil is nil as is rest nil So dereferencing nil in these contexts will not cause an error but poorly written code can get into an infinite loop In NET and C access to null reference null causes a NullReferenceException to be thrown Although catching these is generally considered bad practice this exception type can be caught and handled by the program In Objective C messages may be sent to a nil object which is a null pointer without causing the program to be interrupted the message is simply ignored and the return value if any is nil or 0 depending on the type Before the introduction of Supervisor Mode Access Prevention SMAP a null pointer dereference bug could be exploited by mapping page zero into the attacker s address space and hence causing the null pointer to point to that region This could lead to code execution in some cases MitigationWhile we could have languages with no nulls most do have the possibility of nulls so there are techniques to avoid or aid debugging null pointer dereferences Bond et al suggest modifying the Java Virtual Machine JVM to keep track of null propagation We have three levels of handling null references in order of effectiveness languages with no null languages that can statically analyse code to avoid the possibility of null dereference at run time if null dereference can occur at runtime tools that aid debugging Pure functional languages are an example of level 1 since no direct access is provided to pointers and all code and data is immutable User code running in interpreted or virtual machine languages generally does not suffer the problem of null pointer dereferencing dubious discuss Where a language does provide or utilise pointers which could become void it is possible to avoid runtime null dereferences by providing compilation time checking via static analysis or other techniques with syntactic assistance from language features such as those seen in the Eiffel programming language with Void safety to avoid null derefences D and Rust In some languages analysis can be performed using external tools but these are weak compared to direct language support with compiler checks since they are limited by the language definition itself The last resort of level 3 is when a null reference occurs at runtime debugging aids can help Alternatives to null pointersThis section does not cite any sources Please help improve this section by adding citations to reliable sources Unsourced material may be challenged and removed March 2025 Learn how and when to remove this message As a rule of thumb for each type of struct or class define some objects representing some state of the business logic replacing the undefined behaviour on null For example future to indicate a field inside a structure that will not be available right now but for which we know in advance that in the future it will be defined not applicable to indicate a field in a non normalized structure error timeout to indicate that the field could not be initialized probably stopping normal execution of the full program thread request or command HistoryIn 2009 Tony Hoare stated that he invented the null reference in 1965 as part of the ALGOL W language In that 2009 reference Hoare describes his invention as a billion dollar mistake I call it my billion dollar mistake It was the invention of the null reference in 1965 At that time I was designing the first comprehensive type system for references in an object oriented language ALGOL W My goal was to ensure that all use of references should be absolutely safe with checking performed automatically by the compiler But I couldn t resist the temptation to put in a null reference simply because it was so easy to implement This has led to innumerable errors vulnerabilities and system crashes which have probably caused a billion dollars of pain and damage in the last forty years See alsoMemory debugger Zero pageNotes CWE 476 NULL Pointer Dereference MITRE Null References The Billion Dollar Mistake InfoQ Retrieved 5 September 2024 ISO IEC 9899 clause 6 3 2 3 paragraph 4 ISO IEC 9899 clause 7 17 paragraph 3 NULL which expands to an implementation defined null pointer constant ISO IEC 9899 clause 6 3 2 3 paragraph 3 WR14 N3042 Introduce the nullptr constant open std org 2022 07 22 Archived from the original on December 24 2022 ISO IEC 9899 clause 6 5 3 2 paragraph 4 esp footnote 87 Lattner Chris 2011 05 13 What Every C Programmer Should Know About Undefined Behavior 1 3 blog llvm org Archived from the original on 2023 06 14 Retrieved 2023 06 14 Stroustrup Bjarne March 2001 Chapter 5 The const qualifier 5 4 prevents accidental redefinition of NULL and ensures that NULL can be used where a constant is required The C Programming Language 14th printing of 3rd ed United States and Canada Addison Wesley p 88 ISBN 0 201 88954 4 CWE 476 NULL Pointer Dereference MITRE The Objective C 2 0 Programming Language section Sending Messages to nil OS X exploitable kernel NULL pointer dereference in AppleGraphicsDeviceControl Bond Michael D Nethercote Nicholas Kent Stephen W Guyer Samuel Z McKinley Kathryn S 2007 Tracking bad apples Proceedings of the 22nd annual ACM SIGPLAN conference on Object oriented programming systems and applications OOPSLA 07 p 405 doi 10 1145 1297027 1297057 ISBN 9781595937865 S2CID 2832749 Void safety Background definition and tools Retrieved 2021 11 24 Bartosz Milewski SafeD D Programming Language Retrieved 17 July 2014 Fearless Security Memory Safety Archived from the original on 8 November 2020 Retrieved 4 November 2020 Tony Hoare 2009 08 25 Null References The Billion Dollar Mistake InfoQ com Tony Hoare 2009 08 25 Presentation Null References The Billion Dollar Mistake InfoQ com ReferencesJoint Technical Committee ISO IEC JTC 1 Subcommittee SC 22 Working Group WG 14 2007 09 08 International Standard ISO IEC 9899 PDF Committee Draft a href wiki Template Cite book title Template Cite book cite book a CS1 maint multiple names authors list link CS1 maint numeric names authors list link