This article needs additional citations for verification.(November 2009) |
In formal language theory, the empty string, or empty word, is the unique string of length zero.
Formal theory
Formally, a string is a finite, ordered sequence of characters such as letters, digits or spaces. The empty string is the special case where the sequence has length zero, so there are no symbols in the string. There is only one empty string, because two strings are only different if they have different lengths or a different sequence of symbols. In formal treatments, the empty string is denoted with ε or sometimes Λ or λ.
The empty string should not be confused with the empty language ∅, which is a formal language (i.e. a set of strings) that contains no strings, not even the empty string.
The empty string has several properties:
- |ε| = 0. Its string length is zero.
- ε ⋅ s = s ⋅ ε = s. The empty string is the identity element of the concatenation operation. The set of all strings forms a free monoid with respect to ⋅ and ε.
- εR = ε. Reversal of the empty string produces the empty string, so the empty string is a palindrome.
- . Statements that are about all characters in a string are vacuously true.
- The empty string precedes any other string under lexicographical order, because it is the shortest of all strings.
In context-free grammars, a production rule that allows a symbol to produce the empty string is known as an ε-production, and the symbol is said to be "nullable".
Use in programming languages
In most programming languages, strings are a data type. Strings are typically stored at distinct memory addresses (locations). Thus, the same string (e.g., the empty string) may be stored in two or more places in memory.
In this way, there could be multiple empty strings in memory, in contrast with the formal theory definition, for which there is only one possible empty string. However, a string comparison function would indicate that all of these empty strings are equal to each other.
Even a string of length zero can require memory to store it, depending on the format being used. In most programming languages, the empty string is distinct from a null reference (or null pointer) because a null reference points to no string at all, not even the empty string. The empty string is a legitimate string, upon which most string operations should work. Some languages treat some or all of the following in similar ways: empty strings, null references, the integer 0, the floating point number 0, the Boolean value false, the ASCII character NUL, or other such values.
The empty string is usually represented similarly to other strings. In implementations with string terminating character (null-terminated strings or plain text lines), the empty string is indicated by the immediate use of this terminating character.
Different functions, methods, macros, or idioms exist for checking if a string is empty in different languages.[example needed]
λ representation | Programming languages |
---|---|
"" | C, C#, C++, Go, Haskell, Java, JavaScript, Julia, Lua, M, Objective-C (as a C string), OCaml, Perl, PHP, Python, Ruby, Scala, Standard ML, Swift, Tcl, Visual Basic .NET |
'' | APL, Delphi, JavaScript, Lua, MATLAB, Pascal, Perl, PHP, Python, R, Ruby, Smalltalk, SQL |
character(0) | R |
{'\0'} | C, C++, Objective-C (as a C string) |
std::string() | C++ |
""s | C++ (since the 2014 standard) |
@"" | Objective-C (as a constant NSString object) |
[NSString string] | Objective-C (as a new NSString object) |
q(), qq() | Perl |
str() | Python |
%{} %() | Ruby |
String::new() | Rust |
string.Empty | C#, Visual Basic .NET |
String.make 0 '-' | OCaml |
{} | Tcl |
[[]] | Lua |
Representations of the empty string
This section needs expansion. You can help by adding to it. (March 2010) |
The empty string is a syntactically valid representation of zero in positional notation (in any base), which does not contain leading zeros. Since the empty string does not have a standard visual representation outside of formal language theory, the number zero is traditionally represented by a single decimal digit 0 instead.
Zero-filled memory area, interpreted as a null-terminated string, is an empty string.
Empty lines of text show the empty string. This can occur from two consecutive EOLs, as often occur in text files. This is sometimes used in text processing to separate paragraphs, e.g. in MediaWiki.
See also
- Empty set
- Null-terminated string
- Concatenation theory
- String literal
References
- Corcoran, John; Frank, William; Maloney, Michael (1974). "String theory". Journal of Symbolic Logic. 39 (4): 625–637. doi:10.2307/2272846. JSTOR 2272846. S2CID 2168826.
- CSE1002 Lecture Notes – Lexicographic
- There are two ways to create "empty strings" in R; the other is listed here as
""
.character(0)
creates empty character vectors, which will output 0 when counted. - "String in std::string - Rust". doc.rust-lang.org. Retrieved 2022-11-30.
This article needs additional citations for verification Please help improve this article by adding citations to reliable sources Unsourced material may be challenged and removed Find sources Empty string news newspapers books scholar JSTOR November 2009 Learn how and when to remove this message In formal language theory the empty string or empty word is the unique string of length zero Formal theoryFormally a string is a finite ordered sequence of characters such as letters digits or spaces The empty string is the special case where the sequence has length zero so there are no symbols in the string There is only one empty string because two strings are only different if they have different lengths or a different sequence of symbols In formal treatments the empty string is denoted with e or sometimes L or l The empty string should not be confused with the empty language which is a formal language i e a set of strings that contains no strings not even the empty string The empty string has several properties e 0 Its string length is zero e s s e s The empty string is the identity element of the concatenation operation The set of all strings forms a free monoid with respect to and e eR e Reversal of the empty string produces the empty string so the empty string is a palindrome c s P c displaystyle forall c in s P c Statements that are about all characters in a string are vacuously true The empty string precedes any other string under lexicographical order because it is the shortest of all strings In context free grammars a production rule that allows a symbol to produce the empty string is known as an e production and the symbol is said to be nullable Use in programming languagesIn most programming languages strings are a data type Strings are typically stored at distinct memory addresses locations Thus the same string e g the empty string may be stored in two or more places in memory In this way there could be multiple empty strings in memory in contrast with the formal theory definition for which there is only one possible empty string However a string comparison function would indicate that all of these empty strings are equal to each other Even a string of length zero can require memory to store it depending on the format being used In most programming languages the empty string is distinct from a null reference or null pointer because a null reference points to no string at all not even the empty string The empty string is a legitimate string upon which most string operations should work Some languages treat some or all of the following in similar ways empty strings null references the integer 0 the floating point number 0 the Boolean value false the ASCII character NUL or other such values The empty string is usually represented similarly to other strings In implementations with string terminating character null terminated strings or plain text lines the empty string is indicated by the immediate use of this terminating character Different functions methods macros or idioms exist for checking if a string is empty in different languages example needed l representation Programming languages C C C Go Haskell Java JavaScript Julia Lua M Objective C as a C string OCaml Perl PHP Python Ruby Scala Standard ML Swift Tcl Visual Basic NET APL Delphi JavaScript Lua MATLAB Pascal Perl PHP Python R Ruby Smalltalk SQLcharacter 0 R 0 C C Objective C as a C string std string C s C since the 2014 standard Objective C as a constant NSString object NSString string Objective C as a new NSString object q qq Perlstr Python RubyString new Ruststring Empty C Visual Basic NETString make 0 OCaml Tcl LuaRepresentations of the empty string This section needs expansion You can help by adding to it March 2010 The empty string is a syntactically valid representation of zero in positional notation in any base which does not contain leading zeros Since the empty string does not have a standard visual representation outside of formal language theory the number zero is traditionally represented by a single decimal digit 0 instead Zero filled memory area interpreted as a null terminated string is an empty string Empty lines of text show the empty string This can occur from two consecutive EOLs as often occur in text files This is sometimes used in text processing to separate paragraphs e g in MediaWiki See alsoEmpty set Null terminated string Concatenation theory String literalReferencesCorcoran John Frank William Maloney Michael 1974 String theory Journal of Symbolic Logic 39 4 625 637 doi 10 2307 2272846 JSTOR 2272846 S2CID 2168826 CSE1002 Lecture Notes Lexicographic There are two ways to create empty strings in R the other is listed here as character 0 creates empty character vectors which will output 0 when counted String in std string Rust doc rust lang org Retrieved 2022 11 30