![Units of information](https://www.english.nina.az/wikipedia/image/aHR0cHM6Ly91cGxvYWQud2lraW1lZGlhLm9yZy93aWtpcGVkaWEvY29tbW9ucy90aHVtYi81LzVlL1VuaXRzX29mX2luZm9ybWF0aW9uLnN2Zy8xNjAwcHgtVW5pdHNfb2ZfaW5mb3JtYXRpb24uc3ZnLnBuZw==.png )
A unit of information is any unit of measure of digital data size. In digital computing, a unit of information is used to describe the capacity of a digital data storage device. In telecommunications, a unit of information is used to describe the throughput of a communication channel. In information theory, a unit of information is used to measure information contained in messages and the entropy of random variables.
Due to the need to work with data sizes that range from very small to very large, units of information cover a wide range of data sizes. Units are defined as multiples of a smaller unit except for the smallest unit which is based on convention and hardware design. Multiplier prefixes are used to describe relatively large sizes.
For binary hardware, by far the most common hardware today, the smallest unit is the bit, a portmanteau of binary digit, which represents a value that is one of two possible values; typically shown as 0 and 1. The nibble, 4 bits, represents the value of a single hexadecimal digit. The byte, 8 bits, 2 nibbles, is possibly the most commonly known and used base unit to describe data size. The word is a size that varies by and has a special importance for a particular hardware context. On modern hardware, a word is typically 2, 4 or 8 bytes, but the size varies dramatically on older hardware. Larger sizes can be expressed as multiples of a base unit via SI metric prefixes (powers of ten) or the newer and generally more accurate IEC binary prefixes (powers of two).
Information theory
![image](https://www.english.nina.az/wikipedia/image/aHR0cHM6Ly93d3cuZW5nbGlzaC5uaW5hLmF6L3dpa2lwZWRpYS9pbWFnZS9hSFIwY0hNNkx5OTFjR3h2WVdRdWQybHJhVzFsWkdsaExtOXlaeTkzYVd0cGNHVmthV0V2WTI5dGJXOXVjeTkwYUhWdFlpODFMelZsTDFWdWFYUnpYMjltWDJsdVptOXliV0YwYVc5dUxuTjJaeTh6T0RSd2VDMVZibWwwYzE5dlpsOXBibVp2Y20xaGRHbHZiaTV6ZG1jdWNHNW4ucG5n.png)
In 1928, Ralph Hartley observed a fundamental storage principle, which was further formalized by Claude Shannon in 1945: the information that can be stored in a system is proportional to the logarithm of N possible states of that system, denoted logbN. Changing the base of the logarithm from b to a different number c has the effect of multiplying the value of the logarithm by a fixed constant, namely logcN = (logcb) logbN. Therefore, the choice of the base b determines the unit used to measure information. In particular, if b is a positive integer, then the unit is the amount of information that can be stored in a system with b possible states.
When b is 2, the unit is the shannon, equal to the information content of one "bit". A system with 8 possible states, for example, can store up to log2 8 = 3 bits of information. Other units that have been named include:
- Base b = 3
- the unit is called "trit", and is equal to log2 3 (≈ 1.585) bits.
- Base b = 10
- the unit is called decimal digit, hartley, ban, decit, or dit, and is equal to log2 10 (≈ 3.322) bits.
- Base b = e, the base of natural logarithms
- the unit is called a nat, nit, or nepit (from Neperian), and is worth log2e (≈ 1.443) bits.
The trit, ban, and nat are rarely used to measure storage capacity; but the nat, in particular, is often used in information theory, because natural logarithms are mathematically more convenient than logarithms in other bases.
Units derived from bit
Several conventional names are used for collections or groups of bits.
Byte
Historically, a byte was the number of bits used to encode a character of text in the computer, which depended on computer hardware architecture, but today it almost always means eight bits – that is, an octet. An 8-bit byte can represent 256 (28) distinct values, such as non-negative integers from 0 to 255, or signed integers from −128 to 127. The IEEE 1541-2002 standard specifies "B" (upper case) as the symbol for byte (IEC 80000-13 uses "o" for octet in French, but also allows "B" in English). Bytes, or multiples thereof, are almost always used to specify the sizes of computer files and the capacity of storage units. Most modern computers and peripheral devices are designed to manipulate data in whole bytes or groups of bytes, rather than individual bits.
Nibble
A group of four bits, or half a byte, is sometimes called a nibble, nybble or nyble. This unit is most often used in the context of hexadecimal number representations, since a nibble has the same number of possible values as one hexadecimal digit has.
Word, block, and page
Computers usually manipulate bits in groups of a fixed size, conventionally called words. The number of bits in a word is usually defined by the size of the registers in the computer's CPU, or by the number of data bits that are fetched from its main memory in a single operation. In the IA-32 architecture more commonly known as x86-32, a word is 32 bits, but other past and current architectures use words with 4, 8, 9, 12, 13, 16, 18, 20, 21, 22, 24, 25, 29, 30, 31, 32, 33, 35, 36, 38, 39, 40, 42, 44, 48, 50, 52, 54, 56, 60, 64, 72 bits or others.
Some machine instructions and computer number formats use two words (a "double word" or "dword"), or four words (a "quad word" or "quad").
Computer memory caches usually operate on blocks of memory that consist of several consecutive words. These units are customarily called cache blocks, or, in CPU caches, cache lines.
Virtual memory systems partition the computer's main storage into even larger units, traditionally called pages.
Multiplicative prefixes
A unit for a large amount of data can be formed using either a metric or binary prefix with a base unit. For storage, the base unit is typically byte. For communication throughput, a base unit of bit is common. For example, using the metric kilo prefix, a kilobyte is 1000 bytes and a kilobit is 1000 bits.
Use of metric prefixes is common, but often inaccurate since binary storage hardware is organized with capacity that is a power of 2 – not 10 as the metric prefixes are. In the context of computing, the metric prefixes are often intended to mean something other than their normal meaning. For example, a kilobyte is actually 1024 bytes even though the standard meaning of kilo is 1000. And, mega normally means one million, but in computing is often used to mean 220 = 1048576. The table below illustrates the differences between normal metric sizes and the implied actual size – the binary size.
Symbol | Prefix | Metric size | Binary size | Size difference |
---|---|---|---|---|
k | kilo | 1000 | 1024 | 2.40% |
M | mega | 10002 | 10242 | 4.86% |
G | giga | 10003 | 10243 | 7.37% |
T | tera | 10004 | 10244 | 9.95% |
P | peta | 10005 | 10245 | 12.59% |
E | exa | 10006 | 10246 | 15.29% |
Z | zetta | 10007 | 10247 | 18.06% |
Y | yotta | 10008 | 10248 | 20.89% |
R | ronna | 10009 | 10249 | 23.79% |
Q | quetta | 100010 | 102410 | 26.77% |
The International Electrotechnical Commission (IEC) issued a standard that introduces binary prefixes that accurately represent binary sizes without changing the meaning of the standard metric terms. Rather than based on powers of 1000, these are based on powers of 1024 which is a power of 2.
Symbol | Prefix | Example | Size |
---|---|---|---|
Ki | kibi | kibibyte (KiB) | 210, 1024 |
Mi | mebi | mebibyte (MiB) | 220, 10242 |
Gi | gibi | gibibyte (GiB) | 230, 10243 |
Ti | tebi | tebibyte (TiB) | 240, 10244 |
Pi | pebi | pebibyte (PiB) | 250, 10245 |
Ei | exbi | exbibyte (EiB) | 260, 10246 |
Zi | zebi | zebibyte (ZiB) | 270, 10247 |
Yi | yobi | yobibyte (YiB) | 280, 10248 |
The JEDEC memory standard JESD88F notes that the definitions of kilo (K), giga (G), and mega (M) based on powers of two are included only to reflect common usage, but are otherwise deprecated.
Size examples
- 1 bit: Answer to a yes/no question
- 1 byte: A number from 0 to 255
- 90 bytes: Enough to store a typical line of text from a book
- 512 bytes = 0.5 KiB: The typical sector size of an old style hard disk drive (modern Advanced Format sectors are 4096 bytes).
- 1024 bytes = 1 KiB: A block size in some older UNIX filesystems
- 2048 bytes = 2 KiB: A CD-ROM sector
- 4096 bytes = 4 KiB: A memory page in x86 (since Intel 80386) and many other architectures, also the modern Advanced Format hard disk drive sector size.
- 4 kB: About one page of text from a novel
- 120 kB: The text of a typical pocket book
- 1 MiB: A 1024×1024 pixel bitmap image with 256 colors (8 bpp color depth)
- 3 MB: A three-minute song (133 kbit/s)
- 650–900 MB – a CD-ROM
- 1 GB: 114 minutes of uncompressed CD-quality audio at 1.4 Mbit/s
- 16 GB: DDR5 DRAM laptop memory under $40 (as of early 2024)
- 32/64/128 GB: Three common sizes of USB flash drives
- 1 TB: The size of a $30 hard disk (as of early 2024)
- 6 TB: The size of a $100 hard disk (as of early 2022)
- 16 TB: The size of a small/cheap $130 (as of early 2024) enterprise SAS hard disk drive
- 24 TB: The size of $440 (as of early 2024) "video" hard disk drive
- 32 TB: Largest hard disk drive (as of mid-2024)
- 100 TB: Largest commercially available solid-state drive (as of mid-2024)
- 200 TB: Largest solid-state drive constructed (prediction for mid-2022)
- 1.6 PB (1600 TB): Amount of possible storage in one 2U server (world record as of 2021, using 100 TB solid-states drives).
- 1.3 ZB: Prediction of the volume of the whole internet in 2016
Obsolete and unusual units
Some notable unit names that are today obsolete or only used in limited contexts.
- 1 bit: unibit
- 2 bits: dibit, crumb, quartic digit, quad, semi-nibble, nyp
- 3 bits: tribit, triad, triade
- 4 bits: see nibble
- 5 bits: pentad, pentade,
- 6 bits: byte (in early IBM machines using BCD alphamerics), hexad, hexade, sextet
- 7 bits: heptad, heptade
- 8 bits: octad,octade
- 9 bits: nonet, rarely used
- 10 bits: declet, decle
- 12 bits: slab
- 15 bits: parcel (on CDC 6600 and CDC 7600)
- 16 bits: doublet, wyde, parcel (on Cray-1), chawmp (on a 32-bit machine)
- 18 bits: chomp, chawmp (on a 36-bit machine)
- 32 bits: quadlet, tetra
- 64 bits: octlet, octa
- 96 bits: bentobox (in ITRON OS)
- 128 bits: hexlet,paragraph (on Intel x86 processors)
- 256 bytes: page (on Intel 4004,8080 and 8086 processors, also many other 8-bit processors – typically much larger on many 16-bit/32-bit processors)
- 6 trits: tryte
- combit, comword
See also
- File size
- ISO 80000-13 (Quantities and units – Part 13: Information science and technology)
References
- Mackenzie, Charles E. (1980). Coded Character Sets, History and Development (PDF). The Systems Programming Series (1 ed.). Addison-Wesley Publishing Company, Inc. p. xii. ISBN 978-0-201-14460-4. LCCN 77-90165. Archived (PDF) from the original on May 26, 2016. Retrieved August 25, 2019.
- Abramson, Norman (1963). Information theory and coding. McGraw-Hill.
- Knuth, Donald Ervin. The Art of Computer Programming: Seminumerical algorithms. Vol. 2. Addison Wesley.
- Shanmugam (2006). Digital and Analog Computer Systems.
- Jaeger, Gregg (2007). Quantum information: an overview.
- Kumar, I. Ravi (2001). Comprehensive Statistical Theory of Communication.
- Nybble at dictionary reference.com; sourced from Jargon File 4.2.0, accessed 2007-08-12
- Beebe, Nelson H. F. (2017-08-22). "Chapter I. Integer arithmetic". The Mathematical-Function Computation Handbook – Programming Using the MathCW Portable Software Library (1 ed.). Salt Lake City, UT, US: Springer International Publishing AG. p. 970. doi:10.1007/978-3-319-64110-2. ISBN 978-3-319-64109-6. LCCN 2017947446. S2CID 30244721.
- ISO/IEC standard is ISO/IEC 80000-13:2008. This standard cancels and replaces subclauses 3.8 and 3.9 of IEC 60027-2:2005. The only significant change is the addition of explicit definitions for some quantities. ISO Online Catalogue
- "Dictionary of Terms for Solid State Technology – 7th Edition". JEDEC Solid State Technology Association. February 2018. pp. 100, 118, 135. JESD88F. Retrieved 2021-06-25.
- Maleval, Jean Jacques (2021-02-12). "Nimbus Data SSDs Certified for Use With Dell EMC PowerEdge Servers". StorageNewsletter. Retrieved 2024-05-30.
- Horak, Ray (2007). Webster's New World Telecom Dictionary. John Wiley & Sons. p. 402. ISBN 9-78047022571-4.
- "Unibit".
- Steinbuch, Karl W.; Wagner, Siegfried W., eds. (1967) [1962]. Written at Karlsruhe, Germany. Taschenbuch der Nachrichtenverarbeitung (in German) (2 ed.). Berlin / Heidelberg / New York: Springer-Verlag OHG. pp. 835–836. LCCN 67-21079. Title No. 1036.
- Steinbuch, Karl W.; Weber, Wolfgang; Heinemann, Traute, eds. (1974) [1967]. Written at Karlsruhe / Bochum. Taschenbuch der Informatik – Band III – Anwendungen und spezielle Systeme der Nachrichtenverarbeitung (in German). Vol. 3 (3 ed.). Berlin / Heidelberg / New York: Springer Verlag. pp. 357–358. ISBN 3-540-06242-4. LCCN 73-80607.
- Bertram, H. Neal (1994). Theory of magnetic recording (1 ed.). Cambridge University Press. ISBN 0-521-44973-1. 9-780521-449731.
[...] The writing of an impulse would involve writing a dibit or two transitions arbitrarily closely together. [...]
- Weisstein, Eric. W. "Crumb". MathWorld. Retrieved 2015-08-02.
- Control Data 8092 TeleProgrammer: Programming Reference Manual (PDF). Minneapolis, Minnesota, US: Control Data Corporation. 1964. IDP 107a. Archived (PDF) from the original on 2020-05-25. Retrieved 2020-07-27.
- Knuth, Donald Ervin. The Art of Computer Programming: Cobinatorial Algorithms part 1. Vol. 4a. Addison Wesley.
- Svoboda, Antonín; White, Donnamaie E. (2016) [2012, 1985, 1979-08-01]. Advanced Logical Circuit Design Techniques (PDF) (retyped electronic reissue ed.). Garland STPM Press (original issue) / WhitePubs Enterprises, Inc. (reissue). ISBN 0-8240-7014-3. LCCN 78-31384. Archived (PDF) from the original on 2017-04-14. Retrieved 2017-04-15. [1][2]
- Paul, Reinhold (2013). Elektrotechnik und Elektronik für Informatiker – Grundgebiete der Elektronik (in German). Vol. 2. B.G. Teubner Stuttgart / Springer. ISBN 978-3-32296652-0. Retrieved 2015-08-03.
- Böhme, Gert; Born, Werner; Wagner, B.; Schwarze, G. (2013-07-02) [1969]. Reichenbach, Jürgen (ed.). Programmierung von Prozeßrechnern. Reihe Automatisierungstechnik (in German). Vol. 79. VEB Verlag TechnikISBN 978-3-663-00808-8. 9/3/4185. Berlin, reprint: Springer Verlag. doi:10.1007/978-3-663-02721-8.
- Speiser, Ambrosius Paul (1965) [1961]. Digitale Rechenanlagen – Grundlagen / Schaltungstechnik / Arbeitsweise / Betriebssicherheit [Digital computers – Basics / Circuits / Operation / Reliability] (in German) (2 ed.). ETH Zürich, Zürich, Switzerland: Springer-Verlag / IBM. pp. 6, 34, 165, 183, 208, 213, 215. LCCN 65-14624. 0978.
- Steinbuch, Karl W., ed. (1962). Written at Karlsruhe, Germany. Taschenbuch der Nachrichtenverarbeitung (in German) (1 ed.). Berlin / Göttingen / New York: Springer-Verlag OHG. p. 1076. LCCN 62-14511.
- Williams, R. H. (1969-01-01). British Commercial Computer Digest: Pergamon Computer Data Series. Pergamon Press. ISBN 1-48312210-7. 978-148312210-6.
- "Philips – Philips Data Systems' product range – April 1971" (PDF). Philips. 1971. Retrieved 2015-08-03.
- Crispin, Mark R. (2005). RFC 4042: UTF-9 and UTF-18.
- IEEE Standard for Floating-Point Arithmetic. 2008-08-29. pp. 1–70. doi:10.1109/IEEESTD.2008.4610935. ISBN 978-0-7381-5752-8. Retrieved 2016-02-10.
- Muller, Jean-Michel; Brisebarre, Nicolas; de Dinechin, Florent; Jeannerod, Claude-Pierre; Lefèvre, Vincent; Melquiond, Guillaume; Revol, Nathalie; Stehlé, Damien; Torres, Serge (2010). Handbook of Floating-Point Arithmetic (1 ed.). Birkhäuser. doi:10.1007/978-0-8176-4705-6. ISBN 978-0-8176-4704-9. LCCN 2009939668.
- Erle, Mark A. (2008-11-21). Algorithms and Hardware Designs for Decimal Multiplication (Thesis). Lehigh University (published 2009). ISBN 978-1-10904228-3. 1109042280. Retrieved 2016-02-10.
- Kneusel, Ronald T. (2015). Numbers and Computers. Springer Verlag. ISBN 9783319172606. 3319172603. Retrieved 2016-02-10.
- Zbiciak, Joe. "AS1600 Quick-and-Dirty Documentation". Retrieved 2013-04-28.
- "315 Electronic Data Processing System" (PDF). NCR. November 1965. NCR MPN ST-5008-15. Archived (PDF) from the original on 2016-05-24. Retrieved 2015-01-28.
- Bardin, Hillel (1963). "NCR 315 Seminar" (PDF). Computer Usage Communique. 2 (3). Archived (PDF) from the original on 2016-05-24.
- Schneider, Carl (2013) [1970]. Datenverarbeitungs-Lexikon [Lexicon of information technology] (in German) (softcover reprint of hardcover 1st ed.). Wiesbaden, Germany: Springer Fachmedien Wiesbaden GmbH / Betriebswirtschaftlicher Verlag Dr. Th. Gabler GmbH. pp. 201, 308. doi:10.1007/978-3-663-13618-7. ISBN 978-3-409-31831-0. Retrieved 2016-05-24.
[...] slab, Abk. aus syllable = Silbe, die kleinste adressierbare Informationseinheit für 12 bit zur Übertragung von zwei Alphazeichen oder drei numerischen Zeichen. (NCR) [...] Hardware: Datenstruktur: NCR 315-100 / NCR 315-RMC; Wortlänge: Silbe; Bits: 12; Bytes: –; Dezimalziffern: 3; Zeichen: 2; Gleitkommadarstellung: fest verdrahtet; Mantisse: 4 Silben; Exponent: 1 Silbe (11 Stellen + 1 Vorzeichen) [...] [slab, abbr. for syllable = syllable, smallest addressable information unit for 12 bits for the transfer of two alphabetical characters or three numerical characters. (NCR) [...] Hardware: Data structure: NCR 315-100 / NCR 315-RMC; Word length: Syllable; Bits: 12; Bytes: –; Decimal digits: 3; Characters: 2; Floating point format: hard-wired; Significand: 4 syllables; Exponent: 1 syllable (11 digits + 1 prefix)]
- IEEE Standard for a 32-bit Microprocessor Architecture. The Institute of Electrical and Electronics Engineers, Inc. 1995. pp. 5–7. doi:10.1109/IEEESTD.1995.79519. ISBN 1-55937-428-4. Retrieved 2016-02-10. (NB. The standard defines doublets, quadlets, octlets and hexlets as 2, 4, 8 and 16 bytes, giving the numbers of bits (16, 32, 64 and 128) only as a secondary meaning. This might be important given that bytes were not always understood to mean 8 bits (octets) historically.)
- Knuth, Donald Ervin (2004-02-15) [1999]. Fascicle 1: MMIX (PDF) (0th printing, 15th ed.). Stanford University: Addison-Wesley. Archived (PDF) from the original on 2017-03-30. Retrieved 2017-03-30.
- Raymond, Eric S. (1996). The New Hacker's Dictionary (3 ed.). MIT Press. p. 333. ISBN 0262680920.
- Böszörményi, László; Hölzl, Günther; Pirker, Emaneul (February 1999). Written at Salzburg, Austria. Zinterhof, Peter; Vajteršic, Marian; Uhl, Andreas (eds.). Parallel Cluster Computing with IEEE1394–1995. Parallel Computation: 4th International ACPC Conference including Special Tracks on Parallel Numerics (ParNum '99) and Parallel Computing in Image Processing, Video Processing, and Multimedia. Proceedings: Lecture Notes in Computer Science 1557. Berlin, Germany: Springer Verlag.
- Nicoud, Jean-Daniel (1986). Calculatrices (in French). Vol. 14 (2 ed.). Lausanne: Presses polytechniques romandes. ISBN 2-88074054-1.
- Proceedings. Symposium on Experiences with Distributed and Multiprocessor Systems (SEDMS). Vol. 4. USENIX Association. 1993.
- "1. Introduction: Segment Alignment". 8086 Family Utilities – User's Guide for 8080/8085-Based Development Systems (PDF). Revision E (A620/5821 6K DD ed.). Santa Clara, California, US: Intel Corporation. May 1982 [1980, 1978]. p. 1-6. Order Number: 9800639-04. Archived (PDF) from the original on 2020-02-29. Retrieved 2020-02-29.
- Dewar, Robert Berriedale Keith; Smosna, Matthew (1990). Microprocessors – A Programmer's View (1 ed.). Courant Institute, New York University, New York, US: McGraw-Hill Publishing Company. p. 85. ISBN 0-07-016638-2. LCCN 89-77320. (xviii+462 pages)
- "Terms And Abbreviations / 4.1 Crossing Page Boundaries". MCS-4 Assembly Language Programming Manual – The INTELLEC 4 Microcomputer System Programming Manual (PDF) (Preliminary ed.). Santa Clara, California, US: Intel Corporation. December 1973. pp. v, 2-6, 4-1. MCS-030-1273-1. Archived (PDF) from the original on 2020-03-01. Retrieved 2020-03-02.
[...] Bit – The smallest unit of information which can be represented. (A bit may be in one of two states I 0 or 1). [...] Byte – A group of 8 contiguous bits occupying a single memory location. [...] Character – A group of 4 contiguous bits of data. [...] programs are held in either ROM or program RAM, both of which are divided into pages. Each page consists of 256 8-bit locations. Addresses 0 through 255 comprise the first page, 256-511 comprise the second page, and so on. [...]
(NB. This Intel 4004 manual uses the term character referring to 4-bit rather than 8-bit data entities. Intel switched to use the more common term nibble for 4-bit entities in their documentation for the succeeding processor 4040 in 1974 already.) - Brousentsov, N. P.; Maslov, S. P.; Ramil Alvarez, J.; Zhogolev, E. A. "Development of ternary computers at Moscow State University". Retrieved 2010-01-20.
- US 4319227, Malinowski, Christopher W.; Rinderle, Heinz & Siegle, Martin, "Three-state signaling system", issued 1982-03-09, assigned to AEG-Telefunken
- "US4319227". Google.
- "US4319227" (PDF). Patentimages.
External links
- Representation of numerical values and SI units in character strings for information interchanges
- Bit Calculator – Make conversions between bits, bytes, kilobits, kilobytes, megabits, megabytes, gigabits, gigabytes, terabits, terabytes, petabits, petabytes, exabits, exabytes, zettabits, zettabytes, yottabits, yottabytes.
- Paper on standardized units for use in information technology
- Data Byte Converter
- High Precision Data Unit Converters
A unit of information is any unit of measure of digital data size In digital computing a unit of information is used to describe the capacity of a digital data storage device In telecommunications a unit of information is used to describe the throughput of a communication channel In information theory a unit of information is used to measure information contained in messages and the entropy of random variables Due to the need to work with data sizes that range from very small to very large units of information cover a wide range of data sizes Units are defined as multiples of a smaller unit except for the smallest unit which is based on convention and hardware design Multiplier prefixes are used to describe relatively large sizes For binary hardware by far the most common hardware today the smallest unit is the bit a portmanteau of binary digit which represents a value that is one of two possible values typically shown as 0 and 1 The nibble 4 bits represents the value of a single hexadecimal digit The byte 8 bits 2 nibbles is possibly the most commonly known and used base unit to describe data size The word is a size that varies by and has a special importance for a particular hardware context On modern hardware a word is typically 2 4 or 8 bytes but the size varies dramatically on older hardware Larger sizes can be expressed as multiples of a base unit via SI metric prefixes powers of ten or the newer and generally more accurate IEC binary prefixes powers of two Information theoryComparison of units of information bit trit nat ban Quantity of information is the height of bars Dark green level is the nat unit In 1928 Ralph Hartley observed a fundamental storage principle which was further formalized by Claude Shannon in 1945 the information that can be stored in a system is proportional to the logarithm of N possible states of that system denoted logbN Changing the base of the logarithm from b to a different number c has the effect of multiplying the value of the logarithm by a fixed constant namely logcN logcb logbN Therefore the choice of the base b determines the unit used to measure information In particular if b is a positive integer then the unit is the amount of information that can be stored in a system with b possible states When b is 2 the unit is the shannon equal to the information content of one bit A system with 8 possible states for example can store up to log2 8 3 bits of information Other units that have been named include Base b 3 the unit is called trit and is equal to log2 3 1 585 bits Base b 10 the unit is called decimal digit hartley ban decit or dit and is equal to log2 10 3 322 bits Base b e the base of natural logarithms the unit is called a nat nit or nepit from Neperian and is worth log2e 1 443 bits The trit ban and nat are rarely used to measure storage capacity but the nat in particular is often used in information theory because natural logarithms are mathematically more convenient than logarithms in other bases Units derived from bitSeveral conventional names are used for collections or groups of bits Byte Historically a byte was the number of bits used to encode a character of text in the computer which depended on computer hardware architecture but today it almost always means eight bits that is an octet An 8 bit byte can represent 256 28 distinct values such as non negative integers from 0 to 255 or signed integers from 128 to 127 The IEEE 1541 2002 standard specifies B upper case as the symbol for byte IEC 80000 13 uses o for octet in French but also allows B in English Bytes or multiples thereof are almost always used to specify the sizes of computer files and the capacity of storage units Most modern computers and peripheral devices are designed to manipulate data in whole bytes or groups of bytes rather than individual bits Nibble A group of four bits or half a byte is sometimes called a nibble nybble or nyble This unit is most often used in the context of hexadecimal number representations since a nibble has the same number of possible values as one hexadecimal digit has Word block and page Computers usually manipulate bits in groups of a fixed size conventionally called words The number of bits in a word is usually defined by the size of the registers in the computer s CPU or by the number of data bits that are fetched from its main memory in a single operation In the IA 32 architecture more commonly known as x86 32 a word is 32 bits but other past and current architectures use words with 4 8 9 12 13 16 18 20 21 22 24 25 29 30 31 32 33 35 36 38 39 40 42 44 48 50 52 54 56 60 64 72 bits or others Some machine instructions and computer number formats use two words a double word or dword or four words a quad word or quad Computer memory caches usually operate on blocks of memory that consist of several consecutive words These units are customarily called cache blocks or in CPU caches cache lines Virtual memory systems partition the computer s main storage into even larger units traditionally called pages Multiplicative prefixes A unit for a large amount of data can be formed using either a metric or binary prefix with a base unit For storage the base unit is typically byte For communication throughput a base unit of bit is common For example using the metric kilo prefix a kilobyte is 1000 bytes and a kilobit is 1000 bits Use of metric prefixes is common but often inaccurate since binary storage hardware is organized with capacity that is a power of 2 not 10 as the metric prefixes are In the context of computing the metric prefixes are often intended to mean something other than their normal meaning For example a kilobyte is actually 1024 bytes even though the standard meaning of kilo is 1000 And mega normally means one million but in computing is often used to mean 220 1048 576 The table below illustrates the differences between normal metric sizes and the implied actual size the binary size Symbol Prefix Metric size Binary size Size differencek kilo 1000 1024 2 40 M mega 10002 10242 4 86 G giga 10003 10243 7 37 T tera 10004 10244 9 95 P peta 10005 10245 12 59 E exa 10006 10246 15 29 Z zetta 10007 10247 18 06 Y yotta 10008 10248 20 89 R ronna 10009 10249 23 79 Q quetta 100010 102410 26 77 The International Electrotechnical Commission IEC issued a standard that introduces binary prefixes that accurately represent binary sizes without changing the meaning of the standard metric terms Rather than based on powers of 1000 these are based on powers of 1024 which is a power of 2 Symbol Prefix Example SizeKi kibi kibibyte KiB 210 1024Mi mebi mebibyte MiB 220 10242Gi gibi gibibyte GiB 230 10243Ti tebi tebibyte TiB 240 10244Pi pebi pebibyte PiB 250 10245Ei exbi exbibyte EiB 260 10246Zi zebi zebibyte ZiB 270 10247Yi yobi yobibyte YiB 280 10248 The JEDEC memory standard JESD88F notes that the definitions of kilo K giga G and mega M based on powers of two are included only to reflect common usage but are otherwise deprecated Size examples1 bit Answer to a yes no question 1 byte A number from 0 to 255 90 bytes Enough to store a typical line of text from a book 512 bytes 0 5 KiB The typical sector size of an old style hard disk drive modern Advanced Format sectors are 4096 bytes 1024 bytes 1 KiB A block size in some older UNIX filesystems 2048 bytes 2 KiB A CD ROM sector 4096 bytes 4 KiB A memory page in x86 since Intel 80386 and many other architectures also the modern Advanced Format hard disk drive sector size 4 kB About one page of text from a novel 120 kB The text of a typical pocket book 1 MiB A 1024 1024 pixel bitmap image with 256 colors 8 bpp color depth 3 MB A three minute song 133 kbit s 650 900 MB a CD ROM 1 GB 114 minutes of uncompressed CD quality audio at 1 4 Mbit s 16 GB DDR5 DRAM laptop memory under 40 as of early 2024 32 64 128 GB Three common sizes of USB flash drives 1 TB The size of a 30 hard disk as of early 2024 6 TB The size of a 100 hard disk as of early 2022 16 TB The size of a small cheap 130 as of early 2024 enterprise SAS hard disk drive 24 TB The size of 440 as of early 2024 video hard disk drive 32 TB Largest hard disk drive as of mid 2024 100 TB Largest commercially available solid state drive as of mid 2024 200 TB Largest solid state drive constructed prediction for mid 2022 1 6 PB 1600 TB Amount of possible storage in one 2U server world record as of 2021 using 100 TB solid states drives 1 3 ZB Prediction of the volume of the whole internet in 2016Obsolete and unusual unitsSome notable unit names that are today obsolete or only used in limited contexts 1 bit unibit2 bits dibit crumb quartic digit quad semi nibble nyp3 bits tribit triad triade4 bits see nibble5 bits pentad pentade 6 bits byte in early IBM machines using BCD alphamerics hexad hexade sextet7 bits heptad heptade8 bits octad octade9 bits nonet rarely used10 bits declet decle12 bits slab15 bits parcel on CDC 6600 and CDC 7600 16 bits doublet wyde parcel on Cray 1 chawmp on a 32 bit machine 18 bits chomp chawmp on a 36 bit machine 32 bits quadlet tetra64 bits octlet octa96 bits bentobox in ITRON OS 128 bits hexlet paragraph on Intel x86 processors 256 bytes page on Intel 4004 8080 and 8086 processors also many other 8 bit processors typically much larger on many 16 bit 32 bit processors 6 trits trytecombit comwordSee alsoFile size ISO 80000 13 Quantities and units Part 13 Information science and technology ReferencesMackenzie Charles E 1980 Coded Character Sets History and Development PDF The Systems Programming Series 1 ed Addison Wesley Publishing Company Inc p xii ISBN 978 0 201 14460 4 LCCN 77 90165 Archived PDF from the original on May 26 2016 Retrieved August 25 2019 Abramson Norman 1963 Information theory and coding McGraw Hill Knuth Donald Ervin The Art of Computer Programming Seminumerical algorithms Vol 2 Addison Wesley Shanmugam 2006 Digital and Analog Computer Systems Jaeger Gregg 2007 Quantum information an overview Kumar I Ravi 2001 Comprehensive Statistical Theory of Communication Nybble at dictionary reference com sourced from Jargon File 4 2 0 accessed 2007 08 12 Beebe Nelson H F 2017 08 22 Chapter I Integer arithmetic The Mathematical Function Computation Handbook Programming Using the MathCW Portable Software Library 1 ed Salt Lake City UT US Springer International Publishing AG p 970 doi 10 1007 978 3 319 64110 2 ISBN 978 3 319 64109 6 LCCN 2017947446 S2CID 30244721 ISO IEC standard is ISO IEC 80000 13 2008 This standard cancels and replaces subclauses 3 8 and 3 9 of IEC 60027 2 2005 The only significant change is the addition of explicit definitions for some quantities ISO Online Catalogue Dictionary of Terms for Solid State Technology 7th Edition JEDEC Solid State Technology Association February 2018 pp 100 118 135 JESD88F Retrieved 2021 06 25 Maleval Jean Jacques 2021 02 12 Nimbus Data SSDs Certified for Use With Dell EMC PowerEdge Servers StorageNewsletter Retrieved 2024 05 30 Horak Ray 2007 Webster s New World Telecom Dictionary John Wiley amp Sons p 402 ISBN 9 78047022571 4 Unibit Steinbuch Karl W Wagner Siegfried W eds 1967 1962 Written at Karlsruhe Germany Taschenbuch der Nachrichtenverarbeitung in German 2 ed Berlin Heidelberg New York Springer Verlag OHG pp 835 836 LCCN 67 21079 Title No 1036 Steinbuch Karl W Weber Wolfgang Heinemann Traute eds 1974 1967 Written at Karlsruhe Bochum Taschenbuch der Informatik Band III Anwendungen und spezielle Systeme der Nachrichtenverarbeitung in German Vol 3 3 ed Berlin Heidelberg New York Springer Verlag pp 357 358 ISBN 3 540 06242 4 LCCN 73 80607 Bertram H Neal 1994 Theory of magnetic recording 1 ed Cambridge University Press ISBN 0 521 44973 1 9 780521 449731 The writing of an impulse would involve writing a dibit or two transitions arbitrarily closely together Weisstein Eric W Crumb MathWorld Retrieved 2015 08 02 Control Data 8092 TeleProgrammer Programming Reference Manual PDF Minneapolis Minnesota US Control Data Corporation 1964 IDP 107a Archived PDF from the original on 2020 05 25 Retrieved 2020 07 27 Knuth Donald Ervin The Art of Computer Programming Cobinatorial Algorithms part 1 Vol 4a Addison Wesley Svoboda Antonin White Donnamaie E 2016 2012 1985 1979 08 01 Advanced Logical Circuit Design Techniques PDF retyped electronic reissue ed Garland STPM Press original issue WhitePubs Enterprises Inc reissue ISBN 0 8240 7014 3 LCCN 78 31384 Archived PDF from the original on 2017 04 14 Retrieved 2017 04 15 1 2 Paul Reinhold 2013 Elektrotechnik und Elektronik fur Informatiker Grundgebiete der Elektronik in German Vol 2 B G Teubner Stuttgart Springer ISBN 978 3 32296652 0 Retrieved 2015 08 03 Bohme Gert Born Werner Wagner B Schwarze G 2013 07 02 1969 Reichenbach Jurgen ed Programmierung von Prozessrechnern Reihe Automatisierungstechnik in German Vol 79 VEB Verlag Technik de Berlin reprint Springer Verlag doi 10 1007 978 3 663 02721 8 ISBN 978 3 663 00808 8 9 3 4185 Speiser Ambrosius Paul 1965 1961 Digitale Rechenanlagen Grundlagen Schaltungstechnik Arbeitsweise Betriebssicherheit Digital computers Basics Circuits Operation Reliability in German 2 ed ETH Zurich Zurich Switzerland Springer Verlag IBM pp 6 34 165 183 208 213 215 LCCN 65 14624 0978 Steinbuch Karl W ed 1962 Written at Karlsruhe Germany Taschenbuch der Nachrichtenverarbeitung in German 1 ed Berlin Gottingen New York Springer Verlag OHG p 1076 LCCN 62 14511 Williams R H 1969 01 01 British Commercial Computer Digest Pergamon Computer Data Series Pergamon Press ISBN 1 48312210 7 978 148312210 6 Philips Philips Data Systems product range April 1971 PDF Philips 1971 Retrieved 2015 08 03 Crispin Mark R 2005 RFC 4042 UTF 9 and UTF 18 IEEE Standard for Floating Point Arithmetic 2008 08 29 pp 1 70 doi 10 1109 IEEESTD 2008 4610935 ISBN 978 0 7381 5752 8 Retrieved 2016 02 10 Muller Jean Michel Brisebarre Nicolas de Dinechin Florent Jeannerod Claude Pierre Lefevre Vincent Melquiond Guillaume Revol Nathalie Stehle Damien Torres Serge 2010 Handbook of Floating Point Arithmetic 1 ed Birkhauser doi 10 1007 978 0 8176 4705 6 ISBN 978 0 8176 4704 9 LCCN 2009939668 Erle Mark A 2008 11 21 Algorithms and Hardware Designs for Decimal Multiplication Thesis Lehigh University published 2009 ISBN 978 1 10904228 3 1109042280 Retrieved 2016 02 10 Kneusel Ronald T 2015 Numbers and Computers Springer Verlag ISBN 9783319172606 3319172603 Retrieved 2016 02 10 Zbiciak Joe AS1600 Quick and Dirty Documentation Retrieved 2013 04 28 315 Electronic Data Processing System PDF NCR November 1965 NCR MPN ST 5008 15 Archived PDF from the original on 2016 05 24 Retrieved 2015 01 28 Bardin Hillel 1963 NCR 315 Seminar PDF Computer Usage Communique 2 3 Archived PDF from the original on 2016 05 24 Schneider Carl 2013 1970 Datenverarbeitungs Lexikon Lexicon of information technology in German softcover reprint of hardcover 1st ed Wiesbaden Germany Springer Fachmedien Wiesbaden GmbH Betriebswirtschaftlicher Verlag Dr Th Gabler GmbH pp 201 308 doi 10 1007 978 3 663 13618 7 ISBN 978 3 409 31831 0 Retrieved 2016 05 24 slab Abk aus syllable Silbe die kleinste adressierbare Informationseinheit fur 12 bit zur Ubertragung von zwei Alphazeichen oder drei numerischen Zeichen NCR Hardware Datenstruktur NCR 315 100 NCR 315 RMC Wortlange Silbe Bits 12 Bytes Dezimalziffern 3 Zeichen 2 Gleitkommadarstellung fest verdrahtet Mantisse 4 Silben Exponent 1 Silbe 11 Stellen 1 Vorzeichen slab abbr for syllable syllable smallest addressable information unit for 12 bits for the transfer of two alphabetical characters or three numerical characters NCR Hardware Data structure NCR 315 100 NCR 315 RMC Word length Syllable Bits 12 Bytes Decimal digits 3 Characters 2 Floating point format hard wired Significand 4 syllables Exponent 1 syllable 11 digits 1 prefix IEEE Standard for a 32 bit Microprocessor Architecture The Institute of Electrical and Electronics Engineers Inc 1995 pp 5 7 doi 10 1109 IEEESTD 1995 79519 ISBN 1 55937 428 4 Retrieved 2016 02 10 NB The standard defines doublets quadlets octlets and hexlets as 2 4 8 and 16 bytes giving the numbers of bits 16 32 64 and 128 only as a secondary meaning This might be important given that bytes were not always understood to mean 8 bits octets historically Knuth Donald Ervin 2004 02 15 1999 Fascicle 1 MMIX PDF 0th printing 15th ed Stanford University Addison Wesley Archived PDF from the original on 2017 03 30 Retrieved 2017 03 30 Raymond Eric S 1996 The New Hacker s Dictionary 3 ed MIT Press p 333 ISBN 0262680920 Boszormenyi Laszlo Holzl Gunther Pirker Emaneul February 1999 Written at Salzburg Austria Zinterhof Peter Vajtersic Marian Uhl Andreas eds Parallel Cluster Computing with IEEE1394 1995 Parallel Computation 4th International ACPC Conference including Special Tracks on Parallel Numerics ParNum 99 and Parallel Computing in Image Processing Video Processing and Multimedia Proceedings Lecture Notes in Computer Science 1557 Berlin Germany Springer Verlag Nicoud Jean Daniel 1986 Calculatrices in French Vol 14 2 ed Lausanne Presses polytechniques romandes ISBN 2 88074054 1 Proceedings Symposium on Experiences with Distributed and Multiprocessor Systems SEDMS Vol 4 USENIX Association 1993 1 Introduction Segment Alignment 8086 Family Utilities User s Guide for 8080 8085 Based Development Systems PDF Revision E A620 5821 6K DD ed Santa Clara California US Intel Corporation May 1982 1980 1978 p 1 6 Order Number 9800639 04 Archived PDF from the original on 2020 02 29 Retrieved 2020 02 29 Dewar Robert Berriedale Keith Smosna Matthew 1990 Microprocessors A Programmer s View 1 ed Courant Institute New York University New York US McGraw Hill Publishing Company p 85 ISBN 0 07 016638 2 LCCN 89 77320 xviii 462 pages Terms And Abbreviations 4 1 Crossing Page Boundaries MCS 4 Assembly Language Programming Manual The INTELLEC 4 Microcomputer System Programming Manual PDF Preliminary ed Santa Clara California US Intel Corporation December 1973 pp v 2 6 4 1 MCS 030 1273 1 Archived PDF from the original on 2020 03 01 Retrieved 2020 03 02 Bit The smallest unit of information which can be represented A bit may be in one of two states I 0 or 1 Byte A group of 8 contiguous bits occupying a single memory location Character A group of 4 contiguous bits of data programs are held in either ROM or program RAM both of which are divided into pages Each page consists of 256 8 bit locations Addresses 0 through 255 comprise the first page 256 511 comprise the second page and so on NB This Intel 4004 manual uses the term character referring to 4 bit rather than 8 bit data entities Intel switched to use the more common term nibble for 4 bit entities in their documentation for the succeeding processor 4040 in 1974 already Brousentsov N P Maslov S P Ramil Alvarez J Zhogolev E A Development of ternary computers at Moscow State University Retrieved 2010 01 20 US 4319227 Malinowski Christopher W Rinderle Heinz amp Siegle Martin Three state signaling system issued 1982 03 09 assigned to AEG Telefunken US4319227 Google US4319227 PDF Patentimages External linksRepresentation of numerical values and SI units in character strings for information interchanges Bit Calculator Make conversions between bits bytes kilobits kilobytes megabits megabytes gigabits gigabytes terabits terabytes petabits petabytes exabits exabytes zettabits zettabytes yottabits yottabytes Paper on standardized units for use in information technology Data Byte Converter High Precision Data Unit Converters