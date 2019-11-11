Everything in a PC, laptop, or server is represented as binary digits (a.k.a. bits, where each bit can only be 1 or 0). There are no characters like we use for writing or numbers as we write them anywhere in a computer’s memory or secondary storage such as disk drives. For general purposes, the unit of measure for groups of binary bits is the byte — eight bits. Bytes are an agreed-upon measure that helped standardize computer memory, storage, and how computers handled data.
There are various terms in use to specify the capacity of a disk drive (either magnetic or electronic). The same measures are applied to a computers random access memory (RAM) and other memory devices that inhabit your computer. So now let’s see how the numbers are made up.
Suffixes are used with the number that specifies the capacity of the device. The suffixes designate a multiplier that is to be applied to the number that preceded the suffix. Commonly used suffixes are:
- Kilo = 103 = 1,000 (one thousand)
- Mega = 106 = 1,000,000 (one million)
- Giga = 109 = 1000,000,000 (one billion)
- Tera = 1012 = 1,000,000,000,000 (one trillion)
As an example 500 GB (gigabytes) is 500,000,000,000 bytes.
The units that memory and storage are specified in advertisements, on boxes in the store, and so on are in the decimal system as shown above. However since computers only use binary bits, the actual capacity of these devices is different than the advertised capacity.
You saw that the decimal numbers above were shown with their equivalent powers of ten. In the binary system numbers can be represented as powers of two. The table below shows how bits are used to represent powers of two in an 8 bit Byte. At the bottom of the table there is an example of how the decimal number 109 can be represented as a binary number that can be held in a single byte of 8 bits (01101101).
|
Eight bit binary number
|
Bit 7
|
Bit 6
|
Bit 5
|
Bit 4
|
Bit 3
|
Bit 2
|
Bit 1
|
Bit 0
|
Power of 2
|
27
|
26
|
25
|
24
|
23
|
22
|
21
|
20
|
Decimal Value
|
128
|
64
|
32
|
16
|
8
|
4
|
2
|
1
|
Example Number
|
0
|
1
|
1
|
0
|
1
|
1
|
0
|
1
The example bit values comprise the binary number 01101101. To get the equivalent decimal value just add the decimal values from the table where the bit is set to 1. That is 64 + 32 + 8 + 4 + 1 = 109.
By the time you get out to 230 you have decimal 1,073,741,824 with just 31 bits (don’t forget the 20) You’ve got a large enough number to start specifying memory and storage sizes.
Now comes what you have been waiting for. The table below lists common designations as they are used for labeling decimal and binary values.
|
Decimal
|
Binary
|
KB (Kilobyte)
1KB = 1,000 bytes
|
KiB (Kibibyte)
1KiB = 1,024 bytes
|
MB (Megabyte)
1MB = 1,000,000 bytes
|
MiB (Mebibyte)
1MiB = 1,048,576 bytes
|
GB (Gigabyte)
1GB = 1,000,000,000 bytes
|
GiB (Gibibyte)
1 GiB (Gibibyte) = 1,073,741,824 bytes
|
TB (Terabyte)
1TB = 1,000,000,000,000
|
TiB (Tebibyte)
1TiB = 1,099,511,627,776 bytes
Note that all of the quantities of bytes in the table above are expressed as decimal numbers. They are not shown as binary numbers because those numbers would be more than 30 characters long.
Most users and programmers need not be concerned with the small differences between the binary and decimal storage size numbers. If you’re developing software or hardware that deals with data at the binary level you may need the binary numbers.
As for what this means to your PC: Your PC will make use of the full capacity of your storage and memory devices. If you want to see the capacity of your disk drives, thumb drives, etc, the Disks utility in Fedora will show you the actual capacity of the storage device in number of bytes as a decimal number.
There are also command line tools that can provide you with more flexibility in seeing how your storage bytes are being used. Two such command line tools are du (for files and directories) and df (for file systems). You can read about these by typing man du or man df at the command line in a terminal window.
Kees de Jong
Thanks for this article. A lot of people are confused about this basic computer science understanding. For example Redis, their notation is not the standard: https://github.com/antirez/redis/blob/unstable/redis.conf
And many howto’s, manual pages and whatnot. I guess the problem stems from the fact that Americans don’t use the metric system. Anything with kilo, mega, etc. are powers of 10, not of 2. Weirdly I also see people that should know the metric system write a KB as 1024… 🙂
atolstoy
Thanks for the article. But, I’m afraid the material is much too basic for the Fedora Magazine. We’re already tech-savvy!
Paul W. Frields
@atolstoy: According to feedback from other readers and forums, not everyone is. Glad you didn’t need the article, though.
Peter
Sorry, but it is more like:
bit7 bit6 bit5 … bit0
Your explanation is not wrong but very misleading. Could you please correct this?
Paul W. Frields
@Peter: The table has been updated to be more correct.
Pat Kelly
I really hadn’t planned to change the table. Correct has to be qualified by use and architecture design choice.
Pat Kelly
I considered covering Big-Endian and Little-Endian in this article but since this is a magazine article I didn’t want it to get too long. I used Little-Endian to see if I would get feed pack on this point. I’ve been thinking about proposing another article to cover this, the other larger data structures that bytes are used to form, and some of the uses. Thanks for your comment you have encouraged me to go ahead with that proposal.
Joao Rodrigues
Even though the disk manufacturer may say that the disk has a capacity of 1,000,000,000,000 bytes (or 1 terabyte), it’s the raw capability of storage. It doesn’t mean you can store 1 terabyte of data on it.
Some of that space will be lost in partitioning, partition alignment and filesystem structure.
A very cool tool to analyze disk usage in gnome is baobab
https://wiki.gnome.org/action/show/Apps/DiskUsageAnalyzer
Also, in the short scale vs. long scale war:
Long scale users (mostly european) use the following nomenclature:
10^9 is a thousand millon or a milliard
10^12 is a billion
10^15 is a thousand billon or a billiard
10^18 is a trillion
10^21 is a thousand trillion or a trilliard
Jakfrost
Not that I want to pick nits, but there are 4 Bits in a Byte, two Bits in a nibble, two Bytes in a Word (8 Bit Word). A 16 Bit Word was originally termed a Double Word I think, and a 32 Bit word is a Long Word. You need at least 32 Bits in order to express a single precision floating point value in a PC. Binary is merely a Base 2 Number system (0..1 range), just as Decimal is a Base 10 number system (0…9 range).
Paul W. Frields
There are 8 bits in a byte, and 4 bits in a nibble — although the byte was never strictly defined, that’s been the length as long as I can remember. Words are ambiguous due to processor architecture differences, though many of the platforms have maintained a word at 16 bits, and a double word at 32 bits.[1]
Stuart Gathman
You left out the most important part. A KiB is 1024 bytes. 1024 is 2^10, which is close to 1000. So all the binary multipliers are multiples of 2^10. This approximate equivalence is handy for all sorts of estimations. 3 decimal digits ~= 10 binary bits. How many bits needed to count to 10 billion? 10*10^9 is approx 10 * 2^30, 4 bits are needed to count to 10, so 34 bits are needed. A MiB is 2^10 * 2^10.
Historically, 1024 bytes was called a KiloByte in the context of binary computers until a few decades ago, and 2^20 was called a MegaByte. Eventually, enough lay people were confused, exacerbated by deceptive marketing that used decimal in a binary computer context, that a standards committee was formed to come up with new terms for the binary prefixes.
As usual, the committee solution was hated by all. 2^10 bytes was now to be called a “KibiByte”, 2^20 bytes is a “MibiByte”, and worst of all, 2^30 bytes is a “GiBiByte”. Hence the new nomenclature was pronounced “GiBiRish”. Fortunately, the abbreviations were more acceptable.
2^10 bytes KiBiByte KiB
2^20 bytes MiBiByte MiB
2^30 bytes GiBiByte GiB
2^40 bytes TiBiByte TiB
2^50 bytes PiBiByte PiB
Stuart Gathman
Confusingly, many unix utilities in Fedora still use the old convention of Kilo = 1024 in a binary computer context. For instance, df uses K,M,G,T,P,E,Z,Y to mean powers of 1024. It then added KB,MB,… for powers of 1000 and KiB,MiB,… for powers of 1024.
The binary prefixes get more unpronounceable past PeBiByte.
2^60 bytes ExBiByte EiB ~ Exabyte
2^70 bytes ZeBiByte ZiB ~ Zettabyte
2^80 bytes YoBiByte YiB ~ Yottabyte