Register for the iXsystems Community to get an ad-free experience and exclusive discounts in our eBay Store.

The great capacity non-conspiracy (TiB vs TB)

Western Digital Drives - The Preferred Drives of FreeNAS and TrueNAS CORE
Status
Not open for further replies.

rs225

Neophyte Sage
Joined
Jun 28, 2014
Messages
878
The main purpose here is to show that hard drive makers are not trying to scam anybody with the way they measure sizes. They have done it the same way since at least 1980. What happened is that programmers foolishly applied a technical limitation of RAM sizing to the measurement of all capacity values, possibly starting in 1984.


You get 12 "TB", as defined by drive manufacturers (12 * 10^12). Every OS on the planet reports x * 2^y. Guess which number is larger (hint: someone has an incentive to fudge with how much storage you think you get).
Pointing out: OSX and Ubuntu both display base-10 units in their default GUI, and this is correct.

My question: Can anyone demonstrate a time when magnetic storage media was ever sold with the measurements used by RAM (base-2)?

According to the wikipedia linked below, the first operating system to screw up and use base-2 on hard drive storage was the Macintosh OS, in 1984. They were 'improving' the experience by using abbreviations, rather than the typical listing of exact amounts down to the byte, like MS-DOS.

This is the divergence between base-2 and base-10 as capacities grow:
This is from: http://en.wikipedia.org/wiki/Binary_prefix#Inconsistent_use_of_units

A bunch of background rambling:

Kilobyte(KB) = 1000 bytes. Kibibyte(KiB) = 1024.
Yes, this is what the ISO/IEC 80000 standard has done to try to permanently fix this confusion.

For any readers who don't know, RAM is sold in base-2 units because RAM, at least originally, had to connect to an address bus that directly activates the memory cells that are going to be read. This required some kind of segmentation by address-lines, hence a requirement that any unit of RAM be a power of 2 size.

In the 1960s, computer designers knew using K/k was a bad idea since it already meant 1000(or Kelvin), but every other symbol idea seemed worse, and so K came into use for 1024 bytes of RAM, as it was convenient and the extra 24 bytes were no big deal, particularly when some computers were lucky to have 4K. This corruption of meaning then fanned out, growing larger as the power of 2 diverges from base-10.

After everybody was 'educated' about what a kilobyte of RAM was(without explaining why), certain people, particularly OS and app programmers, proceded to use the RAM definition everywhere else.

But there is no evidence it was being used by storage vendors, and especially not in network capacities.(kilobit,megabit,gigabit per second are all base-10, as well as frequencies Mhz/Ghz)

Storage has typically been addressed through LBA (block numbers) or a physical address (cluster/head/sector) method, which does not imply a base-2 limitation. The only concession was to eventually use a 512-byte sector as the standard. Before that, sector sizes could vary greatly without a base-2 connection.

The Apple Lisa/Mac floppy was called 400K or 800K. The PC 1.44MB floppy was 1,440K, a strange mix of base-2 and base-10, and of course unformatted media was sometimes advertised as 2.0MB, meaning who knows what. Probably 2,000,000 bytes, since the '2M' floppy format was able to get 2019328 stuffed onto one.

The Apple Lisa ProFile Hard Drive (only $3499 for 5MB in Sep 1981, with 10MB available!) used a Seagate ST-506 internally(Seagate first shipped in 1980). This drive has 256-byte sectors, and lists an unformatted capacity of 6.38/12.76MB, or 5.0/10.0MB formatted. Formatted in this case likely means low-level formatted. Using the spec sheet, this means they presented the following bytes of storage to the file system: 5,013,504, or 10,027,008 bytes. Nowhere do you see any relation to base-2, except in the sector-size. The spec for it is here: http://bitsavers.informatik.uni-stuttgart.de/pdf/seagate/ST412_OEMmanual_Apr82.pdf

If you read that spec and think anybody was playing a marketing game with numbers, you're nuts. But if you do think it is a game, then Apple and Seagate were already ripping people off to the tune of 229,000 bytes on a 5MB drive. or 4.5%. Or $157.45.

On the bright side, if RAM ever starts being measured in base-10, your 64GiB of RAM will now be 68.7GB. But your 4TB hard drive will still be 4TB.
 
Last edited:
J

jkh

Guest
First, thank you for the walk down memory lane - that was fun!

Second, I think it's fairly obvious what needs to be done - we need to alleviate this confusion once and for all by simply measuring everything in hex!

As per this article, the actual capacity (and I think we can settle on "formatted capacity" since any other number would be gratuitously unusable) of a 4TB drive is 3.726Tb which becomes a very nice E8ETb in hex. A kilobyte of memory is 400H. Everything just becomes much nicer and shorter in hex!

I'm happy to accept my ACM Turing award for this suggestion at any time, of course. Thanks!
 

rs225

Neophyte Sage
Joined
Jun 28, 2014
Messages
878
Second, I think it's fairly obvious what needs to be done - we need to alleviate this confusion once and for all by simply measuring everything in hex!

I'm happy to accept my ACM Turing award for this suggestion at any time, of course. Thanks!
Off to trademark CAFEFEED0BA for my upcoming line of ~14TB Italian storage drives...

anodos said:
I think we would also benefit from a vim vs emacs discussion.
Whichever implements a ZFS editor mode first, wins.
 
Last edited:

solarisguy

Neophyte Sage
Joined
Apr 4, 2014
Messages
1,125
If you dig deep enough, the byte was not necessarily equal to eight (8) bits...

Wanna correct the great injustice and declare the byte to be ten (10) bits? :)
 

rs225

Neophyte Sage
Joined
Jun 28, 2014
Messages
878
Wanna correct the great injustice and declare the byte to be ten (10) bits?
Just store your data in 80-bit floating point registers, and you can convert using MMX. ;)
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
Definitely not going to start a volatile debate with a thread topic like that...

So why did you leave out...

The fact that the ISO spec that specified 1 Kilobyte as base-10 and not base-2 in 1999. You aren't going to argue that there was no computing industry before 1999, right?
The fact that the ISO spec submitted for what a KB, MB, etc was defined as was submitted entirely by hard drive manufacturers and *only* hard drive manufacturers. Clearly they had nothing to lose and everything to gain by moving the standard to what they want it to be.
The fact that the base-10 unit hadn't been used anywhere outside the hard drive manufacturer's for decades beforehand. Even back to the 1960s it was often (but not always) base-2. Often the exceptions were because the base-2 units were not exactly applicable because of very unique hardware designs.
The fact that, in essence, the hard drive manufacturers tried to sell a hard drive with a given units of measurement, and when they started getting sued, tried to get the unit of measurement changed.

Would you appreciate it if I gave you a car with a rating of 50 miles per gallon, and when you figure out it doesn't get that and come to me and complain I tell you that my definition of a mile is actually about 1/2 of what you call a "mile"? Then, to make matters worse I'd try to get the entire planet to adopt a new standard of mile to be *my* standard just so I'm not inaccurate. This is literally what took place, although on a multi-decade time-scale.

I don't think there as any malice at first with what happened, but when the hard drive manufacturers were called out for what was going on, instead of trying to meet what everyone else deemed as a MB, GB, TB, etc, they then went and moved the standard so they were correct.

I'm not saying your post is wrong or inaccurate, but you definitely left off very important tidbits of information on giving someone the ability to decide for themselves. Instead you cited precise examples and no actual story.

Well, guess what. F*ck them. They've fractured the market with this MB versus MiB debate and I say let them rot in fscking hell forever for what they did. I'll still say MB and refer to it in base-2 until the day I die. If they don't like that, don't try to change the spec when it's not in your favor.
 

rs225

Neophyte Sage
Joined
Jun 28, 2014
Messages
878
Cyberjock, I linked a manual/spec for a circa 1980 product that measured in base-10. It is for the 'legendary' Seagate ST-506 drive, used by Apple in 1981.

If you know of some base-2 usages for the same period or earlier on megabytes, link them. Even kilobytes when used for non-RAM references would be interesting.

The wikipedia page could definitely use the info if it turns out it did appear earlier in actual commercial usage. Currently, it pegs Mac OS in 1984 as the first known example of base-2 usage.
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
I don't need to link them, and it's not worth my time to link them.

What's more entertaining is the fact that "bytes" weren't always 8 bits either. There's so much to this story that unless you are looking for this info for a legendary book on the history of hard drive capacities then you'll probably never know it all (or you were there for it). I don't even pretend to know it all. I know that I've done some research on it as I was very curious, but I'm sure there's tons more I don't know.
 

anodos

Sambassador
iXsystems
Joined
Mar 6, 2014
Messages
7,258
I don't need to link them, and it's not worth my time to link them.

What's more entertaining is the fact that "bytes" weren't always 8 bits either. There's so much to this story that unless you are looking for this info for a legendary book on the history of hard drive capacities then you'll probably never know it all (or you were there for it). I don't even pretend to know it all. I know that I've done some research on it as I was very curious, but I'm sure there's tons more I don't know.
Mostly what I've gotten out of this is that there are some interesting old Unix materials on archive.org (including hand-written notes). I really enjoyed reading the old manpages. Wow, things have come a long way.

I'm still waiting for a vim vs. emacs discussion. :)
 

Ericloewe

Not-very-passive-but-aggressive
Moderator
Joined
Feb 15, 2014
Messages
17,013
"The number of UNIX installations has grown to 10 (...)"

Priceless.
 

solarisguy

Neophyte Sage
Joined
Apr 4, 2014
Messages
1,125
@rs225, long, long time ago, people were interested in the capacity measured in words, not bytes. (As I have pointed out earlier, bytes were not even 8-bit only. And words could be for example 18-bit, 24-bit, 36-bit, 39-bit.) So the stated disk capacity was converted to how many words their OS/hardware could store. And everybody was accustomed to that, especially since the disk manufacturers were advertising unformatted capacities, something like 1.44MB floppy was hyped as 2.0 MB unformatted.

There were and there are two different worlds. The sales, advertising and media people are operating in the realm of mine is newer and bigger than yours. Actual users only count how much they can really store. Unformatted and formatted days are gone, now TBs versus TiBs, who cares...
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
See, I remember this... http://usatoday30.usatoday.com/tech/products/2007-11-03-2622074867_x.htm

Note that Seagate wasn't the only one sued. If I remember correctly a long dead hard drive manufacturing company was the first to be sued in the early 1990s for manipulating true hard drive sizes. I want to say it was Conner, but I'm not 100% sure.

I got a total of 27 rebates from Seagate totaling more than $800 because of the fact that they lost the lawsuit over the TB and TiB. :D

So let's recap..

- some magic **** with base-2 and base-10 took place looooong ago
- some magic **** with bits and bytes took place looooong ago
- lawsuits began on the size discrepancy in the 1990s
- Changes to the definition of a "MB, GB, TB, etc." were put up for vote in ISO certifications by the hard drive manufacturers.
- We are now where we are, with MB, GB, TB and MiB, GiB, and TiB.

So in the big picture who looks like they are the ones that manipulated the industry and some people (like the OP) bought their propaganda lock, stock, and barrel?

Hint: They lost money in a class-action lawsuit as a result

It's funny because if you ask IT people about the GiB vs GB debate you'll find a pretty clear line on who believes in what. The newer generation is quick to adopt the argument that the hard drive companies are being totally honest with their sizes while the older generation (that was around before the term Gibibyte existed) don't accept the new term.

I, being from the "older generation" feel that if hard drive manufacturers wanted to go with base-10 units for GB, MB, and TB, they should have made their own freakin' units. GB, MB, and TB were already taken (for base-2). In fact, this is part of what was the primary argument against the ISO ratification process, but it was rejected because the arguments weren't discussing the merits of what a GB, MB, and TB actually was but was instead arguing that the standard had been established by the industry and taken for granted for decades. ISO doesn't like hearing about what is effectively "industry trade knowledge" that isn't documented.

I lost all respect for ISOs on that day too.
 
Last edited:

anodos

Sambassador
iXsystems
Joined
Mar 6, 2014
Messages
7,258
@rs225, long, long time ago, people were interested in the capacity measured in words, not bytes. (As I have pointed out earlier, bytes were not even 8-bit only. And words could be for example 18-bit, 24-bit, 36-bit, 39-bit.) So the stated disk capacity was converted to how many words their OS/hardware could store. And everybody was accustomed to that, especially since the disk manufacturers were advertising unformatted capacities, something like 1.44MB floppy was hyped as 2.0 MB unformatted.

There were and there are two different worlds. The sales, advertising and media people are operating in the realm of mine is newer and bigger than yours. Actual users only count how much they can really store. Unformatted and formatted days are gone, now TBs versus TiBs, who cares...
For more information about ancient forms of data storage, see the PDP11 peripherals handbook: https://ia601609.us.archive.org/2/i...bk1976_34818234/PDP11_PeripheralsHbk_1976.pdf

All I want for christmas is a PDP11. :)
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,526
LOL. PDP11. Those were in the movie D.A.R.Y.L. from 1985 I believe.
 

rs225

Neophyte Sage
Joined
Jun 28, 2014
Messages
878
I took a look at the PDP Pdf, and it is using base-10 when referring to storage technologies. (page 3-4, "20,000,000" char per tape reel, or per disk pack (possibly in words) ) One thing of interest, is that they specifically note that K=1024 when referring to the memory architecture of the PDP. This implies you can assume K=1000 everywhere else. So, that is in 1976.

I took a look at the UNIX from 1982, anodos. Assuming you are referring to the first page, the 16Kbytes is referring to RAM that the first 64 sectors are going to take when loaded into your system memory, not storage. The text is describing a boot-loader process, as in how you load the UNIX kernel into your system memory to begin executing it. Even if I take your view, it is just another example of a 'programmer' conflating the RAM base-2 limitation to storage in 1982, rather than 1984 (Mac OS).

I am aware that there have been different size bytes, etc, over the years. My only concern is this one: Is there any proof the hard drive makers changed from base-2 to base-10? So far, evidence is no. The only evidence says software makers accidentally applied base-2 to storage measurements.

cyberjock, if you are from the 'older generation,' and if those units were already 'taken,' then it shouldn't be that hard to find evidence of that. I haven't seen it anywhere. Wikipedia seems not to have it. I think the problem is, it doesn't exist. Why would anybody be talking about Gigabytes of RAM pre-1980? Probably the most prolific use of Kilo, Mega, and Giga in those days were in cold war fear-mongering, or for Kilowatts (1000), Megawatts (1,000,000) and Gigawatts (1,000,000,000). I'm not sure exactly where Jigawatts falls, but that was later than 1980 anyway. ;)

anodos said:
As far as names go, I'll use the word "tebibyte" ...
I know what you mean... But, you can keep using Terabyte, because it is unlikely you are talking about a Tebibyte of RAM when you say it.
 
Last edited:
Status
Not open for further replies.
Top