[ale] musings about the insides of an ssd

Ron Frazier (ALE) atllinuxenthinfo at techstarship.com
Fri Jul 19 17:07:00 EDT 2013


Hi all,

Today I'm musing on the insides of an ssd.  Why am I doing this?  There are three reasons.  1) Maximum PC magazine has a nice overview in the August issue of all the major components of a pc, including ssd's.  2) I have a couple of hybrid drives, with a small amount of ssd cache memory, and I wonder about their longevity.  3) I want an ssd for my main desktop drive, and don't have one.  That's not likely to happen though, since I just upgraded that to a 1 TB spinning drive.

In any case, I feel that too little attention is paid to what's inside the ssd drive, and too much to the benchmarks.  I want to share what I'm learning.

Here are some random observations based on my reading.  Some of this has been addressed here before, but it's good to update things on occasion.

What the heck stores the data?  In essence, your data is stored in a memory cell, which is insanely small, maybe 20 nm or less, which is a capacitor that stores an electrical charge.  You put an electrical charge of voltage in the cell, that's a binary 1 (or possibly the other way around).  You take it away, the binary bit changes.  To write the charge to the cell, you apply voltage through an insulator whose job it is to keep the charge in the cell, so your data won't vanish.  In so doing, you partially destroy the insulator.  Hence, the cells have a finite number of times they can be written to reliably.  This is not a significant concern with writing magnetic domains on a hdd.  Sometimes magnetic coatings on hdd's fail, but many times, mechanical things kill hdd's.  Just using an ssd normally will use it up, over time.  In a sense, you do the same to a hdd.  You use up the mechanism, primarily.  Hdd failures are probably less predictable than ssd failures, but both will fail.

Flash memory cells are rated according to the number of program / erase cycles they can endure.  The type of cell I described, with a full state and an empty state, is called an SLC, or single level cell.  It stores 1 binary bit.  It is the most reliable and has the most endurance.  It is also the most expensive technology, and is usually only used for enterprise drives.

See these websites for some very good information:

https://en.wikipedia.org/wiki/Single-level_cell
http://www.centon.com/flash-products/chiptype
https://en.wikipedia.org/wiki/Solid-state_drive
https://en.wikipedia.org/wiki/Flash_memory

The first two, in particular, explain cell types.

What bugs me about many articles and ads is that they don't tell you what's inside the drive.  They just look at benchmarks.  But, in my opinion, what's inside is important.  According to the last article, SLC NAND flash cells are rated at about 100K cycles.  If you can afford that technology, it's the best thing to have.  Depending on your usage, it may not be the most cost effective for you.

The flash memory article mentions Samsung OneNAND KFW4G16Q2M as an example of SLC.  I don't know if that's a flash memory part number or an SSD part number which uses this flash.

The next type of memory cell you'll hear bandied about is MLC, or multi level cell.  Most of these hold 4 voltage states and store 2 binary bits  So, it takes more or less 1/2 as many cells for a given capacity of drive.  However, they're more prone to errors.  So more error correction logic and spare cells must be provided.  An MLC drive will not necessarily cost 1/2 of what an SLC drive does, but it will be much cheaper.  More of a concern, to me, is the fact that, according to the last article, these cells only have 1/10 - 1/20 of the endurance of the SLC, or 10K - 5K cycles.  So, if an SLC drive was able to last 5 years, or 1825 days AT A CERTAIN DATA RATE of programming and erasing, then an MLC drive, used in the same conditions, might only last 1825 / 20 = 91 days until the cells are no longer guaranteed to perform properly.  Things like over provisioning affect this which I'll mention below.

The flash memory article mentions this as an example of MLC: Samsung K9G8G08U0M.

There is a new player in this game, which I personally recommend avoiding like the plague.  It's called TLC, and I haven't found what the acronym means.  However, it stores 8 voltage levels, or 3 binary bits.  This is the most error prone, the quickest to wear out, and the least expensive.  The endurance rating of TLC, according to the flash memory article, is 1/100 of that of SLC, or 1K cycles.

So, the same SSD that I said would last 5 years or 1825 days at a certain data rate if it had SLC cells, might only last 18 days with TLC memory.

The article gives Samsung 840 as an example of a TLC device.  Samsung generally has a very good reputation in this industry.  However, I would NOT buy this device if it has TLC in it.

Not that, after the endurance rating is over, the drive doesn't just die, but the data stored in the cells would become less and less reliable.  The ability of the cells to hold the data will become less and less sure.  Freshly written data might be readable now but not months later.  More and more error correction kicks in.  More and more over provisioning space is used up, if any is available.  Eventually, you get unrecoverable read and write errors which grow in volume.

So, whether an ssd can work for you depends on your data usage and pattern.  SLC is the best option, but may be overly expensive.  MLC may be a good option, if you don't over tax it.  I personally would say avoid TLC.  The general rule I try to go by is not to buy the cheapest of anything I depend on or which is under stress.

Here's the proper way to review and rate and spec an SSD.  This is from Computer Power User magazine, April 2013, an article about the Intel SSD 520 180 GB.

<quote>

... that memory is 25 nm MLC NAND manufactured by Intel. ... The drive has a 5 year warranty and a 5 year endurance rating (at 20 GB of writes each day), not that we expect to have to worry about either.

</quote>

You've got the key bits of data right there.  In this context, a warranty without an endurance rating means nothing.  An endurance rating without a data rate means nothing.  Unfortunately, many articles and ads don't give you this data.

Doing a bit of quick math.  5 yr = 1825 days.  1825 days * 20 GB = 36.5 TB of total data that you can write to the drive before it is no longer guaranteed to be fully reliable.  36.5 TB / 180 GB of capacity = 202.  Thus, you can completely fill, or refill, the drive with data 202 times.

You may say to yourself, I'll never write 36 TB of data to this drive.  Well, you may not, but your pc might.  There are long checklists that you can and should look for any give type of pc to optimize it for an ssd.  It's different for linux, windows, mac.  I don't have them memorized, but I'll just say here that your pc can foil your best laid plans behind your back.

Here are just five examples from the linux world.

Note, reading from the memory cells does not hurt the memory cells.  Only writing and erasing them.

In many cases, linux will alter the meta data of a file even if you just read the file.  It stores the "date accessed".  This causes some data to be written to the disk every time you touch a file, whether you alter it at all.  When you consider years of usage, and tens of thousands of files, this adds up, and uses up the write endurance of your ssd.  Most of the ssd optimization checklists recommend turning this off.

Another place the os can sabotage your ssd is your swap file or partition.  By definition, the swap file or partition will start being exercised when the pc runs lower on ram than it wants to be for whatever it's doing.  From experience, I can tell you that opening 70 or so tabs in Firefox, email, Libre Office, and a few other random things will start hitting the swap file unless you have lots of ram.  If you've ever watched the drive light of a pc with swap being used, you may see it flickering continuously.  Hitting swap can add many GB of data writes to the drive in a very short time.  In the case above, you've only got 36,500 GB of writes to play with for the entire life of the drive.  Most checklists recommend putting your swap on a spinning drive.

You get email, right?  You must, because you're reading this message.  I'll bet you have dozens or hundreds of emails coming in all the time.  In my case, I have thousands, from mailing lists and such.  Every email adds to the database if you store them locally, which may be mysql or something.  Every email uses up some of the drive's write life.  How much, depends on your usage.  I have email archives going back a decade, which encompass 10 GB or so.  The average tech user, maybe 1 GB.  The average grandma, maybe 50 MB.  It depends.  Here's another thing.  As you delete messages, usually they're flagged as being deleted, but actually still exist.  Later the database may be compacted, which involves essentially rewriting the remaining messages back to the hard drive and deleting the old file.  So, if you have an 8 GB file with 2 GB of deleted messages, the system could decided to rewrite the remaining 6 GB.  That comes off your usage tally of the drive's life.

What about podcasts, vidcasts, and media.  My tablet downloads 1 GB of podcasts / week, about 20 shows of around 1 hour of audio.  I listen to them, and most eventually get erased.  Not that I plan to change, but, that does contribute to using up my flash memory.  If that were video on a desktop pc, it could occupy 20 X the space or more.

What if you edit certain types of documents or media files?  What if your system always makes a backup copy of every file, or different versions?  What if it edits a temp file and then saves the file?  Etc., etc.

All these scenarios can balloon the amount of data being written to the drive to much more than what you'd think.  On a hdd, you don't worry.  The drive is either working or it's not.  When it starts to throw errors, you deal with it.  But, you might want to think about what you're doing to your SSD.  Everything you write to it pushes it, in a definable and predictable way, toward its death bed.

We've discussed the concept of data scrubbing here before.  This is a good practice with either a hdd or ssd.  It is a good idea to use a diagnostic utility to periodically read every sector on the drive.  This lets the controller know if any sectors are weak and need error correction or to activate over provisioning.  If you have evidence that the drive is throwing errors, and a read scan doesn't fix them, it can be a good idea to do a full drive WRITE / READ scan.  Yes, depending on the diagnostic, you might use up 1 or 2 of the 202 full writes this example drive has of it's life.  However, some errors can only be detected by the controller by trying to write to the cells then read them back.  If necessary, they are mapped out and spares brought on line.  Thus, you could extend the life of the drive past the point where it might otherwise have to be replaced by letting the controller know, in detail, which cells it can depend on.

I've mentioned over provisioning a couple of times.  This is simply providing more memory cells than what are needed to provide the rated capacity of the drive.  In the hdd world, we'd think of them as spare sectors.  Over provisioning allows the controller to deallocate cells that are getting flaky and use the spares instead.  This extra space is also used for drive optimization and management.  This extends the useful life of the drive.  In general, I think it's a good idea to pay for some over provisioning.

The Maximum PC article made a very interesting statement.  It said that if you see a drive rated at 256 GB, there is no over provisioning.  But, if it says 240 GB, there is 16 GB of over provisioning.

Wow, is it really that simple?  I never thought about that.  Can you really just subtract the drive capacity from the nearest binary multiple to find the over provision amount?  If so, it sounds like that Intel 180 GB drive I mentioned has 76 GB of spare space.  I don't know for sure if that's true, but it sounds good.

What concerns me is that, 5 years down the road or less  for heavy users, and later for casual users, a bunch of people that didn't properly consider drive endurance, or who bought cheaper products with MLC or TLC flash, may find that their data is starting to mysteriously vanish.

I wonder if there are any utilities that can monitor the total data written to your existing hdd over time, so you could know, for example, if the Intel drive's 20 GB / day limit is a problem for you.

The controller is a critical part of the SSD.  However, looking for a specific name may not be as important as before.  There are sometimes performance differences between compressible data (like text) and non compressible data (like iso's) that you may wish to consider.

The drive should support TRIM, as should your OS, which allows the os in the background to erase blocks of data that are no longer needed, so the cells will be ready for writing when needed.  Some drives have automatic garbage collection which does some of this work without requiring the os to trigger it.

Finally, if you're buying a motherboard, you might want to consider whether it supports ssd caching.  This allows you to cache your most frequently used programs and data automatically to a small ssd, but have most of your big data and programs on a spinning hdd where space is cheaper.  This is what a hybrid hdd does, but ssd caching technology allows you to create the same effect by bringing your own hdd and ssd and combining them.

I mentioned a while back in another thread that some drives support a data field in the smart data that shows the ssd drive's estimated remaining life based on the usage to the current time.  The controller tallies up the total data written and makes that data available.  You should look for a drive with this feature and monitor its progress toward its eventual doom.

Also, I remember mentioning previously, but it bears repeating, that some ssd warranties are based on the endurance rating and not the clock.  So, if you write 60 GB / day to the drive I mentioned above, which is rated for 20 GB / day, your warranty may expire in 1.67 years and not 5 years.

Finally, here's a piece of anecdotal advice, which I cannot prove the worth of.  But, run the drive.  Yes, just run the drive.  I don't trust an ssd lying on the shelf for a year to hold my data nearly as much as I would an hdd.  Could be wrong.  But, by running the drive, you allow the controller to do its maintenance, analysis, garbage collection, whatever it does.  It cannot do that when it's powered off.

I've had the memory in my GPS go wonky a couple of times.  I've had to wipe and restore the unit's software and reset the settings.  I'm pretty sure those failures occurred after prolonged times of non use.  Since I've been running the unit every day, I believe I've had fewer problems.

Hope this is helpful.

Sincerely,

Ron



--

Sent from my Android Acer A500 tablet with bluetooth keyboard and K-9 Mail.
Please excuse my potential brevity if I'm typing on the touch screen.

(PS - If you email me and don't get a quick response, you might want to
call on the phone.  I get about 300 emails per day from alternate energy
mailing lists and such.  I don't always see new email messages very quickly.)

Ron Frazier
770-205-9422 (O)   Leave a message.
linuxdude AT techstarship.com
Litecoin: LZzAJu9rZEWzALxDhAHnWLRvybVAVgwTh3
Bitcoin: 15s3aLVsxm8EuQvT8gUDw3RWqvuY9hPGUU




More information about the Ale mailing list