WD Red SMR Drive Compatibility with ZFS

JoshDW19

Community Hall of Fame
Joined
May 16, 2016
Messages
1,077
Thanks to the FreeNAS community, we have uncovered a potential ZFS compatibility issue with some capacities of newer WD Red drives that use SMR (Shingled Magnetic Recording) technology.

Update to this post: https://www.truenas.com/community/threads/update-wd-red-smr-drive-compatibility-with-zfs.88413/

(Note: this post was accurate when written in April 2020. In June 2020, Western Digital announced a rebranding and introduced the WD Red Plus to indicate WD Red CMR drives. In about August 2020, the drives will have changed names Please read this blog to understand the details.)

Western Digital’s WD Red hard drives are commonly used in FreeNAS Minis due to good affordability and low power/acoustic footprint. They are also a popular choice among FreeNAS community members building systems of up to 8 drives. In general, the WD Reds have a normal workload rating of 180TB/year and are specified to work with NAS and RAID systems with up to 8 drives. Most NAS systems will write their drive capacity less than once per month, and most applications where WD Reds are used will typically write far less than that, which is why they were selected as the companion drive for the FreeNAS Mini.

Previous generation WD Red drives and higher capacity (8TB or greater) models in the current generation drives use CMR (Conventional Magnetic Recording), as do WD Red Pro and all the Enterprise-class drives such as WD Gold and Ultrastar. These CMR drives all perform well and are very reliable under ZFS.

Smaller WD Red drives (2 to 6TB) from their newest generation, released in late 2018, use a different recording technology known as DM-SMR (Device-Managed Shingled Magnetic Recording). The 2TB, 3TB, 4TB, and 6TB WD Red DM-SMR drives can be identified by the letters “EFAX” in their product code. There are also previous generation WD Red drives in the same capacities that are CMR drives (identified by “EFRX” in their product codes). The 8TB and larger drives with the “EFAX” code use CMR technology. Realizing this is a little difficult to track, Western Digital has provided a chart to identify all of their drives with DM-SMR .

iXsystems and Western Digital have been working to identify and resolve the ZFS compatibility issue with the WD Red DM-SMR drives. The testing is not yet complete, but at this stage we can confirm:

  1. Some SMR drives are indeed compatible with ZFS, though due to a lack of a ratified industry standard implementation of the technology, they don’t all necessarily behave the same. Further study is needed to determine each implementation’s suitability with ZFS. At present, we are focused on testing the WD Red DM-SMR implementation.
  2. In general, SMR drives are used for their power efficiency and affordability. With regard to performance, sequential write speeds can be faster, but random write speeds are lower and do slow down operations such as resilvering. We recommend CMR drives for more uniform performance. For a list of SMR drives, see the community post from Yorick.
  3. The WD Red DM-SMR drives use an indirect Logical Block Addressing model, which is similar to SSDs. After random writes to the drive, the drives do need to perform some background garbage collection which does reduce their sustained random write performance. When adding a DM-SMR drive to a ZFS pool, it is better that the drive is erased beforehand.
  4. At least one of the WD Red DM-SMR models (the 4TB WD40EFAX with firmware rev 82.00A82) does have a ZFS compatibility issue which can cause it to enter a faulty state under heavy write loads, including resilvering. This was confirmed in our labs this week during testing, causing this drive model to be disqualified from our products. We expect that the other WD Red DM-SMR drives with the same firmware will have the same issue, but testing is still ongoing to validate that assumption.
  5. In the faulty state, the WD Red DM-SMR drive returns IDNF errors, becomes unusable, and is treated as a drive failure by ZFS. In this state, data on that drive can be lost. Data within a vdev or pool can be lost if multiple drives fail.
  6. ZFS incompatibility causing drive failure is a rare event. While we have shipped approximately one hundred FreeNAS Mini systems with the DM-SMR 2TB and 6TB drives, we have had only one reported issue. More testing will be done to understand the ZFS compatibility issues and how to avoid them.
  7. Of iXsystems products, the FreeNAS Mini Series is the only product line that uses WD Red drives. Most of the FreeNAS Mini systems shipped have not used DM-SMR drives. Only systems shipped with 2TB or 6TB drives since September 2019 may have the DM-SMR drives.

Both iXsystems and Western Digital treat data loss as a serious event. We will be doing more work with them to identify and resolve the drive ZFS compatibility issue. Any solution will then be validated with some heavy ZFS workloads with FreeNAS 11.3 and TrueNAS CORE 12.0 before we report back to the community. Follow Western Digital’s blog for more information.

Any existing systems with these drives should have a backup strategy, as should all systems with valuable data. If anyone experiences the IDNF errors with FreeNAS, please contact us and we will advise on how best to handle the situation. Any FreeNAS Mini users will be covered under the standard warranty. There will be more communication when we understand how best to mitigate or resolve the ZFS compatibility issues.

In the meantime, iXsystems will not ship Minis with the impacted WD Red DM-SMR drives. We will instead be using other WD Red drives or WD Red Pro drives that use CMR technology. None of the Mini systems available on Amazon use these DM-SMR drives. It is also recommended that the FreeNAS community avoid using these WD Red DM-SMR drives in their own FreeNAS builds until further information is available.
To simplify the drive selection process, below is a summary of the WD RED CMR drives that are recommended and the DM-SMR drives to avoid until a good resolution is provided.

1591637656106.png


If you have any questions about what to use for your next FreeNAS system, please use the community forum or contact iXsystems.
 
Last edited by a moderator:
Joined
Oct 22, 2019
Messages
3,641
Not sure when the change was made on Western Digital's online store, but it appears they now clearly list which drives use CMR vs SMR. Enough pressure from the iXsystems community? :cool:


* Scroll down to the section titled Recording Technology.
 

no_connection

Patron
Joined
Dec 15, 2013
Messages
480
Part of me want you to bash WD for all of it, maybe rightfully so. But this calm and collected approach to find a solution is way more productive and helpful to the industry as a whole. We need a good way to handle SMR drives reliably and safely for any usecase that is suitable to "archive" use.
 

Arwen

MVP
Joined
May 17, 2014
Messages
3,611
@winnielinnie, Thank you for the link.

Interestingly enough, WD has 8, 10, 12 & 14TB drives with EFAX model numbers that are supposed to be CMR. I really hope this is correct, and not another hidden SMR drive.

Note: I do like my Seagate 8TB Archive / SMR drive. It was the only cheap 8TB drive at the time I bought it. I'd buy another from Seagate, (for my backups, which weirdly enough is write mostly, slow on SMR drives), but they don't make ones larger than 8TB, (as far as I know).
 

morganL

Captain Morgan
Administrator
Moderator
iXsystems
Joined
Mar 10, 2018
Messages
2,694
@winnielinnie, Thank you for the link.

Interestingly enough, WD has 8, 10, 12 & 14TB drives with EFAX model numbers that are supposed to be CMR. I really hope this is correct, and not another hidden SMR drive.

Note: I do like my Seagate 8TB Archive / SMR drive. It was the only cheap 8TB drive at the time I bought it. I'd buy another from Seagate, (for my backups, which weirdly enough is write mostly, slow on SMR drives), but they don't make ones larger than 8TB, (as far as I know).


Yes, the 8 - 14TB drives with EFAX in the name are CMR. EFAX refer to the later generation, not the recording technology.
 

danpoleary

Dabbler
Joined
Nov 29, 2013
Messages
42
Well, I got hit by this, I have a Z2 array made up of the 6TB EFAX drives (48% utilized) and a cable came loose. I replaced that cable and that drive tried to re-silver. It never completed. It would try for hours, fail, and try again, and then was forced out. I had 2 spare EFAX drives that I tried with as well, and same issue. I then tried a Toshiba N300, and it re-silvered fine. So to test further, I replaced the Toshiba with a WD 10TB EFAX drive. It re-silvered fine as well.

I opened a ticket with WD, and they want me to ship each EFAX drive individually back to them at my expense to replace them with EFRX models. I said they should ship me new first so I can rebuild without losing data, and that they should send pre-paid shipping label for the returns. Have not received a reply from them yet.

This is a nightmare waiting for anyone attempting a re-silver with these crappy SMR drives.
 

sretalla

Powered by Neutrality
Moderator
Joined
Jan 1, 2016
Messages
9,703
This is a nightmare waiting for anyone attempting a re-silver with these crappy SMR drives.
Just to clarify, it shouldn't take longer than "normal" to resilver in an EFRX drive to an EFAX VDEV since the only writes going on should be to the EFRX (not impacted by the SMR mass-write performance penalty).

If you go down that painful road, be sure to share how it goes and confirm how long it takes.
 

HoneyBadger

actually does care
Administrator
Moderator
iXsystems
Joined
Feb 6, 2014
Messages
5,112
Resilvering to a "new" or "blank" disk shouldn't take any longer, as all writes will be able to be dumped into (blank) sequential sections. A disk that is kicked out and rejoins may be subject to more read-modify-write reshingling though. More data needed on this front.
 

danpoleary

Dabbler
Joined
Nov 29, 2013
Messages
42
Just to clarify, it shouldn't take longer than "normal" to resilver in an EFRX drive to an EFAX VDEV since the only writes going on should be to the EFRX (not impacted by the SMR mass-write performance penalty).

Not sure what you mean, but non of the 3 6TB EFAX drives would complete the resilver. the 10TB EFAX(non-smr) completed in normal time, so did the Toshiba N300 6TB drive.

Patterns on the EFAX 6TB was that it started out fast, then after about 70%, it would get slower and slower, adding to the completion esitmate (Started at 3+ hours, died when it said 3+ days). It went to less then 1MB/s close to when in failed and retried.

I will have to keep better track of everything the next time one of the 6TB EFAX drives needs to resilver.
 

morganL

Captain Morgan
Administrator
Moderator
iXsystems
Joined
Mar 10, 2018
Messages
2,694
Not sure what you mean, but non of the 3 6TB EFAX drives would complete the resilver. the 10TB EFAX(non-smr) completed in normal time, so did the Toshiba N300 6TB drive.

Patterns on the EFAX 6TB was that it started out fast, then after about 70%, it would get slower and slower, adding to the completion esitmate (Started at 3+ hours, died when it said 3+ days). It went to less then 1MB/s close to when in failed and retried.

I will have to keep better track of everything the next time one of the 6TB EFAX drives needs to resilver.

Was the EFAX drive new with zero data on it? It would be useful to know.
 

danpoleary

Dabbler
Joined
Nov 29, 2013
Messages
42
Was the EFAX drive new with zero data on it? It would be useful to know.
The first drive attempted was the original where cable had issues, when I replaced the cable it tried to resilver. So no, it was not zeroed.

The 2nd and 3rd drives I attempted with were right out of packaging. They should be zeroed at factory I hope. Neither of these 2 completed resilver.
 

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,995
This was a great read and thanks to iXsystems to testing and posting this data. Hopefully the SMR drives will become more reliable so that when someone buys a bargain priced drive that they don't have problems like this.
 

no_connection

Patron
Joined
Dec 15, 2013
Messages
480
I don't see some of the problems going away, what I would like to see is at the least a warning whenever a new drive is put in to read up information about SMR and what can happen if it is resilvered for a week into a pool. At least until future SMR is well established.
If there is any way to check from SMART or Serial number to detect and warn a user that would be even better.

We do absolutely need ZFS to be 100% stable when used with SMR drives, no doubt about it. Heck even a pool made out of SD cards should be stable IMO even tho that probably never going to be a good idea.
 

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,995
Heck even a pool made out of SD cards should be stable IMO even tho that probably never going to be a good idea.
I fully agree and understand.
 

joeschmuck

Old Man
Moderator
Joined
May 28, 2011
Messages
10,995
Well I guess that is a good thing. Crazy that they have a new name but in the long run I guess it's a good thing. Everyone should be buying new WD "Red Plus" drives (for those folks who buy WD Reds). Thanks for posting that link.
 
Last edited:

JoshDW19

Community Hall of Fame
Joined
May 16, 2016
Messages
1,077
Yeah. I have to admit the approach struck me a little funny at first too but I do appreciate that they're trying to straighten out their product line so it's easier to know the difference between DM-SMR and CMR. Hopefully, the lesson here is becoming more apparent to drive vendors as time goes on and we'll start to see better transparency in the industry.
 

zedfive

Cadet
Joined
Aug 12, 2020
Messages
8
I'm in the unfortunate situation, that I read about the SMR problem while upgrading the NAS capacity from 4x4TB configured as two striped mirrors with WD Red EFAX drives to 4x6TB (WD Red EFAX). I replaced 1 disk after the other about every 3 month (starting spring 2019). When I read about the issue I already had replaced 3 of the disks, so currently the NAS has 1x4TB WD Red EFAX and 3x6TB WD Red EFAX. Maybe it is worth to mention, that since the NAS doesn't have a spare slot, I replaced the 6TB drives using an USB2.0 docking cradle and than moved it to the freed slot. Maybe due the slow transmission speed this prevented that the drive cache wasn't probably not filled up. I so far didn't have any issues neither during replacing and normal operation. As of today Aug-12-2020 I'm replacing the first 6TB-EFAX with a 6TB-EFRX which I first bought to replace the last 4TB drive. I also started a replacement issue with the WD support in Germany. The hotline seems to be competent. We agreed to replace one disk after the other, since they want to replace after I sent the the old one first to them. I'll see how it will go. If there is interest I could give a update how the story went.
 

zedfive

Cadet
Joined
Aug 12, 2020
Messages
8
correction: of course the 4TB drives are WD40EFRX. Please disregard the first message.

I'm in the unfortunate situation, that I read about the SMR problem while upgrading the NAS capacity from 4x4TB configured as two striped mirrors with WD Red EFRX drives to 4x6TB (WD Red EFAX). I replaced 1 disk after the other about every 3 month (starting spring 2019). When I read about the issue I already had replaced 3 of the disks, so currently the NAS has 1x4TB WD Red EFRX and 3x6TB WD Red EFAX. Maybe it is worth to mention, that since the NAS doesn't have a spare slot, I replaced the 6TB drives using an USB2.0 docking cradle and than moved it to the freed slot. Maybe due the slow transmission speed this prevented that the drive cache wasn't probably not filled up. I so far didn't have any issues neither during replacing and normal operation. As of today Aug-12-2020 I'm replacing the first 6TB-EFAX with a 6TB-EFRX which I first bought to replace the last 4TB drive. I also started a replacement issue with the WD support in Germany. The hotline seems to be competent. We agreed to replace one disk after the other, since they want to replace after I sent the the old one first to them. I'll see how it will go.
If there is interest I could give a update how the story went.
 
Top