Windows Event ID:140 (Delayed Write Failed) with freenas over iSCSI

Joined
Feb 1, 2019
Messages
3
We seem to have performance problem with freenas to the point were it is causing data corruption.

Anyone already encountered those kind of problems.

iSCSI has its own dedicated 10Gb network and we have only 2 backup servers using it. Backup servers are doing a good amount of random iops because of deduplication in the backup solutions but mostly sequential writes. The backup servers are both running Windows Server 2016. Jumbo frame is already configured and we can reach nearly 900MB/s on reads but writes are really slow.

Freenas server config;
  • 64GB RAM
  • Intel(R) Xeon(R) Silver 4108 CPU @ 1.80GHz
  • 13x + 1x HS 10TB NLSAS in rz2
The complete Windows event message;

{Delayed Write Failed} Windows was unable to save all the data for the file <path>; the data has been lost. This error was returned by the server on which the file exists. Please try to save this file elsewhere.
 

jro

iXsystems
iXsystems
Joined
Jul 16, 2018
Messages
80
Are you getting any errors on the FreeNAS side about the iSCSI initiator timing out or anything like that? Poking around on Google seems to point to a possible network issue.
 
Joined
Feb 1, 2019
Messages
3
Are you getting any errors on the FreeNAS side about the iSCSI initiator timing out or anything like that? Poking around on Google seems to point to a possible network issue.

No iSCSI initiator timing out and no iSCSI specific errors in event viewer.
 

jro

iXsystems
iXsystems
Joined
Jul 16, 2018
Messages
80
Interesting. I did a bit more digging. I'd suggest adjusting the Windows I/O timeout value in the registry (HKEY_LOCAL_MACHINE\System\CurrentControlSet\Services\Disk\), but I doubt increasing that will improve your performance.

Do you have a SLOG device configured in your FreeNAS? What's logical block size are you using in the iSCSI extent? And what are you using for your zvol block size?
 
Joined
Feb 1, 2019
Messages
3
We do not have a dedicated SLOG device atm but we are in the process of adding a PCIE SSD for that purpose (as all the drive slots are already used).

I did not change the default iSCSI extend block size, so 512.

The volblocksize is 64K as we had huge space loss when using smaller volblocksize (50% space loss)
 

jro

iXsystems
iXsystems
Joined
Jul 16, 2018
Messages
80
All of that seems fine and I'm glad to hear you're getting a PCIe SSD (Optane?) for your SLOG. That should improve write latency significantly and it might solve this issue. The Windows iSCSI initiator writes all metadata synchronously, so without a SLOG, latency on those writes can really suck.
 
Top