Recurring lu_disk_write failed & lu_disk_lbwrite failed

Status
Not open for further replies.

benfrank3

Cadet
Joined
Oct 18, 2012
Messages
9
Hello,
I have been using FreeNAS 8 for about 6 months with no problems until recently.
I am connecting to the unit via iSCSI from a Windows 2003 server. There are no other connections to the FreeNAS server.
It appears the existing files on the unit are sound and not corrupted in any way. I can pull off files and the entire directory structure and file list seems okay for reading.
But now anytime I try to copy files to the device (write via iSCSI) I receive repeating errors in Windows that say "Delayed write failed the data may have been lost". In the FreeNAS server I receive repeating errors that say lu_disk_write failed & lu_disk_lbwrite failed.
I have attached a screenshot from the FreeNAS server.
The ZFS volume seems ok with no drive failures. Smart status seems okay as well.
What gives? Any help would be greatly appreciated.
Thanks,
Frank
Freenas.jpg
 

DMooring

Dabbler
Joined
Sep 1, 2011
Messages
20
Sorry you are getting the errors but I am glad it is not only me.
I keep having the same issue. The only way to clear it is to reformat the target.
If I run chkdsk on the drive I get unrecoverable errors.
The last time I had it I reformated exFAT instead of NTFS to see it that would help.
 

benfrank3

Cadet
Joined
Oct 18, 2012
Messages
9
)I've been troubleshooting this in detail and I noticed 2 things.
1. The target becomes unwritable after a certain number of files and folders are written. Delete some files, and you can write some files again. But it's not a hard number of files, it's due to something else - that which I haven't been able to nail down.
2. I tried connecting to the iSCSI target from a Win 7 computer and don't have the subject problem (lu_disk_write_failed). Strangely, it seems generated only when using the Win 2003 server. But I do have the first problem, same on both computers.

???
 

DMooring

Dabbler
Joined
Sep 1, 2011
Messages
20
My server is Windows 2008 R2. The Freenas target has 24Tb of space. I only seem to have the problem when the target gets around 70 to 80 % full. After which deleting file does not seem to help.
I need massive storage that is expandable that is why I went with freenas and ZFS but I may have to look into something else if I can not solve this issue.
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
Check to see how busy the system is when this is happening. iSCSI is sensitive to performance issues, see bug #1531.
 

benfrank3

Cadet
Joined
Oct 18, 2012
Messages
9
OK, thanks, can you tell me how to do that?
Do I just look at Display System Processes?
 

DMooring

Dabbler
Joined
Sep 1, 2011
Messages
20
I have had no problems with the exFat format yet but I have not been running it long. Only about 3 weeks.
 

TravisT

Patron
Joined
May 29, 2011
Messages
297
I've also been experiencing these problems. I'm in the process of standing up an iSCSI based ESXi host. I've done several tests, and they seemed to work well. I reconfigured it a couple days ago and now I'm getting these same errors. Nothing has changed. I'm using two 1TB disks in a striped configuration (and will mirror two more stripes to the pool shortly for RAID-10). I'm sharing zvols via iSCSI device extents.
 

benfrank3

Cadet
Joined
Oct 18, 2012
Messages
9
I re-did the zvol and made the disk dynamic in windows, and a full long format.
So far no errors but it has only been 1 day... I also posted a support ticket with freenas but NO response and it has been a few days... :(
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,525
exFAT is what I consider a "dangerous" file system. Not only does it not have journaling, but it's not well supported except under Vista and 7. If you have XP you can install a patch that allows XP to use exFAT. exFAT was really meant only for USB sticks and is not bootable. exFAT also has licensing issues and isn't supported under Linux.

Things may have changed, but as of mid-2011 exFAT was very much a "redheaded stepchild" and isn't likely to ever change as new file systems are being developed regularly and there isn't much of a reason to use exFAT over other file systems, especially ones that use a journal.
 

TravisT

Patron
Joined
May 29, 2011
Messages
297
I tried destroying my zvols and recreated them. I got the errors again. I then upgraded to the 8.3.0 release and deleted the volume again, recreating them as v28 pools to see if that would help, and it doesn't seem to have made a difference. I've done a full zero wipe of the two disks I'm currently using. Otherwise, nothing has changed since I had this working a week or two ago with 5 VMs running across iscsi.

I also added more RAM to my VMware esxi box - could this be caused by that box by any chance, or is this more than likely a freenas issue?
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
As I already said, performance problems with iSCSI, especially when the pool is ~80% full, are more or less a ZFS tuning issue. Quantifying how busy the system is is very helpful to understanding the nature of the problem, but basically iSCSI can start falling apart with fairly short bursts of ZFS going catatonic doing writes, so one has to aggressively manage the txg parameters in order to retain responsiveness, or you can use SSD's which don't exhibit the problem due to speed, or you can use UFS which exhibits the problem in a much less abstract-and-frustrating manner, which is the direction we've slowly been leaning because the benefits of ZFS to iSCSI are less significant than to general NAS service.
 

DMooring

Dabbler
Joined
Sep 1, 2011
Messages
20
exFat failed. Reformated again but this time just set up a CIFS share. Load average is much lower and file transfer seems to be faster. I will just stay with this till iSCSI problems are fixed.
 

TravisT

Patron
Joined
May 29, 2011
Messages
297
My problems seem to be caused by the intel pro 1000/gt adapters I'm using on both the freenas box and my esxi box. The esxi box is getting constant watchdog timeouts and interface resets with any significant traffic. I'm also getting watchdog timeouts on the freenas e1000 nic, but it doesn't seem to bring the whole interface down.

Not sure where to go from here, as my problem doesn't seem to even be iscsi related.
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
Oh, well, yes, network problems can also lead to iSCSI failures. If both the ESXi box and the FreeNAS box are seeing problems, have you considered checking for issues with your switch? Like maybe overheat or imminent failure? iSCSI is dependent on a very large number of things all working fairly well. Problems at any layer can be catastrophic or at least very annoying.
 

TravisT

Patron
Joined
May 29, 2011
Messages
297
I won't rule that out, but I'm using a cisco 3750G switch, so it's more than capable of handling that traffic. Also, since the nic itself is dropping, as in the protocol drops, I'm thinking this is a hardware/driver related problem. I've checked the switch for any indications of problems already, but I'll look it over again for good measure.
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,525
I won't rule that out, but I'm using a cisco 3750G switch, so it's more than capable of handling that traffic. Also, since the nic itself is dropping, as in the protocol drops, I'm thinking this is a hardware/driver related problem. I've checked the switch for any indications of problems already, but I'll look it over again for good measure.

1. The switch could be failing. I'd try a different switch, even if only temporarily. Most switches I've seen that failed don't give you any "hey.. I am broke" lights. They just don't work right, assuming the switch will even complete bootup testing. Very unfortunate, but its true. :(
2. I'd try a different NIC on each end too. If you have some spare cables, I'd swap out the network cables too. I'd be willing to bet this is more of a hardware issue than driver. If it were a driver there would be tons of people with the problem. ;)
 
Status
Not open for further replies.
Top