iSCSI volume disconnects from VMware 6.0 during vMotion/Clone

Status
Not open for further replies.

Chris M

Cadet
Joined
Feb 3, 2017
Messages
1
Hello everyone.

I have a bare metal FreeNAS 9.10.2 U6 presenting 2 zvol volumes to 2 ESXi 6.0 5050593 hosts via iSCSI. The VM's power up fine but when I attempt to vMotion from one of the datastores to another, the destination datastore disconnects from the ESXi host within about 1 minute. It appears that it might be bogging down the FreeNAS box due to the wired memory dropping like a rock before everything times out, then going back up to the expected value. I have tried adding ZIL SSD's to both datastores with the same result.
upload_2017-12-5_11-35-53.png


The FreeNAS stats are below
Code:
SuperMicro 2028U-TRTP+
2x E5-2620v3
32GB DDR4 ECC RAM
Volume 1: RAIDZ2 6x Seagate SATA 7.2k 2.5"
Volume 2: RAIDZ2 6x Seagate SAS 7.2k 2.5"
Intel 82599ES 10-Gigabit SFP+ NIC
LSI SAS 9305-24I


VMware vobd.log when issue occurs
Code:
2017-12-04T22:21:35.895Z: [vmfsCorrelator] 16119849733us: [esx.problem.vmfs.heartbeat.timedout] 5a20c4bd-fce101d7-5e83-0cc47ab58e90 SharedStorage_SAS


Example of dmesg outputs when issue occurs
Code:
WARNING: 192.1.11.222 (iqn.1998-01.com.vmware:vmhost2-24375a89): no ping reply (NOP-Out) after 5 seconds; dropping connection
ctl_datamove: tag 0xac618 on (0:3:1) aborted
ctl_datamove: tag 0xac61c on (0:3:1) aborted
ctl_datamove: tag 0xac61d on (0:3:1) aborted
(0:1:0/0): WRITE(10). CDB: 2a 00 73 ee 60 00 00 08 00 00
(0:1:0/0): WRITE(16). CDB: 8a 00 00 00 00 02 b5 4f 70 00 00 00 08 00 00 00
(0:1:0/0): WRITE(16). CDB: 8a 00 00 00 00 02 b5 4f a8 00 00 00 08 00 00 00
(0:1:0/0): WRITE(16). CDB: 8a 00 00 00 00 02 c1 f3 d0 00 00 00 08 00 00 00
(0:1:0/0): Tag: 0x11f3ac, type 1
(0:1:0/0): Tag: 0x11f3ad, type 1
(0:1:0/0): ctl_datamove: 349 seconds
(0:1:0/0): ctl_datamove: 349 seconds
(0:1:0/0): Tag: 0x11f389, type 1
(0:1:0/0): Tag: 0x11f3ae, type 1
(0:1:0/0): ctl_process_done: 350 seconds
(0:1:0/0): ctl_datamove: 349 seconds
(0:1:0/0): WRITE(10). CDB: 2a 00 72 dc 18 00 00 08 00 00
(0:1:0/0): Tag: 0x11f374, type 1
(0:1:0/0): WRITE(10). CDB: 2a 00 72 dc 80 00 00 08 00 00
(0:1:0/0): ctl_process_done: 350 seconds
(0:1:0/0): Tag: 0x11f3af, type 1
(0:1:0/0): WRITE(16). CDB: 8a 00 00 00 00 02 b5 4f b0 00 00 00 08 00 00 00
(0:1:0/0): Tag: 0x11f3b0, type 1
(0:1:0/0): ctl_datamove: 349 seconds
(0:1:0/0): ctl_datamove: 349 seconds


I'm not entirely sure where to go from here. Thank you for any and all help.

UPDATE: I've identified some issues noted below. The same target name was being used for both extents so I will be creating a second target and attempting this again. Worst case scenario, I'll also try upgrading to 11U4. Speaking of that, is 11 officially released now?
 
Last edited:
Status
Not open for further replies.
Top