SOLVED XenServer ISCSI multipath wont connect after 9.2.1.8 to 9.3 upgrade

Status
Not open for further replies.

Neil_McCarthy2

Dabbler
Joined
Nov 13, 2014
Messages
13
Hi there, I have been running FreeNAS 9.2 series with my XenServer Pool (3 servers) for quite some time without incident.

Performance has been acceptable, and stability flawless.

I have recently upgraded to the 9.3 Beta, and applied the latest patches.

Unfortunately since this upgrade the Xenservers will not reconnect any more.

On the FreeNAS box I am seeing the following while I am trying to repair the SR:

Nov 14 11:40:49 SAN01 ctld[3240]: child process 6020 terminated with exit status 1
Nov 14 11:42:28 SAN01 WARNING: xx.xxx.4.41 (iqn.2010-12.net.ltd.xen01:af419103): waiting for CTL to terminate tasks, 1 remaining
Nov 14 11:42:28 SAN01 WARNING: xx.xxx.2.41 (iqn.2010-12.net.ltd.xen01:af419103): waiting for CTL to terminate tasks, 1 remaining
Nov 14 11:42:28 SAN01 WARNING: xx.xxx.3.41 (iqn.2010-12.net.ltd.xen01:af419103): waiting for CTL to terminate tasks, 1 remaining
Nov 14 11:42:28 SAN01 WARNING: xx.xxx.1.41 (iqn.2010-12.net.ltd.xen01:af419103): waiting for CTL to terminate tasks, 1 remaining
Nov 14 11:42:48 SAN01 ctld[6133]: xx.xxx.1.43 (iqn.2013-07.net.ltd.xen03:e864697c): read: Connection reset by peer
Nov 14 11:42:48 SAN01 ctld[3240]: child process 6133 terminated with exit status 1
Nov 14 11:43:06 SAN01 WARNING: xx.xxx.1.43 (iqn.2013-07.net.ltd.xen03:e864697c): waiting for CTL to terminate tasks, 1 remaining
Nov 14 11:43:06 SAN01 ctld[6185]: xx.xxx.3.43 (iqn.2013-07.net.ltd.xen03:e864697c): read: Connection reset by peer
Nov 14 11:43:06 SAN01 ctld[3240]: child process 6185 terminated with exit status 1
Nov 14 11:43:24 SAN01 WARNING: xx.xxx.3.43 (iqn.2013-07.net.ltd.xen03:e864697c): waiting for CTL to terminate tasks, 1 remaining
Nov 14 11:43:24 SAN01 ctld[6191]: xx.xxx.2.43 (iqn.2013-07.net.ltd.xen03:e864697c): read: Connection reset by peer
Nov 14 11:43:24 SAN01 ctld[3240]: child process 6191 terminated with exit status 1
Nov 14 11:43:43 SAN01 WARNING: xx.xxx.2.43 (iqn.2013-07.net.ltd.xen03:e864697c): waiting for CTL to terminate tasks, 1 remaining
Nov 14 11:43:47 SAN01 ctld[6196]: xx.xxx.1.43: read: connection lost
Nov 14 11:43:47 SAN01 ctld[3240]: child process 6196 terminated with exit status 1
Nov 14 11:43:47 SAN01 ctld[6197]: xx.xxx.1.43 (iqn.2013-07.net.ltd.xen03:e864697c): read: Connection reset by peer
Nov 14 11:43:47 SAN01 ctld[3240]: child process 6197 terminated with exit status 1
Nov 14 11:43:47 SAN01 ctld[6198]: xx.xxx.2.43: read: connection lost
Nov 14 11:43:47 SAN01 ctld[3240]: child process 6198 terminated with exit status 1
Nov 14 11:43:49 SAN01 ctld[6201]: xx.xxx.3.43: read: connection lost
Nov 14 11:43:49 SAN01 ctld[3240]: child process 6201 terminated with exit status 1
Nov 14 11:43:50 SAN01 ctld[6203]: xx.xxx.1.43: read: connection lost
Nov 14 11:43:50 SAN01 ctld[3240]: child process 6203 terminated with exit status 1
Nov 14 11:45:24 SAN01 WARNING: xx.xxx.2.43 (iqn.2013-07.net.ltd.xen03:e864697c): waiting for CTL to terminate tasks, 1 remaining
Nov 14 11:45:24 SAN01 WARNING: xx.xxx.1.43 (iqn.2013-07.net.ltd.xen03:e864697c): waiting for CTL to terminate tasks, 1 remaining
Nov 14 11:45:24 SAN01 WARNING: xx.xxx.3.43 (iqn.2013-07.net.ltd.xen03:e864697c): waiting for CTL to terminate tasks, 1 remaining
Nov 14 11:51:48 SAN01 ctld[6634]: xx.xxx.4.42 (iqn.2013-07.net.ltd.xen02:d4c4d3d5): read: Connection reset by peer
Nov 14 11:51:48 SAN01 ctld[3240]: child process 6634 terminated with exit status 1
Nov 14 11:52:06 SAN01 WARNING: xx.xxx.4.42 (iqn.2013-07.net.ltd.xen02:d4c4d3d5): waiting for CTL to terminate tasks, 1 remaining
Nov 14 11:52:06 SAN01 ctld[6679]: xx.xxx.1.42 (iqn.2013-07.net.ltd.xen02:d4c4d3d5): read: Connection reset by peer
Nov 14 11:52:06 SAN01 ctld[3240]: child process 6679 terminated with exit status 1
Nov 14 11:52:24 SAN01 WARNING: xx.xxx.1.42 (iqn.2013-07.net.ltd.xen02:d4c4d3d5): waiting for CTL to terminate tasks, 1 remaining
Nov 14 11:52:25 SAN01 ctld[6684]: xx.xxx.3.42 (iqn.2013-07.net.ltd.xen02:d4c4d3d5): read: Connection reset by peer
Nov 14 11:52:25 SAN01 ctld[3240]: child process 6684 terminated with exit status 1
Nov 14 11:52:43 SAN01 WARNING: xx.xxx.3.42 (iqn.2013-07.net.ltd.xen02:d4c4d3d5): waiting for CTL to terminate tasks, 1 remaining
Nov 14 11:52:44 SAN01 ctld[6690]: xx.xxx.4.42: read: connection lost
Nov 14 11:52:44 SAN01 ctld[3240]: child process 6690 terminated with exit status 1
Nov 14 11:52:44 SAN01 ctld[6691]: xx.xxx.4.42 (iqn.2013-07.net.ltd.xen02:d4c4d3d5): read: Connection reset by peer
Nov 14 11:52:44 SAN01 ctld[3240]: child process 6691 terminated with exit status 1
Nov 14 11:52:46 SAN01 ctld[6692]: xx.xxx.4.42: read: connection lost
Nov 14 11:52:46 SAN01 ctld[3240]: child process 6692 terminated with exit status 1
Nov 14 11:52:47 SAN01 ctld[6694]: xx.xxx.3.42: read: connection lost
Nov 14 11:52:47 SAN01 ctld[3240]: child process 6694 terminated with exit status 1
Nov 14 11:52:48 SAN01 ctld[6696]: xx.xxx.1.42: read: connection lost
Nov 14 11:52:48 SAN01 ctld[3240]: child process 6696 terminated with exit status 1

On the Xenservers I can see from the logs that the connection seems to be made, and then drops:

Nov 14 11:54:11 XEN01 xapi: [ info|XEN01|60168 INET 0.0.0.0:80|session.slave_login D:b2770fdc3d2c|xapi] Session.create trackid=25eb942e72f17ff4085dbc647e0c2832 pool=true uname= is_local_superuser=true auth_user_sid= parent=trackid=9834f5af41c964e225f24279aefe4e49
Nov 14 11:54:11 XEN01 xapi: [ info|XEN01|60507 INET 0.0.0.0:80|session.slave_login D:7dd104ea821a|xapi] Session.create trackid=c42534bb15f52fca2d3c78eb24c53b6f pool=true uname= is_local_superuser=true auth_user_sid= parent=trackid=9834f5af41c964e225f24279aefe4e49
Nov 14 11:54:11 XEN01 xapi: [ info|XEN01|60168 INET 0.0.0.0:80|dispatch:pool.audit_log_append D:889e73abbbea|taskhelper] task pool.audit_log_append R:cab79e921db0 (uuid:195de8fa-bb36-da7a-b0fb-e47a94aaa6bb) created (trackid=25eb942e72f17ff4085dbc647e0c2832) by task D:5c0143bda5f6
Nov 14 11:54:11 XEN01 xapi: [ info|XEN01|60507 INET 0.0.0.0:80|dispatch:pool.audit_log_append D:f85093519631|taskhelper] task pool.audit_log_append R:4be9ac327f18 (uuid:a5a3799f-65bc-97e6-a701-3fcd2578e9b7) created (trackid=c42534bb15f52fca2d3c78eb24c53b6f) by task D:0a2c0f72c283
Nov 14 11:54:11 XEN01 xapi: [ info|XEN01|60168 INET 0.0.0.0:80|session.logout D:4c97a3e3f0ff|xapi] Session.destroy trackid=25eb942e72f17ff4085dbc647e0c2832
Nov 14 11:54:11 XEN01 xapi: [ info|XEN01|60507 INET 0.0.0.0:80|session.logout D:daf44a37fb75|xapi] Session.destroy trackid=c42534bb15f52fca2d3c78eb24c53b6f
Nov 14 11:54:15 XEN01 xapi: [ info|XEN01|60908 UNIX /var/xapi/xapi|session.slave_login D:ce159e13b770|xapi] Session.create trackid=9f2be8244b37a7ce2eb8544d1c331261 pool=true uname= is_local_superuser=true auth_user_sid= parent=trackid=9834f5af41c964e225f24279aefe4e49
Nov 14 11:54:15 XEN01 xapi: [ info|XEN01|60910 UNIX /var/xapi/xapi|dispatch:SR.scan D:6251ae0b5f15|taskhelper] task SR.scan R:be3e07d70569 (uuid:6e9031b9-720c-41bc-78dc-56a5712b13bb) created (trackid=9f2be8244b37a7ce2eb8544d1c331261) by task D:9889aa2c1309
Nov 14 11:54:15 XEN01 xapi: [ info|XEN01|60910 UNIX /var/xapi/xapi|SR.scan R:be3e07d70569|storage_impl] SR.scan dbg:OpaqueRef:be3e07d7-0569-bc37-16dc-20d5e70077f6 sr:6b3e6149-527d-58f2-c368-9b10a3b956cc
Nov 14 11:54:15 XEN01 xapi: [ info|XEN01|60910 UNIX /var/xapi/xapi|sm_exec D:c6f50c525e77|xapi] Session.create trackid=078ad521383282dc46eabd05a00dbf32 pool=false uname= is_local_superuser=true auth_user_sid= parent=trackid=9834f5af41c964e225f24279aefe4e49
Nov 14 11:54:15 XEN01 xapi: [ info|XEN01|60910 UNIX /var/xapi/xapi|sm_exec D:c6f50c525e77|xapi] Session.destroy trackid=078ad521383282dc46eabd05a00dbf32
Nov 14 11:54:15 XEN01 xapi: [ info|XEN01|60921 UNIX /var/xapi/xapi|session.logout D:00ac0588055a|xapi] Session.destroy trackid=9f2be8244b37a7ce2eb8544d1c331261
Nov 14 11:54:21 XEN01 xenstored: D7 write data/meminfo_free 45312
Nov 14 11:54:21 XEN01 xenstored: A5234 w event /local/domain/7/data/meminfo_free /local/domain/7/data/meminfo_free
Nov 14 11:54:21 XEN01 xenstored: D7 write data/updated Fri Nov 14 11:54:19 CET 2014
Nov 14 11:54:21 XEN01 xenstored: A10 w event /local/domain/7/data/updated /local/domain/7/data/updated
Nov 14 11:54:21 XEN01 xenstored: A6 w event /local/domain/7/data/updated /local/domain/7/data/updated
Nov 14 11:54:21 XEN01 xenstored: A10955 getdomain 7
Nov 14 11:54:22 XEN01 xapi: [ info|XEN01|60790|local logout in message forwarder D:bbd7945601b2|xapi] Session.destroy trackid=055d7e84306be7ada67bd9aea239c7cf
Nov 14 11:54:22 XEN01 xenstored: D8 write data/meminfo_free 548568
Nov 14 11:54:22 XEN01 xenstored: A5234 w event /local/domain/8/data/meminfo_free /local/domain/8/data/meminfo_free
Nov 14 11:54:22 XEN01 xenstored: D8 write data/updated 1
Nov 14 11:54:22 XEN01 xenstored: A10 w event /local/domain/8/data/updated /local/domain/8/data/updated
Nov 14 11:54:22 XEN01 xenstored: A6 w event /local/domain/8/data/updated /local/domain/8/data/updated
Nov 14 11:54:22 XEN01 xenstored: D8 write data/update_cnt 7163
Nov 14 11:54:22 XEN01 xenstored: A10958 getdomain 8
Nov 14 11:54:22 XEN01 xapi: [ info|XEN01|60933|Async.PBD.plug R:dbe068fc641b|dispatcher] spawning a new thread to handle the current task (trackid=9f6a041410012e20513539a7b4ae4758)
Nov 14 11:54:22 XEN01 xapi: [ info|XEN01|60933|Async.PBD.plug R:dbe068fc641b|storage_access] SR 8320fbf5-e147-7811-a857-47e95462d9f5 will be implemented by /services/SM/lvmoiscsi in VM OpaqueRef:96a9fe16-ec92-767e-b2ce-f2c1e58f421a
Nov 14 11:54:22 XEN01 xapi: [ info|XEN01|60933|Async.PBD.plug R:dbe068fc641b|storage_impl] SR.attach dbg:OpaqueRef:dbe068fc-641b-6cb7-1564-0e12c092c2c5 sr:8320fbf5-e147-7811-a857-47e95462d9f5 device_config:[multiSession:10.199.4.1,3260,iqn.san.local.ltd.san01istgt:xeniscsi00|10.199.1.1,3260,iqn.san.local.ltd.san01istgt:xeniscsi00|10.199.3.1,3260,iqn.san.local.ltd.san01istgt:xeniscsi00|10.199.2.1,3260,iqn.san.local.ltd.san01istgt:xeniscsi00|; target:10.199.1.1; multihomelist:10.199.4.1:3260,10.199.1.1:3260,10.199.2.1:3260,10.199.3.1:3260; targetIQN:*; SCSIid:3300000008b693c12; port:3260]
Nov 14 11:54:22 XEN01 xapi: [ info|XEN01|60933|sm_exec D:0ff2f8eaf8d9|xapi] Session.create trackid=a295a8917dd25092615b3afbd9a9efef pool=false uname= is_local_superuser=true auth_user_sid= parent=trackid=9834f5af41c964e225f24279aefe4e49
Nov 14 11:54:23 XEN01 kernel: [66770.393133] scsi209 : iSCSI Initiator over TCP/IP
Nov 14 11:54:24 XEN01 kernel: [66770.654198] scsi 209:0:0:0: Direct-Access FreeBSD iSCSI Disk 0123 PQ: 0 ANSI: 6
Nov 14 11:54:24 XEN01 kernel: [66770.654453] sd 209:0:0:0: Attached scsi generic sg4 type 0
Nov 14 11:54:24 XEN01 kernel: [66770.654996] sd 209:0:0:0: [sdb] 1048576000 512-byte logical blocks: (536 GB/500 GiB)
Nov 14 11:54:24 XEN01 kernel: [66770.655000] sd 209:0:0:0: [sdb] 8192-byte physical blocks
Nov 14 11:54:24 XEN01 kernel: [66770.655298] scsi 209:0:0:1: Direct-Access FreeBSD iSCSI Disk 0123 PQ: 0 ANSI: 6
Nov 14 11:54:24 XEN01 kernel: [66770.655505] sd 209:0:0:1: Attached scsi generic sg5 type 0
Nov 14 11:54:24 XEN01 kernel: [66770.656119] sd 209:0:0:1: [sdc] 1048576000 512-byte logical blocks: (536 GB/500 GiB)
Nov 14 11:54:24 XEN01 kernel: [66770.656125] sd 209:0:0:1: [sdc] 0-byte physical blocks
Nov 14 11:54:24 XEN01 kernel: [66770.656360] sd 209:0:0:0: [sdb] Write Protect is off
Nov 14 11:54:24 XEN01 kernel: [66770.656868] sd 209:0:0:0: [sdb] Write cache: enabled, read cache: enabled, supports DPO and FUA
Nov 14 11:54:24 XEN01 kernel: [66770.657225] sd 209:0:0:1: [sdc] Write Protect is off
Nov 14 11:54:24 XEN01 kernel: [66770.657630] sd 209:0:0:1: [sdc] Write cache: enabled, read cache: enabled, supports DPO and FUA
Nov 14 11:54:24 XEN01 kernel: [66770.659026] sdb: detected capacity change from 0 to 536870912000
Nov 14 11:54:24 XEN01 kernel: [66770.659031] sdb:
Nov 14 11:54:24 XEN01 kernel: [66770.659794] sdc: detected capacity change from 0 to 536870912000
Nov 14 11:54:24 XEN01 kernel: [66770.659798] sdc: unknown partition table
Nov 14 11:54:24 XEN01 kernel: [66770.663435] unknown partition table
Nov 14 11:54:24 XEN01 kernel: [66770.665927] sd 209:0:0:0: [sdb] Attached SCSI disk
Nov 14 11:54:24 XEN01 kernel: [66770.665938] sd 209:0:0:1: [sdc] Attached SCSI disk
Nov 14 11:54:24 XEN01 iscsid: connection5:0 is operational now
Nov 14 11:54:25 XEN01 multipathd: sdc: add path (uevent)
Nov 14 11:54:25 XEN01 multipathd: sdb: add path (uevent)
Nov 14 11:54:33 XEN01 xenstored: D5 write data/meminfo_free 1285600
Nov 14 11:54:33 XEN01 xenstored: A5234 w event /local/domain/5/data/meminfo_free /local/domain/5/data/meminfo_free
Nov 14 11:54:33 XEN01 xenstored: D5.931 rm attr/eth0
Nov 14 11:54:33 XEN01 xenstored: D5 write attr/eth0/ip x.x.x.x
Nov 14 11:54:33 XEN01 xenstored: D5 write attr/eth0/ipv6/0/addr fe80::5017:97ff:fe51:49b2
Nov 14 11:54:33 XEN01 xenstored: D5 write data/updated Fri Nov 14 11:53:54 CET 2014
Nov 14 11:54:33 XEN01 xenstored: A10 w event /local/domain/5/data/updated /local/domain/5/data/updated
Nov 14 11:54:33 XEN01 xenstored: A6 w event /local/domain/5/data/updated /local/domain/5/data/updated
Nov 14 11:54:33 XEN01 xenstored: A10961 getdomain 5
Nov 14 11:54:41 XEN01 kernel: [66787.903063] sd 209:0:0:0: [sdb] Synchronizing SCSI cache
Nov 14 11:54:41 XEN01 multipathd: sdc: remove path (uevent)
Nov 14 11:54:41 XEN01 kernel: [66787.953008] sd 209:0:0:1: [sdc] Synchronizing SCSI cache
Nov 14 11:54:41 XEN01 multipathd: sdb: remove path (uevent)
Nov 14 11:54:41 XEN01 kernel: [66788.213398] connection5:0: detected conn error (1020)

Any ideas please?

I have tried rebooting the whole infrastructure without any progress.

Thanks

Neil
 

Neil_McCarthy2

Dabbler
Joined
Nov 13, 2014
Messages
13
Sorry forgot to mention, Xenserver 6.2 Updates to XS62ESP1008.

Jumbo frames are enabled, and have been tested since the update. No packet loss.

Thanks
 
Last edited:

mav@

iXsystems
iXsystems
Joined
Sep 29, 2011
Messages
1,428
Unfortunately, I can not guess what is going on just from the logs above. But I can say that I fixed one bug breaking Xen compatibility couple months ago and after that it was working fine on my tests. What's for this case, I would recommend you to grab full packet dump with `tcpdump -s 0 -i <interface> -w dump.out port 3260` of one iSCSI connection with the problem and then it to me in some way (link?) to mav@FreeBSD.org.
 

Neil_McCarthy2

Dabbler
Joined
Nov 13, 2014
Messages
13
OK, I have tried a few things this morning, which have made no difference:

Tried dropping MTU to 1500
Tried reducing number of connections to storage
Tried applying latest updates

I am sending a link to the dump now.

Thanks

Neil
 

mav@

iXsystems
iXsystems
Joined
Sep 29, 2011
Messages
1,428
I've looked through the dump and found no errors there. What I see is that initiator logins, tastes the disks in all ways and cleanly logouts. So I suppose the answer should be searched on the Xen side. I think that for some reason Xen does not like what it sees.

Between FreeNAS 9.2 and 9.3 iSCSI target software was completely changed. And while the new one is faster and in most cases is more featured, it is just different. In particular the reported LUN identifiers are changed. For VMWare it is sometimes needed to do magic sequence during this transition: disable iSCSI target on FreeNAS, do device rescan in VMWare, re-enable target and do rescan again so VMWare discover "new" LUNs. I know very little about Xen, but what I would try to do is delete iSCSI disks from Xen configuration and rediscover them again.
 

Neil_McCarthy2

Dabbler
Joined
Nov 13, 2014
Messages
13
OK I think that I've got to the bottom of the cause, just no fix at the moment.

The issue seems to be related as you say about the new LUN formats, in particular the SCSIid seems to have changed.

XenServer (or perhaps open-iscsi which they use) does not seem to like the trailing _ in the SCSIid returned by FreeNAS.

When you look at the device file created by the iscsi scan, you see the following:

[root@XEN01 ~]# ls -l /dev/disk/by-id
total 0
lrwxrwxrwx 1 root root 9 Nov 13 17:23 edd-int13_dev80 -> ../../sda
lrwxrwxrwx 1 root root 10 Nov 13 17:23 edd-int13_dev80-part1 -> ../../sda1
lrwxrwxrwx 1 root root 10 Nov 13 17:23 edd-int13_dev80-part2 -> ../../sda2
lrwxrwxrwx 1 root root 10 Nov 13 17:23 edd-int13_dev80-part3 -> ../../sda3
lrwxrwxrwx 1 root root 10 Nov 13 17:23 edd-int13_dev80-part4 -> ../../sda4
lrwxrwxrwx 1 root root 9 Nov 16 14:29 scsi-1FreeBSD_iSCSI_Disk_10feed0259f0000 -> ../../sdh
lrwxrwxrwx 1 root root 9 Nov 16 14:29 scsi-1FreeBSD_iSCSI_Disk_10feed0259f0001 -> ../../sdi

However when you do a sr-probe in XenServer then you see the following returned from FreeNAS:

[root@XEN01 ~]# xe sr-probe type=lvmoiscsi device-config:target=10.199.1.1 device-config:targetIQN=iqn.san.local.ltd.san01istgt:xeniscsi01
Error code: SR_BACKEND_FAILURE_107
Error parameters: , The SCSIid parameter is missing or incorrect, <?xml version="1.0" ?>
<iscsi-target>
<LUN>
<vendor>
FreeBSD
</vendor>
<serial>
10feed0259f001
</serial>
<LUNid>
0
</LUNid>
<size>
107374182400
</size>
<SCSIid>
1FreeBSD_iSCSI_Disk_10feed0259f0010_
</SCSIid>
</LUN>
</iscsi-target>

Notice the device file does not have the trailing _ , but the SCSIid does.

When I look at the SCSIid from before the upgrade the difference seems very apparent:

[root@XEN01 ~]# xe pbd-list sr-uuid=8320fbf5-e147-7811-a857-47e95462d9f5
uuid ( RO) : cba720a6-0854-b672-0fcf-2172ba7a5956
host-uuid ( RO): 23c844e5-ec92-4d8b-b168-c2518aff1839
sr-uuid ( RO): 8320fbf5-e147-7811-a857-47e95462d9f5
device-config (MRO): multiSession: 10.199.4.1,3260,iqn.san.local.ltd.san01istgt:xeniscsi00|10.199.1.1,3260,iqn.san.local.ltd.san01istgt:xeniscsi00|10.199.3.1,3260,iqn.san.local.ltd.san01istgt:xeniscsi00|10.199.2.1,3260,iqn.san.local.ltd.san01istgt:xeniscsi00|; target: 10.199.1.1; multihomelist: 10.199.2.1:3260,10.199.3.1:3260,10.199.1.1:3260,10.199.4.1:3260; targetIQN: *; SCSIid: 3300000008b693c12; port: 3260
currently-attached ( RO): false

Any ideas anyone how I can progress this forward?

Thanks

Neil
 

Neil_McCarthy2

Dabbler
Joined
Nov 13, 2014
Messages
13
Right I took the trailing spaces out of the device-id in ctl.conf and restarted ctld and look at the output from the probe now:

[root@XEN01 ~]# xe sr-probe type=lvmoiscsi device-config:target=10.199.1.1 device-config:targetIQN=iqn.san.local.ltd.san01istgt:xeniscsi01
Error code: SR_BACKEND_FAILURE_107
Error parameters: , The SCSIid parameter is missing or incorrect, <?xml version="1.0" ?>
<iscsi-target>
<LUN>
<vendor>
FreeBSD
</vendor>
<serial>
10feed0259f001
</serial>
<LUNid>
0
</LUNid>
<size>
107374182400
</size>
<SCSIid>
1FreeBSD_iSCSI_Disk_10feed0259f0010
</SCSIid>
</LUN>
</iscsi-target>

It appears that there are one of two things going on here (or both):

1. The XenServer xe sr-probe is incorrectly formatting the return back from FreeNAS
2. FreeNAS should not be space filling the parameter

I have just re-attached to one of my iSCSI LUNs successfully since making this change.

Should I raise some kind of enhancement request?

Thanks

Neil
 

Neil_McCarthy2

Dabbler
Joined
Nov 13, 2014
Messages
13
I have just had a quick look at the code:

/usr/local/libexec/nas/generate_ctl_conf.py

198 for i in xrange(31-len(target.iscsi_target_serial)):
199 padded_serial += " "
200 cf_contents.append('\t\t\tdevice-id "iSCSI Disk %s"\n' % padded_serial)

Does the device-id have to be padded?

Thanks

Neil
 

Neil_McCarthy2

Dabbler
Joined
Nov 13, 2014
Messages
13
I was thinking about this on the way to work this morning.

If it is because XenServer and VMWare have different requirements for the SCSIid, then perhaps the definition for the target could have a tickbox for "VMWare Target". If this is the case when the associated targets are put into the ctl.conf file they are padded, if not they are just left.

Just a thought.

Thanks
 

Mlovelace

Guru
Joined
Aug 19, 2014
Messages
1,111
I was thinking about this on the way to work this morning.

If it is because XenServer and VMWare have different requirements for the SCSIid, then perhaps the definition for the target could have a tickbox for "VMWare Target". If this is the case when the associated targets are put into the ctl.conf file they are padded, if not they are just left.

Just a thought.

Thanks
You should submit a bug report ( https://bugs.freenas.org/projects/freenas ) and link this thread. The freeNAS developers don't read the forums often.
 

Neil_McCarthy2

Dabbler
Joined
Nov 13, 2014
Messages
13
The bug is showing fixed.

Have switched to nightly updates so as soon as this filters through to my box, I'll test and update this thread.
 
J

jkh

Guest
The bug fix is also in the 9.3-BETA train; you don't need to switch to the nightlies in order to fix this one. Thanks.
 

Neil_McCarthy2

Dabbler
Joined
Nov 13, 2014
Messages
13
Sorry for the display in getting back to everyone.
The change has worked perfectly, and I now have access to the iSCSI devices.
However please be aware that this needs you to forget your repositories, and reconnect them, and then reconnect the VMs to their virtual disks. This is further confused when snapshots are in the mix.
I need to do some more work on this to verify the process. Once I have I will write it up properly.
Thanks
 

Ste

Dabbler
Joined
Sep 12, 2014
Messages
45
Forgetting the repositories is a major pain. I've done it and been left with a bunch of VDIs that I know nothing about except their size. And since they were all the same size, there was no way to tell which VDI went with what VM. I'd have to pick a VM, attach a random VDI to it, see what hostname it comes up with, label the disk, bring down the VM, attach the disk to its proper VM, rinse and repeat, until I had them all relabelled and attached to the proper VMs.

I really don't want to have to go through all that again.

I know I can back up all the metadata, but then to get things back in place, I think I have to delete all the VMs and let them be recreated from the metadata, and I've just been a bit too paranoid of losing everything to try that.
 

Neil_McCarthy2

Dabbler
Joined
Nov 13, 2014
Messages
13
When I did it, I was left with the VDI names correctly (assuming the name-label was set), however it did not distinguish between the snapshots. Fortunately I only had one snapshot on the iSCSI.
Like I say I was trying a lot of things at the time, so it is not all clear in my mind so my intention is to test this more thoroughly and document the process, so that we all have a idea of what we are likely to walk into in the event of an upgrade to 9.3.
I suspect that the safest route will be to do a data migration to new SRs on different storage, but some will not have that luxury...
Thanks
 

Ste

Dabbler
Joined
Sep 12, 2014
Messages
45
Given that the release is in two days, I was wondering if you've had a chance to write that documentation you mentioned?
 

Neil_McCarthy2

Dabbler
Joined
Nov 13, 2014
Messages
13
I'm sorry I havent, been a bit busy. I'll try and get this done overnight tonight, and tested.
Thanks
 

Ste

Dabbler
Joined
Sep 12, 2014
Messages
45
No need to apologize - we're all busy. It was just a friendly nudge. :) When you do post it, you might want to do so outside of this particular forum section, as it says it will go away when 9.3 is officially released. Perhaps in the How-To section?
 
Status
Not open for further replies.
Top