SOLVED Bluefin iSCSI crashes QEMU hosts

kiler129

Dabbler
Joined
Apr 16, 2016
Messages
22
Ok, this one is bizzare and I really don't know where to start debugging this or how to provide meaningful logs. All my VMs with iSCSI storage with direct libiscsi connection from TNS started crashing upon TNS upgrade to Bluefin.

There's nothing in the logs of TNS, besides a message stating that iSCSI session was unexpectedly closed. The hypervisor (KVM, Proxmox) reports only a cryptic messages like:
Code:
Dec 16 03:49:49 hv-chi QEMU[7313]: kvm: iSCSI Busy/TaskSetFull/TimeOut (retry #1 in 18 ms): TASK_SET_FULL
Dec 16 03:49:49 hv-chi QEMU[7313]: kvm: iSCSI Busy/TaskSetFull/TimeOut (retry #1 in 7 ms): TASK_SET_FULL
Dec 16 03:49:49 hv-chi QEMU[7313]: kvm: iSCSI Busy/TaskSetFull/TimeOut (retry #1 in 3 ms): TASK_SET_FULL
Dec 16 03:49:49 hv-chi QEMU[7313]: kvm: iSCSI Busy/TaskSetFull/TimeOut (retry #1 in 34 ms): TASK_SET_FULL
Dec 16 03:49:49 hv-chi QEMU[7313]: kvm: iSCSI Busy/TaskSetFull/TimeOut (retry #1 in 7 ms): TASK_SET_FULL
Dec 16 03:49:49 hv-chi QEMU[7313]: kvm: iSCSI Busy/TaskSetFull/TimeOut (retry #1 in 18 ms): TASK_SET_FULL
Dec 16 03:49:49 hv-chi QEMU[7313]: kvm: iSCSI Busy/TaskSetFull/TimeOut (retry #1 in 1 ms): TASK_SET_FULL
Dec 16 03:49:49 hv-chi QEMU[7313]: kvm: iSCSI Busy/TaskSetFull/TimeOut (retry #1 in 5 ms): TASK_SET_FULL
Dec 16 03:49:49 hv-chi QEMU[7313]: kvm: iSCSI Busy/TaskSetFull/TimeOut (retry #1 in 5 ms): TASK_SET_FULL
Dec 16 03:49:49 hv-chi QEMU[7313]: malloc(): invalid size (unsorted)


What's important, the last line varies. While memory management issue is for sure somewhere in QEMU code, there are two important factors here:

  1. Something has changed in Bluefin causing TASK_SET_FULL (which lead to crash) to occur in the first place
  2. The iSCSI was perfectly stable in Anglefish
I wonder how can I even debug this and make a meaningful bug report.
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
iXsystems has previously indicated that they are not currently focused on iSCSI on SCALE and recommend that you instead use CORE if you need iSCSI.
 

morganL

Captain Morgan
Administrator
Moderator
iXsystems
Joined
Mar 10, 2018
Messages
2,694
Ok, this one is bizzare and I really don't know where to start debugging this or how to provide meaningful logs. All my VMs with iSCSI storage with direct libiscsi connection from TNS started crashing upon TNS upgrade to Bluefin.

There's nothing in the logs of TNS, besides a message stating that iSCSI session was unexpectedly closed. The hypervisor (KVM, Proxmox) reports only a cryptic messages like:
Code:
Dec 16 03:49:49 hv-chi QEMU[7313]: kvm: iSCSI Busy/TaskSetFull/TimeOut (retry #1 in 18 ms): TASK_SET_FULL
Dec 16 03:49:49 hv-chi QEMU[7313]: kvm: iSCSI Busy/TaskSetFull/TimeOut (retry #1 in 7 ms): TASK_SET_FULL
Dec 16 03:49:49 hv-chi QEMU[7313]: kvm: iSCSI Busy/TaskSetFull/TimeOut (retry #1 in 3 ms): TASK_SET_FULL
Dec 16 03:49:49 hv-chi QEMU[7313]: kvm: iSCSI Busy/TaskSetFull/TimeOut (retry #1 in 34 ms): TASK_SET_FULL
Dec 16 03:49:49 hv-chi QEMU[7313]: kvm: iSCSI Busy/TaskSetFull/TimeOut (retry #1 in 7 ms): TASK_SET_FULL
Dec 16 03:49:49 hv-chi QEMU[7313]: kvm: iSCSI Busy/TaskSetFull/TimeOut (retry #1 in 18 ms): TASK_SET_FULL
Dec 16 03:49:49 hv-chi QEMU[7313]: kvm: iSCSI Busy/TaskSetFull/TimeOut (retry #1 in 1 ms): TASK_SET_FULL
Dec 16 03:49:49 hv-chi QEMU[7313]: kvm: iSCSI Busy/TaskSetFull/TimeOut (retry #1 in 5 ms): TASK_SET_FULL
Dec 16 03:49:49 hv-chi QEMU[7313]: kvm: iSCSI Busy/TaskSetFull/TimeOut (retry #1 in 5 ms): TASK_SET_FULL
Dec 16 03:49:49 hv-chi QEMU[7313]: malloc(): invalid size (unsorted)


What's important, the last line varies. While memory management issue is for sure somewhere in QEMU code, there are two important factors here:

  1. Something has changed in Bluefin causing TASK_SET_FULL (which lead to crash) to occur in the first place
  2. The iSCSI was perfectly stable in Anglefish
I wonder how can I even debug this and make a meaningful bug report.

If you've tested with Angelfish, please report the bug with the data you have. If we have more questions we'll ask in the ticket.

if anyone else sees (or doesn't see) the issue, when using iSCSI, please comment.
 

igsbox

Cadet
Joined
Feb 11, 2023
Messages
3
I can confirm this issue

Config:
TrueNAS-SCALE-22.12.0
Proxmox VE 7.3-4
Volumes for Proxmox are shared via iSCSI, zfs-over-iscsi backend for proxmox: https://github.com/TheGrandWazoo/freenas-proxmox
6 VM's, 2 with medium disk io load
network - dedicated vlan for iscsi traffic, 1 Gbit

After migrating from Illumos to Truenas 1 VM with medium disk io began to stop unexpectedly, as descibe in https://ixsystems.atlassian.net/browse/NAS-119466 (now closed)

Workaround:
I changed LUN for problem VM and now her uptime is ~70 hours
 

igsbox

Cadet
Joined
Feb 11, 2023
Messages
3
Unfortunately, I was overjoyed early on. VM crach again.
Log from TrueNAS:
Feb 11 18:56:09 nas2 kernel: [2358826]: scst: TM fn NEXUS_LOSS_SESS/6 (mcmd 00000000866d7f97, initiator iqn.1993-08.org.debian:01:604325509a3e#172.30.1.23, target iqn.2023-01.igshal.nas2:home) Feb 11 18:56:09 nas2 kernel: [10388]: scst: TM fn 6 (mcmd 00000000866d7f97) finished, status 0 Feb 11 18:56:09 nas2 kernel: [2358826]: iscsi-scst: Freeing conn 00000000e022a1a1 (sess=00000000a41e2688, 0x4000014bf7b80 0, initiator iqn.1993-08.org.debian:01:604325509a3e#172.30.1.23) Feb 11 18:56:09 nas2 kernel: [2358826]: iscsi-scst: Freeing session 00000000a41e2688 (SID 4000014bf7b80)
And then restart VM by HA-rule from Proxmox:
Feb 11 18:56:14 nas2 kernel: [10490]: scst: Using security group "security_group" for initiator "iqn.1993-08.org.debian:01:604325509a3e#172.30.1.23" (target iqn.2023-01.igshal.nas2:home) Feb 11 18:56:14 nas2 kernel: [10490]: iscsi-scst: Session 0000000028939c04 created: target 000000005fc6d95b, tid 3, sid 0x70000cbe59980, initiator iqn.1993-08.org.debian:01:604325509a3e#172.30.1.23 Feb 11 18:56:14 nas2 kernel: [10490]: iscsi-scst: Negotiated parameters: InitialR2T No, ImmediateData Yes, MaxConnections 1, MaxRecvDataSegmentLength 1048576, MaxXmitDataSegmentLength 262144, Feb 11 18:56:14 nas2 kernel: [10490]: iscsi-scst: MaxBurstLength 262144, FirstBurstLength 65536, DefaultTime2Wait 0, DefaultTime2Retain 0, Feb 11 18:56:14 nas2 kernel: [10490]: iscsi-scst: MaxOutstandingR2T 1, DataPDUInOrder Yes, DataSequenceInOrder Yes, ErrorRecoveryLevel 0, Feb 11 18:56:14 nas2 kernel: [10490]: iscsi-scst: HeaderDigest None, DataDigest None, OFMarker No, IFMarker No, OFMarkInt 2048, IFMarkInt 2048, RDMAExtensions No Feb 11 18:56:14 nas2 kernel: [10490]: iscsi-scst: Target parameters set for session 70000cbe59980: QueuedCommands 32, Response timeout 90, Nop-In interval 30, Nop-In timeout 30 Feb 11 18:56:14 nas2 kernel: [10490]: iscsi-scst: Creating connection 0000000056d5f744 for sid 0x70000cbe59980, cid 0 (initiator iqn.1993-08.org.debian:01:604325509a3e#172.30.1.23
Crash log on Proxmox node:
Feb 11 18:56:09 media-xen QEMU[2919333]: kvm: iSCSI Busy/TaskSetFull/TimeOut (retry #1 in 14 ms): TASK_SET_FULL Feb 11 18:56:09 media-xen QEMU[2919333]: kvm: iSCSI Busy/TaskSetFull/TimeOut (retry #1 in 4 ms): TASK_SET_FULL Feb 11 18:56:09 media-xen QEMU[2919333]: kvm: iSCSI Busy/TaskSetFull/TimeOut (retry #1 in 5 ms): TASK_SET_FULL Feb 11 18:56:09 media-xen QEMU[2919333]: kvm: ../block/block-backend.c:1273: blk_wait_while_drained: Assertion `blk->in_flight > 0' failed. Feb 11 18:56:09 media-xen kernel: [536153.797479] fwbr201i0: port 2(tap201i0) entered disabled state Feb 11 18:56:09 media-xen kernel: [536153.797750] fwbr201i0: port 2(tap201i0) entered disabled state Feb 11 18:56:09 media-xen systemd[1]: 201.scope: Succeeded. Feb 11 18:56:09 media-xen systemd[1]: 201.scope: Consumed 11h 11min 9.707s CPU time.
And restart VM on Proxmox:
Feb 11 18:56:10 media-xen pve-ha-lrm[4036693]: starting service vm:201 Feb 11 18:56:10 media-xen pve-ha-lrm[4036705]: start VM 201: UPID:media-xen:003D9861:0332176C:63E7BA9A:qmstart:201:root@pam: Feb 11 18:56:10 media-xen pve-ha-lrm[4036693]: <root@pam> starting task UPID:media-xen:003D9861:0332176C:63E7BA9A:qmstart:201:root@pam: Feb 11 18:56:10 media-xen pve-ha-lrm[4036705]: PVE::Storage::LunCmd::FreeNAS::run_lun_command : list_lu(/dev/zvol/nas2-ssd/home/vm-201-disk-0) Feb 11 18:56:10 media-xen pve-ha-lrm[4036705]: PVE::Storage::LunCmd::FreeNAS::freenas_api_check : called Feb 11 18:56:10 media-xen pve-ha-lrm[4036705]: PVE::Storage::LunCmd::FreeNAS::freenas_api_connect : called Feb 11 18:56:10 media-xen pve-ha-lrm[4036705]: PVE::Storage::LunCmd::FreeNAS::freenas_api_connect : REST connection header Content-Type:'text/html' Feb 11 18:56:10 media-xen pve-ha-lrm[4036705]: PVE::Storage::LunCmd::FreeNAS::freenas_api_connect : Changing to v2.0 API's Feb 11 18:56:10 media-xen pve-ha-lrm[4036705]: PVE::Storage::LunCmd::FreeNAS::freenas_api_connect : called Feb 11 18:56:10 media-xen pve-ha-lrm[4036705]: PVE::Storage::LunCmd::FreeNAS::freenas_api_connect : REST connection header Content-Type:'application/json; charset=utf-8' Feb 11 18:56:10 media-xen pve-ha-lrm[4036705]: PVE::Storage::LunCmd::FreeNAS::freenas_api_connect : REST connection successful to '172.30.1.23' using the 'https' protocol Feb 11 18:56:10 media-xen pve-ha-lrm[4036705]: PVE::Storage::LunCmd::FreeNAS::freenas_api_check : successful : Server version: TrueNAS-SCALE-22.12.0 Feb 11 18:56:10 media-xen pve-ha-lrm[4036705]: PVE::Storage::LunCmd::FreeNAS::freenas_api_check : TrueNAS-SCALE Unformatted Version: 22120000 Feb 11 18:56:10 media-xen pve-ha-lrm[4036705]: PVE::Storage::LunCmd::FreeNAS::freenas_api_check : Using TrueNAS-SCALE API version v2.0 Feb 11 18:56:10 media-xen pve-ha-lrm[4036705]: PVE::Storage::LunCmd::FreeNAS::freenas_iscsi_get_globalconfiguration : called Feb 11 18:56:10 media-xen pve-ha-lrm[4036705]: PVE::Storage::LunCmd::FreeNAS::freenas_api_call : called for host '172.30.1.23' Feb 11 18:56:10 media-xen pve-ha-lrm[4036705]: PVE::Storage::LunCmd::FreeNAS::freenas_api_call : successful Feb 11 18:56:10 media-xen pve-ha-lrm[4036705]: PVE::Storage::LunCmd::FreeNAS::freenas_iscsi_get_globalconfiguration : target_basename=iqn.2023-01.igshal.nas2 Feb 11 18:56:10 media-xen pve-ha-lrm[4036705]: PVE::Storage::LunCmd::FreeNAS::freenas_list_lu : called Feb 11 18:56:10 media-xen pve-ha-lrm[4036705]: PVE::Storage::LunCmd::FreeNAS::freenas_iscsi_get_target : called Feb 11 18:56:10 media-xen pve-ha-lrm[4036705]: PVE::Storage::LunCmd::FreeNAS::freenas_api_call : called for host '172.30.1.23' Feb 11 18:56:11 media-xen pve-ha-lrm[4036705]: PVE::Storage::LunCmd::FreeNAS::freenas_api_call : successful Feb 11 18:56:11 media-xen pve-ha-lrm[4036705]: PVE::Storage::LunCmd::FreeNAS::freenas_iscsi_get_target : successful Feb 11 18:56:11 media-xen pve-ha-lrm[4036705]: PVE::Storage::LunCmd::FreeNAS::freenas_get_targetid : called Feb 11 18:56:11 media-xen pve-ha-lrm[4036705]: PVE::Storage::LunCmd::FreeNAS::freenas_iscsi_get_target : called Feb 11 18:56:11 media-xen pve-ha-lrm[4036705]: PVE::Storage::LunCmd::FreeNAS::freenas_api_call : called for host '172.30.1.23' Feb 11 18:56:11 media-xen pve-ha-lrm[4036705]: PVE::Storage::LunCmd::FreeNAS::freenas_api_call : successful Feb 11 18:56:11 media-xen pve-ha-lrm[4036705]: PVE::Storage::LunCmd::FreeNAS::freenas_iscsi_get_target : successful Feb 11 18:56:11 media-xen pve-ha-lrm[4036705]: PVE::Storage::LunCmd::FreeNAS::freenas_get_targetid : successful : 3 Feb 11 18:56:11 media-xen pve-ha-lrm[4036705]: PVE::Storage::LunCmd::FreeNAS::freenas_iscsi_get_target_to_extent : called Feb 11 18:56:11 media-xen pve-ha-lrm[4036705]: PVE::Storage::LunCmd::FreeNAS::freenas_api_call : called for host '172.30.1.23' Feb 11 18:56:11 media-xen pve-ha-lrm[4036705]: PVE::Storage::LunCmd::FreeNAS::freenas_api_call : successful Feb 11 18:56:11 media-xen pve-ha-lrm[4036705]: PVE::Storage::LunCmd::FreeNAS::freenas_iscsi_get_target_to_extent : successful Feb 11 18:56:11 media-xen pve-ha-lrm[4036705]: PVE::Storage::LunCmd::FreeNAS::freenas_iscsi_get_extent : called Feb 11 18:56:11 media-xen pve-ha-lrm[4036705]: PVE::Storage::LunCmd::FreeNAS::freenas_api_call : called for host '172.30.1.23' Feb 11 18:56:11 media-xen pve-ha-lrm[4036705]: PVE::Storage::LunCmd::FreeNAS::freenas_api_call : successful Feb 11 18:56:11 media-xen pve-ha-lrm[4036705]: PVE::Storage::LunCmd::FreeNAS::freenas_iscsi_get_extent : successful Feb 11 18:56:11 media-xen pve-ha-lrm[4036705]: PVE::Storage::LunCmd::FreeNAS::freenas_list_lu : successful Feb 11 18:56:11 media-xen pve-ha-lrm[4036705]: PVE::Storage::LunCmd::FreeNAS::run_list_lu : called with (method: 'list_lu'; result_value_type: 'name'; param[0]: '/dev/zvol/nas2-ssd/home/vm-201-disk-0') Feb 11 18:56:11 media-xen pve-ha-lrm[4036705]: PVE::Storage::LunCmd::FreeNAS::run_list_lu : TrueNAS object to find: 'zvol/nas2-ssd/home/vm-201-disk-0' Feb 11 18:56:11 media-xen pve-ha-lrm[4036705]: PVE::Storage::LunCmd::FreeNAS::run_list_lu 'zvol/nas2-ssd/home/vm-201-disk-0' with key 'name' found with value: '/dev/zvol/nas2-ssd/home/vm-201-disk-0' Feb 11 18:56:11 media-xen pve-ha-lrm[4036705]: PVE::Storage::LunCmd::FreeNAS::run_lun_command : list_view(/dev/zvol/nas2-ssd/home/vm-201-disk-0) Feb 11 18:56:11 media-xen pve-ha-lrm[4036705]: PVE::Storage::LunCmd::FreeNAS::run_list_view : called Feb 11 18:56:11 media-xen pve-ha-lrm[4036705]: PVE::Storage::LunCmd::FreeNAS::freenas_list_lu : called Feb 11 18:56:11 media-xen pve-ha-lrm[4036705]: PVE::Storage::LunCmd::FreeNAS::freenas_iscsi_get_target : called Feb 11 18:56:11 media-xen pve-ha-lrm[4036705]: PVE::Storage::LunCmd::FreeNAS::freenas_api_call : called for host '172.30.1.23' Feb 11 18:56:11 media-xen pve-ha-lrm[4036705]: PVE::Storage::LunCmd::FreeNAS::freenas_api_call : successful Feb 11 18:56:11 media-xen pve-ha-lrm[4036705]: PVE::Storage::LunCmd::FreeNAS::freenas_iscsi_get_target : successful Feb 11 18:56:11 media-xen pve-ha-lrm[4036705]: PVE::Storage::LunCmd::FreeNAS::freenas_get_targetid : called Feb 11 18:56:11 media-xen pve-ha-lrm[4036705]: PVE::Storage::LunCmd::FreeNAS::freenas_iscsi_get_target : called Feb 11 18:56:11 media-xen pve-ha-lrm[4036705]: PVE::Storage::LunCmd::FreeNAS::freenas_api_call : called for host '172.30.1.23' Feb 11 18:56:11 media-xen pve-ha-lrm[4036705]: PVE::Storage::LunCmd::FreeNAS::freenas_api_call : successful Feb 11 18:56:11 media-xen pve-ha-lrm[4036705]: PVE::Storage::LunCmd::FreeNAS::freenas_iscsi_get_target : successful Feb 11 18:56:11 media-xen pve-ha-lrm[4036705]: PVE::Storage::LunCmd::FreeNAS::freenas_get_targetid : successful : 3 Feb 11 18:56:11 media-xen pve-ha-lrm[4036705]: PVE::Storage::LunCmd::FreeNAS::freenas_iscsi_get_target_to_extent : called Feb 11 18:56:11 media-xen pve-ha-lrm[4036705]: PVE::Storage::LunCmd::FreeNAS::freenas_api_call : called for host '172.30.1.23' Feb 11 18:56:11 media-xen pve-ha-lrm[4036705]: PVE::Storage::LunCmd::FreeNAS::freenas_api_call : successful Feb 11 18:56:11 media-xen pve-ha-lrm[4036705]: PVE::Storage::LunCmd::FreeNAS::freenas_iscsi_get_target_to_extent : successful Feb 11 18:56:11 media-xen pve-ha-lrm[4036705]: PVE::Storage::LunCmd::FreeNAS::freenas_iscsi_get_extent : called Feb 11 18:56:11 media-xen pve-ha-lrm[4036705]: PVE::Storage::LunCmd::FreeNAS::freenas_api_call : called for host '172.30.1.23' Feb 11 18:56:11 media-xen pve-ha-lrm[4036705]: PVE::Storage::LunCmd::FreeNAS::freenas_api_call : successful Feb 11 18:56:11 media-xen pve-ha-lrm[4036705]: PVE::Storage::LunCmd::FreeNAS::freenas_iscsi_get_extent : successful Feb 11 18:56:11 media-xen pve-ha-lrm[4036705]: PVE::Storage::LunCmd::FreeNAS::freenas_list_lu : successful Feb 11 18:56:11 media-xen pve-ha-lrm[4036705]: PVE::Storage::LunCmd::FreeNAS::run_list_lu : called with (method: 'list_view'; result_value_type: 'lun-id'; param[0]: '/dev/zvol/nas2-ssd/home/vm-201-disk-0') Feb 11 18:56:11 media-xen pve-ha-lrm[4036705]: PVE::Storage::LunCmd::FreeNAS::run_list_lu : TrueNAS object to find: 'zvol/nas2-ssd/home/vm-201-disk-0' Feb 11 18:56:11 media-xen pve-ha-lrm[4036705]: PVE::Storage::LunCmd::FreeNAS::run_list_lu 'zvol/nas2-ssd/home/vm-201-disk-0' with key 'lun-id' found with value: '2' Feb 11 18:56:11 media-xen pve-ha-lrm[4036705]: PVE::Storage::LunCmd::FreeNAS::run_lun_command : list_extent(/dev/zvol/nas2-ssd/home/vm-201-disk-0) Feb 11 18:56:11 media-xen pve-ha-lrm[4036705]: PVE::Storage::LunCmd::FreeNAS::run_list_extent : called with (method: 'list_extent'; params[0]: '/dev/zvol/nas2-ssd/home/vm-201-disk-0') Feb 11 18:56:11 media-xen pve-ha-lrm[4036705]: PVE::Storage::LunCmd::FreeNAS::freenas_list_lu : called Feb 11 18:56:11 media-xen pve-ha-lrm[4036705]: PVE::Storage::LunCmd::FreeNAS::freenas_iscsi_get_target : called Feb 11 18:56:11 media-xen pve-ha-lrm[4036705]: PVE::Storage::LunCmd::FreeNAS::freenas_api_call : called for host '172.30.1.23' Feb 11 18:56:11 media-xen pve-ha-lrm[4036705]: PVE::Storage::LunCmd::FreeNAS::freenas_api_call : successful Feb 11 18:56:11 media-xen pve-ha-lrm[4036705]: PVE::Storage::LunCmd::FreeNAS::freenas_iscsi_get_target : successful Feb 11 18:56:11 media-xen pve-ha-lrm[4036705]: PVE::Storage::LunCmd::FreeNAS::freenas_get_targetid : called Feb 11 18:56:11 media-xen pve-ha-lrm[4036705]: PVE::Storage::LunCmd::FreeNAS::freenas_iscsi_get_target : called Feb 11 18:56:11 media-xen pve-ha-lrm[4036705]: PVE::Storage::LunCmd::FreeNAS::freenas_api_call : called for host '172.30.1.23' Feb 11 18:56:11 media-xen pve-ha-lrm[4036705]: PVE::Storage::LunCmd::FreeNAS::freenas_api_call : successful Feb 11 18:56:11 media-xen pve-ha-lrm[4036705]: PVE::Storage::LunCmd::FreeNAS::freenas_iscsi_get_target : successful Feb 11 18:56:11 media-xen pve-ha-lrm[4036705]: PVE::Storage::LunCmd::FreeNAS::freenas_get_targetid : successful : 3 Feb 11 18:56:11 media-xen pve-ha-lrm[4036705]: PVE::Storage::LunCmd::FreeNAS::freenas_iscsi_get_target_to_extent : called Feb 11 18:56:11 media-xen pve-ha-lrm[4036705]: PVE::Storage::LunCmd::FreeNAS::freenas_api_call : called for host '172.30.1.23' Feb 11 18:56:11 media-xen pve-ha-lrm[4036705]: PVE::Storage::LunCmd::FreeNAS::freenas_api_call : successful Feb 11 18:56:11 media-xen pve-ha-lrm[4036705]: PVE::Storage::LunCmd::FreeNAS::freenas_iscsi_get_target_to_extent : successful Feb 11 18:56:11 media-xen pve-ha-lrm[4036705]: PVE::Storage::LunCmd::FreeNAS::freenas_iscsi_get_extent : called Feb 11 18:56:11 media-xen pve-ha-lrm[4036705]: PVE::Storage::LunCmd::FreeNAS::freenas_api_call : called for host '172.30.1.23' Feb 11 18:56:11 media-xen pve-ha-lrm[4036705]: PVE::Storage::LunCmd::FreeNAS::freenas_api_call : successful Feb 11 18:56:11 media-xen pve-ha-lrm[4036705]: PVE::Storage::LunCmd::FreeNAS::freenas_iscsi_get_extent : successful Feb 11 18:56:11 media-xen pve-ha-lrm[4036705]: PVE::Storage::LunCmd::FreeNAS::freenas_list_lu : successful Feb 11 18:56:11 media-xen pve-ha-lrm[4036705]: PVE::Storage::LunCmd::FreeNAS::run_list_extent TrueNAS object to find: 'zvol/nas2-ssd/home/vm-201-disk-0' Feb 11 18:56:11 media-xen pve-ha-lrm[4036705]: PVE::Storage::LunCmd::FreeNAS::run_list_extent 'zvol/nas2-ssd/home/vm-201-disk-0' wtih key 'naa' found with value: '0x6589cfc000000a475c44b5b92e5f9f7d'
 

morganL

Captain Morgan
Administrator
Moderator
iXsystems
Joined
Mar 10, 2018
Messages
2,694
Have you tested with TrueNAS 13.0 or SCALE Angelfish?
That would help confirm and locate the problem.

Bluefin will get its 1st update this month...if the problem persists then we should create a bug report.
 

igsbox

Cadet
Joined
Feb 11, 2023
Messages
3
Have you tested with TrueNAS 13.0 or SCALE Angelfish?
No, sorry ) I just recently started an experiment on migration from Solaris to TrueNAS. And I immediately chose TrueNAS Scale for the better support of modern hardware
Bluefin will get its 1st update this month...if the problem persists then we should create a bug report.
Good news ) Of course
 

LucaE

Cadet
Joined
Mar 21, 2023
Messages
5
if anyone else sees (or doesn't see) the issue, when using iSCSI, please comment.
Hello everyone,

I wanted to contribute here by telling that I'm running into the same behavior/error.
My setup is "TrueNAS-SCALE-22.12.0" + iscsi + Proxmox (PVE) 7.3.3

What I've done
- created vms using qm create in the cli
- Imported 7 existing vms (.qcow2 and .img) from an older "bare" qemu-setup into proxmox using qm diskimport to an iscsi device
- all machines are now stored on the same iscsi device
- started the vms using proxmox gui

Result
- 5 of 7 vms run as expected
- 2 vms show the following errors
- both of the 2 failing vms were able to be started one time (one vm on the first try and the second on about the 5th try)

Hint?
- both failing vms are very old (ubuntu 12.04 and 14.04)
- BUT one of the running vms also runs ubuntu 14.04 ...

Behaviour
After I started the specific vm in proxmox I get the following lines in the log-files:

Proxmox log
Mar 17 13:54:53 BMproxmox QEMU[225022]: kvm: iSCSI Busy/TaskSetFull/TimeOut (retry #1 in 9 ms): TASK_SET_FULL
Mar 17 13:54:53 BMproxmox QEMU[225022]: kvm: iSCSI Busy/TaskSetFull/TimeOut (retry #1 in 4 ms): TASK_SET_FULL
Mar 17 13:54:53 BMproxmox QEMU[225022]: kvm: iSCSI Busy/TaskSetFull/TimeOut (retry #1 in 2 ms): TASK_SET_FULL
Mar 17 13:54:53 BMproxmox QEMU[225022]: kvm: iSCSI Busy/TaskSetFull/TimeOut (retry #1 in 6 ms): TASK_SET_FULL
Mar 17 13:54:53 BMproxmox QEMU[225022]: kvm: iSCSI Busy/TaskSetFull/TimeOut (retry #1 in 7 ms): TASK_SET_FULL
Mar 17 13:54:53 BMproxmox QEMU[225022]: kvm: iSCSI Busy/TaskSetFull/TimeOut (retry #1 in 4 ms): TASK_SET_FULL
Mar 17 13:54:53 BMproxmox QEMU[225022]: kvm: iSCSI Busy/TaskSetFull/TimeOut (retry #1 in 11 ms): TASK_SET_FULL
Mar 17 13:54:53 BMproxmox QEMU[225022]: kvm: iSCSI Busy/TaskSetFull/TimeOut (retry #1 in 2 ms): TASK_SET_FULL
Mar 17 13:54:53 BMproxmox QEMU[225022]: kvm: iSCSI Busy/TaskSetFull/TimeOut (retry #1 in 2 ms): TASK_SET_FULL
+
Mar 17 13:54:53 BMproxmox QEMU[225022]: kvm: ../block/block-backend.c:1555: blk_aio_read_entry: Assertion `qiov->size == acb->bytes' failed.
or +
Mar 17 13:53:20 BMproxmox QEMU[224744]: malloc(): invalid next size (unsorted)

Truenas log
Mar 17 05:54:17 truenas iscsi-scstd[387468]: Connect from 192.168.10.152:58624 to 192.168.10.151:3260
Mar 17 05:54:17 truenas kernel: [387468]: scst: Using security group "security_group" for initiator "iqn.1993-08.org.debian:01:1242f03a2ae#192.168.10.151" (target iqn.2021-12.xyz.abc.truenas:proxmox)
Mar 17 05:54:17 truenas kernel: [387468]: iscsi-scst: Session 0000000061afb657 created: target 000000001239a84c, tid 2, sid 0x340000b7016980, initiator iqn.1993-08.org.debian:01:1242f03a2ae#192.168.10.151
Mar 17 05:54:17 truenas kernel: [387468]: iscsi-scst: Negotiated parameters: InitialR2T No, ImmediateData Yes, MaxConnections 1, MaxRecvDataSegmentLength 1048576, MaxXmitDataSegmentLength 262144,
Mar 17 05:54:17 truenas kernel: [387468]: iscsi-scst: MaxBurstLength 262144, FirstBurstLength 65536, DefaultTime2Wait 0, DefaultTime2Retain 0,
Mar 17 05:54:17 truenas kernel: [387468]: iscsi-scst: MaxOutstandingR2T 1, DataPDUInOrder Yes, DataSequenceInOrder Yes, ErrorRecoveryLevel 0,
Mar 17 05:54:17 truenas kernel: [387468]: iscsi-scst: HeaderDigest None, DataDigest None, OFMarker No, IFMarker No, OFMarkInt 2048, IFMarkInt 2048, RDMAExtensions No
Mar 17 05:54:17 truenas kernel: [387468]: iscsi-scst: Target parameters set for session 340000b7016980: QueuedCommands 32, Response timeout 90, Nop-In interval 30, Nop-In timeout 30
Mar 17 05:54:17 truenas kernel: [387468]: iscsi-scst: Creating connection 000000005ef342b5 for sid 0x340000b7016980, cid 0 (initiator iqn.1993-08.org.debian:01:1242f03a2ae#192.168.10.151)

// Waiting while console of booting vm outputs "Waiting for boot device"

Mar 17 05:54:26 truenas kernel: [4107811]: iscsi-scst: ***ERROR***: Connection 000000005ef342b5 with initiator iqn.1993-08.org.debian:01:1242f03a2ae unexpectedly closed!
Mar 17 05:54:26 truenas kernel: [4103083]: scst: TM fn NEXUS_LOSS_SESS/6 (mcmd 00000000264077e1, initiator iqn.1993-08.org.debian:01:1242f03a2ae#192.168.10.151, target iqn.2021-12.xyz.abc.truenas:proxmox)
Mar 17 05:54:26 truenas kernel: [387397]: scst: TM fn 6 (mcmd 00000000264077e1) finished, status 0
Mar 17 05:54:26 truenas kernel: [4103083]: iscsi-scst: Freeing conn 000000005ef342b5 (sess=0000000061afb657, 0x340000b7016980 0, initiator iqn.1993-08.org.debian:01:1242f03a2ae#192.168.10.151)
Mar 17 05:54:26 truenas kernel: [4103083]: iscsi-scst: Freeing session 0000000061afb657 (SID 340000b7016980)
 

LucaE

Cadet
Joined
Mar 21, 2023
Messages
5
Behaviour
After I started the specific vm in proxmox I get the following lines in the log-files:

Getting into detail I completely forgot what happens. The booting vm stops/crashes immediately after saying "Waiting while console of booting vm outputs "Waiting for boot device"" before proxmox gives the mentioned log-output. And TNS gives the output I mentioned after "// Waiting while console of booting vm outputs "Waiting for boot device""
 

jgreco

Resident Grinch
Joined
May 29, 2011
Messages
18,680
No, sorry ) I just recently started an experiment on migration from Solaris to TrueNAS. And I immediately chose TrueNAS Scale for the better support of modern hardware

For the hardware that you should be using for a NAS, SCALE offers no better support than CORE.
 

LucaE

Cadet
Joined
Mar 21, 2023
Messages
5
Getting into detail I completely forgot what happens. The booting vm stops/crashes immediately after saying "Waiting while console of booting vm outputs "Waiting for boot device"" before proxmox gives the mentioned log-output. And TNS gives the output I mentioned after "// Waiting while console of booting vm outputs "Waiting for boot device""
Because ticket https://ixsystems.atlassian.net/browse/NAS-119466 has been "closed without changed" because "Feedback timeout" I created a new one. We'll see if this helps https://ixsystems.atlassian.net/browse/NAS-121134
 
Top