iSCSI volume status is Degraded, some Windows virtual machine error when storage vmotion

Status
Not open for further replies.

SubX

Explorer
Joined
Sep 15, 2017
Messages
56
Backgroud
  • FreeNAS 9.10 with 3 iSCSI volumes ( Vol 1 - 2 x 4T as Raid 1, Vol 2 - 1 x 1.5T, Vol 3 - 1 x 1T). VMFS 6 datastore created from those volumes
  • 2 x vSphere 6.5 hosts with vCenter Server 6.5
issues
  • FreeNAS prompt red alert with above volumes ( Vol 1 - volume status is Degraded, One or more devices has experienced an error resulting in data corruption. Applications may be affected. / Vol 2 - volume status is online , One or more devices has experienced an error resulting in data corruption. Applications may be affected.)
  • Some of the Windows vm when migrating from Vol 1 to Vol 2, it will prompt error caused by file vm_name.vmdk. some Windows & Linux vm can be migrated from Vol 1 to Vol 3 without any problem. For those affected vms when performing backup via Veeam 9.5, there is CRC and read failure error. For those unaffected one, there is no problem on Veeam backup.
Questions
  • How to troubleshoot the red alert message from FreeNAS? I suspect it has something to do with the storage vmotion error
  • run smartctl -a on those FreeNAS disks, didn't see error message.
Thanks, S.
 

SubX

Explorer
Joined
Sep 15, 2017
Messages
56
thanks DL.
Attached please find the zpool status output.
  • vol 1 = iSCSI4TR1
  • vol 2 = iSCSI-110-1500G
  • vol 3 is in another FreeNAS in good condition
 

Attachments

  • putty_FN110_vol_error.txt
    4.5 KB · Views: 364

SubX

Explorer
Joined
Sep 15, 2017
Messages
56
Also for vol 2, how should I wipe or format the drive and recreate the iSCSI volume once I move data out of that volume? Is there any special check (smartctl -a , scrub) to perform? Just want to make sure once I recreate the volume, the potential issue should be eliminated. For now, when I run smartctl, it seems that there is no error on 5, 197, 198.
thanks a ton! S.
 

SubX

Explorer
Joined
Sep 15, 2017
Messages
56
Zpool Status Output
Code:
=~=~=~=~=~=~=~=~=~=~=~= PuTTY log 2018.03.20 09:15:52 =~=~=~=~=~=~=~=~=~=~=~=

FreeBSD 10.3-STABLE (FreeNAS.amd64) #0 r295946+21897e6695f(HEAD): Tue Jul 25 00:03:12 UTC 2017

FreeNAS (c) 2009-2016, The FreeNAS Development Team
All rights reserved.
FreeNAS is released under the modified BSD license.

For more information, documentation, help or support, go here:
 http://freenas.org
Welcome to FreeNAS
[root@freenas] ~# zpool status

  pool: SMB-110-4TS
 state: ONLINE
status: One or more devices has experienced an error resulting in data
corruption.  Applications may be affected.
action: Restore the file in question if possible.  Otherwise restore the
entire pool from backup.
   see: http://illumos.org/msg/ZFS-8000-8A
  scan: none requested
config:

NAME										  STATE	 READ WRITE CKSUM
SMB-110-4TS								   ONLINE	   0	 0	 0
  gptid/5e329e07-225a-11e8-a6d6-001517bb234a  ONLINE	   0	 0	 0

errors: 9 data errors, use '-v' for a list

  pool: freenas-boot
 state: ONLINE
  scan: none requested
config:

NAME		STATE	 READ WRITE CKSUM
freenas-boot  ONLINE	   0	 0	 0
  da0p2	 ONLINE	   0	 0	 0

errors: No known data errors

  pool: iSCSI-110-1500G
 state: ONLINE
status: One or more devices has experienced an error resulting in data
corruption.  Applications may be affected.
action: Restore the file in question if possible.  Otherwise restore the
entire pool from backup.
   see: http://illumos.org/msg/ZFS-8000-8A
  scan: scrub repaired 0 in 3h12m with 15 errors on Wed Mar  7 00:21:16 2018
config:

NAME										  STATE	 READ WRITE CKSUM
iSCSI-110-1500G							   ONLINE	   0	 0	19
  gptid/4a6a4d9a-c3da-11e7-9a43-6c626d9838e5  ONLINE	   0	 0	38

errors: 1 data errors, use '-v' for a list

  pool: iSCSI4TR1
 state: DEGRADED
status: One or more devices has experienced an error resulting in data
corruption.  Applications may be affected.
action: Restore the file in question if possible.  Otherwise restore the
entire pool from backup.
   see: http://illumos.org/msg/ZFS-8000-8A
  scan: scrub repaired 48K in 3h40m with 60 errors on Sun Mar 18 23:16:08 2018
config:

NAME											STATE	 READ WRITE CKSUM
iSCSI4TR1									   DEGRADED	 0	 0 28.9K
  mirror-0									  DEGRADED	 0	 0 57.9K
	gptid/9a7aff67-1712-11e8-ad21-001517bb234a  DEGRADED	 0	 0 57.9K  too many errors
	gptid/9c8e023e-1712-11e8-ad21-001517bb234a  DEGRADED	 0	 0 57.9K  too many errors

errors: 1 data errors, use '-v' for a list
[root@freenas] ~# zpool status -vx pool: iSCSI4TR1

cannot open 'pool:': no such pool
  pool: iSCSI4TR1
 state: DEGRADED
status: One or more devices has experienced an error resulting in data
corruption.  Applications may be affected.
action: Restore the file in question if possible.  Otherwise restore the
entire pool from backup.
   see: http://illumos.org/msg/ZFS-8000-8A
  scan: scrub repaired 48K in 3h40m with 60 errors on Sun Mar 18 23:16:08 2018
config:

NAME											STATE	 READ WRITE CKSUM
iSCSI4TR1									   DEGRADED	 0	 0 28.9K
  mirror-0									  DEGRADED	 0	 0 57.9K
	gptid/9a7aff67-1712-11e8-ad21-001517bb234a  DEGRADED	 0	 0 57.9K  too many errors
	gptid/9c8e023e-1712-11e8-ad21-001517bb234a  DEGRADED	 0	 0 57.9K  too many errors

errors: Permanent errors have been detected in the following files:

		iSCSI4TR1/zv-110-4TR1:<0x1>
[root@freenas] ~# zpool status -vx pool: iSCSI4TR1iSCSI-110-1500G

cannot open 'pool:': no such pool
  pool: iSCSI-110-1500G
 state: ONLINE
status: One or more devices has experienced an error resulting in data
corruption.  Applications may be affected.
action: Restore the file in question if possible.  Otherwise restore the
entire pool from backup.
   see: http://illumos.org/msg/ZFS-8000-8A
  scan: scrub repaired 0 in 3h12m with 15 errors on Wed Mar  7 00:21:16 2018
config:

NAME										  STATE	 READ WRITE CKSUM
iSCSI-110-1500G							   ONLINE	   0	 0	19
  gptid/4a6a4d9a-c3da-11e7-9a43-6c626d9838e5  ONLINE	   0	 0	38

errors: Permanent errors have been detected in the following files:

		iSCSI-110-1500G/zv-110-1500G:<0x1>
[root@freenas] ~# 
 

SubX

Explorer
Joined
Sep 15, 2017
Messages
56
The situation is getting worse, it seems that the alert is happening almost every day now. Could it be a memory issue?

This thread has similar issue and it turns out to be memory related. https://forums.freenas.org/index.ph...orruption-applications-may-be-affected.57337/

Could someone give me a hand here?

Many Thanks,
S.

upload_2018-3-21_15-47-2.png
 
D

dlavigne

Guest
The disks in those 2 pools have died and need to be replaced, the pools recreated, and the data restored from backup. Since a one disk pool has no redundancy, consider using mutltiple disks when you recreate your pools.
 

SubX

Explorer
Joined
Sep 15, 2017
Messages
56
Thanks.

I tried to wipe vol 2 and create a new vol with new vsphere 6.5 datastore. I can create new vm in the newly created datastore. However when trying to use Veeam to backup the new vm, it prompts me error and tried to storage vmotion the vm to other datastore, it prompts error access vmdk file. However vm is running OK. I suspect that it might be memory error which cause missing data written to vol 2.

Follow above FreeNAS link regarding the memory test, I run memtest86 for 27 hours still running. It turns out to be 682 memory error. Being new to memtest86, I will try to find out how to identify which memory has issue and check if that would lift the issue.

Thanks,
S
 

SubX

Explorer
Joined
Sep 15, 2017
Messages
56
A total of 18G memory ( 3 x Kingston 4G each, 3 x Corsair 2G each). Rerun memtest86 on those Kingston modules with 7 passes without error. Remove Corsair from system.

Currently FreeNAS is running fine and all the issues on storage vMotion and Veeam Backup are all gone with 4-5 different tests.

Tried to get more understanding on FreeNAS log, when I reboot NAS system, the system did prompt me Alert which is the same as before.

upload_2018-3-25_12-18-36.png


Logged into system and found the following for /var/log/messages. Can someone help me?
  • Why it doesn't have log regarding above 3 critical alerts?
  • Does it captured in other log under /var/log?
  • Notice from online that we can download Splunk for free for a daily log data up to 500M. Anyone who can share experience how to setup Splunk with FreeNAS? Can't find much help online. I might open another thread on this topic.
Code:
[root@freenas] /var/log# cat messages
Mar 25 00:00:00 freenas newsyslog[17366]: logfile turned over due to size>100K
Mar 25 00:00:00 freenas syslog-ng[1615]: Configuration reload request received, reloading configuration;
Mar 25 00:44:38 freenas alert.py: [common.pipesubr:66] Popen()ing: /usr/local/sbin/dmidecode -s system-product-name
Mar 25 00:44:38 freenas alert.py: [common.pipesubr:66] Popen()ing: /usr/local/sbin/dmidecode -s baseboard-product-name
Mar 25 01:44:51 freenas alert.py: [common.pipesubr:66] Popen()ing: /usr/local/sbin/dmidecode -s system-product-name
Mar 25 01:44:51 freenas alert.py: [common.pipesubr:66] Popen()ing: /usr/local/sbin/dmidecode -s baseboard-product-name
Mar 25 01:58:01 freenas update_check.py: [freenasOS.Configuration:713] Unable to load http://update.ixsystems.com/FreeNAS/trains.txt: <urlopen error [Errno 8] hostname nor servname provided, or not known>
Mar 25 01:58:01 freenas update_check.py: [freenasOS.Configuration:713] Unable to load http://update-master.ixsystems.com/FreeNAS/trains.txt: <urlopen error [Errno 8] hostname nor servname provided, or not known>
Mar 25 01:58:01 freenas update_check.py: [freenasOS.Configuration:727] Unable to load ['http://update.ixsystems.com/FreeNAS/trains.txt', 'http://update-master.ixsystems.com/FreeNAS/trains.txt']: <urlopen error [Errno 8] hostname nor servname provided, or not known>
Mar 25 02:45:03 freenas alert.py: [common.pipesubr:66] Popen()ing: /usr/local/sbin/dmidecode -s system-product-name
Mar 25 02:45:03 freenas alert.py: [common.pipesubr:66] Popen()ing: /usr/local/sbin/dmidecode -s baseboard-product-name
Mar 25 03:45:17 freenas alert.py: [common.pipesubr:66] Popen()ing: /usr/local/sbin/dmidecode -s system-product-name
Mar 25 03:45:17 freenas alert.py: [common.pipesubr:66] Popen()ing: /usr/local/sbin/dmidecode -s baseboard-product-name
Mar 25 04:45:31 freenas alert.py: [common.pipesubr:66] Popen()ing: /usr/local/sbin/dmidecode -s system-product-name
Mar 25 04:45:31 freenas alert.py: [common.pipesubr:66] Popen()ing: /usr/local/sbin/dmidecode -s baseboard-product-name
Mar 25 05:45:45 freenas alert.py: [common.pipesubr:66] Popen()ing: /usr/local/sbin/dmidecode -s system-product-name
Mar 25 05:45:45 freenas alert.py: [common.pipesubr:66] Popen()ing: /usr/local/sbin/dmidecode -s baseboard-product-name
Mar 25 06:45:59 freenas alert.py: [common.pipesubr:66] Popen()ing: /usr/local/sbin/dmidecode -s system-product-name
Mar 25 06:45:59 freenas alert.py: [common.pipesubr:66] Popen()ing: /usr/local/sbin/dmidecode -s baseboard-product-name
Mar 25 07:46:12 freenas alert.py: [common.pipesubr:66] Popen()ing: /usr/local/sbin/dmidecode -s system-product-name
Mar 25 07:46:12 freenas alert.py: [common.pipesubr:66] Popen()ing: /usr/local/sbin/dmidecode -s baseboard-product-name
[root@freenas] /var/log# ls
./						   dmesg.today				  nginx/					   sssd/
../						  lpd-errs					 nginx-access.log			 telemetry.json.bz2
3ware_raid_alarms.today	  maillog					  nginx-error.log			  ups.log
3ware_raid_alarms.yesterday  maillog.0.bz2				pbid.log					 userlog
auth.log					 mdnsresponder.log			pf.today					 utx.lastlogin
cron						 messages					 ppp.log					  utx.log
daemon.log				   messages.0.bz2			   proftpd/					 wtmp
debug.log					middlewared.log			  samba4/					  xferlog
debug.log.0.bz2			  mount.today				  security
[root@freenas] /var/log# messages.0.bz2

 
Status
Not open for further replies.
Top