vdev I/O failure?

Status
Not open for further replies.

anth

Dabbler
Joined
Mar 27, 2012
Messages
16
Hi Guys,

I've started to get this error message below, I'm running 8.2.0-BETA3 currently, tried running BETA4 and the error messages become more frequent and it throws more problems at me.

I'm not getting any errors from SMART in either FreeNAS or from a WD disk check tool so I'm not totally sure it really is a HDD issue...

Below mentions ada0p2 multiple times, which disk does this refer to? I'll run an extended test on that disk just to be sure.

Code:
Jun 19 21:32:34 nas root: ZFS: vdev I/O failure, zpool=data path=/dev/ada0p2 offset=12189696 size=131072 error=5
Jun 19 21:32:34 nas root: ZFS: vdev I/O failure, zpool=data path=/dev/ada0p2 offset=12058624 size=131072 error=5
Jun 19 21:32:47 nas root: ZFS: vdev I/O failure, zpool=data path=/dev/ada0p2 offset=40501248 size=131072 error=5
Jun 19 21:32:47 nas root: ZFS: vdev I/O failure, zpool=data path= offset=36306944 size=131072 error=5
Jun 19 21:32:47 nas root: ZFS: zpool I/O failure, zpool=data error=5
Jun 19 21:32:47 nas root: ZFS: vdev I/O failure, zpool=data path=/dev/ada0p2 offset=40370176 size=131072 error=5
Jun 19 21:32:47 nas root: ZFS: vdev I/O failure, zpool=data path= offset=36175872 size=131072 error=5


Cheers
 

anth

Dabbler
Joined
Mar 27, 2012
Messages
16
Oh, I'm not also getting the below error message with a yellow flashing alert message:

Code:
WARNING: The volume data (ZFS) status is UNKNOWN: One
or more devices has experienced an error resulting in data 
corruption. Applications may be affected.Restore the file in 
question if possible. Otherwise restore the entire pool from 
backup.


Note there's no data on these disks yet.
 

paleoN

Wizard
Joined
Apr 22, 2012
Messages
1,402
I'm not getting any errors from SMART in either FreeNAS or from a WD disk check tool so I'm not totally sure it really is a HDD issue...

Below mentions ada0p2 multiple times, which disk does this refer to? I'll run an extended test on that disk just to be sure.
You failed to mention your zpool setup. I happened to see your post in the performance sticky so I know you are running striped mirrors. I would suspect a bad HDD myself or bad RAM perhaps? Does it only mention ada0p2? Check the serials on the disks to find which one it is.

Scrub your pool and then run:
Code:
zpool status -v
That will show you which mirrors are having errors. Oh, how are the drives connected?
 

anth

Dabbler
Joined
Mar 27, 2012
Messages
16
Correct, 4 sets of striped mirrors (8x2TB drives total), forgot to mention that sorry.

Yeah I only seem to get a mention of ada0p2. I run a scrub then status and got the below:

Code:
[root@nas /root]# zpool scrub data                                              
[root@nas /root]# zpool status -v                                               
  pool: data                                                                    
 state: ONLINE                                                                  
 scrub: resilver completed after 0h0m with 0 errors on Wed Jun 20 12:12:11 2012 
config:                                                                         
                                                                                
        NAME        STATE     READ WRITE CKSUM                                  
        data        ONLINE       0     0     0                                  
          mirror    ONLINE       0     0     0                                  
            ada0p2  ONLINE       0     0     0  16K resilvered                  
            ada1p2  ONLINE       0     0     0  20.7M resilvered                
          mirror    ONLINE       0     0     0                                  
            ada2p2  ONLINE       0     0     0                                  
            ada3p2  ONLINE       0     0     0                                  
          mirror    ONLINE       0     0     0                                  
            ada4p2  ONLINE       0     0     0                                  
            ada5p2  ONLINE       0     0     0                                  
          mirror    ONLINE       0     0     0                                  
            ada6p2  ONLINE       0     0     0                                  
            ada7p2  ONLINE       0     0     0                                  
                                                                                
errors: No known data errors


I guess that means the errors are with ada0 and ada1? I'll run extended tests on these drives and report back. Weird it says no errors though :\
 

anth

Dabbler
Joined
Mar 27, 2012
Messages
16
I ran extended tests on my WD drives ada0 and ada1 using WD lifeguard (WD's official HDD testing tool) and they both passed without errors.

I did zpool clear data and the flashing yellow error message dissapeared.

Decided to upgrade to BETA4 and see if it now works, but the yellow flashing error has reappeared:

Code:
WARNING: The volume data (ZFS) status is UNKNOWN: One 
or more devices has experienced an unrecoverable error. An 
attempt was made to correct the error. Applications are 
unaffected.Determine if the device needs to be replaced, and 
clear the errors using 'zpool clear' or replace the device with 
'zpool replace'.


The drives are connected directly to the motherboard SATA ports, no RAID card is in use.

Side note, the error message only seemed to appear when I tried to install and enable the jail. It just continuously shows the loading icon next to plugins and never loads, this is in the console:

Code:
Jun 20 21:00:16 nas notifier: /etc/rc.conf: jail_jail-b4_rootdir=/mnt/data/jail/jail-b4: not found
Jun 20 21:00:16 nas notifier: jail_jail-b4_hostname=jail-b4: not found
Jun 20 21:00:16 nas notifier: jail_jail-b4_devfs_enable=YES: not found
Jun 20 21:00:16 nas notifier: jail_jail-b4_devfs_ruleset=devfsrules_jail: not found
Jun 20 21:00:16 nas notifier: jail_jail-b4_procfs_enable=YES: not found
Jun 20 21:00:16 nas notifier: jail_jail-b4_mount_enable=YES: not found
Jun 20 21:00:16 nas notifier: jail_jail-b4_vnet_enable=YES: not found
Jun 20 21:00:16 nas notifier: Configuring jails:.
Jun 20 21:06:12 nas notifier: /etc/rc.conf: jail_jail-b4_rootdir=/mnt/data/jail/jail-b4: not found
Jun 20 21:06:12 nas notifier: jail_jail-b4_hostname=jail-b4: not found
Jun 20 21:06:12 nas notifier: jail_jail-b4_devfs_enable=YES: not found
Jun 20 21:06:12 nas notifier: jail_jail-b4_devfs_ruleset=devfsrules_jail: not found
Jun 20 21:06:12 nas notifier: jail_jail-b4_procfs_enable=YES: not found
Jun 20 21:06:12 nas notifier: jail_jail-b4_mount_enable=YES: not found
Jun 20 21:06:12 nas notifier: jail_jail-b4_vnet_enable=YES: not found
Jun 20 21:06:12 nas notifier: /etc/rc.conf: jail_jail-b4_rootdir=/mnt/data/jail/jail-b4: not found
Jun 20 21:06:12 nas notifier: jail_jail-b4_hostname=jail-b4: not found
Jun 20 21:06:12 nas notifier: jail_jail-b4_devfs_enable=YES: not found
Jun 20 21:06:12 nas notifier: jail_jail-b4_devfs_ruleset=devfsrules_jail: not found
Jun 20 21:06:12 nas notifier: jail_jail-b4_procfs_enable=YES: not found
Jun 20 21:06:12 nas notifier: jail_jail-b4_mount_enable=YES: not found
Jun 20 21:06:12 nas notifier: jail_jail-b4_vnet_enable=YES: not found
Jun 20 21:06:12 nas notifier: /etc/rc.conf: jail_jail-b4_rootdir=/mnt/data/jail/jail-b4: not found
Jun 20 21:06:12 nas notifier: jail_jail-b4_hostname=jail-b4: not found
Jun 20 21:06:12 nas notifier: jail_jail-b4_devfs_enable=YES: not found
Jun 20 21:06:12 nas notifier: jail_jail-b4_devfs_ruleset=devfsrules_jail: not found
Jun 20 21:06:12 nas notifier: jail_jail-b4_procfs_enable=YES: not found
Jun 20 21:06:12 nas notifier: jail_jail-b4_mount_enable=YES: not found
Jun 20 21:06:12 nas notifier: jail_jail-b4_vnet_enable=YES: not found
Jun 20 21:06:12 nas notifier: Stopping jails: cannot stop jail jail-b4. No jail id in /var/run
Jun 20 21:06:12 nas notifier: .
Jun 20 21:06:17 nas notifier: security.jail.allow_raw_sockets: 1 -> 1
Jun 20 21:06:18 nas notifier: /etc/rc.conf: jail_jail-b4_rootdir=/mnt/data/jail/jail-b4: not found
Jun 20 21:06:18 nas notifier: jail_jail-b4_hostname=jail-b4: not found
Jun 20 21:06:18 nas notifier: jail_jail-b4_devfs_enable=YES: not found
Jun 20 21:06:18 nas notifier: jail_jail-b4_devfs_ruleset=devfsrules_jail: not found
Jun 20 21:06:18 nas notifier: jail_jail-b4_procfs_enable=YES: not found
Jun 20 21:06:18 nas notifier: jail_jail-b4_mount_enable=YES: not found
Jun 20 21:06:18 nas notifier: jail_jail-b4_vnet_enable=YES: not found
Jun 20 21:06:18 nas notifier: Configuring jails:.


EDIT: I deleted the jail and added it back in and it's now working. Still have the yellow flashing alert though...

EDIT2: The I/O failure has returned:

Code:
Jun 20 21:46:48 nas root: ZFS: vdev I/O failure, zpool=data path=/dev/ada0p2 offset=17227972608 size=131072 error=5
 

William Grzybowski

Wizard
iXsystems
Joined
May 27, 2011
Messages
1,754
vdev failure is usually either a bad disk or controller... I would not discard cable neither... ada0 and ada1 are the only ones failing?
 

anth

Dabbler
Joined
Mar 27, 2012
Messages
16
Well the controller is just on board SATA2 and SATA 3 ports.

I've only noticed errors for ada0 in the console, ada1 is in the same mirror, so it's possible it's just that data isn't being copied correctly from ada0?

I did a full extended test with WD's own diagnostics tool and it came up fine? :\ should I try a 3rd party disk checker for bad sectors? Would you recommend any?

I can swap the sata cables, but it would be such a pain in the ass, everything is so tight in there I'd litterally have to strip half the box for 1 cable so I won't do that just yet.

I got more errors this morning when powering the box on:

Code:
Jun 22 16:48:40 nas root: ZFS: checksum mismatch, zpool=data path=/dev/ada0p2 offset=17207001088 size=131072
Jun 22 16:48:40 nas root: ZFS: checksum mismatch, zpool=data path=/dev/ada0p2 offset=17206870016 size=131072
Jun 22 16:48:40 nas root: ZFS: checksum mismatch, zpool=data path=/dev/ada0p2 offset=17206214656 size=131072
Jun 22 16:48:40 nas root: ZFS: checksum mismatch, zpool=data path=/dev/ada0p2 offset=17206607872 size=131072
Jun 22 16:48:40 nas root: ZFS: checksum mismatch, zpool=data path=/dev/ada0p2 offset=17206345728 size=131072
Jun 22 16:48:41 nas root: ZFS: checksum mismatch, zpool=data path=/dev/ada0p2 offset=17209278464 size=131072
Jun 22 16:48:41 nas root: ZFS: checksum mismatch, zpool=data path=/dev/ada0p2 offset=17209540608 size=131072
Jun 22 16:48:41 nas root: ZFS: checksum mismatch, zpool=data path=/dev/ada0p2 offset=17209409536 size=131072


Again, if you know of any good DOS/Unix live disks that have a good HDD checking tool on them let me know and I'll try that first.

Cheers

EDIT: And some more errors:

Code:
Jun 22 16:55:51 nas root: ZFS: checksum mismatch, zpool=data path=/dev/ada0p2 offset=71610368 size=4096
Jun 22 16:55:51 nas root: ZFS: checksum mismatch, zpool=data path=/dev/ada0p2 offset=71614464 size=4096
Jun 22 16:56:17 nas root: ZFS: checksum mismatch, zpool=data path=/dev/ada0p2 offset=71655424 size=4096
Jun 22 16:56:17 nas root: ZFS: checksum mismatch, zpool=data path=/dev/ada0p2 offset=71634944 size=4096
Jun 22 16:56:17 nas root: ZFS: checksum mismatch, zpool=data path=/dev/ada0p2 offset=71639040 size=4096
Jun 22 16:56:17 nas root: ZFS: checksum mismatch, zpool=data path=/dev/ada0p2 offset=71626752 size=4096
Jun 22 16:56:17 nas root: ZFS: checksum mismatch, zpool=data path=/dev/ada0p2 offset=71659520 size=4096
Jun 22 16:56:17 nas root: ZFS: checksum mismatch, zpool=data path=/dev/ada0p2 offset=71663616 size=4096
Jun 22 16:56:21 nas root: ZFS: checksum mismatch, zpool=data path=/dev/ada0p2 offset=71622656 size=4096
Jun 22 16:56:27 nas root: ZFS: checksum mismatch, zpool=data path=/dev/ada0p2 offset=88125440 size=131072
Jun 22 16:56:30 nas root: ZFS: checksum mismatch, zpool=data path=/dev/ada0p2 offset=71630848 size=4096
Jun 22 16:56:39 nas root: ZFS: checksum mismatch, zpool=data path=/dev/ada0p2 offset=71647232 size=4096
Jun 22 16:56:39 nas root: ZFS: checksum mismatch, zpool=data path=/dev/ada0p2 offset=88256512 size=131072


EDIT2: Just some some ada1 errors :\ although note some of the error offsets match ada0?

Code:
Jun 22 17:07:31 nas root: ZFS: vdev I/O failure, zpool=data path=/dev/ada1p2 offset=17250258944 size=131072 error=5
Jun 22 17:07:31 nas root: ZFS: vdev I/O failure, zpool=data path=/dev/ada1p2 offset=17247768576 size=131072 error=5
Jun 22 17:07:31 nas root: ZFS: vdev I/O failure, zpool=data path=/dev/ada1p2 offset=17238331392 size=131072 error=5
Jun 22 17:07:31 nas root: ZFS: vdev I/O failure, zpool=data path=/dev/ada1p2 offset=17250521088 size=131072 error=5
Jun 22 17:07:31 nas root: ZFS: vdev I/O failure, zpool=data path=/dev/ada0p2 offset=17237938176 size=131072 error=5
Jun 22 17:07:31 nas root: ZFS: vdev I/O failure, zpool=data path=/dev/ada0p2 offset=17235841024 size=131072 error=5
Jun 22 17:07:31 nas root: ZFS: vdev I/O failure, zpool=data path=/dev/ada0p2 offset=17250521088 size=131072 error=5
Jun 22 17:07:31 nas root: ZFS: vdev I/O failure, zpool=data path= offset=17246326784 size=131072 error=5
Jun 22 17:07:31 nas root: ZFS: zpool I/O failure, zpool=data error=5
Jun 22 17:07:31 nas root: ZFS: vdev I/O failure, zpool=data path=/dev/ada0p2 offset=17252487168 size=131072 error=5
 

paleoN

Wizard
Joined
Apr 22, 2012
Messages
1,402
Well the controller is just on board SATA2 and SATA 3 ports.
How about a list of all your hardware at this point. Most motherboards have at least 2 different SATA controllers on them.

I can swap the sata cables, but it would be such a pain in the ass, everything is so tight in there I'd litterally have to strip half the box for 1 cable so I won't do that just yet.
Right, that would be dumb. Do you have 2 additional cables? If so then unplug the cables from the motherboard & hard drives and hook up the new ones and keep them loose in the case.

EDIT2: Just some some ada1 errors :\ although note some of the error offsets match ada0?
Which is why you are getting zpool I/O failure. Essentially both disks are "bad" in the same place. Otherwise ZFS could get the info from the other disk in the mirror.

Go rerun the performance test, the write part anyway, and scrub your pool afterward. It should help show if any other disks/controller are having issues.
 

anth

Dabbler
Joined
Mar 27, 2012
Messages
16
Ok full setup is:

- ASRock Z77 PRO4-M
--- 2 x SATA3 6.0 Gb/s connectors by Intel Z77
--- 2 x SATA3 6.0 Gb/s connectors by ASMedia ASM1061
--- 4 x SATA2 3.0 Gb/s connectors by ???
- 8x 2TB WD Greens in RAID10
- 8GB RAM Kingston KyperX (2x4GB modules)
- i3-2120T 2.6GHz
- 380W PSU
- 3x Zalman ZM-HDR1 hot swappable bays

It's possible that the hot swappable bays are causing an error (possibly a faulty board or something), but because of the hot swappable bays, it's really tight in the case, I would have to take the bays out and the back plate to find which SATA cable is which and replace.

Since the bays are hot swappable, I could just swap the hard drives around, effectivley that would rule out the SATA cables, hot swappable board and the SATA controllers? Not sure if FreeNAS will detect the HDDs have been moved and update the pool dynamically, but I guess I'll find out. There's no data on there at the moment so not much of a worry, I can always destroy the zpool and re-create.

Below is the current boot up sequence, so it looks like ada0 and ada1 are connected to SATA3 ports:

Code:
Jun 23 20:55:12 nas kernel: ada0 at ahcich0 bus 0 scbus0 target 0 lun 0
Jun 23 20:55:12 nas kernel: ada0:  ATA-8 SATA 3.x device
Jun 23 20:55:12 nas kernel: ada0: 600.000MB/s transfers (SATA 3.x, UDMA6, PIO 8192bytes)
Jun 23 20:55:12 nas kernel: ada0: Command Queueing enabled
Jun 23 20:55:12 nas kernel: ada0: 1907729MB (3907029168 512 byte sectors: 16H 63S/T 16383C)

Jun 23 20:55:12 nas kernel: ada1 at ahcich1 bus 0 scbus1 target 0 lun 0
Jun 23 20:55:12 nas kernel: ada1:  ATA-8 SATA 3.x device
Jun 23 20:55:12 nas kernel: ada1: 600.000MB/s transfers (SATA 3.x, UDMA6, PIO 8192bytes)
Jun 23 20:55:12 nas kernel: ada1: Command Queueing enabled
Jun 23 20:55:12 nas kernel: ada1: 1907729MB (3907029168 512 byte sectors: 16H 63S/T 16383C)

Jun 23 20:55:12 nas kernel: ada2 at ahcich2 bus 0 scbus2 target 0 lun 0
Jun 23 20:55:12 nas kernel: ada2:  ATA-8 SATA 3.x device
Jun 23 20:55:12 nas kernel: ada2: 600.000MB/s transfers (SATA 3.x, UDMA6, PIO 8192bytes)
Jun 23 20:55:12 nas kernel: ada2: Command Queueing enabled
Jun 23 20:55:12 nas kernel: ada2: 1907729MB (3907029168 512 byte sectors: 16H 63S/T 16383C)

Jun 23 20:55:12 nas kernel: ada3 at ahcich3 bus 0 scbus3 target 0 lun 0
Jun 23 20:55:12 nas kernel: ada3:  ATA-8 SATA 3.x device
Jun 23 20:55:12 nas kernel: ada3: 600.000MB/s transfers (SATA 3.x, UDMA6, PIO 8192bytes)
Jun 23 20:55:12 nas kernel: ada3: Command Queueing enabled
Jun 23 20:55:12 nas kernel: ada3: 1907729MB (3907029168 512 byte sectors: 16H 63S/T 16383C)

Jun 23 20:55:12 nas kernel: ada4 at ahcich4 bus 0 scbus4 target 0 lun 0
Jun 23 20:55:12 nas kernel: ada4:  ATA-8 SATA 3.x device
Jun 23 20:55:12 nas kernel: ada4: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 8192bytes)
Jun 23 20:55:12 nas kernel: ada4: Command Queueing enabled
Jun 23 20:55:12 nas kernel: ada4: 1907729MB (3907029168 512 byte sectors: 16H 63S/T 16383C)

Jun 23 20:55:12 nas kernel: ada5 at ahcich5 bus 0 scbus5 target 0 lun 0
Jun 23 20:55:12 nas kernel: ada5:  ATA-8 SATA 3.x device
Jun 23 20:55:12 nas kernel: ada5: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 8192bytes)
Jun 23 20:55:12 nas kernel: ada5: Command Queueing enabled
Jun 23 20:55:12 nas kernel: ada5: 1907729MB (3907029168 512 byte sectors: 16H 63S/T 16383C)

Jun 23 20:55:12 nas kernel: ada6 at ahcich6 bus 0 scbus6 target 0 lun 0
Jun 23 20:55:12 nas kernel: ada6:  ATA-8 SATA 3.x device
Jun 23 20:55:12 nas kernel: ada6: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 8192bytes)
Jun 23 20:55:12 nas kernel: ada6: Command Queueing enabled
Jun 23 20:55:12 nas kernel: ada6: 1907729MB (3907029168 512 byte sectors: 16H 63S/T 16383C)

Jun 23 20:55:12 nas kernel: ada7 at ahcich7 bus 0 scbus7 target 0 lun 0
Jun 23 20:55:12 nas kernel: ada7:  ATA-8 SATA 3.x device
Jun 23 20:55:12 nas kernel: ada7: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 8192bytes)
Jun 23 20:55:12 nas kernel: ada7: Command Queueing enabled
Jun 23 20:55:12 nas kernel: ada7: 1907729MB (3907029168 512 byte sectors: 16H 63S/T 16383C)


Did another performance test (seems degraded now) and scrub:

Code:
[root@nas /mnt/data]# dd if=/dev/zero of=tmp.dat bs=2048k count=50k             
51200+0 records in                                                              
51200+0 records out                                                             
107374182400 bytes transferred in 639.409828 secs (167927013 bytes/sec)
[root@nas /mnt/data]# zpool scrub data 


Then a scrub and status:

Code:
action: Determine if the device needs to be replaced, and clear the errors      
        using 'zpool clear' or replace the device with 'zpool replace'.         
   see: http://www.sun.com/msg/ZFS-8000-9P                                      
 scrub: resilver completed after 0h0m with 0 errors on Sat Jun 23 21:16:22 2012 
config:                                                                         
                                                                                
        NAME        STATE     READ WRITE CKSUM                                  
        data        ONLINE       0     7     0                                  
          mirror    ONLINE       0     7     0                                  
            ada0p2  ONLINE       0 24.6K     1  4.11M resilvered                
            ada1p2  ONLINE       0     7     0  24K resilvered                  
          mirror    ONLINE       0     0     0                                  
            ada2p2  ONLINE       0     0     0                                  
            ada3p2  ONLINE       0     0     0                                  
          mirror    ONLINE       0     0     0                                  
            ada4p2  ONLINE       0     0     0                                  
            ada5p2  ONLINE       0     0     0                                  
          mirror    ONLINE       0     0     0                                  
            ada6p2  ONLINE       0     0     0                                  
            ada7p2  ONLINE       0     0     0                                  
                                                                                
errors: No known data errors 


Still only seems like ada0 and ada1, will switch ada0 and ada1 with ada6 and ada7 and report back.
 

anth

Dabbler
Joined
Mar 27, 2012
Messages
16
EDIT: Apparently my previous post with system specs and additional info is awaiting moderator approval? :S

Ok, I swapped ada0 and ada1 with ada6 and ada7 (thereby using different SATA cables, different controller and different hot swappable boards).

I did another write test then scrub and status and got the below:

Code:
[root@nas] /mnt/data# dd if=/dev/zero of=tmp.dat bs=2048k count=50k
51200+0 records in
51200+0 records out
107374182400 bytes transferred in 472.388308 secs (227300677 bytes/sec)

[root@nas] /mnt/data# zpool scrub data

[root@nas] /mnt/data# zpool status -v data
  pool: data
 state: ONLINE
status: One or more devices has experienced an unrecoverable error.  An
        attempt was made to correct the error.  Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
        using 'zpool clear' or replace the device with 'zpool replace'.
   see: http://www.sun.com/msg/ZFS-8000-9P
 scrub: resilver completed after 0h0m with 0 errors on Sat Jun 23 22:24:22 2012
config:

        NAME        STATE     READ WRITE CKSUM
        data        ONLINE       0     5     0
          mirror    ONLINE       0     0     0
            ada6p2  ONLINE       0     0     1
            ada7p2  ONLINE       0     0     0
          mirror    ONLINE       0     0     0
            ada2p2  ONLINE       0     0     0
            ada3p2  ONLINE       0     0     0
          mirror    ONLINE       0     0     0
            ada4p2  ONLINE       0     0     0
            ada5p2  ONLINE       0     0     0
          mirror    ONLINE       0     5     0
            ada0p2  ONLINE       0     5     0  260K resilvered
            ada1p2  ONLINE       0 12.3K     0  12.1M resilvered

errors: No known data errors


The console footer also showed the below errors during the write test:

Code:
Jun 23 22:13:14 nas ntpd[1559]: kernel time sync status change 2001
Jun 23 22:17:40 nas root: ZFS: vdev I/O failure, zpool=data path=/dev/ada0p2 offset=26181632 size=131072 error=5
Jun 23 22:17:40 nas root: ZFS: vdev I/O failure, zpool=data path= offset=21987328 size=131072 error=5
Jun 23 22:17:41 nas root: ZFS: zpool I/O failure, zpool=data error=5
Jun 23 22:17:41 nas root: ZFS: vdev I/O failure, zpool=data path=/dev/ada0p2 offset=18972672 size=131072 error=5
Jun 23 22:17:41 nas root: ZFS: vdev I/O failure, zpool=data path= offset=14778368 size=131072 error=5
Jun 23 22:17:41 nas root: ZFS: zpool I/O failure, zpool=data error=5
Jun 23 22:17:41 nas root: ZFS: vdev I/O failure, zpool=data path=/dev/ada0p2 offset=18841600 size=131072 error=5
Jun 23 22:17:41 nas root: ZFS: vdev I/O failure, zpool=data path= offset=14647296 size=131072 error=5
Jun 23 22:17:41 nas root: ZFS: zpool I/O failure, zpool=data error=5
Jun 23 22:17:41 nas root: ZFS: vdev I/O failure, zpool=data path=/dev/ada0p2 offset=18710528 size=131072 error=5
Jun 23 22:17:41 nas root: ZFS: vdev I/O failure, zpool=data path= offset=14516224 size=131072 error=5
Jun 23 22:17:41 nas root: ZFS: zpool I/O failure, zpool=data error=5
Jun 23 22:17:41 nas root: ZFS: vdev I/O failure, zpool=data path=/dev/ada0p2 offset=394297344 size=131072 error=5
Jun 23 22:17:41 nas root: ZFS: vdev I/O failure, zpool=data path= offset=390103040 size=131072 error=5
Jun 23 22:17:41 nas root: ZFS: zpool I/O failure, zpool=data error=5


The confusing part is that ada0, ada1, ada6 and ada7 are now showing different serial numbers under "view disks" (which is correct, because they've been swapped with each other), but zpool status is now showing ada0 and ada1 at the bottom instead of the top, so I'm unsure if FreeNAS is still attaching ada0 and ada1 to the original disks or if they really have updated to the new disks? If they've updated to the new disks then it would suggest either dodgy cable, controller or hot swap bay. Can someone confirm this confusion part first? :\

Cheers
 

paleoN

Wizard
Joined
Apr 22, 2012
Messages
1,402
The confusing part is that ada0, ada1, ada6 and ada7 are now showing different serial numbers under "view disks" (which is correct, because they've been swapped with each other)
Any progress on this? What are/were the serials. Do they match with what smartctl reports?
 

anth

Dabbler
Joined
Mar 27, 2012
Messages
16
Apologies for the delay, I just had my tonsils removed so I haven't felt up to fiddling with my NAS...

I checked what's connected to what and it looks like ada0 and ada1 are connected to the same SATA3 chipset, which looks like the ASMedia ASM1061 SATA3 chipset. Are there any known issues with FreeNAS and this SATA chipset?

The board is "ASRock Z77 Pro4-M". Did post this before but the forum put my post in for moderation and no one bothered to approve it :\

I have just swapped the hot swappable bays to make sure they're not the issue, will also need to swap SATA cables to ensure they're not the problem either, will report back when that's done.
 

anth

Dabbler
Joined
Mar 27, 2012
Messages
16
Alright quick update, I swapped the SATA cables and that didnt make any difference, so I deleted my zpool and re-created it leaving out ada0 and ada1, this was my results:

Code:
[root@freenas] ~# dd if=/dev/zero of=/mnt/raid/tmp.dat bs=2048k count=50k
51200+0 records in
51200+0 records out
107374182400 bytes transferred in 364.313138 secs (294730470 bytes/sec)

[root@freenas] ~# dd if=/mnt/raid/tmp.dat of=/dev/null bs=2048k count=50k
51200+0 records in
51200+0 records out
107374182400 bytes transferred in 296.219817 secs (362481429 bytes/sec)

[root@freenas] ~# zpool scrub raid

[root@freenas] ~# zpool status -v raid
  pool: raid
 state: ONLINE
 scrub: scrub completed after 0h11m with 0 errors on Mon Jul  9 21:11:23 2012
config:

        NAME                                            STATE     READ WRITE CKSUM
        raid                                            ONLINE       0     0     0
          mirror                                        ONLINE       0     0     0
            gptid/5ccfeb0d-c9c4-11e1-abed-bc5ff4440cea  ONLINE       0     0     0
            gptid/5d7a5bbc-c9c4-11e1-abed-bc5ff4440cea  ONLINE       0     0     0
          mirror                                        ONLINE       0     0     0
            ada4p2.nop                                  ONLINE       0     0     0
            ada5p2.nop                                  ONLINE       0     0     0
          mirror                                        ONLINE       0     0     0
            ada6p2.nop                                  ONLINE       0     0     0
            ada7p2.nop                                  ONLINE       0     0     0

errors: No known data errors


Added ada0 and ada1 in as another mirror just to confirm then re-run the tests and instantly got the below errors followed by freenas locking up and requiring a reboot.

Code:
Jul  9 21:35:42 freenas root: ZFS: vdev I/O failure, zpool=raid path=/dev/ada0p2.nop offset=409731072 size=131072 error=5
Jul  9 21:35:42 freenas root: ZFS: vdev I/O failure, zpool=raid path=/dev/ada0p2.nop offset=409600000 size=131072 error=5
Jul  9 21:35:42 freenas root: ZFS: vdev I/O failure, zpool=raid path=/dev/ada0p2.nop offset=333578240 size=131072 error=5
Jul  9 21:35:42 freenas root: ZFS: vdev I/O failure, zpool=raid path=/dev/ada0p2.nop offset=310771712 size=131072 error=5
Jul  9 21:35:42 freenas root: ZFS: vdev I/O failure, zpool=raid path=/dev/ada0p2.nop offset=310640640 size=131072 error=5
Jul  9 21:40:30 freenas kernel: ahcich0: Timeout on slot 3 port 0
Jul  9 21:40:30 freenas kernel: ahcich0: is 00000000 cs 00000000 ss 00001ff8 rs 00001ff8 tfd 40 serr 00000000


After reboot, did another scrub and status and got the below:

Code:
[root@freenas] ~# zpool scrub raid
[root@freenas] ~# zpool status -v raid
  pool: raid
 state: ONLINE
status: One or more devices has experienced an unrecoverable error.  An
        attempt was made to correct the error.  Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
        using 'zpool clear' or replace the device with 'zpool replace'.
   see: http://www.sun.com/msg/ZFS-8000-9P
 scrub: resilver completed after 0h0m with 0 errors on Mon Jul  9 21:45:34 2012
config:

        NAME                                            STATE     READ WRITE CKSUM
        raid                                            ONLINE       0     0     0
          mirror                                        ONLINE       0     0     0
            gptid/5ccfeb0d-c9c4-11e1-abed-bc5ff4440cea  ONLINE       0     0     0
            gptid/5d7a5bbc-c9c4-11e1-abed-bc5ff4440cea  ONLINE       0     0     0
          mirror                                        ONLINE       0     0     0
            gptid/6e848e98-c9c4-11e1-abed-bc5ff4440cea  ONLINE       0     0     0
            gptid/6f1ce598-c9c4-11e1-abed-bc5ff4440cea  ONLINE       0     0     0
          mirror                                        ONLINE       0     0     0
            gptid/8e810d98-c9c4-11e1-abed-bc5ff4440cea  ONLINE       0     0     0
            gptid/8f2229e2-c9c4-11e1-abed-bc5ff4440cea  ONLINE       0     0     0
          mirror                                        ONLINE       0     0     1
            gptid/0d28fa62-c9ca-11e1-abed-bc5ff4440cea  ONLINE       0 2.71K     0  4.88M resilvered
            gptid/0f837d37-c9ca-11e1-abed-bc5ff4440cea  ONLINE       0     0     1

errors: No known data errors


I assume this confirms that the issue is with the two SATA3 ports on the ASM1061 chipset? Though I did google this chipset and other freenas users report no issues...

Side note I did try pairing up ada0 with ada2 and ada1 with ada3, but it made no difference.
 

paleoN

Wizard
Joined
Apr 22, 2012
Messages
1,402
I assume this confirms that the issue is with the two SATA3 ports on the ASM1061 chipset? Though I did google this chipset and other freenas users report no issues...
Have you tried ada0 & ada1 on some of the SATA2 ports? If so and you still get the errors that would point to the cables or the drives themselves. The issue could be a harware problem with the motherboard as well.

Side note I did try pairing up ada0 with ada2 and ada1 with ada3, but it made no difference.
In that you still got errors on ada0 & ada1?
 

anth

Dabbler
Joined
Mar 27, 2012
Messages
16
Have you tried ada0 & ada1 on some of the SATA2 ports? If so and you still get the errors that would point to the cables or the drives themselves. The issue could be a harware problem with the motherboard as well.

I tried physically swapping the drives connected to the ASM1061 (ada0 and ada1) and the problem remained on ada0 and ada1, indicating it's not a problem with the drives themselves. I did shuffle the cables around to make sure I was using two of the same brand SATA3 cables too, but I might buy some SATA2 cables to see if the problem is resolved by forcing the chipset to run at SATA2...

In that you still got errors on ada0 & ada1?

Yep, in that I still got errors with ada0 and ada1.
 
Status
Not open for further replies.
Top