AHCI timeouts

ProtoSD · Oct 11, 2011

But is there anyway FreeNAS continue to work with the damaged drive (i think i'll have to wait a few weeks before getting the new drive)? Can't it "mark" the damaged sectors in order not to use them anymore?
Do i have to make another scrub?

The drive is supposed to mark the sectors bad automatically, I think those are the 'pending' ones that SMART reported. Scrubbing will not help if the disk is failing, it could just make the disk fail faster. I think since you're only using raidz1, I would not risk doing anything until you have a new disk. I have a few of the Samsung drives and the first thing I did before using them was upgrade the firmware and run HD Tune Prol diagnostics on it.

The ahci timeouts are only happening with the one disk? It would seem like if it were FreeBSD related that you would see similar errors on the other disks, but Durkation also has a valid explanation. Hopefully a new disk will solve the problem.

djoole · Oct 11, 2011

Ah yes, maybe.
So i should launch an extensive SMART test from another PC and in IDE mode.
If Currently unreadable (pending) sectors and Offline uncorrectable sectors flags clear to zero, it means it was the FreeNAS environment which was screwing the SMART results...

Do you guys have any SMART software to recommend? (preferably DOS, so that i can put it on my USB DOS bootable key)

Durkatlon · Oct 11, 2011

OK, I hadn't noticed that the time-outs were only on one disk. When I had them, I got them on all disks except the USB "disk". So this may be a different problem and indeed related to a bad drive. That's easy to verify I suppose by plopping a different drive in that slot.

ProtoSD · Oct 11, 2011

The 'Ultimate Boot CD" used to have HD Tune on it, but I just checked the list for the latest version and it's not on there.

Here's a link for that and another for HD Tune:

http://www.ultimatebootcd.com/

http://www.hdtune.com/

djoole · Oct 11, 2011

I'm going to use ultimate boot cd, gsmartcontrol is included, thanks for the advice.

While waiting for having the drive replaced, should i do something on my ZFS pool? (scrub, something else?)

ProtoSD · Oct 11, 2011

While waiting for having the drive replaced, should i do something on my ZFS pool? (scrub, something else?)

No, if your pool is online and you can still access your data I would leave it alone, don't copy new files or do anything except read/copy to someplace else. You only have z1, so don't risk another disk failing when your pool is degraded.

djoole · Oct 11, 2011

Well actually, the pool is not "degraded". The drive with 315 bad sectors is still in it. Can't it just work like this a few weeks? I have a lot of writes to do.

Another question, my pool containing 2 raidz1. If one fails, is the other concerned?

EDIT : yes actually when i will send the drive in RMA, the pool will be degraded.... i might buy a new one at once and keep the replacement one as spare...

ProtoSD · Oct 11, 2011

EDIT : yes actually when i will send the drive in RMA, the pool will be degraded.... i might buy a new one at once and keep the replacement one as spare...

I almost suggested that, but I didn't know if money was a concern. When I sent my disk for RMA, I bought a new disk instead of waiting. When the replacement arrived I kept it installed as a 'cold' spare. I usually don't trust replacements because most of the time they are returned/refurbished disks that someone else already had a problem with. I my situation I got lucky and my disk was new.

Another question, my pool containing 2 raidz1. If one fails, is the other concerned?

I'm not completely clear on how this works, but I think if you ran the first pool degraded and lost another disk, then the other pool would be affected.

survive · Oct 11, 2011

Hi guys,

As I understand it if you have a zpool made up of 2 (or multiple) virtual devices (individual raidz's) you loose the whole pool if you lose one of the vdevs. You *might* be able to salvage the data on the remaining vdev, but if you can it's simply because you were lucky!

If you have an existing zpool made up of 1 vdev and you add a second vdev when the first is half full the existing data does not get restriped across both of the vdevs, the system will start to write new data to them both till the origanl was full. All the data that is added after the second vdev will reside on both till you fill up the first then the system will finish up filling the second vdev.

With 8 drives you loose the same number of drives to parity doing 2 raidz vdevs that you would lose to a single 8 drive raidz2 Your system won't have to work as hard to generate the parity information, but you do have a higher risk of data loss if the wrong 2 disks fail.

-Will

djoole · Oct 12, 2011

A raid2 with 8 drives wasn't an option, i don't have enough backup space to put my data somewhere else during the creation of the pool.

Now, i have the first vdev nearly full, and nothing else on the second one (the one with the wierd disk).

Regarding what you say, survive, i would like to remove the second vdev from the pool, and put it on a different pool, so that i don't loose everything on a vdev fail.

Is it possible to remove a vdev from a pool without breaking the pool? If you tell me no, i guess i'll have to find A 6TB backup space... :(

djoole · Oct 12, 2011

djoole said:
For now, the long SMART test is still running, and no timeouts, this is the first time i can put the test this far (60% remaining).

I'm completely lost now. SMART extensive offline test is over, and it passed. No error found.
The Currently unreadable (pending) sectors and Offline uncorrectable sectors flags are still at 315.
Is this disk okay or not? I can't figure out.

Milhouse · Oct 12, 2011

djoole said:
Is it possible to remove a vdev from a pool without breaking the pool?

No, it's not possible - doing so would destroy the zpool.

A zpool with two vdevs will stripe data across both vdevs - you'll have some data in vdev#1, and the rest of your data in vdev#2. Lose either of the vdevs and you lose the entire zpool.

I'm not sure why you claim the first vdev is nearly full and the second vdev completely empty - that's very unlikely unless you added the second vdev after loading all your data into the zpool when it consisted of only the first vdev, and have not written any new data since. Even so, removing the second vdev will almost certainly result in the zpool becoming unavailable.

djoole · Oct 12, 2011

Ok thanks for the info Milhouse.

I'm not sure why you claim the first vdev is nearly full and the second vdev completely empty - that's very unlikely unless you added the second vdev after loading all your data into the zpool when it consisted of only the first vdev, and have not written any new data since.

That's exactyl what i did :)

Even so, removing the second vdev will almost certainly result in the zpool becoming available.

I assume you meant unavailable ;)

Anyway :
- I've bought a new drive to replace the "315 bad sectors" drive (although Seatools and SmartmonTools say the disk is perfectly fine)
- I'm going to backup all my data and make a fresh new zpool with only one RAIDZ2 as advised here. (the bad side on this solution is that i won't be able to grow my volume unless changing ALL the 8 drives to 3 TB)

I have another question (maybe i should post it in a fresh topic..) : when you create a zpool, you can force use of 4K.
What if i have in the pool to create, mixed drives (512, 4K, SATA2, SATA3...)? Is it better to force anyway?

Milhouse · Oct 12, 2011

djoole said:
I assume you meant unavailable ;)

Ha! Yes indeed - corrected post. Thanks.

djoole said:
I have another question (maybe i should post it in a fresh topic..) : when you create a zpool, you can force use of 4K.
What if i have in the pool to create, mixed drives (512, 4K, SATA2, SATA3...)? Is it better to force anyway?

Force 4K, that's what I would do.

Forcing 4K on non-4K (non-Advanced Format) drives doesn't seem to hurt performance in any significant way, and has the advantage that if in future you replace a non-4K drive with a 4K drive (eg. when upgrading a 2TB drive to 3TB to create more storage, or if non-4K drives are hard to find in future and you need to replace a drive) you won't need to worry about the sector alignment.

djoole · Oct 12, 2011

Ok, thanks for the tip.

Well it seems you just can't do it perfectly the first time with FreeNAS :D (talking about choosing the good zpool structure).

So now i'm ready to do it well :
- backup my data
- create a fresh zpool with a raidz2 with 8 drivers forced to 4K (the new one i just bought today is 4K ^^ )
- transfer data
- enjoy it (hoping that the AHCI timeouts i got will be history that it was the Samsung bad firmware, and a Seagate with bad sectors).
Anyway, if i get AHCI timeouts with the new pool, i'll try to switch to IDE Mode in the BIOS, as advised by Durkatlon.

Thanks to all of you.
I'll keep in touch :)

holzmann · Dec 12, 2011

Hi everyone.

I am encountering the exact same error.

FreeNAS 8.0.2
RAIDZ2 + 4K
Mobo: ASUS M4A88T-M LE (NB: AMD 880G, SB: AMD SB710)
RAM: Kingston 8GB (2 x 4GB) 240-Pin DDR3 SDRAM DDR3 1333 (PC3 10600) ECC Unbuffered Server Memory
HDD: 4x SAMSUNG EcoGreen 1TB

The error appears on all of my drives???

holzmann · Dec 13, 2011

I can duplicate the errors every time I try to copy a 16GB directory over from the old server to the new server.

About 10 minutes into it, I start getting "ahcichX: Timeout on slot NN port 0" where X ranges from 0-3 for all four disks. They go on and on.

WTF!

djoole · Dec 14, 2011

djoole said:
Ok, thanks for the tip.

Well it seems you just can't do it perfectly the first time with FreeNAS :D (talking about choosing the good zpool structure).

So now i'm ready to do it well :
- backup my data
- create a fresh zpool with a raidz2 with 8 drivers forced to 4K (the new one i just bought today is 4K ^^ )
- transfer data
- enjoy it (hoping that the AHCI timeouts i got will be history that it was the Samsung bad firmware, and a Seagate with bad sectors).
Anyway, if i get AHCI timeouts with the new pool, i'll try to switch to IDE Mode in the BIOS, as advised by Durkatlon.

Thanks to all of you.
I'll keep in touch :)

Oops, i forgot to keep you in the loop.
So i never got any AHCI timeouts after changing the Seagate with bad sectors.

Life is good, with a raidz2 and a weekly scrub (thanks to milhouse script)

holzmann · Dec 14, 2011

How did you determine it was the Seagate with bad sectors?

From what I can determine all four my brand new Samsung drives are running fine.

djoole · Dec 14, 2011

I didn't, i just assumed that.
There was no other reason. And anyway this disk had to be changed, it's not good to let a disk with bad sectors in a RAID.
And after i changed it (and did another zpool from scratch), everything was okay.

But for your problem i think it's something else.
Are your Samsung EG old? There is a firmware issue with them. Just to be sure, you should upgrade them.

Important Announcement for the TrueNAS Community.

AHCI timeouts

MVP

Contributor

Patron

MVP

Contributor

MVP

Contributor

MVP

Behold the Wumpus

Contributor

Contributor

Guru

Contributor

Guru

Contributor

Dabbler

Dabbler

Contributor

Dabbler

Contributor

Important Announcement for the TrueNAS Community.

Related topics on forums.truenas.com for thread: "AHCI timeouts"

Similar threads