Monitoring for a failed drive

Status
Not open for further replies.

jetter555

Cadet
Joined
Feb 6, 2018
Messages
5
Hello-
I currently have my FreeNas box uploading syslogs to one of my servers. Id like to setup
a macro on that syslog server that sends a text message to my phone if one of the drives fails.
The problem is im not sure what the log will say when that happens. I need an exact text
string to trigger the text message rule. Anyone know what a good string to use would be?

I currently have it set to detect "vdev state changed" because that is what is says if I pull a drive.
It works great except when it does a scrub once a week I get a text in the middle of the night lol

Thanks!
 

Chris Moore

Hall of Famer
Joined
May 2, 2015
Messages
10,080
FreeNAS will email you. You don't want that?

Sent from my SAMSUNG-SGH-I537 using Tapatalk
 

Chris Moore

Hall of Famer
Joined
May 2, 2015
Messages
10,080
I do currently use the email but for a failed drive id like to know asap by text.
I get email notifications on my phone, usually just as quickly as a text message. I don't really see the value of this, however I have something that might help you. The following excerpts were emailed to after a drive failure:
Code:
> (da8:mps1:0:7:0): WRITE(10). CDB: 2a 00 67 41 a1 80 00 00 08 00 length 4096 SMID 930 terminated ioc 804b scsi 0 state c xfer 0
> (da8:mps1:0:7:0): WRITE(10). CDB: 2a 00 67 41 a1 80 00 00 08 00
> (da8:mps1:0:7:0): CAM status: CCB request completed with an error
> (da8:mps1:0:7:0): Retrying command
> mps1: mpssas_prepare_remove: Sending reset for target ID 7
> da8 at mps1 bus 0 scbus1 target 7 lun 0
> da8: mps1: <ATA ST2000DM001-1ER1 CC25>Unfreezing devq for target ID 7
>  s/n Z4Z1WQ9Z detached
> (da8:mps1:0:7:0): WRITE(10). CDB: 2a 00 67 41 a1 80 00 00 08 00
> (da8:mps1:0:7:0): CAM status: CCB request aborted by the host
> (da8:mps1:0:7:0): Error 5, Periph was invalidated

Code:
> (ada2:ahcich4:0:0:0): READ_DMA. ACB: c8 00 50 17 15 4c 00 00 00 00 08 00
> (ada2:ahcich4:0:0:0): CAM status: ATA Status Error
> (ada2:ahcich4:0:0:0): ATA status: 51 (DRDY SERV ERR), error: 40 (UNC )
> (ada2:ahcich4:0:0:0): RES: 51 40 53 17 15 4c 00 00 00 08 00
> (ada2:ahcich4:0:0:0): Retrying command
> (ada2:ahcich4:0:0:0): READ_DMA. ACB: c8 00 50 17 15 4c 00 00 00 00 08 00
> (ada2:ahcich4:0:0:0): CAM status: ATA Status Error
> (ada2:ahcich4:0:0:0): ATA status: 51 (DRDY SERV ERR), error: 40 (UNC )
> (ada2:ahcich4:0:0:0): RES: 51 40 53 17 15 4c 00 00 00 08 00
> (ada2:ahcich4:0:0:0): Retrying command
> (ada2:ahcich4:0:0:0): READ_DMA. ACB: c8 00 50 17 15 4c 00 00 00 00 08 00
> (ada2:ahcich4:0:0:0): CAM status: ATA Status Error
> (ada2:ahcich4:0:0:0): ATA status: 51 (DRDY SERV ERR), error: 40 (UNC )
> (ada2:ahcich4:0:0:0): RES: 51 40 53 17 15 4c 00 00 00 08 00
> (ada2:ahcich4:0:0:0): Retrying command
> (ada2:ahcich4:0:0:0): READ_DMA. ACB: c8 00 50 17 15 4c 00 00 00 00 08 00
> (ada2:ahcich4:0:0:0): CAM status: ATA Status Error
> (ada2:ahcich4:0:0:0): ATA status: 51 (DRDY SERV ERR), error: 40 (UNC )
> (ada2:ahcich4:0:0:0): RES: 51 40 53 17 15 4c 00 00 00 08 00
> (ada2:ahcich4:0:0:0): Retrying command
> (ada2:ahcich4:0:0:0): READ_DMA. ACB: c8 00 50 17 15 4c 00 00 00 00 08 00
> (ada2:ahcich4:0:0:0): CAM status: ATA Status Error
> (ada2:ahcich4:0:0:0): ATA status: 51 (DRDY SERV ERR), error: 40 (UNC )
> (ada2:ahcich4:0:0:0): RES: 51 40 53 17 15 4c 00 00 00 08 00
> (ada2:ahcich4:0:0:0): Error 5, Retries exhausted
> (ada2:ahcich4:0:0:0): READ_DMA. ACB: c8 00 50 17 15 4c 00 00 00 00 08 00
> (ada2:ahcich4:0:0:0): CAM status: ATA Status Error
> (ada2:ahcich4:0:0:0): ATA status: 51 (DRDY SERV ERR), error: 40 (UNC )
> (ada2:ahcich4:0:0:0): RES: 51 40 53 17 15 4c 00 00 00 08 00
> (ada2:ahcich4:0:0:0): Retrying command
> (ada2:ahcich4:0:0:0): READ_DMA. ACB: c8 00 50 17 15 4c 00 00 00 00 08 00
> (ada2:ahcich4:0:0:0): CAM status: ATA Status Error
> (ada2:ahcich4:0:0:0): ATA status: 51 (DRDY SERV ERR), error: 40 (UNC )
> (ada2:ahcich4:0:0:0): RES: 51 40 53 17 15 4c 00 00 00 08 00
> (ada2:ahcich4:0:0:0): Retrying command
> (ada2:ahcich4:0:0:0): READ_DMA. ACB: c8 00 50 17 15 4c 00 00 00 00 08 00
> (ada2:ahcich4:0:0:0): CAM status: ATA Status Error
> (ada2:ahcich4:0:0:0): ATA status: 51 (DRDY SERV ERR), error: 40 (UNC )
> (ada2:ahcich4:0:0:0): RES: 51 40 53 17 15 4c 00 00 00 08 00
> (ada2:ahcich4:0:0:0): Retrying command
> (ada2:ahcich4:0:0:0): READ_DMA. ACB: c8 00 50 17 15 4c 00 00 00 00 08 00
> (ada2:ahcich4:0:0:0): CAM status: ATA Status Error
> (ada2:ahcich4:0:0:0): ATA status: 51 (DRDY SERV ERR), error: 40 (UNC )
> (ada2:ahcich4:0:0:0): RES: 51 40 53 17 15 4c 00 00 00 08 00
> (ada2:ahcich4:0:0:0): Retrying command
> (ada2:ahcich4:0:0:0): READ_DMA. ACB: c8 00 50 17 15 4c 00 00 00 00 08 00
> (ada2:ahcich4:0:0:0): CAM status: ATA Status Error
> (ada2:ahcich4:0:0:0): ATA status: 51 (DRDY SERV ERR), error: 40 (UNC )
> (ada2:ahcich4:0:0:0): RES: 51 40 53 17 15 4c 00 00 00 08 00
> (ada2:ahcich4:0:0:0): Error 5, Retries exhausted

Code:
> (ada0:ahcich1:0:0:0): READ_FPDMA_QUEUED. ACB: 60 b0 60 69 5e 40 18 00 00 00 00 00
> (ada0:ahcich1:0:0:0): CAM status: Uncorrectable parity/CRC error
> (ada0:ahcich1:0:0:0): Retrying command
> (ada0:ahcich1:0:0:0): READ_FPDMA_QUEUED. ACB: 60 b0 10 6a 5e 40 18 00 00 00 00 00
> (ada0:ahcich1:0:0:0): CAM status: Uncorrectable parity/CRC error
> (ada0:ahcich1:0:0:0): Retrying command
> (ada0:ahcich1:0:0:0): READ_FPDMA_QUEUED. ACB: 60 b0 c0 6a 5e 40 18 00 00 00 00 00
> (ada0:ahcich1:0:0:0): CAM status: Uncorrectable parity/CRC error
> (ada0:ahcich1:0:0:0): Retrying command
> (ada1:ahcich1:0:1:0): READ_FPDMA_QUEUED. ACB: 60 00 68 6a 5e 40 18 00 00 01 00 00
> (ada1:ahcich1:0:1:0): CAM status: Uncorrectable parity/CRC error
> (ada1:ahcich1:0:1:0): Retrying command
> (ada2:ahcich1:0:2:0): READ_FPDMA_QUEUED. ACB: 60 b0 48 67 5e 40 18 00 00 00 00 00
> (ada2:ahcich1:0:2:0): CAM status: Uncorrectable parity/CRC error
> (ada2:ahcich1:0:2:0): Retrying command
> (ada1:ahcich1:0:1:0): READ_FPDMA_QUEUED. ACB: 60 50 70 6b 5e 40 18 00 00 00 00 00
> (ada1:ahcich1:0:1:0): CAM status: Uncorrectable parity/CRC error
> (ada1:ahcich1:0:1:0): Retrying command
> (ada2:ahcich1:0:2:0): READ_FPDMA_QUEUED. ACB: 60 b0 f8 67 5e 40 18 00 00 00 00 00
> (ada2:ahcich1:0:2:0): CAM status: Uncorrectable parity/CRC error
> (ada2:ahcich1:0:2:0): Retrying command
> (ada2:ahcich1:0:2:0): READ_FPDMA_QUEUED. ACB: 60 b0 98 66 5e 40 18 00 00 00 00 00
> (ada2:ahcich1:0:2:0): CAM status: Uncorrectable parity/CRC error
> (ada2:ahcich1:0:2:0): Retrying command

These errors were from different systems at different times and caused by different types of faults.
 
Last edited:

jetter555

Cadet
Joined
Feb 6, 2018
Messages
5
I get email notifications on my phone, usually just as quickly as a text message. I don't really see the value of this, however I have something that might help you. The following excerpts were emailed to after a drive failure:
Code:
> (da8:mps1:0:7:0): WRITE(10). CDB: 2a 00 67 41 a1 80 00 00 08 00 length 4096 SMID 930 terminated ioc 804b scsi 0 state c xfer 0
> (da8:mps1:0:7:0): WRITE(10). CDB: 2a 00 67 41 a1 80 00 00 08 00
> (da8:mps1:0:7:0): CAM status: CCB request completed with an error
> (da8:mps1:0:7:0): Retrying command
> mps1: mpssas_prepare_remove: Sending reset for target ID 7
> da8 at mps1 bus 0 scbus1 target 7 lun 0
> da8: mps1: <ATA ST2000DM001-1ER1 CC25>Unfreezing devq for target ID 7
>  s/n Z4Z1WQ9Z detached
> (da8:mps1:0:7:0): WRITE(10). CDB: 2a 00 67 41 a1 80 00 00 08 00
> (da8:mps1:0:7:0): CAM status: CCB request aborted by the host
> (da8:mps1:0:7:0): Error 5, Periph was invalidated

Code:
> (ada2:ahcich4:0:0:0): READ_DMA. ACB: c8 00 50 17 15 4c 00 00 00 00 08 00
> (ada2:ahcich4:0:0:0): CAM status: ATA Status Error
> (ada2:ahcich4:0:0:0): ATA status: 51 (DRDY SERV ERR), error: 40 (UNC )
> (ada2:ahcich4:0:0:0): RES: 51 40 53 17 15 4c 00 00 00 08 00
> (ada2:ahcich4:0:0:0): Retrying command
> (ada2:ahcich4:0:0:0): READ_DMA. ACB: c8 00 50 17 15 4c 00 00 00 00 08 00
> (ada2:ahcich4:0:0:0): CAM status: ATA Status Error
> (ada2:ahcich4:0:0:0): ATA status: 51 (DRDY SERV ERR), error: 40 (UNC )
> (ada2:ahcich4:0:0:0): RES: 51 40 53 17 15 4c 00 00 00 08 00
> (ada2:ahcich4:0:0:0): Retrying command
> (ada2:ahcich4:0:0:0): READ_DMA. ACB: c8 00 50 17 15 4c 00 00 00 00 08 00
> (ada2:ahcich4:0:0:0): CAM status: ATA Status Error
> (ada2:ahcich4:0:0:0): ATA status: 51 (DRDY SERV ERR), error: 40 (UNC )
> (ada2:ahcich4:0:0:0): RES: 51 40 53 17 15 4c 00 00 00 08 00
> (ada2:ahcich4:0:0:0): Retrying command
> (ada2:ahcich4:0:0:0): READ_DMA. ACB: c8 00 50 17 15 4c 00 00 00 00 08 00
> (ada2:ahcich4:0:0:0): CAM status: ATA Status Error
> (ada2:ahcich4:0:0:0): ATA status: 51 (DRDY SERV ERR), error: 40 (UNC )
> (ada2:ahcich4:0:0:0): RES: 51 40 53 17 15 4c 00 00 00 08 00
> (ada2:ahcich4:0:0:0): Retrying command
> (ada2:ahcich4:0:0:0): READ_DMA. ACB: c8 00 50 17 15 4c 00 00 00 00 08 00
> (ada2:ahcich4:0:0:0): CAM status: ATA Status Error
> (ada2:ahcich4:0:0:0): ATA status: 51 (DRDY SERV ERR), error: 40 (UNC )
> (ada2:ahcich4:0:0:0): RES: 51 40 53 17 15 4c 00 00 00 08 00
> (ada2:ahcich4:0:0:0): Error 5, Retries exhausted
> (ada2:ahcich4:0:0:0): READ_DMA. ACB: c8 00 50 17 15 4c 00 00 00 00 08 00
> (ada2:ahcich4:0:0:0): CAM status: ATA Status Error
> (ada2:ahcich4:0:0:0): ATA status: 51 (DRDY SERV ERR), error: 40 (UNC )
> (ada2:ahcich4:0:0:0): RES: 51 40 53 17 15 4c 00 00 00 08 00
> (ada2:ahcich4:0:0:0): Retrying command
> (ada2:ahcich4:0:0:0): READ_DMA. ACB: c8 00 50 17 15 4c 00 00 00 00 08 00
> (ada2:ahcich4:0:0:0): CAM status: ATA Status Error
> (ada2:ahcich4:0:0:0): ATA status: 51 (DRDY SERV ERR), error: 40 (UNC )
> (ada2:ahcich4:0:0:0): RES: 51 40 53 17 15 4c 00 00 00 08 00
> (ada2:ahcich4:0:0:0): Retrying command
> (ada2:ahcich4:0:0:0): READ_DMA. ACB: c8 00 50 17 15 4c 00 00 00 00 08 00
> (ada2:ahcich4:0:0:0): CAM status: ATA Status Error
> (ada2:ahcich4:0:0:0): ATA status: 51 (DRDY SERV ERR), error: 40 (UNC )
> (ada2:ahcich4:0:0:0): RES: 51 40 53 17 15 4c 00 00 00 08 00
> (ada2:ahcich4:0:0:0): Retrying command
> (ada2:ahcich4:0:0:0): READ_DMA. ACB: c8 00 50 17 15 4c 00 00 00 00 08 00
> (ada2:ahcich4:0:0:0): CAM status: ATA Status Error
> (ada2:ahcich4:0:0:0): ATA status: 51 (DRDY SERV ERR), error: 40 (UNC )
> (ada2:ahcich4:0:0:0): RES: 51 40 53 17 15 4c 00 00 00 08 00
> (ada2:ahcich4:0:0:0): Retrying command
> (ada2:ahcich4:0:0:0): READ_DMA. ACB: c8 00 50 17 15 4c 00 00 00 00 08 00
> (ada2:ahcich4:0:0:0): CAM status: ATA Status Error
> (ada2:ahcich4:0:0:0): ATA status: 51 (DRDY SERV ERR), error: 40 (UNC )
> (ada2:ahcich4:0:0:0): RES: 51 40 53 17 15 4c 00 00 00 08 00
> (ada2:ahcich4:0:0:0): Error 5, Retries exhausted

Code:
> (ada0:ahcich1:0:0:0): READ_FPDMA_QUEUED. ACB: 60 b0 60 69 5e 40 18 00 00 00 00 00
> (ada0:ahcich1:0:0:0): CAM status: Uncorrectable parity/CRC error
> (ada0:ahcich1:0:0:0): Retrying command
> (ada0:ahcich1:0:0:0): READ_FPDMA_QUEUED. ACB: 60 b0 10 6a 5e 40 18 00 00 00 00 00
> (ada0:ahcich1:0:0:0): CAM status: Uncorrectable parity/CRC error
> (ada0:ahcich1:0:0:0): Retrying command
> (ada0:ahcich1:0:0:0): READ_FPDMA_QUEUED. ACB: 60 b0 c0 6a 5e 40 18 00 00 00 00 00
> (ada0:ahcich1:0:0:0): CAM status: Uncorrectable parity/CRC error
> (ada0:ahcich1:0:0:0): Retrying command
> (ada1:ahcich1:0:1:0): READ_FPDMA_QUEUED. ACB: 60 00 68 6a 5e 40 18 00 00 01 00 00
> (ada1:ahcich1:0:1:0): CAM status: Uncorrectable parity/CRC error
> (ada1:ahcich1:0:1:0): Retrying command
> (ada2:ahcich1:0:2:0): READ_FPDMA_QUEUED. ACB: 60 b0 48 67 5e 40 18 00 00 00 00 00
> (ada2:ahcich1:0:2:0): CAM status: Uncorrectable parity/CRC error
> (ada2:ahcich1:0:2:0): Retrying command
> (ada1:ahcich1:0:1:0): READ_FPDMA_QUEUED. ACB: 60 50 70 6b 5e 40 18 00 00 00 00 00
> (ada1:ahcich1:0:1:0): CAM status: Uncorrectable parity/CRC error
> (ada1:ahcich1:0:1:0): Retrying command
> (ada2:ahcich1:0:2:0): READ_FPDMA_QUEUED. ACB: 60 b0 f8 67 5e 40 18 00 00 00 00 00
> (ada2:ahcich1:0:2:0): CAM status: Uncorrectable parity/CRC error
> (ada2:ahcich1:0:2:0): Retrying command
> (ada2:ahcich1:0:2:0): READ_FPDMA_QUEUED. ACB: 60 b0 98 66 5e 40 18 00 00 00 00 00
> (ada2:ahcich1:0:2:0): CAM status: Uncorrectable parity/CRC error
> (ada2:ahcich1:0:2:0): Retrying command

These errors were from different systems at different times and caused by different types of faults.


Thanks, thats what I was looking for!
 
Status
Not open for further replies.
Top