- Joined
- Jan 14, 2023
- Messages
- 623
I started configuring the multi_report.sh script and upon first run I was greeted with a critical error message to my surprise.
My boot pool is apparently degraded.
Hardware:
During running the script I also received these errors:
This would lead me to thinking I can ignore the warning and also assume it is not related to the degraded boot pool.
My course of action would be to to
edit: almost forgot another confusing thing: I did not receive an email alert or an alert in the GUI.
My boot pool is apparently degraded.
Hardware:
Code:
TrueNAS-SCALE-23.10.1 Supermicro X10SLL-F, i3 4130, 16 Gb ECC RAM, Seasonic Prime PX-750 Data pool: 2*8Tb mirror Data pool 2: 1*8Tb boot pool: 1*128 Gb SSD UPS: Eaton Eco 650
Code:
pool: boot-pool state: DEGRADED status: One or more devices has experienced an unrecoverable error. An attempt was made to correct the error. Applications are unaffected. action: Determine if the device needs to be replaced, and clear the errors using 'zpool clear' or replace the device with 'zpool replace'. see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-9P scan: scrub repaired 0B in 00:00:31 with 0 errors on Sat Feb 17 03:45:32 2024 config: NAME STATE READ WRITE CKSUM boot-pool DEGRADED 0 0 0 sdc3 DEGRADED 0 0 0 too many errors errors: No known data errors sdc3 -> sdc3
Code:
########## SMART status report for sdc drive (INTENSO SSD : AA000000000000001840) ########## SMART overall-health self-assessment test result: PASSED ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x0032 100 100 050 Old_age Always - 0 5 Reallocated_Sector_Ct 0x0032 100 100 050 Old_age Always - 0 9 Power_On_Hours 0x0032 100 100 050 Old_age Always - 1632 12 Power_Cycle_Count 0x0032 100 100 050 Old_age Always - 14 160 Unknown_Attribute 0x0032 100 100 050 Old_age Always - 0 161 Unknown_Attribute 0x0033 100 100 050 Pre-fail Always - 100 163 Unknown_Attribute 0x0032 100 100 050 Old_age Always - 8 164 Unknown_Attribute 0x0032 100 100 050 Old_age Always - 10351 165 Unknown_Attribute 0x0032 100 100 050 Old_age Always - 86 166 Unknown_Attribute 0x0032 100 100 050 Old_age Always - 1 167 Unknown_Attribute 0x0032 100 100 050 Old_age Always - 42 168 Unknown_Attribute 0x0032 100 100 050 Old_age Always - 5050 169 Unknown_Attribute 0x0032 100 100 050 Old_age Always - 100 175 Program_Fail_Count_Chip 0x0032 100 100 050 Old_age Always - 0 176 Erase_Fail_Count_Chip 0x0032 100 100 050 Old_age Always - 0 177 Wear_Leveling_Count 0x0032 100 100 050 Old_age Always - 0 178 Used_Rsvd_Blk_Cnt_Chip 0x0032 100 100 050 Old_age Always - 0 181 Program_Fail_Cnt_Total 0x0032 100 100 050 Old_age Always - 0 182 Erase_Fail_Count_Total 0x0032 100 100 050 Old_age Always - 0 192 Power-Off_Retract_Count 0x0032 100 100 050 Old_age Always - 3 194 Temperature_Celsius 0x0022 100 100 050 Old_age Always - 40 195 Hardware_ECC_Recovered 0x0032 100 100 050 Old_age Always - 0 196 Reallocated_Event_Count 0x0032 100 100 050 Old_age Always - 0 197 Current_Pending_Sector 0x0032 100 100 050 Old_age Always - 0 198 Offline_Uncorrectable 0x0032 100 100 050 Old_age Always - 0 199 UDMA_CRC_Error_Count 0x0032 100 100 050 Old_age Always - 0 232 Available_Reservd_Space 0x0032 100 100 050 Old_age Always - 100 241 Total_LBAs_Written 0x0030 100 100 050 Old_age Offline - 27485 242 Total_LBAs_Read 0x0030 100 100 050 Old_age Offline - 2999 245 Unknown_Attribute 0x0032 100 100 050 Old_age Always - 116886 No Errors Logged Most recent Short & Extended Tests - Listed by test number # 1 Short offline Completed without error 00% 1487 - # 2 Extended offline Completed without error 00% 1381 - SCT Error Recovery Control: SCT Error Recovery Control command not supported
During running the script I also received these errors:
Code:
Collecting data, Please wait... Partition 1 does not start on physical sector boundary. Partition 2 does not start on physical sector boundary. Partition 1 does not start on physical sector boundary. Partition 2 does not start on physical sector boundary. Partition 1 does not start on physical sector boundary. Partition 2 does not start on physical sector boundary.
fdisk -l
shows the errors in regard to zd0
which would be my zvol for my pfsense VM.Code:
Disk /dev/zd0: 12 GiB, 12884918272 bytes, 25165856 sectors Units: sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 16384 bytes I/O size (minimum/optimal): 16384 bytes / 16384 bytes Disklabel type: gpt Disk identifier: 9D15D700-C5E8-11EE-BAA7-00A0982879CB Device Start End Sectors Size Type /dev/zd0p1 40 532519 532480 260M EFI System /dev/zd0p2 532520 533543 1024 512K FreeBSD boot /dev/zd0p3 534528 2631679 2097152 1G FreeBSD swap /dev/zd0p4 2631680 25163775 22532096 10.7G FreeBSD ZFS Partition 1 does not start on physical sector boundary. Partition 2 does not start on physical sector boundary.
This would lead me to thinking I can ignore the warning and also assume it is not related to the degraded boot pool.
My course of action would be to to
zpool clear
the boot pool and wait what happens. The main reason for my question here though is: Why is the status degraded if no error is logged? Where can I check?edit: almost forgot another confusing thing: I did not receive an email alert or an alert in the GUI.
Last edited: