Chrisputer
Cadet
- Joined
- Aug 16, 2022
- Messages
- 5
Hey all,
I've got a bit of a head-scratcher here and I could use some help from my fellow tech wizards. I've been using my 4 Samsung 845DCs as my go-to scratch space for a while now, but recently things took a strange turn. It all started with some funky alerts popping up in my server's syslog, warning me of "write error...corrected" messages on my NVMe drive. Naturally, I went into detective mode and started troubleshooting. After some fiddling around, I managed to move the drive to a separate m.2 slot and everything seemed fine.
But, as is often the case, just when I thought I'd cracked the code, another one of my Samsung SSDs started acting up. At this point, I was feeling a little flabbergasted. How could I be losing my SSDs all of a sudden? I decided to do some benchmarking on my Windows workstation, and sure enough, the drive was working perfectly fine. But as soon as I put it back into my server, another Samsung SSD started showing some concerning errors in the GUI. I checked the SMART data and found some corrected layer 1 issues and CRC errors. After trying to reinsert the drive, TrueNAS gave me a big ol' "faulty" message. What the heck was going on?!
I tried creating a new pool with the problematic drive, but no luck. An error message popped up saying the drive couldn't be used because it wasn't starting at 0. I've been in the IT game for 15 years and I've never seen anything like this before. Just when I thought things couldn't get any weirder, all of my HDDs started showing the same issue, no matter how many times I tested them on my Windows desktop.
After hours of banging my head against the wall, I finally realized that anything going in and out of the PCH was getting corrupted in some way. The NVMe drive I mentioned earlier was fine because it talked directly to the CPU, but the other m.2 slot went through the PCH, which seemed to be the source of the problem.
I ended up getting a new CPU and motherboard, and everything seems to be running smoothly except for those darn Samsung SSDs. I've tried everything I can think of, but one of them just won't show up properly, and it's causing my system to hang at boot up. I'm hoping to get the drive fixed, but I'm running low on cash after buying all this new hardware and a bunch of new HDDs all within the same week.
Any ideas? Thanks in advance for any help or suggestions you can offer!
I've got a bit of a head-scratcher here and I could use some help from my fellow tech wizards. I've been using my 4 Samsung 845DCs as my go-to scratch space for a while now, but recently things took a strange turn. It all started with some funky alerts popping up in my server's syslog, warning me of "write error...corrected" messages on my NVMe drive. Naturally, I went into detective mode and started troubleshooting. After some fiddling around, I managed to move the drive to a separate m.2 slot and everything seemed fine.
But, as is often the case, just when I thought I'd cracked the code, another one of my Samsung SSDs started acting up. At this point, I was feeling a little flabbergasted. How could I be losing my SSDs all of a sudden? I decided to do some benchmarking on my Windows workstation, and sure enough, the drive was working perfectly fine. But as soon as I put it back into my server, another Samsung SSD started showing some concerning errors in the GUI. I checked the SMART data and found some corrected layer 1 issues and CRC errors. After trying to reinsert the drive, TrueNAS gave me a big ol' "faulty" message. What the heck was going on?!
I tried creating a new pool with the problematic drive, but no luck. An error message popped up saying the drive couldn't be used because it wasn't starting at 0. I've been in the IT game for 15 years and I've never seen anything like this before. Just when I thought things couldn't get any weirder, all of my HDDs started showing the same issue, no matter how many times I tested them on my Windows desktop.
After hours of banging my head against the wall, I finally realized that anything going in and out of the PCH was getting corrupted in some way. The NVMe drive I mentioned earlier was fine because it talked directly to the CPU, but the other m.2 slot went through the PCH, which seemed to be the source of the problem.
I ended up getting a new CPU and motherboard, and everything seems to be running smoothly except for those darn Samsung SSDs. I've tried everything I can think of, but one of them just won't show up properly, and it's causing my system to hang at boot up. I'm hoping to get the drive fixed, but I'm running low on cash after buying all this new hardware and a bunch of new HDDs all within the same week.
Any ideas? Thanks in advance for any help or suggestions you can offer!