Copying files in the same pool? Need help

NathanYung

Cadet
Joined
Jan 10, 2023
Messages
4
CBD4AF20-5480-4F95-BBC5-FB0D411AF48B.jpeg
Please forgive me if I make any errors. My English is pretty basic.

I use Solaris ZFS pool for more 10 years.

Last month, I tried to use TrueNAS core.
After scrub, no errors, copy from “Solaris Zpool” to “TrueNAS Zpool”.
I did it twice, in different directories.

Hard disk all new, 6 x 18TB, raidz2.

After copy with rsync, scrub again in TrueNAS core.
Everything is fine.

>>>Then I copy directory from A to cover B, using mc with ssh.
No errors.

But when I play my camera video files in B directory, files goes wrong.

I checked all my photos raw files, video files, as soon as possible.
>>> Small files are all good, but video files bigger than 500MB ( maybe, because camera raw smaller 100MB) are all damaged.

>>> I do Scrub immediately, all video files in B are broken.

Because I have setting with snap before copy, I rolled back, both A and B video files, are good, no damage.

I tried copy again, from A to cover B, it happened again, all big files went wrong.

>>> A and B in the same pool.

I ran memtest86 after this, memories 128GB, nothing goes wrong.

Machine : HP DL380P Gen8
No such errors happened on Solaris 11.4 in the past 10 years.

Really need some help .
 

sretalla

Powered by Neutrality
Moderator
Joined
Jan 1, 2016
Messages
9,703
You're showing 141K checksum errors on all of your drives in the HomeNAS pool.

That says either a bad cable, backplane or SATA/SAS controller to me.

If you'd care to share more (in line with the forum rules) about your hardware, maybe there's something about that which may show itself to be the cause.
 

NathanYung

Cadet
Joined
Jan 10, 2023
Messages
4
You're showing 141K checksum errors on all of your drives in the HomeNAS pool.

That says either a bad cable, backplane or SATA/SAS controller to me.

If you'd care to share more (in line with the forum rules) about your hardware, maybe there's something about that which may show itself to be the cause.
Thanks you for reply.
141k error shows after copy from directory a to cover b inside the same pool.
Copy files with rsync from another machine with Solaris is ok, no errors.
I have copied “over 40TB” datas from Solaris ZFS Zpool, without any errors.( copy then scrub)

>>> Errors only happens inside the same pool, and only larger files.(maybe larger than 500MB)
files over 1GB, all goes wrong.

Hope someone can help to confirm this issue.

Machine: HP DL380P Gen8
128GB registered dimm.
E5-2697 V2 x2

I’ve test my backplanes, cables, HBA cards, memories, I don’t think these have any problems.
 

sretalla

Powered by Neutrality
Moderator
Joined
Jan 1, 2016
Messages
9,703
With 6 disks showing 141K Checksum errors, I would suggest looking at dmesg for CAM STATUS or other items logged by CAM.

Have you run short or long SMART tests on the disks?

How are your disks connected? to what controller?
 
Top