Single memory location corruption with rsync

Status
Not open for further replies.

mael

Dabbler
Joined
Jun 27, 2013
Messages
20
I only have one desktop with a few hard drives (signature) and I've been looking for the proper way to send the files. It seems to really only be from NTFS if coming from another OS as far as I can gather, as all the others seem to have a shortcoming here or there.
All over the forums people are saying that the proper procedure is to use rsync (from within the FreeNAS GUI) to move the files. And I just did that with over 3TB of my own data. I figured there would be a few mismatched ones and zeros but I really didn't care if I lost ten, twenty, or so files. But then once I was done with my files I was going to rsync others' files. And with those I figured I should make sure they were sent correctly. First was a ~40GB tar file. I sent it, checked it, and the md5 was wrong. Renamed it and sent it a second time, whilst first was still there, and when I tried md5, I got an input/output error. Deleted them both, tried again and again and again and again and yes again a few more times. Wrong wrong wrong wrong!
Seeing as those files weren't mine I had them backed up and sent it via CIFS on a Windows machine. Correct!
Deleted it again and sent it via cp command (FreeNAS GUI). Correct again! I tried via each again a few more times and they were all correct.
Deleted it and sent it again via rsync. Wrong!

So yes, should we be concerned about using rsync or what ?
Oh right, and I did check files to see if they were correct at the very beginning but I may have "accidentally" used cp before I read that the people's choice was rsync.
 

fracai

Guru
Joined
Aug 22, 2012
Messages
1,212
I've never seen this sort of behavior. I'd suspect failing hardware, but your other transfers were fine. Can you post the exact commands that you were using?
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,525
I'd be very concerned. I'd start looking at what is going on. Maybe a RAM test on both the source and destination machine. I know one person had pictures that were often corrupt when copying them over CIFS shares and was getting upset. Turned out his desktop's RAM was bad and it was corrupting files as they were loaded into RAM to be sent over the network.

You are doing the right thing with checking via MD5. Don't stop doing that because its the right thing to do. Especially since you are trying to validate that rsync isn't trashing your data. Rsync should be doing its own CRC checks and validating that the destination file matches the source. There's lots of options for rsync though and you can disable the CRC checks and whatnot. If your RAM is good on both ends my guess is that Rsync isn't doing its own CRC checks(is it disabled or not supported?).

I know using protocols like FTP can be ugly, even on local LANs, because there is no packet CRC. I corrupted a bunch of data in 2008 when I tried to FTP 2TB over the LAN. Learned my lesson on that one. CIFS is a fairly safe bet with its own checksumming.

What program are you using on the Windows side to do the Rsyncing?
 

mael

Dabbler
Joined
Jun 27, 2013
Messages
20
fracai: The one I've seen on here is
rsync -av source destination

cyberjock:
(Might want to skip to the edit - this is just explaining everything I explained terribly before, probably useless information. I'm just too tired right now, I'm about to head off to bed, just leaving it here in case it is important)
I'm guessing having run the test months ago a few times and it having been fine means nothing and I should do so again and leave it running overnight.
Destination machine ? Both are one in the same. Or does it mean the computer that is running the commands in FreeNAS GUI via terminal ? When I did copy it from another machine, it worked fine.
I would think if I had corrupted RAM would result in errors regardless of which way I tried ? CIFS/cp/rsync

Well, I showed the options I get, what they do at the front, but maybe a or v disable the check ? The rsync I'm talking about is the FreeNAS rsync from dataset1 to dataset3 on drive1 (Currently in the machine 2X3TB(media)/2TB[other data[drive1] and 16GB of RAM)

Ouch! 2TB!

I'm not sure I conveyed what's happening well enough. I'll try again if I didn't already clarify.

I boot up my FreeNAS system(Desktop). Then from Laptop1 I log into FreeNAS GUI and from within the GUI I hit "Shell" open up the terminal and enter in:
rsync -av /source/file /destination/location
Once that is copied, I md5 the destination file. And then a few times like I described above.
Then I boot up Laptop2 (the Windows machine) and open up explorer and go to my share (Z: drive named somethingshare) that I set up. I plug in my external backup and drag and drop the ~40GB file into the share. Once it is completed I go back to Laptop1 and md5 returns the correct value
Then still within "Shell" (on Laptop1 via FreeNAS GUI) I try cp and md5 returns the correct value. Each and every time. With cp or drag and drop in Windows.
I then try rsync -av in "Shell", exactly the same way, and md5 returns the wrong value.

I seriously hope that you comprehend me now. I really have difficulties expressing what I try to say. Even who I ask ends up getting confused and my points get jumbled up.

EDIT:And an edit before I hit enter that can't be good...
I just started the test and I get this:
2aea0b60-4091-4795-92fe-24da3fac7909_zps940fa926.jpg


Where do I go from here ? Throw out my RAM ? Or probably just better off buying a new rig ? Although I did hear a friend say that before running memtest86+ his computer was acting funny and that afterwards it was running completely fine. So, I guess now I'm asking does the output of this get saved anywhere and this address gets flagged like a bad sector or completely hopeless ?
Also, still think it's weird that with rsync it always failed and with everything else it didn't. Just a coincidence I suppose. Damn coincidences.
And in the time it took to upload that picture, I got 2 more red lines in what seems to be the same address and "Errors" has gone to 100+.

Need some rest now, thanks a lot for all the help.
 

cyberjock

Inactive Account
Joined
Mar 25, 2012
Messages
19,525
I'm guessing having run the test months ago a few times and it having been fine means nothing and I should do so again and leave it running overnight.

Absolutely. I've seen RAM go from just fine one day to BSODing a system like crazy the next.
Destination machine ? Both are one in the same. Or does it mean the computer that is running the commands in FreeNAS GUI via terminal ? When I did copy it from another machine, it worked fine.
I mean both the FreeNAS server and the system you are copying data from.
I would think if I had corrupted RAM would result in errors regardless of which way I tried ? CIFS/cp/rsync

Nope. You're assuming that the commands would happen to be allocated the same locations and they'd happen to be bad. They definitely don't use the same amount of RAM, and are assigned memory locations based on what's free at that moment in time.


:And an edit before I hit enter that can't be good...
I just started the test and I get this:
2aea0b60-4091-4795-92fe-24da3fac7909_zps940fa926.jpg


Where do I go from here ? Throw out my RAM ? Or probably just better off buying a new rig ? Although I did hear a friend say that before running memtest86+ his computer was acting funny and that afterwards it was running completely fine. So, I guess now I'm asking does the output of this get saved anywhere and this address gets flagged like a bad sector or completely hopeless ?
Also, still think it's weird that with rsync it always failed and with everything else it didn't. Just a coincidence I suppose. Damn coincidences.
And in the time it took to upload that picture, I got 2 more red lines in what seems to be the same address and "Errors" has gone to 100+.

Need some rest now, thanks a lot for all the help.

If you are overclocking, spank yourself and get rid of it. If you aren't overclocking then you are now in "identify the problem". It could be a stick of RAM or a slot. It could even be something like your power supply voltage is fluctuating causing data to be lost in RAM. Now you start testing sticks one at a time and testing different combinations of slots to find the actual fault. If its RAM you can probably RMA it if you have a receipt. If its a RAM slot and the motherboard isnt in warranty, time to buy a new motherboard(and probably processor). It's important to keep in mind that you are trying to narrow down the issue to 1 particular thing(ram slot or ram stick). So if 1 RAM stick fails the test you still need to test the other 3. If those give errors to then you have to figure out if you really have 2+ sticks of bad RAM(unlikely but not necessarily 0% chance) or what is actually wrong. It also helps if you have a spare machine that you can run memory testing in that you KNOW passes its RAM tests.

There is no log unless you save the log. I believe there's an option to save it to a location that's available from DOS. Sometimes nothing is available though. I just take a picture of the screen like you appear to have done.

...and now you see why some of us senior guys push for ECC RAM. How much data would you have ultimately lost before you figured out that you were losing data? One person didn't know anything was wrong until he corrupted his zpool beyond the ability to mount it. He lost everything and had no backups. :(

I think jgreco put it best when he said ECC RAM is like seatbelts for a car. You might not use it for years, but the one time you do use it you'll be so glad you did.
 

mael

Dabbler
Joined
Jun 27, 2013
Messages
20
Edit: Is there anyway to put the stick of RAM in the last slot then have FreeNAS ignore it ? Apparently my motherboard (P7H55 my biggest mistake in assembling this computer I picked the cheapest there was...)refuses to recognize RAM if it's not in single channel if two are in dual channel but I can't seem to disable dual channel so I can at best only see 8GB but I need at least the 12GB to have something functional whilst I figure out what I'll need for a new system.

Nope. You're assuming that the commands would happen to be allocated the same locations and they'd happen to be bad. They definitely don't use the same amount of RAM, and are assigned memory locations based on what's free at that moment in time.
Yup, that hit me right before I got in bed and passed out.

If you are overclocking, spank yourself and get rid of it. If you aren't overclocking then you are now in "identify the problem". It could be a stick of RAM or a slot. It could even be something like your power supply voltage is fluctuating causing data to be lost in RAM. Now you start testing sticks one at a time and testing different combinations of slots to find the actual fault. If its RAM you can probably RMA it if you have a receipt. If its a RAM slot and the motherboard isnt in warranty, time to buy a new motherboard(and probably processor). It's important to keep in mind that you are trying to narrow down the issue to 1 particular thing(ram slot or ram stick). So if 1 RAM stick fails the test you still need to test the other 3. If those give errors to then you have to figure out if you really have 2+ sticks of bad RAM(unlikely but not necessarily 0% chance) or what is actually wrong. It also helps if you have a spare machine that you can run memory testing in that you KNOW passes its RAM tests.
Oh geez, I haven't overclocked in what almost a decade ? Wow, time sure does fly. I lean more towards the underclocking nowadays but still nothing has been changed on that computer.
Yeah, I will be sure to do all that but I'm going to guess it's not the power supply if it's failed at the same spot all night long. But I've learned now to not rule out anything lol. Sadly, I don't have anything to test it on. Most people I know are still running Windows XP on a 10+ year old computer with less RAM than my phone...and obviously not DDR3 lol.

Although from all this one thing I don't get is why didn't rsync flag the resulting file as dodgy when the md5 is obviously wrong.

...and now you see why some of us senior guys push for ECC RAM. How much data would you have ultimately lost before you figured out that you were losing data? One person didn't know anything was wrong until he corrupted his zpool beyond the ability to mount it. He lost everything and had no backups. :(

I think jgreco put it best when he said ECC RAM is like seatbelts for a car. You might not use it for years, but the one time you do use it you'll be so glad you did.
Oh believe me I'd love ECC RAM but then I'd have to get ECC everything else and this was just a test to see if I could actually get the NAS up and running and if it would do what I wanted it to do. Of course, if it had passed I'd've probably stayed with this setup for years I am guilty of that lol.
Well, I still may have loads of corrupt files I didn't check all 3TB although I've gone through a few of my films and they all work of course those are the least of my worries. Not like I can't re-rip lol. Just that those are the most time consuming.
Ouch that must be terrible!

Lol, I was thinking if the extra 100 on the price tag was necessary now I see it is. My new build shall have that. And here I was trying to save up some money. x'D

Hmm..if you can maybe you should switch that title to reflect more on what has happened lol. I just freaked out and if there was something wrong with rsync I didn't want others to suffer now I don't want others to panick if they only read the first thread hope you know my heart was in the right place and mind in the wrong one.

Thank you so much for all this. It never would've hit me in a million years. You don't know how much I appreciate this. Keep up the good work.
 
Status
Not open for further replies.
Top