Rsync/Cron errors

Status
Not open for further replies.

TravisT

Patron
Joined
May 29, 2011
Messages
297
Today, I received an email from my freenas box with the following text:

Code:
tar: collectd/rrd/localhost/df/df-mnt-cargo-globemaster.rrd: Cannot stat: No such file or directory
tar: collectd/rrd/localhost/df/df-mnt-cargo-hercules.rrd: Cannot stat: No such file or directory
tar: collectd/rrd/localhost/df/df-mnt-cargo-starlifter.rrd: Cannot stat: No such file or directory
tar: collectd/rrd/localhost/df/.df-mnt-cargo.rrd: Cannot stat: No such file or directory
tar: collectd/rrd/localhost/df/.df-mnt-cargo-Test.rrd: Cannot stat: No such file or directory
tar: collectd/rrd/localhost/df/.df-mnt-cargo-depot.rrd: Cannot stat: No such file or directory
tar: collectd/rrd/localhost/df/.df-mnt-cargo-galaxy.rrd: Cannot stat: No such file or directory
tar: collectd/rrd/localhost/df/.df-mnt-cargo-globemaster.rrd: Cannot stat: No such file or directory
tar: collectd/rrd/localhost/df/.df-mnt-cargo-hercules.rrd: Cannot stat: No such file or directory
tar: collectd/rrd/localhost/df/.df-mnt-cargo-starlifter.rrd: Cannot stat: No such file or directory
tar: Error exit delayed from previous errors.


Cargo is my ZFS volume and my datasets are depot, galaxy, globemaster, hercules and starlifter.

I have 5 cron jobs configured to backup my freenas box to a USB drive, but it's not scheduled to start until tomorrow, so I'm not sure if these errors are related to that. In my daily email from freenas, it shows that scrub was run 29 days ago and the threshold is 30 days.

The other strange thing I noticed is that when I checked the zpool status from the command line, scrub showed none requested - I thought it normally indicated the last time scrub was run. Is that normal?
 

Durkatlon

Patron
Joined
Aug 19, 2011
Messages
414
Last scrub run only shows scrubs that were done since you last rebooted I think.

What makes you think these errors have anything to do with rsync or cron? They appear to me to have to do with the "Reporting" tab that FreeNAS8 has where you can see disk usage and network traffic etc. I'm not sure if these errors are normal when you first create the volume and datasets. I know it takes a while for the Reporting page to fully populate after you create a new install (which I've had to do just yesterday because of a dead USB stick).
 

TravisT

Patron
Joined
May 29, 2011
Messages
297
The subject line of the email is:

Code:
Cron <root@FreeNAS> /bin/sh /root/save_rrds.sh


I agree it could be a reporting problem from looking at the error. The install has been running for over 3 months now and has not been rebooted recently (to my knowledge at least). No new volumes or datastores.

Everything appears to be working, but just curious as to why this email would be generated.
 

TravisT

Patron
Joined
May 29, 2011
Messages
297
I take that back. My uptime shows 20 days, so it must have been rebooted at some point.

BTW, I'm running FreeNAS-8.0.1-BETA4-amd64. Need to upgrade in the next couple of days to the newest release.
 

Durkatlon

Patron
Joined
Aug 19, 2011
Messages
414
The save_rrds script is responsible for persisting the reporting data and I'm guessing for pruning it for the longer term graphs (i.e once the reporting data is a week old, there's no point in keeping it for the daily graph, since this data is no longer reachable with that resolution).

It seems like the input files that save_rrds is expecting have gone missing for some reason. Have you been installing extra stuff on top of the base installation at all?
 

TravisT

Patron
Joined
May 29, 2011
Messages
297
No I haven't installed any other packages or anything. It's weird because it's been running fine for the last few months. All of a sudden I got this email.
 

Durkatlon

Patron
Joined
Aug 19, 2011
Messages
414
Yeah odd, somehow those files disappeared. Are any of the partitions on the USB stick showing full when you do a "df -ah" ?
 

TravisT

Patron
Joined
May 29, 2011
Messages
297
Sorry I didn't get back to you before now. I'm running this server as a virtual machine on an ESXi host. I checked the df -ah and the only partition that is 100% is the /dev, which if I remember correctly is normal. I did have issues in the past where I would get emails from cron every 5 minutes when a partition filled up. This happened when I enabled snapshots on my volumes. I've heard it has been fixed, but I have not re-enabled them since I had the problems.

I also noticed on my monthly scrub report that errors found and repaired. Haven't seen any more emails since the second one, and maybe ZFS caught and repaired the errors. Gotta love it for that (if that's the case).
 

Durkatlon

Patron
Joined
Aug 19, 2011
Messages
414
Perhaps the errors that scrub fixed affected the directory containing the RRD data. It's a weird problem that I haven't seen reported before. Are you continuing to receive those emails or was it a one time occurrence?
 

TravisT

Patron
Joined
May 29, 2011
Messages
297
Well, I received one email (the one I posted), and I think the day after I received one more (slightly different, but still rrd related). I have not received any more emails since those two. Could be that the scrub fixed it.
 

TravisT

Patron
Joined
May 29, 2011
Messages
297
So I received another email today, with only one of the above lines listed. Apparently the scrub didn't fix it. I'm going to start another scrub to see if something is going on and data is being corrupted. If it runs clean, I have no idea what could be causing this...
 

Milhouse

Guru
Joined
Jun 1, 2011
Messages
564
If you're getting errors in ZFS, and corruption in non-ZFS partitions (ie. /data) then I'd suspect the entire VM has a problem of some kind, maybe it's related to FreeNAS/FreeBSD, but also maybe not.
 

TravisT

Patron
Joined
May 29, 2011
Messages
297
You bring up a good point. ZFS found another CSUM error that was corrected when I re-ran scrub. Since the OS is not installed on a ZFS partition, that would have nothing to do with any corruption within the UFS partitions. Not sure what could be causing this. Hopefully my dual backups along with ZFS ability to find/correct errors will allow me to limp along until I can get physical access to the server and investigate further. I'll also keep an eye on the logs of the ESXi hypervisor to see if anything turns up there.

Any other recommendations are welcome.
 
Status
Not open for further replies.
Top