If a source is not connectable, ccollect.sh issues a series of error
messages such as:
$ ccollect.sh "int 1" dummy
2009-06-25-21:04:14: ccollect 0.7.1: Beginning backup using interval int 1
[dummy] 2009-06-25-21:04:14: Beginning to backup
[dummy] ssh: connect to host Ha port 20: No route to host
[dummy] rsync: connection unexpectedly closed (0 bytes received so far) [receiver]
[dummy] rsync error: unexplained error (code 255) at io.c(600) [receiver=3.0.5]
[dummy] 2009-06-25-21:04:17: Error: source Ha:/tmp is not readable. Skipping.
2009-06-25-21:04:17: Finished
If you expect the source to be up, you want to see these messages.
However, for a notebook computer or other portable machine, it may be
normal for it to be disconnected. If quiet_if_down is specified for
that source, then the ssh and rsync errors are suppressed and the
"Error:" prefix is removed from the "skipping" message:
$ ccollect.sh "int 1" dummy
2009-06-25-21:03:33: ccollect 0.7.1: Beginning backup using interval int 1
[dummy] 2009-06-25-21:03:34: Beginning to backup
[dummy] 2009-06-25-21:03:37: Source Ha:/tmp is not readable. Skipping.
2009-06-25-21:03:37: Finished
I considered the alternative implementation of adding the logic to
ccollect_analyse_logs.sh to enable it to separate rsync messages
generated the initial connection test from messages generated by
rsync used for an actual backup data transfer. Adding this approach
to ccollect.sh appeared much simpler.
By default, ccollect.sh sorts backup directories based on last change
time (ctime). This adds the option to sort based on modification
time (mtime).
I have updated doc/ccollect.text but it needs some work to simplify
and explain the issue.
User may optionally create a file rsync_failure_codes with a
newline-separated list of rsync return codes that are to be
regarded as complete failure.
If rsync exits with such a code, then the backup will be marked
for deletion during the next ccollect run (if delete_incomplete).
I added documentation for this feature in doc/ccollect.text
In my experience (yours may differ), two rsync exit codes that
belong in this file are 12 and 255. I have seen 12 result from
ssh errors and 255 result from a kernel module conflict. In both
cases, the resulting backups were empty. Without the
rsync_failure_codes option, such errors cause good backups to be
deleted, as per c_interval, leaving behind the new empty backups.
In the long run, we may want a different and more comprehensive
method for analyzing rsync errors. In the short run, I find this
option necessary.
I run ccollect nightly on a Thecus N2100 running Debian-ARM.
The Linksys NSLU2 also runs an ARM processor and I have lightly tested
ccollect on that under SlugOS.