The interval definition section was down to just before the
maximum backup check. This makes the code more friendly to
automatic interval selection. Auto interval selection needs to
have ddir defined first and it is best if it is done after
delete_incomplete. This change accomplishes that while still
placing it before the maximum backup check which needs to know
the interval.
so all incomplete backups are deleted, not just the ones with
the particular interval that the user specified.
The advantage of this is that those to-be-deleted incomplete
backups will not interfere with calculations required for
automatic interval selection.
Eight lines and two variables are removed which makes the code,
I think, easier to read.
The main motivation for this change, however, is that it makes
ccollect.sh more friendly to (future) auto interval selection.
The removed lines and variables assumed that the interval was
known prior to the start of the source loop. With auto interval
selection, the selected interval can be different for each
source.
If a source is not connectable, ccollect.sh issues a series of error
messages such as:
$ ccollect.sh "int 1" dummy
2009-06-25-21:04:14: ccollect 0.7.1: Beginning backup using interval int 1
[dummy] 2009-06-25-21:04:14: Beginning to backup
[dummy] ssh: connect to host Ha port 20: No route to host
[dummy] rsync: connection unexpectedly closed (0 bytes received so far) [receiver]
[dummy] rsync error: unexplained error (code 255) at io.c(600) [receiver=3.0.5]
[dummy] 2009-06-25-21:04:17: Error: source Ha:/tmp is not readable. Skipping.
2009-06-25-21:04:17: Finished
If you expect the source to be up, you want to see these messages.
However, for a notebook computer or other portable machine, it may be
normal for it to be disconnected. If quiet_if_down is specified for
that source, then the ssh and rsync errors are suppressed and the
"Error:" prefix is removed from the "skipping" message:
$ ccollect.sh "int 1" dummy
2009-06-25-21:03:33: ccollect 0.7.1: Beginning backup using interval int 1
[dummy] 2009-06-25-21:03:34: Beginning to backup
[dummy] 2009-06-25-21:03:37: Source Ha:/tmp is not readable. Skipping.
2009-06-25-21:03:37: Finished
I considered the alternative implementation of adding the logic to
ccollect_analyse_logs.sh to enable it to separate rsync messages
generated the initial connection test from messages generated by
rsync used for an actual backup data transfer. Adding this approach
to ccollect.sh appeared much simpler.
By default, ccollect.sh sorts backup directories based on last change
time (ctime). This adds the option to sort based on modification
time (mtime).
I have updated doc/ccollect.text but it needs some work to simplify
and explain the issue.
After rsync, the destination directory's mtime reflects the
modification time of its immediate contents. This patch overrides
that and sets the mtime to the time that the backup finished.
With this patch, the age of a backup can be assessed by looking at
its mtime. The advantages of this are (1) that mtime can be
preserved, via cp -a or rsync -a, when copying a backup repository
to a new hard disk or a new machine and (2) that incorrect mtimes,
such as might happen after a user meddles with his backup
repository, can be, via touch, corrected. The disadvantage is that
mtime for the immediate contents of the destination directory is
lost.
User may optionally create a file rsync_failure_codes with a
newline-separated list of rsync return codes that are to be
regarded as complete failure.
If rsync exits with such a code, then the backup will be marked
for deletion during the next ccollect run (if delete_incomplete).
I added documentation for this feature in doc/ccollect.text
In my experience (yours may differ), two rsync exit codes that
belong in this file are 12 and 255. I have seen 12 result from
ssh errors and 255 result from a kernel module conflict. In both
cases, the resulting backups were empty. Without the
rsync_failure_codes option, such errors cause good backups to be
deleted, as per c_interval, leaving behind the new empty backups.
In the long run, we may want a different and more comprehensive
method for analyzing rsync errors. In the short run, I find this
option necessary.
I run ccollect nightly on a Thecus N2100 running Debian-ARM.
The Linksys NSLU2 also runs an ARM processor and I have lightly tested
ccollect on that under SlugOS.
1). On systems I tried, delete_incomplete failed because the line:
< pcmd rm $VVERBOSE -rf "${ddir}/${realincomplete}" || \
should read:
> pcmd rm $VVERBOSE -rf "${realincomplete}" || \
(Is this true of all systems?)
2). The marker file was not deleted. Code was added to delete it.
3). The delete_incomplete code was simplified.
Previously, there was a $VERBOSE script variable that was set according to
the "-v" command line option and then reset to null within each source
sub-shell. With no loss of functionality, this patch removes all references
to that variable.
This makes the script 13 lines shorter.
According to the documentation, "if [the very_verbose] file exists in the
source specification -v will be passed to rsync, rm and mkdir." Previously,
the -v option was passed only to rsync. This patch passes it to rm and mkdir
as well.
Actually, as per the behavior of the previous version, it is verbose that
sends the -v option to rsync while very_verbose sends it the -vv option. I
left it this way because this behavior seems reasonable. Maybe the
documentation should be corrected on this point.
1). If an option doesn't exist in a source directory, check the defaults directory.
2). For every option, create a corresponding "no_" option so that a source directory
can override an option set in defaults.
(i.patch)
I added some patches from John with an E-Mail address he
does not to be public on the internet.
I did not ask him before whether this is fine, so I screwed
up (similar to the description in git-tag(1)).
Thus I replaced his e-mail and would like you to accept
this forced push and remove old trees on the net.
Thanks,
Nico
Signed-off-by: Nico Schottelius <nico@ikn.schottelius.org>
Based on patches by John Lawless <jlawless@redwoodscientific.com>.
Skipped the sort changing part (from -tc to -t)
c.patch:
--- ccollect-0.7.1-b.sh 2009-05-24 21:32:00.000000000 -0700
+++ ccollect-0.7.1-c.sh 2009-05-24 21:39:43.000000000 -0700
@@ -40,10 +40,13 @@
VERSION=0.7.1
RELEASE="2009-02-02"
HALF_VERSION="ccollect ${VERSION}"
FULL_VERSION="ccollect ${VERSION} (${RELEASE})"
+#TSORT="tc" ; NEWER="cnewer"
+TSORT="t" ; NEWER="newer"
+
#
# CDATE: how we use it for naming of the archives
# DDATE: how the user should see it in our output (DISPLAY)
#
CDATE="date +%Y%m%d-%H%M"
@@ -513,14 +516,14 @@
#
# Check for backup directory to clone from: Always clone from the latest one!
#
- # Use ls -1c instead of -1t, because last modification maybe the same on all
- # and metadate update (-c) is updated by rsync locally.
- #
- last_dir="$(pcmd ls -tcp1 "${ddir}" | grep '/$' | head -n 1)" || \
+ # Depending on your file system, you may want to sort on:
+ # 1. mtime (modification time) with TSORT=t, or
+ # 2. ctime (last change time, usually) with TSORT=tc
+ last_dir="$(pcmd ls -${TSORT}p1 "${ddir}" | grep '/$' | head -n 1)" || \
_exit_err "Failed to list contents of ${ddir}."
#
# clone from old backup, if existing
#
d.patch:
--- ccollect-0.7.1-c.sh 2009-05-24 21:39:43.000000000 -0700
+++ ccollect-0.7.1-d.sh 2009-05-24 21:47:09.000000000 -0700
@@ -492,12 +492,12 @@
if [ "${count}" -ge "${c_interval}" ]; then
substract=$((${c_interval} - 1))
remove=$((${count} - ${substract}))
_techo "Removing ${remove} backup(s)..."
- pcmd ls -p1 "$ddir" | grep "^${INTERVAL}\..*/\$" | \
- sort -n | head -n "${remove}" > "${TMP}" || \
+ pcmd ls -${TSORT}p1r "$ddir" | grep "^${INTERVAL}\..*/\$" | \
+ head -n "${remove}" > "${TMP}" || \
_exit_err "Listing old backups failed"
i=0
while read to_remove; do
eval remove_$i=\"${to_remove}\"
Signed-off-by: Nico Schottelius <nico@ikn.schottelius.org>
Based on patches by John Lawless <jll2_8854b@redwoodscientific.com>.
Skipped the sort changing part (from -tc to -t)
c.patch:
--- ccollect-0.7.1-b.sh 2009-05-24 21:32:00.000000000 -0700
+++ ccollect-0.7.1-c.sh 2009-05-24 21:39:43.000000000 -0700
@@ -40,10 +40,13 @@
VERSION=0.7.1
RELEASE="2009-02-02"
HALF_VERSION="ccollect ${VERSION}"
FULL_VERSION="ccollect ${VERSION} (${RELEASE})"
+#TSORT="tc" ; NEWER="cnewer"
+TSORT="t" ; NEWER="newer"
+
#
# CDATE: how we use it for naming of the archives
# DDATE: how the user should see it in our output (DISPLAY)
#
CDATE="date +%Y%m%d-%H%M"
@@ -513,14 +516,14 @@
#
# Check for backup directory to clone from: Always clone from the latest one!
#
- # Use ls -1c instead of -1t, because last modification maybe the same on all
- # and metadate update (-c) is updated by rsync locally.
- #
- last_dir="$(pcmd ls -tcp1 "${ddir}" | grep '/$' | head -n 1)" || \
+ # Depending on your file system, you may want to sort on:
+ # 1. mtime (modification time) with TSORT=t, or
+ # 2. ctime (last change time, usually) with TSORT=tc
+ last_dir="$(pcmd ls -${TSORT}p1 "${ddir}" | grep '/$' | head -n 1)" || \
_exit_err "Failed to list contents of ${ddir}."
#
# clone from old backup, if existing
#
d.patch:
--- ccollect-0.7.1-c.sh 2009-05-24 21:39:43.000000000 -0700
+++ ccollect-0.7.1-d.sh 2009-05-24 21:47:09.000000000 -0700
@@ -492,12 +492,12 @@
if [ "${count}" -ge "${c_interval}" ]; then
substract=$((${c_interval} - 1))
remove=$((${count} - ${substract}))
_techo "Removing ${remove} backup(s)..."
- pcmd ls -p1 "$ddir" | grep "^${INTERVAL}\..*/\$" | \
- sort -n | head -n "${remove}" > "${TMP}" || \
+ pcmd ls -${TSORT}p1r "$ddir" | grep "^${INTERVAL}\..*/\$" | \
+ head -n "${remove}" > "${TMP}" || \
_exit_err "Listing old backups failed"
i=0
while read to_remove; do
eval remove_$i=\"${to_remove}\"
Signed-off-by: Nico Schottelius <nico@ikn.schottelius.org>
> # Verify source is up and accepting connections before deleting any old backups
> rsync "$source" >/dev/null || _exit_err "Source ${source} is not readable. Skipping."
I think that this quick test is a much better than, say, pinging
the source in a pre-exec script: this tests not only that the
source is up and connected to the net, it also verifies (1) that
ssh is up and accepting our key (if we are using ssh), and (2) that
the source directory is mounted (if it needs to be mounted) and
readable.
> # Verify source is up and accepting connections before deleting any old backups
> rsync "$source" >/dev/null || _exit_err "Source ${source} is not readable. Skipping."
I think that this quick test is a much better than, say, pinging
the source in a pre-exec script: this tests not only that the
source is up and connected to the net, it also verifies (1) that
ssh is up and accepting our key (if we are using ssh), and (2) that
the source directory is mounted (if it needs to be mounted) and
readable.