diff --git a/software/ccollect/documentation/ccollect.htm b/software/ccollect/documentation/ccollect.htm index 342288e9..647d51e9 100644 --- a/software/ccollect/documentation/ccollect.htm +++ b/software/ccollect/documentation/ccollect.htm @@ -1,8 +1,8 @@ -ccollect - Installing, Configuring and Using

ccollect - Installing, Configuring and Using

Revision History
Revision 2.3for ccollect 2.3, Initial Version from 2006-01-13NS

Table of Contents

Introduction
Supported and tested operating systems and architectures
Why you COULD only backup from remote hosts, not to them
Incompatibilities and changes
Quick start
Requirements
Installing ccollect
Using ccollect
Installing
Configuring
Runtime options
General configuration
Source configuration
Hints
Smart logging
Using a different ssh port
Using source names or interval in pre_/post_exec scripts
Using rsync protocol without ssh
Not excluding top-level directories
Re-using already created rsync-backups
Using pre_/post_exec
Using source specific interval definitions
Comparing backups
Testing for host reachabilty
Easy check for errors
F.A.Q.
What happens if one backup is broken or empty?
When backing up from localhost the destination is also included. Is this a bug?
Why does ccollect say "Permission denied" with my pre-/postexec script?
Why does the backup job fail when part of the source is a link?
How can I prevent missing the right time to enter my password?
Backup fails, if autofs is running, but sources not reachable
Examples
A backup host configuration from scratch
Using hard-links requires less disk space
A collection of backups on the backup server
Processes running when doing ccollect -j

(pseudo) incremental backup +ccollect - Installing, Configuring and Using

ccollect - Installing, Configuring and Using

Revision History
Revision 2.5for ccollect 2.5, Initial Version from 2006-01-13NS

(pseudo) incremental backup with different exclude lists -using hardlinks and rsync

Introduction

ccollect is a backup utility written in the sh-scripting language. +using hardlinks and rsync

Introduction

ccollect is a backup utility written in the sh-scripting language. It does not depend on a specific shell, only /bin/sh needs to be -bourne shell compatible (like dash, ksh, zsh, bash, …).

Supported and tested operating systems and architectures

ccollect was successfully tested on the following platforms:

  • +bourne shell compatible (like dash, ksh, zsh, bash, …).

    Supported and tested operating systems and architectures

    ccollect was successfully tested on the following platforms:

    • FreeBSD on amd64/i386
    • GNU/Linux on amd64/arm/hppa/i386/ppc @@ -12,64 +12,66 @@ Mac OS X 10.5 NetBSD on alpha/amd64/i386/sparc/sparc64
    • OpenBSD on amd64 +
    • +Windows by installing Cygwin, OpenSSH and rsync

    It should run on any Unix that supports rsync and has a POSIX-compatible bourne shell. If your platform is not listed above and you have it successfully -running, please drop me a mail.

    Why you COULD only backup from remote hosts, not to them

    While considering the design of ccollect, I thought about enabling +running, please drop me a mail.

    Why you COULD only backup from remote hosts, not to them

    While considering the design of ccollect, I thought about enabling backup to remote hosts. Though this sounds like a nice feature ("Backup my notebook to the server now."), in my opinion it is a bad idea to backup to a remote host.

    But as more and more people requested this feature, it was implemented, -so you have the choice whether you want to use it or not.

    Reason

    If you want to backup TO a remote host, you have to loosen security on it.

    Imagine the following situation: You backup your farm of webservers TO +so you have the choice whether you want to use it or not.

    Reason

    If you want to backup TO a remote host, you have to loosen security on it.

    Imagine the following situation: You backup your farm of webservers TO a backup host somewhere else. Now one of your webservers which has access to your backup host gets -compromised.

    Your backup server will be compromised, too.

    And the attacker will have access to all data on the other webservers.

    Doing it securely

    Think of it the other way round: The backup server (now behind a +compromised.

    Your backup server will be compromised, too.

    And the attacker will have access to all data on the other webservers.

    Doing it securely

    Think of it the other way round: The backup server (now behind a firewall, not accessable from outside) connects to the webservers and pulls the data from them. If someone gets access to one of the webservers, this person will perhaps not even see your machine. If the attacker sees connections from a host to the compromised machine, she will not be able to log in on the backup machine. -All other backups are still secure.

    Incompatibilities and changes

    Versions 0.9 and 1.0

    • +All other backups are still secure.

    Incompatibilities and changes

    Versions 0.9 and 1.0

    • Added "Error: " prefix in _exit_err() -

    Versions 0.8 and 0.9

    • +

    Versions 0.8 and 0.9

    • Renamed script to ccollect (.sh is not needed)
    • Removed feature to backup to a host via ccollect, added new tool (FIXME: insert name here) that takes care of this via tunnel
    • Perhaps creating subdirectory of source name (idea from Stefan Schlörholz) -

    Versions 0.7 and 0.8

    The argument order changed:

    • +

    Versions 0.7 and 0.8

    The argument order changed:

    • Old: "<interval name> [args] <sources to backup>"
    • New: "[args] <interval name> <sources to backup>"

    If you did not use arguments (most people do not), nothing will -change for you.

    Deletion of incomplete backups using the delete_incomplete option

    • +change for you.

      Deletion of incomplete backups using the delete_incomplete option

      • Old: Only incomplete backups from the current interval have been removed
      • New: All incomplete backups are deleted -

      Support for standard values

      • +

      Support for standard values

      • Old: no support
      • New: Options in $CCOLLECT_CONF/defaults are used as defaults (see below) -

    Versions 0.6 and 0.7

    The format of destination changed:

    • +

    Versions 0.6 and 0.7

    The format of destination changed:

    • Before 0.7 it was a (link to a) directory
    • As of 0.7 it is a textfile containing the destination -

    You can update your configuration using tools/config-pre-0.7-to-0.7.sh.

    Added remote_host

    • +

    You can update your configuration using tools/config-pre-0.7-to-0.7.sh.

    Added remote_host

    • As of 0.7 it is possible to backup to hosts (see section remote_host below). -

    Versions 0.5 and 0.6

    The format of rsync_options changed:

    • +

    Versions 0.5 and 0.6

    The format of rsync_options changed:

    • Before 0.6 it was whitespace delimeted
    • As of 0.6 it is newline seperated (so you can pass whitespaces to rsync) -

    You can update your configuration using tools/config-pre-0.6-to-0.6.sh.

    The name of the backup directories changed:

    • +

    You can update your configuration using tools/config-pre-0.6-to-0.6.sh.

    The name of the backup directories changed:

    • Before 0.6: "date +%Y-%m-%d-%H%M"
    • As of 0.6: "date +%Y%m%d-%H%M" (better readable, date is closer together)

    For the second change there is no updated needed, as XXXX- is always before -XXXXX (- comes before digit).

    Versions 0.4 and 0.5

    Not a real incompatibilty, but seems to fit in this section:

    0.5 does NOT require

    • +XXXXX (- comes before digit).

    Versions 0.4 and 0.5

    Not a real incompatibilty, but seems to fit in this section:

    0.5 does NOT require

    • PaX
    • bc -

    anymore!

    Versions < 0.4 and 0.4

    Since ccollect 0.4 there are several incompatibilities with earlier -versions:

    List of incompatibilities

    • +

    anymore!

    Versions < 0.4 and 0.4

    Since ccollect 0.4 there are several incompatibilities with earlier +versions:

    List of incompatibilities

    • pax (Posix) is now required, cp -al (GNU specific) is removed
    • "interval" was written with two l (ell), which is wrong in English @@ -81,7 +83,7 @@ ccollect will now exit when preexec returns non-zero ccollect now reports when postexec returns non-zero

    You can convert your old configuration directory using config-pre-0.4-to-0.4.sh, which can be found in the tools/ -subdirectory:

    [10:05] hydrogenium:ccollect-0.4# ./tools/config-pre-0.4-to-0.4.sh /etc/ccollect

    Quick start

    For those who do not want to read the whole long document:

    # get latest ccollect tarball from http://www.nico.schottelius.org/software/ccollect/
    +subdirectory:

    [10:05] hydrogenium:ccollect-0.4# ./tools/config-pre-0.4-to-0.4.sh /etc/ccollect

    Quick start

    For those who do not want to read the whole long document:

    # get latest ccollect tarball from http://www.nico.schottelius.org/software/ccollect/
     # replace value for CCV with the current version
     export CCV=0.8.1
     
    @@ -140,7 +142,7 @@ du -s ~/DASI /bin
     # report success
     echo "Please report success using ./tools/report_success.sh"

    Cutting and pasting the complete section above to your shell will result in the download of ccollect, the creation of a sample configuration and the -execution of some backups.

    Requirements

    Installing ccollect

    For the installation you need at least

    • +execution of some backups.

    Requirements

    Installing ccollect

    For the installation you need at least

    Using ccollect

    Running ccollect requires the following tools to be installed:

    • +

    Using ccollect

    Running ccollect requires the following tools to be installed:

    • date
    • rsync
    • ssh (if you want to use rsync over ssh, which is recommened for security) -

    Installing

    Either type make install or simply copy it to a directory in your +

Installing

Either type make install or simply copy it to a directory in your $PATH and execute chmod 0755 /path/to/ccollect.sh. If you like to use the new management scripts (available since 0.6), copy the -following scripts to a directory in $PATH:

  • +following scripts to a directory in $PATH:

    • tools/ccollect_add_source.sh
    • tools/ccollect_analyse_logs.sh.sh @@ -168,11 +170,11 @@ following scripts to a directory in $PATH:

      • tools/ccollect_logwrapper.sh

      After having installed and used ccollect, report success using -./tools/report_success.sh.

    Configuring

    For configuration aid have a look at the above mentioned tools, which can assist +./tools/report_success.sh.

    Configuring

    For configuration aid have a look at the above mentioned tools, which can assist you quite well. When you are successfully using ccollect, I would be happy if you add a link to your website, stating "I backup with ccollect", which points to the ccollect homepage. So more people now about ccollect, use it and -improve it. You can also report success using tools/report_success.sh.

    Runtime options

    ccollect looks for its configuration in /etc/ccollect or, if set, in +improve it. You can also report success using tools/report_success.sh.

    Runtime options

    ccollect looks for its configuration in /etc/ccollect or, if set, in the directory specified by the variable $CCOLLECT_CONF:

    # sh-compatible (dash, zsh, mksh, ksh, bash, ...)
     $ CCOLLECT_CONF=/your/config/dir ccollect.sh ...
     
    @@ -180,11 +182,11 @@ $ CCOLLECT_CONF=/your/config/dir ccollect.sh ...
     $ ( setenv CCOLLECT_CONF /your/config/dir ; ccollect.sh ... )

    When you start ccollect, you have to specify in which interval to backup (daily, weekly, yearly; you can specify the names yourself, see below) and which sources to backup (or -a to backup all sources).

    The interval specifies how many backups are kept.

    There are also some self-explanatory parameters you can pass to ccollect, -simply use ccollect.sh --help for info.

    General configuration

    The general configuration can be found in $CCOLLECT_CONF/defaults or +simply use ccollect.sh --help for info.

    General configuration

    The general configuration can be found in $CCOLLECT_CONF/defaults or /etc/ccollect/defaults. All options specified there are generally valid for all source definitions, although the values can be overwritten in the source configuration.

    All configuration entries are plain-text files -(use UTF-8 for non-ascii characters).

    Interval definition

    The interval definition can be found in +(use UTF-8 for non-ascii characters).

    Interval definition

    The interval definition can be found in $CCOLLECT_CONF/defaults/intervals/ or /etc/ccollect/defaults/intervals. Each file in this directory specifies an interval. The name of the file is the same as the name of the interval: intervals/'<interval name>'.

    The content of this file should be a single line containing a number. @@ -196,11 +198,11 @@ This number defines how many versions of this interval are kept.

    Example:< [10:23] zaphodbeeblebrox:ccollect-0.2% cat conf/defaults/intervals/* 28 12 - 4

    This means to keep 28 daily backups, 12 monthly backups and 4 weekly.

    General pre- and post-execution

    If you add $CCOLLECT_CONF/defaults/pre_exec or + 4

    This means to keep 28 daily backups, 12 monthly backups and 4 weekly.

    General pre- and post-execution

    If you add $CCOLLECT_CONF/defaults/pre_exec or /etc/ccollect/defaults/pre_exec (same with post_exec), ccollect will start pre_exec before the whole backup process and post_exec after backup of all sources is done.

    If pre_exec exits with a non-zero return code, the whole backup -process will be aborted.

    The pre_exec and post_exec script can access the following exported variables:

    • +process will be aborted.

      The pre_exec and post_exec script can access the following exported variables:

      • INTERVAL: the interval selected (daily)
      • no_sources: number of sources to backup (2) @@ -212,12 +214,12 @@ human readable format before and after the whole backup process:

        Source configuration

        Each source configuration exists in $CCOLLECT_CONF/sources/$name or -/etc/ccollect/sources/$name.

        The name you choose for the subdirectory describes the source.

        Each source contains at least the following files:

        • +[13:01] hydrogenium:~# ln -s /etc/ccollect/defaults/pre_exec /etc/ccollect/defaults/post_exec

    Source configuration

    Each source configuration exists in $CCOLLECT_CONF/sources/$name or +/etc/ccollect/sources/$name.

    The name you choose for the subdirectory describes the source.

    Each source contains at least the following files:

    • source (a text file containing the rsync compatible path to backup)
    • destination (a text file containing the directory we should backup to) -

    Additionally a source may have the following files:

    • +

    Additionally a source may have the following files:

    • pre_exec program to execute before backing up this source
    • post_exec program to execute after backing up this source @@ -257,7 +259,7 @@ human readable format before and after the whole backup process:

      Default options

      If you add $CCOLLECT_CONF/defaults/option_name, the value will + /home/nico/vpn

      Default options

      If you add $CCOLLECT_CONF/defaults/option_name, the value will be used in abscence of the option in a source. If you want to prevent the default value to be used in a source, you can create the file $CCOLLECT_CONF/sources/$name/no_option_name (i.e. prefix it with @@ -265,7 +267,7 @@ the default value to be used in a source, you can create the file [9:04] ikn2:ccollect% touch conf/sources/local/no_verbose

      This enables the verbose option for all sources, but disables it for the source local.

      If an option is specified in the defaults folder and in the source, the source specific version overrides the default one:

      Example:

         [9:05] ikn2:ccollect% echo "backup-host" > conf/defaults/remote_host
      -   [9:05] ikn2:ccollect% echo "different-host" > conf/sources/local/remote_host

      You can use all source options as defaults, with the exception of

      • + [9:05] ikn2:ccollect% echo "different-host" > conf/sources/local/remote_host

        You can use all source options as defaults, with the exception of

        • source
        • destination @@ -273,20 +275,20 @@ the source specific version overrides the default one:

          Example:

          pre_exec
           
        • post_exec -

      Detailed description of "source"

      source describes a rsync compatible source (one line only).

      For instance backup_user@foreign_host:/home/server/video. +

    Detailed description of "source"

    source describes a rsync compatible source (one line only).

    For instance backup_user@foreign_host:/home/server/video. To use the rsync protocol without the ssh-tunnel, use rsync::USER@HOST/SRC. For more information have a look at the manpage -of rsync(1).

    Detailed description of "destination"

    destination must be a text file containing the destination directory. +of rsync(1).

    Detailed description of "destination"

    destination must be a text file containing the destination directory. destination USED to be a link to the destination directory in earlier versions, so do not be confused if you see such examples.

    Example:

       [11:36] zaphodbeeblebrox:ccollect-0.2% cat conf/sources/testsource2/destination
    -   /home/nico/backupdir

    Detailed description of "remote_host"

    remote_host must be a text file containing the destination host. + /home/nico/backupdir

    Detailed description of "remote_host"

    remote_host must be a text file containing the destination host. If this file is existing, you are backing up your data TO this host and not to you local host.

    Warning: You need to have ssh access to the remote host. rsync and ccollect will connect to that host via ssh. ccollect needs the shell access, because it needs to find out how many backups exist on the remote host and to be able to delete them.

    Example:

       [10:17] denkbrett:ccollect-0.7.0% cat conf/sources/remote1/remote_host
    -   home.schottelius.org

    It may contain all the ssh-specific values like myuser@yourhost.ch.

    Detailed description of "verbose"

    verbose tells ccollect that the log should contain verbose messages.

    If this file exists in the source specification -v will be passed to rsync.

    Example:

       [11:35] zaphodbeeblebrox:ccollect-0.2% touch conf/sources/testsource1/verbose

    Detailed description of "very_verbose"

    very_verbose tells ccollect that it should log very verbosely.

    If this file exists in the source specification -v will be passed to -rsync, rm and mkdir.

    Example:

       [23:67] nohost:~% touch conf/sources/testsource1/very_verbose

    Detailed description of "summary"

    If you create the file summary in the source definition, + home.schottelius.org

    It may contain all the ssh-specific values like myuser@yourhost.ch.

    Detailed description of "verbose"

    verbose tells ccollect that the log should contain verbose messages.

    If this file exists in the source specification -v will be passed to rsync.

    Example:

       [11:35] zaphodbeeblebrox:ccollect-0.2% touch conf/sources/testsource1/verbose

    Detailed description of "very_verbose"

    very_verbose tells ccollect that it should log very verbosely.

    If this file exists in the source specification -v will be passed to +rsync, rm and mkdir.

    Example:

       [23:67] nohost:~% touch conf/sources/testsource1/very_verbose

    Detailed description of "summary"

    If you create the file summary in the source definition, ccollect will present you a nice summary at the end.

    backup:~# touch /etc/ccollect/sources/root/summary
     backup:~# ccollect.sh werktags root
     ==> ccollect.sh: Beginning backup using interval werktags <==
    @@ -313,11 +315,11 @@ backup:~# ccollect.sh werktags root
     [root] Successfully finished backup.
     ==> Finished ccollect.sh <==

    You could also combine it with verbose or very_verbose, but these already print some statistics (though not all / the same as presented by -summary).

    Detailed description of "exclude"

    exclude specifies a list of paths to exclude. The entries are seperated by a newline (\n).

    Example:

       [11:35] zaphodbeeblebrox:ccollect-0.2% cat conf/sources/testsource2/exclude
    +summary).

    Detailed description of "exclude"

    exclude specifies a list of paths to exclude. The entries are seperated by a newline (\n).

    Example:

       [11:35] zaphodbeeblebrox:ccollect-0.2% cat conf/sources/testsource2/exclude
        openvpn-2.0.1.tar.gz
        nicht_reinnehmen
        etwas mit leerzeichenli
    -   something with spaces is not a problem

    Detailed description of "intervals/"

    When you create the subdirectory intervals/ in your source configuration + something with spaces is not a problem

    Detailed description of "intervals/"

    When you create the subdirectory intervals/ in your source configuration directory, you can specify individiual intervals for this specific source. Each file in this directory describes an interval.

    Example:

       [11:37] zaphodbeeblebrox:ccollect-0.2% ls -l conf/sources/testsource2/intervals/
        insgesamt 8
    @@ -325,18 +327,18 @@ Each file in this directory describes an interval.

    Example:

    Detailled description of "rsync_options"

    When you create the file rsync_options in your source configuration, + 20

    Detailled description of "rsync_options"

    When you create the file rsync_options in your source configuration, all the parameters in this file will be passed to rsync. This way you can pass additional options to rsync. For instance you can tell rsync to show progress ("--progress"), or which -password-file ("--password-file") to use for automatic backup over the rsync-protocol.

    Example:

       [23:42] hydrogenium:ccollect-0.2% cat conf/sources/test_rsync/rsync_options
    -   --password-file=/home/user/backup/protected_password_file

    Detailled description of "pre_exec" and "post_exec"

    When you create pre_exec and / or post_exec in your source + --password-file=/home/user/backup/protected_password_file

    Detailled description of "pre_exec" and "post_exec"

    When you create pre_exec and / or post_exec in your source configuration, ccollect will execute this command before and respectively after doing the backup for this specific source. If you want to have pre-/post-exec before and after all backups, see above for general configuration.

    If pre_exec exits with a non-zero return code, the backup process of this source will be aborted (i.e. backup skipped).

    The post_exec script can access the following exported variables from -ccollect:

    • +ccollect:

      • name: name of the source that is being backed up
      • destination_name: contains the base directory name (daily.20091031-1013.24496) @@ -353,32 +355,32 @@ df -h #!/bin/sh # Show whats free after -df -h

      Detailed description of "delete_incomplete"

      If you create the file delete_incomplete in a source specification directory, +df -h

      Detailed description of "delete_incomplete"

      If you create the file delete_incomplete in a source specification directory, ccollect will look for incomplete backups (when the whole ccollect process was interrupted) and remove them. Without this file ccollect will only warn -the user.

      Detailed description of "rsync_failure_codes"

      If you have the file rsync_failure_codes in your source configuration +the user.

      Detailed description of "rsync_failure_codes"

      If you have the file rsync_failure_codes in your source configuration directory, it should contain a newline-separated list of numbers representing rsync exit codes. If rsync exits with any code in this list, a marker will be left in the destination directory indicating failure of this backup. If you have enabled delete_incomplete, then this backup will be deleted during -the next ccollect run on the same interval.

      Detailed description of "mtime"

      By default, ccollect.sh chooses the most recent backup directory for cloning or +the next ccollect run on the same interval.

      Detailed description of "mtime"

      By default, ccollect.sh chooses the most recent backup directory for cloning or the oldest for deletion based on the directory’s last change time (ctime). With this option, the sorting is done based on modification time (mtime). With this version of ccollect, the ctime and mtime of your backups will normally be the same and this option has no effect. However, if you, for example, move your backups to another hard disk using cp -a or rsync -a, you should use this option because the ctimes are not preserved during such operations.

      If you have any backups in your repository made with ccollect version 0.7.1 or -earlier, do not use this option.

      Detailed description of "quiet_if_down"

      By default, ccollect.sh emits a series of error messages if a source is not +earlier, do not use this option.

      Detailed description of "quiet_if_down"

      By default, ccollect.sh emits a series of error messages if a source is not connectable. With this option enabled, ccollect still reports that the source is not connectable but the associated error messages generated by rsync or ssh are suppressed. You may want to use this option for sources, -like notebook PCs, that are often disconnected.

    Hints

    Smart logging

    Since ccollect-0.6.1 you can use the ccollect-logwrapper.sh(1) for logging. +like notebook PCs, that are often disconnected.

    Hints

    Smart logging

    Since ccollect-0.6.1 you can use the ccollect-logwrapper.sh(1) for logging. You call it the same way you call ccollect.sh and it will create a logfile containing the output of ccollect.sh. For more information look at the manpage ccollect-logwrapper. The following is an example running ccollect-logwrapper.sh:

    u0219 ~ # ~chdscni9/ccollect-logwrapper.sh daily u0160.nshq.ch.netstream.com
     ccollect-logwrapper.sh (11722): Starting with arguments: daily u0160.nshq.ch.netstream.com
    -ccollect-logwrapper.sh (11722): Finished.

    Using a different ssh port

    Mostly easy is to use your ~/.ssh/config file:

    host mx2.schottelius.org
    +ccollect-logwrapper.sh (11722): Finished.

    Using a different ssh port

    Mostly easy is to use your ~/.ssh/config file:

    host mx2.schottelius.org
        Port 2342

    If you only use that port for backup only and normally want to use another port, you can add HostName and "HostKeyAlias" (if you also have different keys on the different ports):

    Host hhydrogenium
    @@ -389,7 +391,7 @@ keys on the different ports):

    Host hhydrogenium
     Host bruehe
        Hostname bruehe.schottelius.org
        Port 22
    -   HostKeyAlias bruehe.schottelius.org

    Using source names or interval in pre_/post_exec scripts

    The pre-/post_exec scripts can access some internal variables from ccollect:

    • + HostKeyAlias bruehe.schottelius.org

    Using source names or interval in pre_/post_exec scripts

    The pre-/post_exec scripts can access some internal variables from ccollect:

    • INTERVAL: The interval specified on the command line
    • no_sources: number of sources @@ -398,9 +400,9 @@ Host bruehe
    • name: the name of the currently being backuped source (not available for generic pre_exec script) -

    Only available for post_exec:

    • +

    Only available for post_exec:

    • remote_host: name of host we backup to (empty if unused) -

    Using rsync protocol without ssh

    When you have a computer with little computing power, it may be useful to use +

Using rsync protocol without ssh

When you have a computer with little computing power, it may be useful to use rsync without ssh, directly using the rsync protocol (specify user@host::share in source). You may wish to use rsync_options to specify a password file to use for automatic backup.

Example:

backup:~# cat /etc/ccollect/sources/sample.backup.host.org/source
@@ -410,11 +412,11 @@ backup:~# cat /etc/ccollect/sources/sample.backup.host.org/rsync_options
 --password-file=/etc/ccollect/sources/sample.backup.host.org/rsync_password
 
 backup:~# cat /etc/ccollect/sources/sample.backup.host.org/rsync_password
-this_is_the_rsync_password

This hint was reported by Daniel Aubry.

Not excluding top-level directories

When you exclude "/proc" or "/mnt" from your backup, you may run into +this_is_the_rsync_password

This hint was reported by Daniel Aubry.

Not excluding top-level directories

When you exclude "/proc" or "/mnt" from your backup, you may run into trouble when you restore your backup. When you use "/proc/*" or "/mnt/\*" -instead, ccollect will backup empty directories.

Note

When those directories contain hidden files +instead, ccollect will backup empty directories.

Note

When those directories contain hidden files (those beginning with a dot (.)), -they will still be transferred!

This hint was reported by Marcus Wagner.

Re-using already created rsync-backups

If you used rsync directly before you use ccollect, you can +they will still be transferred!

This hint was reported by Marcus Wagner.

Re-using already created rsync-backups

If you used rsync directly before you use ccollect, you can use this old backup as initial backup for ccollect: You simply move it into a directory below the destination directory and name it "interval.0".

Example:

backup:/home/backup/web1# ls
@@ -427,8 +429,8 @@ backup:/home/backup/web1# mkdir daily.0
 backup:/home/backup/web1# mv * daily.0 2>/dev/null
 
 backup:/home/backup/web1# ls
-daily.0

Now you can use /home/backup/web1 as the destination for the backup.

Note

It does not matter anymore how you name your directory, as ccollect uses -the -c option from ls to find out which directory to clone from.

Note

Older versions (pre 0.6, iirc) had a problem, if you named the first backup +daily.0

Now you can use /home/backup/web1 as the destination for the backup.

Note

It does not matter anymore how you name your directory, as ccollect uses +the -c option from ls to find out which directory to clone from.

Note

Older versions (pre 0.6, iirc) had a problem, if you named the first backup something like "daily.initial". It was needed to use the "0" (or some number that is lower than the current year) as extension. ccollect used sort to find the latest backup. ccollect itself uses @@ -437,12 +439,12 @@ used sort to find the latest backup. sort. So, if you had a directory named "daily.initial", ccollect always diffed against this backup and transfered and deleted files which where deleted in previous backups. This means you simply -wasted resources, but your backup had beer complete anyway.

Using pre_/post_exec

Your pre_/post_exec script does not need to be a script, you can also -use a link to

  • +wasted resources, but your backup had beer complete anyway.

Using pre_/post_exec

Your pre_/post_exec script does not need to be a script, you can also +use a link to

  • an existing program
  • an already written script -

The only requirement is that it is executable.

Using source specific interval definitions

When you are backing up multiple hosts via cron each night, it may be +

The only requirement is that it is executable.

Using source specific interval definitions

When you are backing up multiple hosts via cron each night, it may be a problem that host "big_server" may only have 4 daily backups, because otherwise its backup device will be full. But for all other hosts you want to keep 20 daily backups. In this case you would create @@ -450,26 +452,26 @@ you want to keep 20 daily backups. In this case you would create /etc/ccollect/sources/big_server/intervals/daily containing "4".

Source specific intervals always overwrite the default values. If you have to specify it individually for every host, because of different requirements, you can even omit creating -/etc/ccollect/default/intervals/daily.

Comparing backups

If you want to see what changed between two backups, you can use +/etc/ccollect/default/intervals/daily.

Comparing backups

If you want to see what changed between two backups, you can use rsync directly:

[12:00] u0255:ddba034.netstream.ch#  rsync -n -a --delete --stats --progress daily.20080324-0313.17841/ daily.20080325-0313.31148/

This results in a listing of changes. Because we pass -n to rsync no transfer -is made (i.e. report only mode).

This hint was reported by Daniel Aubry.

Testing for host reachabilty

If you want to test whether the host you try to backup is reachable, you can use +is made (i.e. report only mode).

This hint was reported by Daniel Aubry.

Testing for host reachabilty

If you want to test whether the host you try to backup is reachable, you can use the following script as source specific pre-exec:

#!/bin/sh
-# ping -c1 -q `cat "/etc/ccollect/sources/$name/source" | cut -d"@" -f2 | cut -d":" -f1`

This prevents the deletion of old backups, if the host is not reachable.

This hint was reported by Daniel Aubry.

Easy check for errors

If you want to see whether there have been any errors while doing the backup, -you can run ccollect together with ccollect_analyse_logs.sh:

$ ccollect | ccollect_analyse_logs.sh e

F.A.Q.

What happens if one backup is broken or empty?

Let us assume that one backup failed (connection broke or the source +# ping -c1 -q `cat "/etc/ccollect/sources/$name/source" | cut -d"@" -f2 | cut -d":" -f1`

This prevents the deletion of old backups, if the host is not reachable.

This hint was reported by Daniel Aubry.

Easy check for errors

If you want to see whether there have been any errors while doing the backup, +you can run ccollect together with ccollect_analyse_logs.sh:

$ ccollect | ccollect_analyse_logs.sh e

F.A.Q.

What happens if one backup is broken or empty?

Let us assume that one backup failed (connection broke or the source hard disk had some failures). Therefore we’ve got one incomplete backup in our history.

ccollect will transfer the missing files the next time you use it. -This leads to

  • +This leads to

    • more transferred files
    • much greater disk space usage, as no hardlinks can be used

    If the whole ccollect process was interrupted, ccollect (since 0.6) can detect that and remove the incomplete backups, so you can clone from a complete -backup instead

When backing up from localhost the destination is also included. Is this a bug?

No. ccollect passes your source definition directly to rsync. It +backup instead

When backing up from localhost the destination is also included. Is this a bug?

No. ccollect passes your source definition directly to rsync. It does not try to analyze it. So it actually does not know if a source comes from local harddisk or from a remote server. And it does not want to. When you backup from the local harddisk (which is perhaps not even a good idea when thinking of security), add the destination -to source/exclude. (Daniel Aubry reported this problem)

Why does ccollect say "Permission denied" with my pre-/postexec script?

The most common error is that you have not given your script the correct -permissions. Try chmod 0755 /etc/ccollect/sources/'yoursource'/*_exec`.

Why does the backup job fail when part of the source is a link?

When a part of your path you specified in the source is a +to source/exclude. (Daniel Aubry reported this problem)

Why does ccollect say "Permission denied" with my pre-/postexec script?

The most common error is that you have not given your script the correct +permissions. Try chmod 0755 /etc/ccollect/sources/'yoursource'/*_exec`.

Why does the backup job fail when part of the source is a link?

When a part of your path you specified in the source is a (symbolic, hard links are not possible for directories) link, the backup must fail.

First of all, let us have a look at how it looks like:

==> ccollect 0.4: Beginning backup using interval taeglich <==
 [testsource] Sa Apr 29 00:01:55 CEST 2006 Beginning to backup
@@ -488,12 +490,12 @@ lrwxrwxrwx 1 nico nico 29 2006-04-29 00:01 projekte -> oeffentlich/computer/p
 This link now links to something not reachable (dead link). It is
 impossible to create subdirectories under the broken link.

In conclusion you cannot use paths with a linked part.

However, you can backup directories containing symbolic links (in this case you could backup /home/user/nico, which contains -/home/user/nico/projekte and oeffentlich/computer/projekte).

How can I prevent missing the right time to enter my password?

As ccollect first deletes the old backups, it may take some time +/home/user/nico/projekte and oeffentlich/computer/projekte).

How can I prevent missing the right time to enter my password?

As ccollect first deletes the old backups, it may take some time until rsync requests the password for the ssh session from you.

The easiest way not to miss that point is running ccollect in screen, which has the ability to monitor the output for activity. So as soon as your screen beeps, after ccollect began to remove the last directory, you can enter your password (have a look at screen(1), especially "C-a M" -and "C-a _", for more information).

Backup fails, if autofs is running, but sources not reachable

If you are trying to backup a system containing paths that are managed +and "C-a _", for more information).

Backup fails, if autofs is running, but sources not reachable

If you are trying to backup a system containing paths that are managed by autofs, you may run into this error:

2009-12-01-23:14:15: ccollect 0.8.1: Beginning backup using interval monatlich
 [ikn] 2009-12-01-23:14:15: Beginning to backup
 [ikn] 2009-12-01-23:14:15: Executing /home/users/nico/ethz/ccollect/sources/ikn/pre_exec ...
@@ -507,7 +509,7 @@ Enter LUKS passphrase:
 [ikn] directory has vanished: "/home/users/nico/privat/firmen/ethz/autofs/sysadmin"
 [ikn] rsync warning: some files vanished before they could be transferred (code 24) at main.c(1057) [sender=3.0.6]
 [ikn] 2009-12-01-23:44:23: Source / is not readable. Skipping.

Thus, if you are unsure whether autofs paths can be mounted during backup, -stop autofs in pre_exec and reenable it in post_exec.

Examples

A backup host configuration from scratch

srwali01:~# mkdir /etc/ccollect
+stop autofs in pre_exec and reenable it in post_exec.

Examples

A backup host configuration from scratch

srwali01:~# mkdir /etc/ccollect
 srwali01:~# mkdir -p /etc/ccollect/defaults/intervals/
 srwali01:~# echo 28 > /etc/ccollect/defaults/intervals/taeglich
 srwali01:~# echo 52 > /etc/ccollect/defaults/intervals/woechentlich
@@ -552,7 +554,7 @@ srwali01:/etc/ccollect/sources/srwali03# cat > exclude << EOF
 > EOF
 srwali01:/etc/ccollect/sources/srwali03# echo 'root@10.103.2.3:/' > source
 srwali01:/etc/ccollect/sources/srwali03# echo /mnt/hdbackup/srwali03 > destination
-srwali01:/etc/ccollect/sources/srwali03# mkdir /mnt/hdbackup/srwali03

Using hard-links requires less disk space

# du (coreutils) 5.2.1
+srwali01:/etc/ccollect/sources/srwali03# mkdir /mnt/hdbackup/srwali03

Using hard-links requires less disk space

# du (coreutils) 5.2.1
 [10:53] srsyg01:sources% du -sh ~/backupdir
 4.6M    /home/nico/backupdir
 [10:53] srsyg01:sources% du -sh ~/backupdir/*
@@ -597,7 +599,7 @@ du (GNU coreutils) 5.93
 12G     hydrogenium/durcheinander.2006-01-17-00:27.13820
 1.5G    hydrogenium/durcheinander.2006-01-25-23:18.31328
 200M    hydrogenium/durcheinander.2006-01-26-00:11.3332

In the second report (without -l) the sizes include the space the inodes of -the hardlinks allocate.

A collection of backups on the backup server

All the data of my important hosts is backuped to eiche into +the hardlinks allocate.

A collection of backups on the backup server

All the data of my important hosts is backuped to eiche into /mnt/schwarzesloch/backup:

[9:24] eiche:backup# ls *
 creme:
 woechentlich.2006-01-26-22:22.4153   woechentlich.2006-02-12-11:48.2461
@@ -637,7 +639,7 @@ DDIR=/mnt/usb/backup
 
 rsync -av -H --delete /mnt/schwarzesloch/ "$DDIR/schwarzesloch/"
 
-rsync -av -H --delete /mnt/archiv/ "$DDIR/archiv/"

Processes running when doing ccollect -j

Truncated output from ps axuwwwf:

   S+   11:40   0:00  |   |   |   \_ /bin/sh /usr/local/bin/ccollect.sh daily -j ddba034 ddba045 ddba046 ddba047 ddba049 ddna010 ddna011
+rsync -av -H --delete /mnt/archiv/ "$DDIR/archiv/"

Processes running when doing ccollect -j

Truncated output from ps axuwwwf:

   S+   11:40   0:00  |   |   |   \_ /bin/sh /usr/local/bin/ccollect.sh daily -j ddba034 ddba045 ddba046 ddba047 ddba049 ddna010 ddna011
    S+   11:40   0:00  |   |   |       \_ /bin/sh /usr/local/bin/ccollect.sh daily ddba034
    S+   11:40   0:00  |   |   |       |   \_ /bin/sh /usr/local/bin/ccollect.sh daily ddba034
    R+   11:40  23:40  |   |   |       |   |   \_ rsync -a --delete --numeric-ids --relative --delete-excluded --link-dest=/home/server/backup/ddba034
diff --git a/software/ccollect/documentation/ccollect.html b/software/ccollect/documentation/ccollect.html
index 8ad905fc..f8653e9c 100644
--- a/software/ccollect/documentation/ccollect.html
+++ b/software/ccollect/documentation/ccollect.html
@@ -737,8 +737,8 @@ asciidoc.install();
 

ccollect - Installing, Configuring and Using

Nico Schottelius
<nico-ccollect__@__schottelius.org>
-version 2.3, -for ccollect 2.3, Initial Version from 2006-01-13 +version 2.5, +for ccollect 2.5, Initial Version from 2006-01-13
@@ -783,6 +783,11 @@ NetBSD on alpha/amd64/i386/sparc/sparc64 OpenBSD on amd64

+
  • +

    +Windows by installing Cygwin, OpenSSH and rsync +

    +
  • It should run on any Unix that supports rsync and has a POSIX-compatible bourne shell. If your platform is not listed above and you have it successfully @@ -2206,9 +2211,9 @@ rsync -av -H --delete /mnt/archiv/ "$DDIR/archiv/"


    diff --git a/software/ccollect/documentation/ccollect.text b/software/ccollect/documentation/ccollect.text index b267cdb6..f252b25d 100644 --- a/software/ccollect/documentation/ccollect.text +++ b/software/ccollect/documentation/ccollect.text @@ -1,7 +1,7 @@ ccollect - Installing, Configuring and Using ============================================ Nico Schottelius -2.3, for ccollect 2.3, Initial Version from 2006-01-13 +2.5, for ccollect 2.5, Initial Version from 2006-01-13 :Author Initials: NS @@ -26,6 +26,7 @@ Supported and tested operating systems and architectures - Mac OS X 10.5 - NetBSD on alpha/amd64/i386/sparc/sparc64 - OpenBSD on amd64 +- Windows by installing Cygwin, OpenSSH and rsync It *should* run on any Unix that supports `rsync` and has a POSIX-compatible bourne shell. If your platform is not listed above and you have it successfully diff --git a/software/ccollect/documentation/man/ccollect.1 b/software/ccollect/documentation/man/ccollect.1 index 1aae7651..eced1a49 100644 --- a/software/ccollect/documentation/man/ccollect.1 +++ b/software/ccollect/documentation/man/ccollect.1 @@ -1,13 +1,13 @@ '\" t .\" Title: ccollect .\" Author: Nico Schottelius -.\" Generator: DocBook XSL Stylesheets v1.76.1 -.\" Date: 03/22/2017 +.\" Generator: DocBook XSL Stylesheets v1.79.1 +.\" Date: 05/01/2019 .\" Manual: \ \& .\" Source: \ \& .\" Language: English .\" -.TH "CCOLLECT" "1" "03/22/2017" "\ \&" "\ \&" +.TH "CCOLLECT" "1" "05/01/2019" "\ \&" "\ \&" .\" ----------------------------------------------------------------- .\" * Define some portability stuff .\" ----------------------------------------------------------------- diff --git a/software/ccollect/documentation/man/ccollect.htm b/software/ccollect/documentation/man/ccollect.htm index f460e445..27ecc41b 100644 --- a/software/ccollect/documentation/man/ccollect.htm +++ b/software/ccollect/documentation/man/ccollect.htm @@ -1,8 +1,8 @@ -ccollect(1)

    ccollect(1)


    NAME

    ccollect - (pseudo) incremental backup with different exclude lists using hardlinks and rsync

    SYNOPSIS

    ccollect.sh [args] <interval name> <sources to backup>

    DESCRIPTION

    ccollect is a backup utility written in the sh-scripting language. +ccollect(1)

    ccollect(1)


    NAME

    ccollect - (pseudo) incremental backup with different exclude lists using hardlinks and rsync

    SYNOPSIS

    ccollect.sh [args] <interval name> <sources to backup>

    DESCRIPTION

    ccollect is a backup utility written in the sh-scripting language. It does not depend on a specific shell, only /bin/sh needs to be bourne shell compatibel (like dash, ksh, zsh, bash, …).

    For more information refer to the manual titled "ccollect - Installing, Configuring and Using" (available as text (asciidoc), -texinfo or html).

    OPTIONS

    +texinfo or html).

    OPTIONS

    -a, --all
    Backup all sources specified in /etc/ccollect/sources @@ -39,10 +39,10 @@ texinfo or html).

    LOGGING MECHANISM

    ccollect logging depends on running in non-interactive/interactive mode -and on specified optins. The mechanism behaves as the following:

    +

    LOGGING MECHANISM

    ccollect logging depends on running in non-interactive/interactive mode +and on specified optins. The mechanism behaves as the following:

    non-interactive mode -
    • +
      • standard output goes to syslog
      • optional: specify logging into file @@ -52,16 +52,16 @@ log all output by default optional: log only errors
      interactive mode -
      • +
        • standard output goes to stdout
        • log only errors
        • optional: log into syslog or file -

          • +

            • log all output by default
            • optional: log only errors -

    SEE ALSO

    ccollect_add_source(1), ccollect_analyse_logs(1), ccollect_logwrapper(1) -ccollect_delete_source(1), ccollect_list_intervals(1)

    AUTHOR

    Nico Schottelius <nico-ccollect--@--schottelius.org>

    COPYING

    Copyright (C) 2006-2008 Nico Schottelius. Free use of this software is +

    SEE ALSO

    ccollect_add_source(1), ccollect_analyse_logs(1), ccollect_logwrapper(1) +ccollect_delete_source(1), ccollect_list_intervals(1)

    AUTHOR

    Nico Schottelius <nico-ccollect--@--schottelius.org>

    COPYING

    Copyright (C) 2006-2008 Nico Schottelius. Free use of this software is granted under the terms of the GNU General Public License Version 3 (GPLv3).

    diff --git a/software/ccollect/documentation/man/ccollect.html b/software/ccollect/documentation/man/ccollect.html index 8287d170..450fe050 100644 --- a/software/ccollect/documentation/man/ccollect.html +++ b/software/ccollect/documentation/man/ccollect.html @@ -1,9 +1,10 @@ + - + ccollect(1)