3

When running the next command, I get errors about accessing certain files or directories (like /root) for which Duplicity does not have sufficient permissions:

duplicity --dry-run --include="/home/user" --include="/other/dir" \
    --exclude='/' / file:///tmp/backup-test/

I started wondering why this happens and ran a strace to see which files are accessed:

strace -o /tmp/duplicity-traced -e trace=file !!

To my surprise, every file in the directory / and /home are accessed through stat(). This does not look good, performance-wise. Is duplicity just not optimized for this query or am I missing something?

Lekensteyn
  • 174,277
  • Probably better just to synchronize one directory at a time. – poolie May 07 '11 at 12:40
  • @poolie: I want to sync my home dir and /etc, do you think it's better to keep these separate? Please clarify, thanks. – Lekensteyn May 07 '11 at 12:42
  • 1
    Yes, I think it's much better to do them separately. One other reason is that you can safely sync your home directory as yourself, where as to copy etc you will need to run as root. – poolie May 08 '11 at 05:38

1 Answers1

3

After digging in the source code as retrieved by apt-get source duplicity, I conclude that it's not possible without modifying the source code and its algorithm.

The way duplicity works:

  1. Scan the *source_directory* for entry
  2. For each entry found, do a lstat() call. This is used to determine the inode type (e.g. regular file or directory). The result will be cached for this entry
  3. If the entry exists...

    1. and is a symlink, the destination will be read using readlink
    2. and is a regular file or directory, access() will be called to detect whether the entry is readable or not
    3. if the entry is readable, it will be tested by each selection function (set by options like --include and --exclude)

This looks very reasonable and unless disk access is very slow (NFS?), there is no need to change this algorithm to do the path check before the stat calls.

Lekensteyn
  • 174,277