best practices for handling hard links
Marc Evans <marc <at> softwarehackery.com>
2014-11-17 12:11:00 GMT
I am trying to determine what the current state of hard link handling is
for duplicity, and then assuming that my belief is correct, that hard
links are not preserved by duplicity, I would like to understand what
current best practices are?
Background: I have about 26TB of raw data that I am backing up to the
cloud via duplicity. Once it is encrypted, etc, it consumes about 48TB
in the cloud, which includes 1 full backup plus daily incrementals
spanning a 1 month period. With the data is considerable files that are
highly compressed as well as thousands of hard-link files.
Experimentation is finding that the hard-link files are getting stored
multiple times, and further when restored the hard links are not preserved.
Based on my reading of the mailing list archives my observations seem to
be confirmed, though that is in years-old threads. I see in the code
various pieces that are hard link oriented though. I also see discussion
of special casing hard link handling at duplicity invocation, such that
excludes are used to insure that only one copy is actually backed up and
a hard link manifest is generated that can be used by scripts to restore
Given the above, what is the state of hard link handling and what are
current best practices for dealing with them?
Thanks in advance - Marc