Mojca Miklavec | 1 Feb 2008 02:06
Picon

Re: rsync-ing from two locations with same filenames (at different versions)

On Jan 30, 2008 2:38 PM, Matt McCutchen wrote:
> On Wed, 2008-01-30 at 09:48 +0100, Mojca Miklavec wrote:
> > Neither helps. Even if I have a file of differest size and with a
> > different timestamp, and even if I add --checksum or --ignore-times,
> > the old file in dest won't be modified (overwritten by a newer file).
>
> I can't reproduce the problem.  I ran the script in your original
> message, except I changed "b2" to "b22" and waited a few seconds before
> running that line so the file would get a later mtime.  Both rsync 2.6.9
> and the latest development rsync correctly replaced b.txt with the
> version from new/ on the second run.  Am I missing something?

I don't know. Somtimes it works and sometimes not (but mostly not as a
rule of thumb). Even if I wait for a few minutes inbetween, the new
file won't be chosen.

> rsync --version
rsync  version 2.6.3  protocol version 28

That might be old, but that was the default that came with fink on Mac
OS X (if the error has been fixed in the meantime, I will upgrade).

> ll new/dir1/
skupno 4,0K
-rw-r--r-- 1 mojca wheel 6 feb  1 02:00 b.txt

> ll full/dir1/
skupno 8,0K
-rw-r--r-- 1 mojca wheel 2 jan 30 09:36 a.txt
-rw-r--r-- 1 mojca wheel 4 feb  1 01:52 b.txt
(Continue reading)

Matt McCutchen | 1 Feb 2008 02:36

Re: rsync-ing from two locations with same filenames (at different versions)

On Fri, 2008-02-01 at 02:06 +0100, Mojca Miklavec wrote:
> I don't know. Somtimes it works and sometimes not (but mostly not as a
> rule of thumb). Even if I wait for a few minutes inbetween, the new
> file won't be chosen.
> 
> > rsync --version
> rsync  version 2.6.3  protocol version 28
> 
> That might be old, but that was the default that came with fink on Mac
> OS X (if the error has been fixed in the meantime, I will upgrade).

Duh.  I realize now that it's perfectly reasonable for you to be able to
reproduce the problem while I can't.  Versions 2.6.9 and earlier of
rsync sort the file-list using the C library's quicksort, an unstable
sort, so the results in case of duplicate files are highly sensitive to
both the C library implementation and the order of directory entries in
the source (which in turn is sensitive to the filesystem
implementation).  You probably have both a different C library and a
different filesystem than I do.

In any case, since rsync 3.0.0pre1, the default file-list sorting
algorithm is a mergesort, which is stable, so files from earlier source
arguments take priority.  If you upgrade to an rsync 3.0.0pre* version,
your scenario should work consistently.  If it doesn't, that's a bug we
should try to fix.

Matt

--

-- 
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
(Continue reading)

Mojca Miklavec | 1 Feb 2008 09:46
Picon

Re: rsync-ing from two locations with same filenames (at different versions)

On Feb 1, 2008 2:36 AM, Matt McCutchen wrote:
> On Fri, 2008-02-01 at 02:06 +0100, Mojca Miklavec wrote:
> > I don't know. Somtimes it works and sometimes not (but mostly not as a
> > rule of thumb). Even if I wait for a few minutes inbetween, the new
> > file won't be chosen.
> >
> > > rsync --version
> > rsync  version 2.6.3  protocol version 28
> >
> > That might be old, but that was the default that came with fink on Mac
> > OS X (if the error has been fixed in the meantime, I will upgrade).
>
> Duh.  I realize now that it's perfectly reasonable for you to be able to
> reproduce the problem while I can't.  Versions 2.6.9 and earlier of
> rsync sort the file-list using the C library's quicksort, an unstable
> sort, so the results in case of duplicate files are highly sensitive to
> both the C library implementation and the order of directory entries in
> the source (which in turn is sensitive to the filesystem
> implementation).  You probably have both a different C library and a
> different filesystem than I do.
>
> In any case, since rsync 3.0.0pre1, the default file-list sorting
> algorithm is a mergesort, which is stable, so files from earlier source
> arguments take priority.  If you upgrade to an rsync 3.0.0pre* version,
> your scenario should work consistently.  If it doesn't, that's a bug we
> should try to fix.

Oh, thanks a lot for the explanation :)
I will try to figure out how to build it and test then.

(Continue reading)

Zane Brady | 1 Feb 2008 13:20
Picon
Favicon

RE: rsync Digest, Vol 62, Issue 1

Yep

Zane 

-----Original Message-----
From: rsync-bounces+zane_brady=trimble.com <at> lists.samba.org
[mailto:rsync-bounces+zane_brady=trimble.com <at> lists.samba.org] On Behalf Of rsync-request <at> lists.samba.org
Sent: Friday, February 01, 2008 7:01 AM
To: rsync <at> lists.samba.org
Subject: rsync Digest, Vol 62, Issue 1

Send rsync mailing list submissions to
	rsync <at> lists.samba.org

To subscribe or unsubscribe via the World Wide Web, visit
	https://lists.samba.org/mailman/listinfo/rsync
or, via email, send a message with subject or body 'help' to
	rsync-request <at> lists.samba.org

You can reach the person managing the list at
	rsync-owner <at> lists.samba.org

When replying, please edit your Subject line so it is more specific than "Re: Contents of rsync digest..."

To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html

---------------------------------------

Today's Topics:
(Continue reading)

Mike Bombich | 2 Feb 2008 07:18
Favicon
Gravatar

Re: creation date and OSX [performance]

Looking at this patch from a performance perspective, it appears that getattrlist is called twice for every file:

23:57:24.341  lstat                                  00-basic-permissions/owned-by-root                                                                                                      0.000011   rsync               
23:57:24.341  listxattr                              00-basic-permissions/owned-by-root                                                                                                      0.000006   rsync               
23:57:24.341  getattrlist                            00-basic-permissions/owned-by-root                                                                                                      0.000006   rsync               
23:57:24.341  getattrlist                            00-basic-permissions/owned-by-root                                                                                                      0.000005   rsync               
23:57:24.341  lstat                                  00-basic-permissions/owned-by-www                                                                                                       0.000008   rsync               
23:57:24.341  listxattr                              00-basic-permissions/owned-by-www                                                                                                       0.000005   rsync               
23:57:24.341  getattrlist                            00-basic-permissions/owned-by-www                                                                                                       0.000006   rsync               
23:57:24.341  getattrlist                            00-basic-permissions/owned-by-www                                                                                                       0.000005   rsync               


The first time it is called by 

sys_llistxattr()
getCreationTime()

 -- basically we determine if the file has a creation date.  If it does, then add the CRTIME_XATTR string to the xattr list.  The creation date isn't actually cached here, though.  To get the actual creation date, getattrlist is called again via xattrs.c 

get_xattr_data()
sys_lgetxattr()
get_crtime_xattr()
getCreationTime()


The performance hit is significant, and I'm wondering how safe it is to simply assume that every file has a creation date (given that this section is wrapped in #if HAVE_OSX_XATTRS), therefore, drop the first getCreationDate and add the CRTIME_XATTR string to the xattr list by default.  For example:

// sysxattrs.c:150
ssize_t sys_llistxattr(const char *path, char *list, size_t size)
{
ssize_t ret = listxattr(path, list, size, XATTR_NOFOLLOW);
if (ret < 0)
return ret;
// if (getCreationTime(path) != NULL) {
ret += sizeof CRTIME_XATTR;
if (list) {
if ((size_t)ret > size) {
errno = ERANGE;
return -1;
}
memcpy(list + ret - sizeof CRTIME_XATTR,
      CRTIME_XATTR, sizeof CRTIME_XATTR);
}
// }
return ret;
}


Or would this bomb out running on MOSX with a non-HFS volume as the source?  Or is there a better way to avoid this call (e.g. determine the underlying filesystem)?

Mike

On Dec 1, 2007, at 10:45 PM, Robert DuToit wrote:

Hi,
 I've been using rsync (OSX Tiger now Leopard) to backup my home folder daily using -a -H -A -X link-dest=dir to make incremental backups. There was a problem though since many files especially images, movies etc would be recopied each time instead of creating hard links. I have been testing the pre5 release and found that it seems to make hard links correctly for all files. I am hoping rsync 3.0 can replace the Apple version which has been so flawed.

I tried the osx-create-time.diff patch too and it works but it took twice as long to copy my home folder as without and ground to a halt the last time. I know the creation date issue is somewhat "fringe" for rsync but it does matter to a lot of OSX folk so I don't know if there is any way to speed it up. I've made some small backup wrapper applications for people and they always want the creation date.....

I noticed the rsync version that is used now in Carbon Copy Cloner is pretty "clean" with meatdata and saves the creation date and is very fast.....  Just some thoughts-I don't have any experience with this code so can't help in that way.  Thanks, Rob--
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


--

-- 
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Vitorio Machado | 2 Feb 2008 09:24
Picon

Re: creation date and OSX [performance]

Hi,

I think it's OK to run getattrlist once assuming that there are creation date. My arguments:

1) First of all, most of Macs run only on HFS+, some exceptions will be those running under UFS (only saw one person talk about this for a server, but I think it's very rare) or that have volumes under FAT/NTFS filesystem to be compatible with Windows.

2) From getattrlist manpage:
    The getattrlist() function is only supported by certain volume format
     implementations.  For maximum compatibility, client programs should use
     high-level APIs (such as the Carbon File Manager) to access file system
     attributes.  These high-level APIs include logic to emulate file system
     attributes on volumes that don't support getattrlist().

In other words, if we really care about compatibility, we should use Carbon system call that tests it for us and make the dirt work for us. May be a good idea.

3) Also from getattrlist manpage:
     Not all volumes support all attributes.  See the discussion of
     ATTR_VOL_ATTRIBUTES for a discussion of how to determine whether a par-
     ticular volume supports a particular attribute.

I don't really know what discussion it refers, but I suppose it should be on the Apple developer site http://developer.apple.com . I didn't have the time to look, yet.

4) Always from getattrlist manpage:

COMPATIBILITY
     Not all volumes support getattrlist().  The best way to test whether a
     volume supports this function is to simply call it and check the error
     result.  getattrlist() will return ENOTSUP if it is not supported on a
     particular volume.

I suppose that a getattrlist on unsupported volumes will return this error, if it works like I'm expecting, we should only catch it and that's it.

Also note that (always from getattrlist manpage):

     The getattrlist() function has been undocumented for more than two years.
     In that time a number of volume format implementations have been created
     without a proper specification for the behaviour of this routine.  You
     may encounter volume format implementations with slightly different be-
     haviour than what is described here.  Your program is expected to be tol-
     erant of this variant behaviour.

So, there are some clues to be checked. I will probably check something if I have some time, but I already engaged myself with 10.3 compatibility and I had unexpected personal problems. So I can't say when I would be able to give some time for those projects.

Best regards,

Vitorio

Le 2 févr. 08 à 07:18, Mike Bombich a écrit :

Looking at this patch from a performance perspective, it appears that getattrlist is called twice for every file:

23:57:24.341  lstat                                  00-basic-permissions/owned-by-root                                                                                                      0.000011   rsync               
23:57:24.341  listxattr                              00-basic-permissions/owned-by-root                                                                                                      0.000006   rsync               
23:57:24.341  getattrlist                            00-basic-permissions/owned-by-root                                                                                                      0.000006   rsync               
23:57:24.341  getattrlist                            00-basic-permissions/owned-by-root                                                                                                      0.000005   rsync               
23:57:24.341  lstat                                  00-basic-permissions/owned-by-www                                                                                                       0.000008   rsync               
23:57:24.341  listxattr                              00-basic-permissions/owned-by-www                                                                                                       0.000005   rsync               
23:57:24.341  getattrlist                            00-basic-permissions/owned-by-www                                                                                                       0.000006   rsync               
23:57:24.341  getattrlist                            00-basic-permissions/owned-by-www                                                                                                       0.000005   rsync               


The first time it is called by 

sys_llistxattr()
getCreationTime()

 -- basically we determine if the file has a creation date.  If it does, then add the CRTIME_XATTR string to the xattr list.  The creation date isn't actually cached here, though.  To get the actual creation date, getattrlist is called again via xattrs.c 

get_xattr_data()
sys_lgetxattr()
get_crtime_xattr()
getCreationTime()


The performance hit is significant, and I'm wondering how safe it is to simply assume that every file has a creation date (given that this section is wrapped in #if HAVE_OSX_XATTRS), therefore, drop the first getCreationDate and add the CRTIME_XATTR string to the xattr list by default.  For example:

// sysxattrs.c:150
ssize_t sys_llistxattr(const char *path, char *list, size_t size)
{
ssize_t ret = listxattr(path, list, size, XATTR_NOFOLLOW);
if (ret < 0)
return ret;
// if (getCreationTime(path) != NULL) {
ret += sizeof CRTIME_XATTR;
if (list) {
if ((size_t)ret > size) {
errno = ERANGE;
return -1;
}
memcpy(list + ret - sizeof CRTIME_XATTR,
      CRTIME_XATTR, sizeof CRTIME_XATTR);
}
// }
return ret;
}


Or would this bomb out running on MOSX with a non-HFS volume as the source?  Or is there a better way to avoid this call (e.g. determine the underlying filesystem)?

Mike

On Dec 1, 2007, at 10:45 PM, Robert DuToit wrote:

Hi,
 I've been using rsync (OSX Tiger now Leopard) to backup my home folder daily using -a -H -A -X link-dest=dir to make incremental backups. There was a problem though since many files especially images, movies etc would be recopied each time instead of creating hard links. I have been testing the pre5 release and found that it seems to make hard links correctly for all files. I am hoping rsync 3.0 can replace the Apple version which has been so flawed.

I tried the osx-create-time.diff patch too and it works but it took twice as long to copy my home folder as without and ground to a halt the last time. I know the creation date issue is somewhat "fringe" for rsync but it does matter to a lot of OSX folk so I don't know if there is any way to speed it up. I've made some small backup wrapper applications for people and they always want the creation date.....

I noticed the rsync version that is used now in Carbon Copy Cloner is pretty "clean" with meatdata and saves the creation date and is very fast.....  Just some thoughts-I don't have any experience with this code so can't help in that way.  Thanks, Rob--
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


-- 
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync

--

-- 
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Giuliano Gavazzi | 2 Feb 2008 12:08

Re: creation date and OSX [performance]


On 2 Feb 2008, at 09:24, Vitorio Machado wrote:

> 3) Also from getattrlist manpage:
>      Not all volumes support all attributes.  See the discussion of
>      ATTR_VOL_ATTRIBUTES for a discussion of how to determine  
> whether a par-
>      ticular volume supports a particular attribute.
>
> I don't really know what discussion it refers, but I suppose it  
> should be on the Apple developer site http://developer.apple.com . I  
> didn't have the time to look, yet.
>

the discussion is in that same manpage.

Giuliano
--

-- 
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html

Paul Slootman | 2 Feb 2008 13:34

hardlinks not working with inode number > 2^31

I've been using 3.0.0pre8 to move a Debian archive from one filesystem
to another. This archive contains a daily snapshot of the Debian ftp
site, with common files hardlinked to save space.

I noticed that it was using far more space than necessary. Upon
investigation it seems that all the source files that have inode numbers
greater than 2^31 aren't being hardlinked together at the destination.

XFS uses a sparse inode number distribution. Here is an ls -i of a
random directory:

3239109826 libtomcat4-java_4.0.3-3woody3_all.deb
  18105717 libtomcat4-java_4.1.31-3_all.deb
1092085178 libtomcat4-java_4.1.31-4_all.deb
  18105718 tomcat4-admin_4.1.31-3_all.deb
1092085181 tomcat4-admin_4.1.31-4_all.deb
3239109829 tomcat4-webapps_4.0.3-3woody3_all.deb
  18105719 tomcat4-webapps_4.1.31-3_all.deb
1092085183 tomcat4-webapps_4.1.31-4_all.deb
3239109831 tomcat4_4.0.3-3woody3.diff.gz
3239109832 tomcat4_4.0.3-3woody3.dsc
3239109833 tomcat4_4.0.3-3woody3_all.deb
3239109834 tomcat4_4.0.3.orig.tar.gz
  18105720 tomcat4_4.1.31-3.diff.gz
  18105721 tomcat4_4.1.31-3.dsc
  18105722 tomcat4_4.1.31-3_all.deb
1092085188 tomcat4_4.1.31-4.diff.gz
1092085190 tomcat4_4.1.31-4.dsc
1092085191 tomcat4_4.1.31-4_all.deb
1091849984 tomcat4_4.1.31.orig.tar.gz

The files with 3239109xxx inodes aren't getting hardlinked.
Note that this is still less than 2^32, so should fit in an unsigned 32 bit int.
Note also that on large (>1TB) XFS filesystems on 64bit systems, a mount
option "inode64" is recommended, as otherwise inodes are only allocated
from the 1st 1TB space. This will reduce performance (the inodes are
then not always located near the data), and can also lead to not being
able to create new files if the 1st 1TB is full.

If rsync cannot currently cope with larger than 31bit inodes, than this
would need to be dealt with IMHO. It's not like larger than 1TB
filesystems are a rare thing nowadays, with single 3,5inch SATA disks
already available at 1TB sizes for very reasonable prices.

Paul Slootman
--

-- 
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html

Jochen Reinwand | 2 Feb 2008 14:00
Picon
Picon

Re: [PATCH] Add forgotten setup_iconv() call so that daemon-side iconv works.

Hi Matt,

thanks for the patch!

I applied the patch to rsync-3.0.0pre8 and rsync-HEAD-20080127-2251GMT, but 
the new setup_iconv doesn't seem to work. Trying to connect with 
parameter --iconv set, the daemon writes the following to the syslog and 
closes the connection:

iconv_open("UTF-8", "iso8859-15") failed
rsync error: requested action not supported (code 4) at rsync.c(120) 
[receiver=3.0.0pre8]

Without --iconv set everything is working fine!

Any idea what can be wrong? As client I used an openSUSE 10.2 system and as 
server also an openSUSE 10.2 system and a Buffalo LinkStation running 
openlink. It's the same for both systems.

Thanks for your help!

Jochen
--

-- 
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html

Simo Sorce | 2 Feb 2008 18:19
Picon
Favicon

patches/ dir missing in pre8 ?

What happened to the patches dir?
It is missing in the pre8 tarball.

Simo.

-- 
Simo Sorce * Red Hat, Inc * New York

--

-- 
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


Gmane