Nikita Danilov | 1 Jan 03:30 2007

Re: Finding hardlinks

Mikulas Patocka writes:
 > 
 > 
 > On Fri, 29 Dec 2006, Trond Myklebust wrote:
 > 
 > > On Thu, 2006-12-28 at 19:14 +0100, Mikulas Patocka wrote:
 > >> Why don't you rip off the support for colliding inode number from the
 > >> kernel at all (i.e. remove iget5_locked)?
 > >>
 > >> It's reasonable to have either no support for colliding ino_t or full
 > >> support for that (including syscalls that userspace can use to work with
 > >> such filesystem) --- but I don't see any point in having half-way support
 > >> in kernel as is right now.
 > >
 > > What would ino_t have to do with inode numbers? It is only used as a
 > > hash table lookup. The inode number is set in the ->getattr() callback.
 > 
 > The question is: why does the kernel contain iget5 function that looks up 
 > according to callback, if the filesystem cannot have more than 64-bit 
 > inode identifier?

Generally speaking, file system might have two different identifiers for
files:

 - one that makes it easy to tell whether two files are the same one;

 - one that makes it easy to locate file on the storage.

According to POSIX, inode number should always work as identifier of the
first class, but not necessary as one of the second. For example, in
(Continue reading)

Josh Boyer | 1 Jan 05:32 2007
Picon

Re: [PATCH] Make JFFS depend on CONFIG_BROKEN

On 12/30/06, Adrian Bunk <bunk <at> stusta.de> wrote:
> On Mon, Dec 18, 2006 at 07:25:56AM -0600, Josh Boyer wrote:
> > +       NOTE: This filesystem is deprecated and is scheduled for removal in
> > +       2.6.21.  See Documentation/feature-removal-schedule.txt
> >...
>
> $ grep -i jffs Documentation/feature-removal-schedule.txt
> $

This was a follow on patch to Jeff's 'kill-jffs' branch.  He asked me
to resend it as a separate patch without all the quoted context.

Jeff?

josh
Mikulas Patocka | 1 Jan 23:47 2007
Picon

Re: Finding hardlinks

Hi!

>>>> If user (or script) doesn't specify that flag, it
>>>> doesn't help. I think
>>>> the best solution for these filesystems would be
>>>> either to add new syscall
>>>> 	int is_hardlink(char *filename1, char *filename2)
>>>> (but I know adding syscall bloat may be objectionable)
>>>
>>> it's also the wrong api; the filenames may have been
>>> changed under you
>>> just as you return from this call, so it really is a
>>> "was_hardlink_at_some_point()" as you specify it.
>>> If you make it work on fd's.. it has a chance at least.
>>
>> Yes, but it doesn't matter --- if the tree changes under
>> "cp -a" command, no one guarantees you what you get.
>> 	int fis_hardlink(int handle1, int handle 2);
>> Is another possibility but it can't detect hardlinked
>> symlinks.
>
> Ugh. Is it even legal to hardlink symlinks?

Why it shoudln't be? It seems to work quite fine in Linux.

> Anyway, cp -a is not the only application that wants to do hardlink
> detection.

I tested programs for ino_t collision (I intentionally injected it) and 
found that CP from coreutils 6.7 fails to copy directories but displays 
(Continue reading)

Mikulas Patocka | 1 Jan 23:58 2007
Picon

Re: Finding hardlinks

> > The question is: why does the kernel contain iget5 function that looks up
> > according to callback, if the filesystem cannot have more than 64-bit
> > inode identifier?
>
> Generally speaking, file system might have two different identifiers for
> files:
>
> - one that makes it easy to tell whether two files are the same one;
>
> - one that makes it easy to locate file on the storage.
>
> According to POSIX, inode number should always work as identifier of the
> first class, but not necessary as one of the second. For example, in
> reiserfs something called "a key" is used to locate on-disk inode, which
> in turn, contains inode number. Identifiers of the second class tend to

BTW. How does ReiserFS find that a given inode number (or object ID in 
ReiserFS terminology) is free before assigning it to new file/directory?

Mikulas

> live in directory entries, and during lookup we want to consult inode
> cache _before_ reading inode from the disk (otherwise cache is mostly
> useless), right? This means that some file systems want to index inodes
> in a cache by something different than inode number.
Nikita Danilov | 2 Jan 00:05 2007

Re: Finding hardlinks

Mikulas Patocka writes:

[...]

 > 
 > BTW. How does ReiserFS find that a given inode number (or object ID in 
 > ReiserFS terminology) is free before assigning it to new file/directory?

reiserfs v3 has an extent map of free object identifiers in
super-block. reiser4 used 64 bit object identifiers without reuse.

 > 
 > Mikulas

Nikita.

-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo <at> vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Mikulas Patocka | 2 Jan 00:22 2007
Picon

Re: Finding hardlinks

> > BTW. How does ReiserFS find that a given inode number (or object ID in
> > ReiserFS terminology) is free before assigning it to new file/directory?
>
> reiserfs v3 has an extent map of free object identifiers in
> super-block.

Inode free space can have at most 2^31 extents --- if inode numbers 
alternate between "allocated", "free". How do you pack it to superblock?

> reiser4 used 64 bit object identifiers without reuse.

So you are going to hit the same problem as I did with SpadFS --- you 
can't export 64-bit inode number to userspace (programs without 
-D_FILE_OFFSET_BITS=64 will have stat() randomly failing with EOVERFLOW 
then) and if you export only 32-bit number, it will eventually wrap-around 
and colliding st_ino will cause data corruption with many userspace 
programs.

Mikulas
-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo <at> vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Jan Harkes | 2 Jan 00:53 2007
Picon

Re: Finding hardlinks

On Mon, Jan 01, 2007 at 11:47:06PM +0100, Mikulas Patocka wrote:
> >Anyway, cp -a is not the only application that wants to do hardlink
> >detection.
> 
> I tested programs for ino_t collision (I intentionally injected it) and 
> found that CP from coreutils 6.7 fails to copy directories but displays 
> error messages (coreutils 5 work fine). MC and ARJ skip directories with 
> colliding ino_t and pretend that operation completed successfuly. FTS 
> library fails to walk directories returning FTS_DC error. Diffutils, find, 
> grep fail to search directories with coliding inode numbers. Tar seems 
> tolerant except incremental backup (which I didn't try). All programs 
> except diff were tolerant to coliding ino_t on files.

Thanks for testing so many programs, but... did the files/symlinks with
colliding inode number have i_nlink > 1? Or did you also have directories
with colliding inode numbers. It looks like you've introduced hardlinked
directories in your test which are definitely not supported, in fact it
will probably cause not only issues for userspace programs, but also
locking and garbage collection issues in the kernel's dcache.

I'm surprised you're seeing so many problems. The only find problem that
I am aware of is the one where it assumes that there will be only
i_nlink-2 subdirectories in a given directory, this optimization can be
disabled with -noleaf. The only problems I've encountered with ino_t
collisions are archivers and other programs that recursively try to copy
a tree while preserving hardlinks. And in all cases these seem to have
no problem with such collisions as long as i_nlink == 1.

Jan
-
(Continue reading)

Mikulas Patocka | 2 Jan 01:04 2007
Picon

Re: Finding hardlinks

On Mon, 1 Jan 2007, Jan Harkes wrote:

> On Mon, Jan 01, 2007 at 11:47:06PM +0100, Mikulas Patocka wrote:
>>> Anyway, cp -a is not the only application that wants to do hardlink
>>> detection.
>>
>> I tested programs for ino_t collision (I intentionally injected it) and
>> found that CP from coreutils 6.7 fails to copy directories but displays
>> error messages (coreutils 5 work fine). MC and ARJ skip directories with
>> colliding ino_t and pretend that operation completed successfuly. FTS
>> library fails to walk directories returning FTS_DC error. Diffutils, find,
>> grep fail to search directories with coliding inode numbers. Tar seems
>> tolerant except incremental backup (which I didn't try). All programs
>> except diff were tolerant to coliding ino_t on files.
>
> Thanks for testing so many programs, but... did the files/symlinks with
> colliding inode number have i_nlink > 1? Or did you also have directories
> with colliding inode numbers. It looks like you've introduced hardlinked
> directories in your test which are definitely not supported, in fact it
> will probably cause not only issues for userspace programs, but also
> locking and garbage collection issues in the kernel's dcache.

I tested it only on files without hardlink (with i_nlink == 1) --- most 
programs (except diff) are tolerant to collision, they won't store st_ino 
in memory unless i_nlink > 1.

I didn't hardlink directories, I just patched stat, lstat and fstat to 
always return st_ino == 0 --- and I've seen those failures. These failures 
are going to happen on non-POSIX filesystems in real world too, very 
rarely.
(Continue reading)

Christoph Hellwig | 2 Jan 15:26 2007

Re: [FSAIO][PATCH 6/8] Enable asynchronous wait page and lock page

On Thu, Dec 28, 2006 at 08:17:17PM +0530, Suparna Bhattacharya wrote:
> I am really bad with names :(  I tried using the _wq suffixes earlier and
> that seemed confusing to some, but if no one else objects I'm happy to use
> that. I thought aio_lock_page() might be misleading because it is
> synchronous if a regular wait queue entry is passed in, but again it may not
> be too bad.
> 
> What's your preference ? Does anything more intuitive come to mind ?

Beein bad about naming seems to be a disease, at least I suffer from it
aswell.  I wouldn't mind either the _wq or aio_ naming - _wq describes
the way it's called and aio_ describes it's a special case for aio.
Similarly to how ->aio_read/->aio_write can be used for synchronous I/O
aswell.

-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo <at> vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Christoph Hellwig | 2 Jan 15:29 2007

Re: [FSAIO][PATCH 7/8] Filesystem AIO read

On Thu, Dec 28, 2006 at 08:48:30PM +0530, Suparna Bhattacharya wrote:
> Yes, we can do that -- how about aio_restarted() as an alternate name ?

Sounds fine to me.

> > Pluse possible naming updates discussed in the last mail.  Also do we
> > really need to pass current->io_wait here?  Isn't the waitqueue in
> > the kiocb always guaranteed to be the same?  Now that all pagecache
> 
> We don't have have the kiocb available to this routine. Using current->io_wait
> avoids the need to pass the iocb down to deeper levels just for the sync vs
> async checks, also allowing such routines to be shared by other code which
> does not use iocbs (e.g. generic_file_sendfile->do_generic_file_read
> ->do_generic_mapping_read) without having to set up dummy iocbs.

We really want to switch senfile to kiocbs btw, - for one thing to
allow an aio_sendfile implementation and second to make it more common
to all the other I/O path code so we can avoid special cases in the
fs code  So I'm not convinced by that argument.  But again we don't
need to put the io_wait removal into your patchkit.  I'll try to
hack on it once I'll get a little spare time.

-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo <at> vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Gmane