Paul E. McKenney | 1 Apr 2010 03:29
Picon

Re: [PATCH] NFS: Fix RCU warnings in nfs_inode_return_delegation_noreclaim() [ver #2]

On Wed, Mar 31, 2010 at 11:53:28PM +0100, David Howells wrote:
> Eric Dumazet <eric.dumazet@...> wrote:
> 
> > If you dont own a lock, and test a pointer, what guarantee do you have
> > this pointer doesnt change right after you tested it ?
> 
> There are five possibilities:
> 
>  (1) A pointer points to something when you check, and still points to the
>      same thing after you've gained the lock.
> 
>  (2) A pointer points to something when you check, and points to something
>      else after you've gained the lock.
> 
>  (3) A pointer points to something when you check, and is NULL after you've
>      gained the lock.
> 
>  (4) A pointer points to NULL when you check, and points to something after
>      you've gained the lock.
> 
>  (5) A pointer points to NULL when you check, and points to NULL after you've
>      gained the lock.
> 
> However, what if you _know_ that the pointer can only ever be made non-NULL
> during initialisation, and may even be left unset?  That means possibility (4)
> can never happen, and that possibility (5) can be detected by testing before
> taking the lock.  Now, what if (5) is a common occurrence?  It might make
> sense to make the test.
> 
> And what matter if the pointer _does_ change after you test it.  If it was
(Continue reading)

Steven Whitehouse | 1 Apr 2010 14:16
Picon
Favicon

lockd and lock cancellation

Hi,

I'm trying to find my way around the lockd code and I'm currently a bit
stumped by the code relating to lock cancellation. There is only one
call to vfs_cancel_lock() in lockd/svclock.c and its return value isn't
checked.

It is used in combination with nlmsvc_unlink_block() which
unconditionally calls posix_unblock_lock(). There are also other places
where the code also calls nlmsvc_unlink_block() without first canceling
the lock. The way in which vfs_cancel_lock() is used suggests that
canceling a lock is a synchronous operation, and that it must succeed
before returning.

I'd have expected to see (pseudo code) something more like the
following:

ret = vfs_cancel_lock();
if (ret == -ENOENT) /* never had the lock in the first place */
    do_something_appropriate();
else if (ret == -EINVAL) /* we raced with a grant */
    unlock_lock();
else /* lock successfully canceled */
    nlmsvc_unlink_block();

Is there a reason why that is not required? and indeed, is there a
reason why its safe to call nlmsvc_unlink_block() in the cases where the
lock isn't canceled first? I'm trying to work out how the underlying fs
can tell that a lock has gone away in those particular cases,

(Continue reading)

Rob Gardner | 1 Apr 2010 14:40
Picon
Favicon

Re: lockd and lock cancellation

Steven Whitehouse wrote:
> Hi,
>
> I'm trying to find my way around the lockd code and I'm currently a bit
> stumped by the code relating to lock cancellation. There is only one
> call to vfs_cancel_lock() in lockd/svclock.c and its return value isn't
> checked.
>
> It is used in combination with nlmsvc_unlink_block() which
> unconditionally calls posix_unblock_lock(). There are also other places
> where the code also calls nlmsvc_unlink_block() without first canceling
> the lock. The way in which vfs_cancel_lock() is used suggests that
> canceling a lock is a synchronous operation, and that it must succeed
> before returning.
>
> I'd have expected to see (pseudo code) something more like the
> following:
>
> ret = vfs_cancel_lock();
> if (ret == -ENOENT) /* never had the lock in the first place */
>     do_something_appropriate();
> else if (ret == -EINVAL) /* we raced with a grant */
>     unlock_lock();
> else /* lock successfully canceled */
>     nlmsvc_unlink_block();
>
> Is there a reason why that is not required? and indeed, is there a
> reason why its safe to call nlmsvc_unlink_block() in the cases where the
> lock isn't canceled first? I'm trying to work out how the underlying fs
> can tell that a lock has gone away in those particular cases,
(Continue reading)

Trond Myklebust | 1 Apr 2010 15:17
Picon
Picon

Re: why does each NFS mount have (at least) 2 rpc_clnt's ?

On Thu, 2010-04-01 at 08:46 -0400, Jeff Layton wrote: 
> I'm working on an issue in an older kernel where we see occasional
> panics when trying to refresh credentials. Here's the bug in case
> anyone is interested:
> 
>     https://bugzilla.redhat.com/show_bug.cgi?id=572870
> 
> ...I think I understand the problem well enough now. The problem is
> pretty complex, but the issue is that some operations are done using
> credentials from a stateowner associated with a nfs_client, but using
> the rpc_clnt in nfs_server->client. The two can have different
> authtypes if there are a mix of mounts with different authtypes to the
> same server. This problem seems to have been fixed in mainline with the
> introduction of the auth_generic code.
> 
> It leaves me wondering though...what exactly is the reason for having
> two rpc_clients per NFS mount? To clarify, I'm talking about these two,
> which seem to be somewhat redundant:
> 
> nfs_server->client
> nfs_server->nfs_client->cl_rpcclient
> 
> On mount, the nfs4_set_client calls nfs_get_client to search the list
> of nfs_client structs until it finds one that matches the address, port,
> etc of the NFS server. If one isn't found, the kernel creates one using
> whatever authtype was requested for the mount.
> 
> Later, nfs_init_server_rpcclient looks at the rpc_clnt in the
> nfs_client and copies it. If the auth pseudoflavor doesn't match
> however, it creates a new rpc_auth for it.
(Continue reading)

Rob Gardner | 1 Apr 2010 16:07
Picon
Favicon

Re: lockd and lock cancellation

Steven Whitehouse wrote:
> Hi,
>
> Thanks for the fast response...
>
> On Thu, 2010-04-01 at 13:40 +0100, Rob Gardner wrote:
>   
>> Steven Whitehouse wrote:
>>     
>>> Hi,
>>>
>>> I'm trying to find my way around the lockd code and I'm currently a bit
>>> stumped by the code relating to lock cancellation. There is only one
>>> call to vfs_cancel_lock() in lockd/svclock.c and its return value isn't
>>> checked.
>>>
>>> It is used in combination with nlmsvc_unlink_block() which
>>> unconditionally calls posix_unblock_lock(). There are also other places
>>> where the code also calls nlmsvc_unlink_block() without first canceling
>>> the lock. The way in which vfs_cancel_lock() is used suggests that
>>> canceling a lock is a synchronous operation, and that it must succeed
>>> before returning.
>>>
>>> I'd have expected to see (pseudo code) something more like the
>>> following:
>>>
>>> ret = vfs_cancel_lock();
>>> if (ret == -ENOENT) /* never had the lock in the first place */
>>>     do_something_appropriate();
>>> else if (ret == -EINVAL) /* we raced with a grant */
(Continue reading)

Trond Myklebust | 1 Apr 2010 16:07
Picon
Picon

Re: why does each NFS mount have (at least) 2 rpc_clnt's ?

On Thu, 2010-04-01 at 15:26 +0200, Ondrej Valousek wrote: 
> Hi Trond,
> 
> Does it have anything to do with the situation when we want an the same 
> filesystem to be mounted twice somewhere else with a different flags? 
> (i.e. RO/RW for example)
> If yes, I remember some discussion about it in the past (the above was 
> not possible).
> Thanks,

Yes. That's exactly what it is supposed to allow (and btw, mounting
filesystems both ro and rw should be possible now).

Cheers
  Trond

> Ondrej
> > Look again at nfs_init_server_rpcclient(). The pseudoflavour is not the
> > only thing that is changed. We also change the soft flag and the timeout
> > properties of the server->client.
> >
> > The point is that users sometimes want to specify per-mountpoint
> > transport properties, and so we try to give them that possibility, while
> > at the same time sharing sockets/rdma connections.
> >
> > Cheers
> >   Trond
> >
> > _______________________________________________
> > NFSv4 mailing list
(Continue reading)

Paul E. McKenney | 1 Apr 2010 16:39
Picon

Re: [PATCH] NFS: Fix RCU warnings in nfs_inode_return_delegation_noreclaim() [ver #2]

On Thu, Apr 01, 2010 at 12:45:14PM +0100, David Howells wrote:
> Paul E. McKenney <paulmck@...> wrote:
> 
> > > I think it is incorrectly used.  Given that the rcu_dereference() in:
> > > 
> > > 	if (rcu_dereference(nfsi->delegation) != NULL) {
> > > 		spin_lock(&clp->cl_lock);
> > > 		delegation = nfs_detach_delegation_locked(nfsi, NULL);
> > > 		spin_unlock(&clp->cl_lock);
> > > 		if (delegation != NULL)
> > > 			nfs_do_return_delegation(inode, delegation, 0);
> > > 	}
> > 
> > And nfs_detach_delegation_locked() rechecks nfsi->delegation() under
> > the lock, so this is a legitimate use.
> > 
> > The pointer is not held constant, but any changes will be accounted
> > for and handled correctly.  So I would argue that the pointer value is
> > in fact protected by the recheck-under-lock algorithm used here.
> 
> A legitimate use of what?

A legitimate use of loading an RCU-protected pointer without
smp_read_barrier_depends().  However, I could imagine some situations
where the ACCESS_ONCE() semantics were required -- though in this
particular situation, I am having a hard time seeing how the compiler
could mess us up.  That said, my time on the C++ standards committee
has given me new respect for the perversity of compiler writers.

So you have objected to needless memory barriers.  How do you feel
(Continue reading)

J. Bruce Fields | 1 Apr 2010 17:56

Re: lockd and lock cancellation

On Thu, Apr 01, 2010 at 03:07:00PM +0100, Rob Gardner wrote:
> If the lock were actually granted, then unlocking in lieu of a cancel  
> can potentially leave a range unlocked that should be left locked. This  
> can happen in the case of a lock upgrade or a coalesce operation; For  
> instance, suppose the client holds a lock on bytes 0-100, then issues  
> another lock request for bytes 50-150, but sends a cancel just after the  
> lock is actually granted. If you now simply unlock 50-150, then the  
> client is left holding only 0-50, and has "lost" the lock on bytes  
> 51-100. In other words, the client will *believe* that he has 0-100  
> locked, but in reality, only 0-50 are locked.
>
> As for what to do in this situation... well it would be nice if the  
> filesystem treated the cancel request as an "undo" if the grant won the  
> race. But seriously, I think this is just one of the (many) flaws in the  
> protocol and we probably have to live with it. My personal feeling is  
> that it's safer to leave bytes locked rather than have a client believe  
> it holds a lock when it doesn't.

I wrote some code to address this sort of problem a long time ago, and
didn't submit it:

	http://git.linux-nfs.org/?p=bfields/linux-topics.git;a=shortlog;h=refs/heads/fair-queueing

The idea was to support correct cancelling by introducing a new type of
lock.  Let's call it a "provisional lock".  It behaves in every way like
a posix lock, *except* that it doesn't combine with (or downgrade)
existing locks.  This allows a provisional lock to be cancelled--by just
removing it from the lock list--or to be "upgraded" to a normal posix
lock.  (Note if you're looking at the code above, "provisional locks"
are identified with the FL_BLOCK flag.)
(Continue reading)

David Howells | 1 Apr 2010 16:46
Picon
Favicon

Re: [PATCH] NFS: Fix RCU warnings in nfs_inode_return_delegation_noreclaim() [ver #2]

Paul E. McKenney <paulmck@...> wrote:

> So you have objected to needless memory barriers.  How do you feel
> about possibly needless ACCESS_ONCE() calls?

That would work here since it shouldn't emit any excess instructions.

David
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@...
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Steven Whitehouse | 1 Apr 2010 15:13
Picon
Favicon

Re: lockd and lock cancellation

Hi,

Thanks for the fast response...

On Thu, 2010-04-01 at 13:40 +0100, Rob Gardner wrote:
> Steven Whitehouse wrote:
> > Hi,
> >
> > I'm trying to find my way around the lockd code and I'm currently a bit
> > stumped by the code relating to lock cancellation. There is only one
> > call to vfs_cancel_lock() in lockd/svclock.c and its return value isn't
> > checked.
> >
> > It is used in combination with nlmsvc_unlink_block() which
> > unconditionally calls posix_unblock_lock(). There are also other places
> > where the code also calls nlmsvc_unlink_block() without first canceling
> > the lock. The way in which vfs_cancel_lock() is used suggests that
> > canceling a lock is a synchronous operation, and that it must succeed
> > before returning.
> >
> > I'd have expected to see (pseudo code) something more like the
> > following:
> >
> > ret = vfs_cancel_lock();
> > if (ret == -ENOENT) /* never had the lock in the first place */
> >     do_something_appropriate();
> > else if (ret == -EINVAL) /* we raced with a grant */
> >     unlock_lock();
> > else /* lock successfully canceled */
> >     nlmsvc_unlink_block();
(Continue reading)


Gmane