Frank Steiner | 2 May 08:24 2005
Picon

Re: Stale File handles keep coming back

Trond Myklebust wrote

> On lau , 2005-04-30 at 15:15 +0200, Frank Steiner wrote:
> 
>>reiserfs v3 on both servers (PC and pSeries).
> 
> Is the superblock in reiserfs 3.6 format? What
> does /proc/fs/reiserfs/≤partition>/version say?

Hmm, there is not /proc/fs/reiserfs on my system (?), but

debugreiserfs  /dev/mapper/exportraid-home  (yes, LVM)

tells me:

Reiserfs super block in block 16 on 0xfc00 of format 3.6 with standard journal

So I guess that's ok. Also note that the problems wer introduced onlt
after the latest kernel update while the filesystem was created a year
ago and running stable since that time!

I'm currently testing the patch against the acess/getattr flooding that
you sent here because a colleague who is running a few computer pools
observes a correlation between high nfs load and the stale NFS handles.

cu,
Frank

--

-- 
Dipl.-Inform. Frank Steiner   Web:  http://www.bio.ifi.lmu.de/~steiner/
(Continue reading)

David Cureton | 2 May 09:49 2005

The missing link -

Hi,
    I am using a SuSE 9.2 system as an NFS server and are seeing some
strange results. The server holds our home directories and exports a
number of disks to all the other workstations in the group.  The setup
is somewhat strange in that the first NFS mount, a directory called
"server", containing all the mountpoint for the subsequent mounting of
the other exported partitions

eg.  /server exported from the fileserver contains

drwxr-xr-x   4 root root 288 2005-04-01 15:32 .
lrwxrwxrwx   1 root root  27 2005-03-24 16:25 ./Shared ->
/server/volumes/vol0/Shared
lrwxrwxrwx   1 root root  26 2005-03-24 10:55 ./Staff ->
/server/volumes/vol0/Staff
lrwxrwxrwx   1 root root  27 2005-03-24 10:52 ./users ->
/server/volumes/vol0/users/
lrwxrwxrwx   1 root root  26 2005-03-24 10:51 ./usr1 ->
/server/volumes/vol0/usr1/
lrwxrwxrwx   1 root root  26 2005-03-24 10:51 ./usr2 ->
/server/volumes/vol0/usr2/
lrwxrwxrwx   1 root root  25 2005-03-24 10:53 ./usr3 ->
/server/volumes/vol0/usr3
lrwxrwxrwx   1 root root  25 2005-03-24 10:54 ./usr4 ->
/server/volumes/vol0/usr4
lrwxrwxrwx   1 root root  25 2005-03-24 10:54 ./usr5 ->
/server/volumes/vol0/usr5
drwxr-xr-x   5 root root 120 2005-03-29 13:52 ./volumes
drwxr-xr-x  12 root root 272 2005-04-01 16:00 ./volumes/vol0
drwxr-xr-x   4 root root  80 2005-03-23 17:53 ./volumes/vol1
(Continue reading)

David Cureton | 2 May 09:50 2005

The missing link

Hi,
    I am using a SuSE 9.2 system as an NFS server and are seeing some
strange results. The server holds our home directories and exports a
number of disks to all the other workstations in the group.  The setup
is somewhat strange in that the first NFS mount, a directory called
"server", containing all the mountpoint for the subsequent mounting of
the other exported partitions

eg.  /server exported from the fileserver contains

drwxr-xr-x   4 root root 288 2005-04-01 15:32 .
lrwxrwxrwx   1 root root  27 2005-03-24 16:25 ./Shared ->
/server/volumes/vol0/Shared
lrwxrwxrwx   1 root root  26 2005-03-24 10:55 ./Staff ->
/server/volumes/vol0/Staff
lrwxrwxrwx   1 root root  27 2005-03-24 10:52 ./users ->
/server/volumes/vol0/users/
lrwxrwxrwx   1 root root  26 2005-03-24 10:51 ./usr1 ->
/server/volumes/vol0/usr1/
lrwxrwxrwx   1 root root  26 2005-03-24 10:51 ./usr2 ->
/server/volumes/vol0/usr2/
lrwxrwxrwx   1 root root  25 2005-03-24 10:53 ./usr3 ->
/server/volumes/vol0/usr3
lrwxrwxrwx   1 root root  25 2005-03-24 10:54 ./usr4 ->
/server/volumes/vol0/usr4
lrwxrwxrwx   1 root root  25 2005-03-24 10:54 ./usr5 ->
/server/volumes/vol0/usr5
drwxr-xr-x   5 root root 120 2005-03-29 13:52 ./volumes
drwxr-xr-x  12 root root 272 2005-04-01 16:00 ./volumes/vol0
drwxr-xr-x   4 root root  80 2005-03-23 17:53 ./volumes/vol1
(Continue reading)

David Cureton | 2 May 09:51 2005

The missing link

Hi,
    I am using a SuSE 9.2 system as an NFS server and are seeing some
strange results. The server holds our home directories and exports a
number of disks to all the other workstations in the group.  The setup
is somewhat strange in that the first NFS mount, a directory called
"server", containing all the mountpoint for the subsequent mounting of
the other exported partitions

eg.  /server exported from the fileserver contains

drwxr-xr-x   4 root root 288 2005-04-01 15:32 .
lrwxrwxrwx   1 root root  27 2005-03-24 16:25 ./Shared ->
/server/volumes/vol0/Shared
lrwxrwxrwx   1 root root  26 2005-03-24 10:55 ./Staff ->
/server/volumes/vol0/Staff
lrwxrwxrwx   1 root root  27 2005-03-24 10:52 ./users ->
/server/volumes/vol0/users/
lrwxrwxrwx   1 root root  26 2005-03-24 10:51 ./usr1 ->
/server/volumes/vol0/usr1/
lrwxrwxrwx   1 root root  26 2005-03-24 10:51 ./usr2 ->
/server/volumes/vol0/usr2/
lrwxrwxrwx   1 root root  25 2005-03-24 10:53 ./usr3 ->
/server/volumes/vol0/usr3
lrwxrwxrwx   1 root root  25 2005-03-24 10:54 ./usr4 ->
/server/volumes/vol0/usr4
lrwxrwxrwx   1 root root  25 2005-03-24 10:54 ./usr5 ->
/server/volumes/vol0/usr5
drwxr-xr-x   5 root root 120 2005-03-29 13:52 ./volumes
drwxr-xr-x  12 root root 272 2005-04-01 16:00 ./volumes/vol0
drwxr-xr-x   4 root root  80 2005-03-23 17:53 ./volumes/vol1
(Continue reading)

김경표 | 2 May 12:49 2005

The actimeo seems not work

hello list.
I have some question how to work of  linux nfs client cache.
 
I was proceeding two test.
One is solaris 9 client with linux server.
The other is linux client with linux server.
 
Linux server and client kernel version is 2.4.27.
nfs-util version is 1.0.4.
 
[ mount option list ]
Linux server export option is rw,async.
Linux client mount optione is just actimeo=0.
Another linux client mount option is  actimeo=100.
Solaris9 mount option is soft,xattr,rw.
 
[ linux client setting ]
192.168.1.212:/vol1 /mnt1 nfs rw,v3,rsize=8192,wsize=8192,acregmin=0,acregmax=0,acdirmin=0,acdirmax=0,hard,udp,lock,addr=192.168.1.212 0 0
192.168.1.212:/vol1 /mnt nfs rw,v3,rsize=8192,wsize=8192,acregmin=100,acregmax=100,acdirmin=100,acdirmax=100,hard,udp,lock,addr=192.168.1.212 0 0
 
[ Solaris 9 test ]
2 solaris 9 clients ----> linux server. 
 
After i touch a file on one solaris client, I type ls -al the file on the other solaris client.
And i found that the file attribute(mtime) was updated randomly.(0 sec ~ 30sec)
The solais9 client is my customer machine.
My client want that the file attribute update interval is to 0 sec.
So I recommand to check nfs mount option in solaris 9 ( actimeo, acregXXX...).
 
[ linux client test ]
 
Test procedure is same with solaris9.
But, resut is not same.
Linux client shows up-to-dated file attribute.
As I know, linux nfs client default mount option is acregmin is 3.
So i change the mount option to  actimeo=100.
But result is not different.
So I wonder that actimeo option is a available.
i tried tcpdump.
 
when mount option is actimeo=0 ( in linux client)
linux-client> ls -al /mnt/test.test
linux-client>-rwxrwxrwx    1 root     root           12 Apr 29 16:59 /mnt/test.test
 
linux-server tcpdump output
-----------------------------------------------------------
18:51:52.561558 GRVC-C.4105878944 > 192.168.1.212.nfs: 120 access [|nfs] (DF)
18:51:52.561588 192.168.1.212.nfs > GRVC-C.4105878944: reply ok 120 access c 0000 (DF)
18:51:52.561683 GRVC-C.4122656160 > 192.168.1.212.nfs: 132 getattr [|nfs] (DF)
18:51:52.561759 192.168.1.212.nfs > GRVC-C.4122656160: reply ok 112 getattr REG 100777 ids 0/0 [|nfs] (DF)
-------------------------------------------------------------
 
when mount option is actimeo=100 ( in linux client)
linux-client> ls -al /mnt1/test.test
linux-client>-rwxrwxrwx    1 root     root           12 Apr 29 16:59 /mnt/test.test
 
linux-server tcpdump output
-------------------------------------------------------------
18:55:37.860627 GRVC-C.4189765024 > 192.168.1.212.nfs: 120 access [|nfs] (DF)
18:55:37.860694 192.168.1.212.nfs > GRVC-C.4189765024: reply ok 120 access c 0000 (DF)
18:55:37.860874 GRVC-C.4206542240 > 192.168.1.212.nfs: 116 getattr [|nfs] (DF)
18:55:37.860932 192.168.1.212.nfs > GRVC-C.4206542240: reply ok 112 getattr DIR 40777 ids 99/99 [|nfs] (DF)
18:55:37.861123 GRVC-C.4223319456 > 192.168.1.212.nfs: 132 getattr [|nfs] (DF)
18:55:37.861147 192.168.1.212.nfs > GRVC-C.4223319456: reply ok 112 getattr REG 100777 ids 0/0 [|nfs] (DF)
18:55:37.861373 GRVC-C.4240096672 > 192.168.1.212.nfs: 132 getattr [|nfs] (DF)
18:55:37.861387 192.168.1.212.nfs > GRVC-C.4240096672: reply ok 112 getattr REG 100777 ids 0/0 [|nfs] (DF)
-------------------------------------------------------------
 
As you see, two more getattr appeared in actimeo=100.
 
linux nfs client code seems different by actimeo option.
but I can't verify the affect of nfs client cache . ( aspect of file attribute cacheing)
 
anybody help me?
How can i verify the affect of nfs client cache.
sorry poor my english.
Neil Horman | 2 May 14:37 2005
Picon

Re: The actimeo seems not work

On Mon, May 02, 2005 at 07:49:43PM +0900, ����ǥ wrote:
> hello list.
> I have some question how to work of  linux nfs client cache.
> 
> I was proceeding two test.
> One is solaris 9 client with linux server.
> The other is linux client with linux server.
> 
> Linux server and client kernel version is 2.4.27.
> nfs-util version is 1.0.4.
> 
> [ mount option list ]
> Linux server export option is rw,async.
> Linux client mount optione is just actimeo=0.
> Another linux client mount option is  actimeo=100.
> Solaris9 mount option is soft,xattr,rw.
> 
> [ linux client setting ]
> 192.168.1.212:/vol1 /mnt1 nfs
rw,v3,rsize=8192,wsize=8192,acregmin=0,acregmax=0,acdirmin=0,acdirmax=0,hard,udp,lock,addr=192.168.1.212
0 0
> 192.168.1.212:/vol1 /mnt nfs
rw,v3,rsize=8192,wsize=8192,acregmin=100,acregmax=100,acdirmin=100,acdirmax=100,hard,udp,lock,addr=192.168.1.212
0 0
> 
> [ Solaris 9 test ]
> 2 solaris 9 clients ----> linux server. 
> 
> After i touch a file on one solaris client, I type ls -al the file on the other solaris client.
> And i found that the file attribute(mtime) was updated randomly.(0 sec ~ 30sec)
> The solais9 client is my customer machine.
> My client want that the file attribute update interval is to 0 sec.
> So I recommand to check nfs mount option in solaris 9 ( actimeo, acregXXX...).
> 
> [ linux client test ]
> 
> Test procedure is same with solaris9.
> But, resut is not same.
> Linux client shows up-to-dated file attribute.
> As I know, linux nfs client default mount option is acregmin is 3.
> So i change the mount option to  actimeo=100.
> But result is not different.
> So I wonder that actimeo option is a available.
> i tried tcpdump.
> 
You're probably seeing the close-to-open cache consistency mechanism working.
Every time a file is closed on a linux system (with no outstanding open fd's,
IIRC), a setattr transaction is sent to the server to update file metadata.  On
every file open (if there are no already open fd's, IIRC), a getattr transaction
is sent to the server to get the latest file metadata on the server.  This is
probably why it seems that the attribute cache timeouts aren't working like you
would expect.  To get a clearer view of how they operate, moun the linux clients
using the nocto nfs mount option.  Of course, I wouldn't recommend doing this,
as it provides better consistency between nfs nodes, but it will make the ac*
options work a little more like you might expect.

Regards
Neil

> when mount option is actimeo=0 ( in linux client)
> linux-client> ls -al /mnt/test.test 
> linux-client>-rwxrwxrwx    1 root     root           12 Apr 29 16:59 /mnt/test.test
> 
> linux-server tcpdump output
> -----------------------------------------------------------
> 18:51:52.561558 GRVC-C.4105878944 > 192.168.1.212.nfs: 120 access [|nfs] (DF)
> 18:51:52.561588 192.168.1.212.nfs > GRVC-C.4105878944: reply ok 120 access c 0000 (DF)
> 18:51:52.561683 GRVC-C.4122656160 > 192.168.1.212.nfs: 132 getattr [|nfs] (DF)
> 18:51:52.561759 192.168.1.212.nfs > GRVC-C.4122656160: reply ok 112 getattr REG 100777 ids 0/0 [|nfs] (DF)
> -------------------------------------------------------------
> 
> when mount option is actimeo=100 ( in linux client)
> linux-client> ls -al /mnt1/test.test 
> linux-client>-rwxrwxrwx    1 root     root           12 Apr 29 16:59 /mnt/test.test
> 
> linux-server tcpdump output
> -------------------------------------------------------------
> 18:55:37.860627 GRVC-C.4189765024 > 192.168.1.212.nfs: 120 access [|nfs] (DF)
> 18:55:37.860694 192.168.1.212.nfs > GRVC-C.4189765024: reply ok 120 access c 0000 (DF)
> 18:55:37.860874 GRVC-C.4206542240 > 192.168.1.212.nfs: 116 getattr [|nfs] (DF)
> 18:55:37.860932 192.168.1.212.nfs > GRVC-C.4206542240: reply ok 112 getattr DIR 40777 ids 99/99
[|nfs] (DF)
> 18:55:37.861123 GRVC-C.4223319456 > 192.168.1.212.nfs: 132 getattr [|nfs] (DF)
> 18:55:37.861147 192.168.1.212.nfs > GRVC-C.4223319456: reply ok 112 getattr REG 100777 ids 0/0 [|nfs] (DF)
> 18:55:37.861373 GRVC-C.4240096672 > 192.168.1.212.nfs: 132 getattr [|nfs] (DF)
> 18:55:37.861387 192.168.1.212.nfs > GRVC-C.4240096672: reply ok 112 getattr REG 100777 ids 0/0 [|nfs] (DF)
> -------------------------------------------------------------
> 
> As you see, two more getattr appeared in actimeo=100.
> 
> linux nfs client code seems different by actimeo option.
> but I can't verify the affect of nfs client cache . ( aspect of file attribute cacheing)
> 
> anybody help me?
> How can i verify the affect of nfs client cache.
> sorry poor my english.
-- 
/***************************************************
 *Neil Horman
 *Software Engineer
 *Red Hat, Inc.
 *nhorman <at> redhat.com
 *gpg keyid: 1024D / 0x92A74FA1
 *http://pgp.mit.edu
 ***************************************************/

-------------------------------------------------------
This SF.Net email is sponsored by: NEC IT Guy Games.
Get your fingers limbered up and give it your best shot. 4 great events, 4
opportunities to win big! Highest score wins.NEC IT Guy Games. Play to
win an NEC 61 plasma display. Visit http://www.necitguy.com/?r 
_______________________________________________
NFS maillist  -  NFS <at> lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

Trond Myklebust | 2 May 14:50 2005
Picon
Picon

Re: The missing link -

må den 02.05.2005 Klokka 17:49 (+1000) skreiv David Cureton:

> However on occasion we have noticed that the /server/Staff link just
> disappears for some SuSE9.2 clients and the user no longer has access to
> there home directory. All other clients still have the link available
> and they are fine.

What kind of filesystem are you exporting?

When this behaviour happens, what does the client report when you do an
'ls -l' on /server/Staff?

Cheers,
  Trond
--

-- 
Trond Myklebust <trond.myklebust <at> fys.uio.no>

-------------------------------------------------------
This SF.Net email is sponsored by: NEC IT Guy Games.
Get your fingers limbered up and give it your best shot. 4 great events, 4
opportunities to win big! Highest score wins.NEC IT Guy Games. Play to
win an NEC 61 plasma display. Visit http://www.necitguy.com/?r 
_______________________________________________
NFS maillist  -  NFS <at> lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

Trond Myklebust | 2 May 15:02 2005
Picon
Picon

Re: The actimeo seems not work

må den 02.05.2005 Klokka 19:49 (+0900) skreiv 김경표:
>  
> when mount option is actimeo=0 ( in linux client)
> linux-client> ls -al /mnt/test.test 
> linux-client>-rwxrwxrwx    1 root     root           12 Apr 29
> 16:59 /mnt/test.test
>  
> linux-server tcpdump output
> -----------------------------------------------------------
> 18:51:52.561558 GRVC-C.4105878944 > 192.168.1.212.nfs: 120 access [|
> nfs] (DF)
> 18:51:52.561588 192.168.1.212.nfs > GRVC-C.4105878944: reply ok 120
> access c 0000 (DF)
> 18:51:52.561683 GRVC-C.4122656160 > 192.168.1.212.nfs: 132 getattr [|
> nfs] (DF)
> 18:51:52.561759 192.168.1.212.nfs > GRVC-C.4122656160: reply ok 112
> getattr REG 100777 ids 0/0 [|nfs] (DF)
> -------------------------------------------------------------

ACCESS caching in the 2.4. kernels was very experimental (in fact the
ACCESS patch was never applied to the mainline kernel) and could explain
why this is happening.

Cheers,
  Trond

-------------------------------------------------------
This SF.Net email is sponsored by: NEC IT Guy Games.
Get your fingers limbered up and give it your best shot. 4 great events, 4
opportunities to win big! Highest score wins.NEC IT Guy Games. Play to
win an NEC 61 plasma display. Visit http://www.necitguy.com/?r 
_______________________________________________
NFS maillist  -  NFS <at> lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

Steven A. Falco | 2 May 19:40 2005

FC3 server -> Solaris client, directories come and go

I'm having a problem with directories "disappearing" using a Fedora 3 NFS server and a solaris 5.10 client (NFS v3).  When I first cd into the directory, an ls shows what I expect.  If I then wait an hour and redo the ls, I get:

    .: No such file or directory

If I then do a pwd, followed by another ls, the diretory contents reappear.  For example:
solara$ ls
.: No such file or directory
solara$ pwd
/proj/videorunner/hw/vr_enc/cae/PLD/lbenc/src
solara$ ls
AUDX_NEW/                  mt48lc8m16a2_.vhd
FLYWHEEL_DEC/              outdata0.txt
I captured the NFS traffic with ethereal.  When I do the initial ls, the sequence is a series of LOOKUP calls (walking the pathname) followed by a READDIRPLUS call to obtain the file names.

When I later do the ls, an ACCESS call is issued, and that call returns ERR_NOENT.  This is the first ACCESS call that I see in the trace.  After the pwd, when the ls works, I instead see a series of LOOKUPs that again traverses the pathname, followed by LOOKUPs for all the filenames in the directory (so obviously the directory contents were cached in the solaris machine).  So apparently, the ACCESS calls fail, but they are only issued by solaris in certain circumstances, in this case when doing an ls in a window that has been idle for an hour.

Curiously, all the permissions in the directories of the pathname include world read/execute, so I don't understand why the ACCESS call would return ERR_NOENT.  Also, if there were a permissions problem, I would expect that to show up during the initial cd into the directory.

I have not found anything like this in the archives.  Has anyone else seen this, and if so, are there any fixes/workarounds?  I can post all or part of the ethereal traces if that would help.

    Steve Falco

Peter Åstrand | 2 May 19:49 2005
Picon

Re: FC3 server -> Solaris client, directories come and go

On Mon, 2 May 2005, Steven A. Falco wrote:

> I'm having a problem with directories "disappearing" using a Fedora 3 NFS 
> server and a solaris 5.10 client (NFS v3).  When I first cd into the 
> directory, an ls shows what I expect.  If I then wait an hour and redo the 
> ls, I get:

This bug is in Red Hats Bugzilla: 
https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=150759. This is a 
big problem for us as well.

--

-- 
Peter Åstrand		Chief Developer
Cendio			www.thinlinc.com
Teknikringen 3		www.cendio.se
583 30 Linköping        Phone: +46-13-21 46 00

Gmane