Stefan Orth | 1 Dec 2003 07:10
Picon
Favicon

Stefan Orth/Germany/IBM is out of the office.

I will be out of the office starting November 30, 2003 and will not return
until December 8, 2003.

I will respond to your message when I return.

-------------------------------------------------------
This SF.net email is sponsored by: SF.net Giveback Program.
Does SourceForge.net help you be more productive?  Does it
help you create better code?  SHARE THE LOVE, and help us help
YOU!  Click Here: http://sourceforge.net/donate/
_______________________________________________
NFS maillist  -  NFS <at> lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

Bogdan Costescu | 1 Dec 2003 11:58
Picon

Re: NFS server not responding

On Sun, 30 Nov 2003, seth vidal wrote:

> I bet it's related to that problem.

Nope... The NFS server for this cluster is a single AMD Athlon with 256MiB 
RAM. I did not see any kscand or equivalent taking so much CPU as 
described in the bug reports. When the load is high, all top users are 
nfsd threads.
One of the reports however reminded me of the readahead discussion. I did 
some tests some time ago with 2.4.20-based kernel and did not see much 
difference, however I will try it now too and write back if I see some 
advantage.

--

-- 
Bogdan Costescu

IWR - Interdisziplinaeres Zentrum fuer Wissenschaftliches Rechnen
Universitaet Heidelberg, INF 368, D-69120 Heidelberg, GERMANY
Telephone: +49 6221 54 8869, Telefax: +49 6221 54 8868
E-mail: Bogdan.Costescu <at> IWR.Uni-Heidelberg.De

-------------------------------------------------------
This SF.net email is sponsored by: SF.net Giveback Program.
Does SourceForge.net help you be more productive?  Does it
help you create better code?  SHARE THE LOVE, and help us help
YOU!  Click Here: http://sourceforge.net/donate/
_______________________________________________
NFS maillist  -  NFS <at> lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

(Continue reading)

Bogdan Costescu | 1 Dec 2003 12:38
Picon

Re: Re: [PATCH] SGI 905314 (1/2): make NFSSVC_MAXBLKSIZE depend on PAGE_SIZE

On Mon, 1 Dec 2003, Greg Banks wrote:

> It's not that simple; the ia64 port can be configured for 64K pages,
> which would result in nfsd reporting 128K for wtmax on UDP.

But allocating only 32K in this case means half a page... A waste. Then 
maybe TCP should get its own setting ?

--

-- 
Bogdan Costescu

IWR - Interdisziplinaeres Zentrum fuer Wissenschaftliches Rechnen
Universitaet Heidelberg, INF 368, D-69120 Heidelberg, GERMANY
Telephone: +49 6221 54 8869, Telefax: +49 6221 54 8868
E-mail: Bogdan.Costescu <at> IWR.Uni-Heidelberg.De

-------------------------------------------------------
This SF.net email is sponsored by: SF.net Giveback Program.
Does SourceForge.net help you be more productive?  Does it
help you create better code?  SHARE THE LOVE, and help us help
YOU!  Click Here: http://sourceforge.net/donate/
_______________________________________________
NFS maillist  -  NFS <at> lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

Greg Banks | 2 Dec 2003 01:26
Picon

Re: Re: [PATCH] SGI 905314 (1/2): make NFSSVC_MAXBLKSIZE dependon PAGE_SIZE

Bogdan Costescu wrote:
> 
> On Mon, 1 Dec 2003, Greg Banks wrote:
> 
> > It's not that simple; the ia64 port can be configured for 64K pages,
> > which would result in nfsd reporting 128K for wtmax on UDP.
> 
> But allocating only 32K in this case means half a page... A waste.

The 2.4 code now does a kmalloc(NFSSVC_MAXBLKSIZE+1024), which
ends up allocating nearly twice as many pages as we need, so it's
already that wasteful now.

Besides machines with 64K pages are going to be wasting a lot
more RAM elsewhere.

> Then
> maybe TCP should get its own setting ?

I plan on later submitting a patch against 2.6 to do that.

Greg.
--

-- 
Greg Banks, R&D Software Engineer, SGI Australian Software Group.
I don't speak for SGI.

-------------------------------------------------------
This SF.net email is sponsored by: SF.net Giveback Program.
Does SourceForge.net help you be more productive?  Does it
help you create better code?  SHARE THE LOVE, and help us help
(Continue reading)

Shivaji Navale | 2 Dec 2003 12:26
Picon

Mailbox corruption on The NFS server

Hi,

We have this peculiar problem for the Mailboxes of users
/var/spool/mail/username.
The mailboxes get corrupted asto the first 20-26 lines of mailbox get
DELETED.

We are using 2.4.20-18.8.um.1 kernel on the (LVS Director) which exports
the mail partition to 30 NFS/NIS clients.

i am not sure if this would be the right place
to ask this question, but this is a problem bugging us since long.

Googled extensively, but couldnt work out the proper solution.

Could anybody suggest, why this is happening and how it could be overcome.
Thanks a lot

portmapper gives following results

    100000    2   tcp    111  portmapper
    100000    2   udp    111  portmapper
    100024    1   udp  32768  status
    100024    1   tcp  32768  status
    100004    2   udp    953  ypserv
    100004    1   udp    953  ypserv
    100004    2   tcp    956  ypserv
    100004    1   tcp    956  ypserv
    391002    2   tcp  32769  sgi_fam
    100003    2   udp   2049  nfs
(Continue reading)

Douglas Furlong | 2 Dec 2003 15:37
Favicon

Re: NFS server not responding

On Fri, 2003-11-28 at 16:56, Trond Myklebust wrote:
> >>>>> " " == Bogdan Costescu <bogdan.costescu <at> iwr.uni-heidelberg.de> writes:
> 
> 
>      > I also see something like 0.8-1% retransmissions and these
>      > messages on newly installed Fedora Core 1 on some cluster
>      > nodes, using default r/wsize (8192). As I'm using root-NFS, the
>      > node is quite useless when this situation happens. I'm sure
> 
> Huh? Why should a 1% retransmission make a noticable difference? Be
> realistic: we're talking about a delay of 100ms on 1/100 requests...

If this was the case then i would agree that there is no problem at all,
but I am noticing delays of three or four seconds when opening up a new
mail in Evolution, or downloading new mail off of the IMAP server (which
get stored in the users home directory on the NFS server). When typing
in to a mail I will find the text freezes for several seconds, which is
fine for me (touch type with accuracy) but other people that are less
secure working on PC (read most people I deal with), they find this sort
of behaviour unacceptable (which i agree with).

I have found that all of these error's coincide with the NFS server not
responding error messages. Before making the changes to the retrans
values I was finding messages appearing as "blank" in evolution as the
initial download from the IMAP server would fail due to not being able
to write to disk, however evolution would think that it had, and would
just show empty emails (exceedingly annoying).
Now I am not receiving any error messages just moments when applications
"freeze", the rest of the system is fine, and I just have to give it a
few seconds and all is back to normal.
(Continue reading)

Trond Myklebust | 2 Dec 2003 16:37
Picon
Picon

Re: NFS server not responding

>>>>> " " == Douglas Furlong <douglas.furlong <at> firebox.com> writes:

     > If this was the case then i would agree that there is no
     > problem at all, but I am noticing delays of three or four
     > seconds when opening up a new mail in Evolution, or downloading
     > new mail off of the IMAP server (which get stored in the users
     > home directory on the NFS server). When typing in to a mail I
     > will find the text freezes for several seconds, which is fine
     > for me (touch type with accuracy) but other people that are
     > less secure working on PC (read most people I deal with), they
     > find this sort of behaviour unacceptable (which i agree with).

Nobody on this list is directly responsible for the Fedore Core 1
kernel, so whining about what is or isn't acceptable in it won't help.

I have no problems on *any* of the machines in the test-rigs I have at
my disposition when using a standard 2.4.23 kernel.  For the record,
those few that I have used with the Fedora kernel have been fine too
(though I haven't made any detailed tests of that)

     > I have found that all of these error's coincide with the NFS
     > server not responding error messages. Before making the changes

That's no surprise, but a <1-2% retransmission frequency
_DOES_NOT_SUFFICE_ to explain an NFS server not responding messageq. If
those retransmissions are randomly distributed (as they should
normally be) then we're talking unnoticable delays.

If, OTOH, the retransmissions are all occurring at once, then that
might explain it ('cos retransmissions follow an exponential rule
(Continue reading)

Juri Haberland | 2 Dec 2003 19:51

Re: Mailbox corruption on The NFS server

Shivaji Navale <shivaji <at> ee.iitb.ac.in> wrote:
> Hi,

Hi,

please don't start a new topic/thread by just replying to another mail.
Thanks.

> We have this peculiar problem for the Mailboxes of users
> /var/spool/mail/username.
> The mailboxes get corrupted asto the first 20-26 lines of mailbox get
> DELETED.
> 
> We are using 2.4.20-18.8.um.1 kernel on the (LVS Director) which exports
> the mail partition to 30 NFS/NIS clients.

It's considered a bad idea to put mailboxes on a NFS share as there
might be locking issues if two applications simultanously acces the same
mailbox. Either use the maildir format or don't export your mailboxes
via NFS.

Regards,
Juri

--

-- 
Juri Haberland  <juri <at> koschikode.com> 

-------------------------------------------------------
This SF.net email is sponsored by: SF.net Giveback Program.
Does SourceForge.net help you be more productive?  Does it
(Continue reading)

Trond Myklebust | 2 Dec 2003 20:35
Picon
Picon

Re: Mailbox corruption on The NFS server

>>>>> " " == Juri Haberland <list-linux.nfs <at> spoiled.org> writes:

     > It's considered a bad idea to put mailboxes on a NFS share as
     > there might be locking issues if two applications simultanously
     > acces the same mailbox. Either use the maildir format or don't
     > export your mailboxes via NFS.

Sort of. It can be made to work *provided* that you can guarantee that
your mail programs all agree to support the same file locking scheme.
Currently that means they must chose one (or both) of the following
schemes:

  - fcntl() (a.k.a. POSIX, a.k.a. lockf()) locking

  - dotlocking (a.k.a. creating a lock file using something like 'ln
    mailbox .mailbox.locked')

Note that the BSD flock() and use of O_EXCL in a dotlocking scheme are
not considered to be reliable within Linux NFS.

Cheers,
  Trond

-------------------------------------------------------
This SF.net email is sponsored by: SF.net Giveback Program.
Does SourceForge.net help you be more productive?  Does it
help you create better code?  SHARE THE LOVE, and help us help
YOU!  Click Here: http://sourceforge.net/donate/
_______________________________________________
NFS maillist  -  NFS <at> lists.sourceforge.net
(Continue reading)

Philippe Troin | 2 Dec 2003 20:56

2.4.23: Killed process on NFS client can result in lost lock on server

The problem described in the enclosed mail still occurs in 2.4.23. If
anybody cares.

Applying the enclosed patch from Trond makes the problem less
frequent, but it still occurs.

Phil.

From: Philippe Troin <phil <at> fifi.org>
Subject: [NFS] Killed process on NFS client can result in lost lock on server
Date: 2003-09-30 20:06:23 GMT
I've noticed this first with bogofilter, and was able to reproduce the
problem with the enclosed test program.

Setup: kernel 2.4.22 and nfs-utils 1.0.5

A (nfs) client mounts a file system from the (nfs) server with these
options (from /proc/mounts):

server:/fs /fs nfs rw,nodev,v3,rsize=8192,wsize=8192,hard,intr,udp,lock,addr=server

If a process running on the (nfs) client is killed by a signal while
holding a lock on a (nfs) file, the server might not relinquish the
(Continue reading)


Gmane