John Goerzen | 2 Jul 15:53 2004

Starvation with find?

Hello,

I have JFS in a 2.6.4-based workstation.  My cron peridically runs
updatedb, which basically runs find to populate the database used by
locate.

When it runs, my computer gets *S L O W*.  I'm talking so slow that it
takes more than 30 seconds to start up mutt when it would normally take
only a fraction of a second.  The find also seems to take far longer
than it should.

If I kill the find process, things immediately return to normal.

CPU usage usually remains low but occasionally spends a couple of
seconds at 100%.  

Any thoughts?

-- John
John Goerzen | 2 Jul 16:04 2004

Re: Starvation with find?

On Fri, Jul 02, 2004 at 08:53:04AM -0500, John Goerzen wrote:
> If I kill the find process, things immediately return to normal.
> 
> CPU usage usually remains low but occasionally spends a couple of
> seconds at 100%.  
> 
> Any thoughts?

Incidentally, I'm having similar problems with amanda, and with it, I'm
seeing errors like this:

sendsize.20040628190003.debug:sendsize[21127]: time 3337.219: /bin/tar:
./src/ke
rnel-source-2.6.6/arch/i386/boot/compressed/vmlinux.bin: Warning: Cannot
seek to
 0: Bad file descriptor

Strange, eh?
John Goerzen | 2 Jul 16:58 2004

Re: Starvation with find?

On Fri, Jul 02, 2004 at 03:54:57PM +0100, Antonio P. P. Almeida wrote:
> > CPU usage usually remains low but occasionally spends a couple of
> > seconds at 100%.  
> > 
> > Any thoughts?
> 
> This is just to say that when I installed 2.6.4 in my laptop it
> presented similar symptoms. IIRC, there was *heavy*, really *heavy*
> I/O activity, ditto for the CPU -- basically the machine was less than

Glad to know I am not alone :-)

> sluggish when doing the daily updatedb. That's when I decided to stay
> with 2.4 until the 2.6 problems -- or what seems to be problems -- get
> sorted out.

I should also add: I have not seen this problem in reiser, ext2, or
ext3.

-- John
Antonio P. P. Almeida | 2 Jul 16:54 2004

Re: Starvation with find?

From: John Goerzen <jgoerzen <at> complete.org>
Subject: [Jfs-discussion] Starvation with find?
Date: Fri, 2 Jul 2004 08:53:04 -0500

> Hello,
> 
> I have JFS in a 2.6.4-based workstation.  My cron peridically runs
> updatedb, which basically runs find to populate the database used by
> locate.
> 
> When it runs, my computer gets *S L O W*.  I'm talking so slow that it
> takes more than 30 seconds to start up mutt when it would normally take
> only a fraction of a second.  The find also seems to take far longer
> than it should.
> 
> If I kill the find process, things immediately return to normal.
> 
> CPU usage usually remains low but occasionally spends a couple of
> seconds at 100%.  
> 
> Any thoughts?

This is just to say that when I installed 2.6.4 in my laptop it
presented similar symptoms. IIRC, there was *heavy*, really *heavy*
I/O activity, ditto for the CPU -- basically the machine was less than
sluggish when doing the daily updatedb. That's when I decided to stay
with 2.4 until the 2.6 problems -- or what seems to be problems -- get
sorted out.

Regards,
(Continue reading)

Dave Kleikamp | 2 Jul 20:04 2004
Picon

Re: Starvation with find?

On Fri, 2004-07-02 at 09:58, John Goerzen wrote:
> On Fri, Jul 02, 2004 at 03:54:57PM +0100, Antonio P. P. Almeida wrote:
> > > CPU usage usually remains low but occasionally spends a couple of
> > > seconds at 100%.  
> > > 
> > > Any thoughts?
> > 
> > This is just to say that when I installed 2.6.4 in my laptop it
> > presented similar symptoms. IIRC, there was *heavy*, really *heavy*
> > I/O activity, ditto for the CPU -- basically the machine was less than
> 
> Glad to know I am not alone :-)
> 
> > sluggish when doing the daily updatedb. That's when I decided to stay
> > with 2.4 until the 2.6 problems -- or what seems to be problems -- get
> > sorted out.
> 
> I should also add: I have not seen this problem in reiser, ext2, or
> ext3.

updatedb is notorious for causing everything else to swap out of memory,
but I don't know why this would affect jfs more than any other file
system.  jfs does have a larger in-memory inode, but I would expect the
result to be that fewer jfs inode would be cached.  I would think that
with updatedb running, you'd eventually push everything useful out of
memory anyway.

Anyway, I ran updatedb against my jfs volumes on 2.6.7 and 2.4.27-rc2,
both with and without noatime, and neither brought the system anywhere
near a crawl.  I don't know of any changes in jfs between 2.6.4 & 2.6.7
(Continue reading)

Dave Kleikamp | 2 Jul 20:16 2004
Picon

Re: Starvation with find?

On Fri, 2004-07-02 at 09:04, John Goerzen wrote:
> On Fri, Jul 02, 2004 at 08:53:04AM -0500, John Goerzen wrote:
> > If I kill the find process, things immediately return to normal.
> > 
> > CPU usage usually remains low but occasionally spends a couple of
> > seconds at 100%.  
> > 
> > Any thoughts?
> 
> Incidentally, I'm having similar problems with amanda, and with it, I'm
> seeing errors like this:
> 
> sendsize.20040628190003.debug:sendsize[21127]: time 3337.219: /bin/tar:
> ./src/ke
> rnel-source-2.6.6/arch/i386/boot/compressed/vmlinux.bin: Warning: Cannot
> seek to
>  0: Bad file descriptor
> 
> Strange, eh?

Yes, it is strange.  Is that file otherwise accessible?

I haven't responded to your earlier email about amanda.  I've been
trying to recall any similar problems I might have seen.  What 2.4
kernel are you running?
--

-- 
David Kleikamp
IBM Linux Technology Center
John Goerzen | 2 Jul 20:42 2004

Re: Starvation with find?

On Fri, Jul 02, 2004 at 01:16:51PM -0500, Dave Kleikamp wrote:
> > sendsize.20040628190003.debug:sendsize[21127]: time 3337.219: /bin/tar:
> > ./src/ke
> > rnel-source-2.6.6/arch/i386/boot/compressed/vmlinux.bin: Warning: Cannot
> > seek to
> >  0: Bad file descriptor
> > 
> > Strange, eh?
> 
> Yes, it is strange.  Is that file otherwise accessible?

Yes.  

> I haven't responded to your earlier email about amanda.  I've been
> trying to recall any similar problems I might have seen.  What 2.4
> kernel are you running?

Vanilla 2.4.26 plus vserver patch.

Why is it that updatedb would cause things to swap out?  It shouldn't be
using much RAM.  Do we have a bug in the VM system where it
over-aggressively maximizes cache size?

-- John
Dave Kleikamp | 2 Jul 20:59 2004
Picon

Re: Starvation with find?

On Fri, 2004-07-02 at 13:42, John Goerzen wrote:
> On Fri, Jul 02, 2004 at 01:16:51PM -0500, Dave Kleikamp wrote:
> > > sendsize.20040628190003.debug:sendsize[21127]: time 3337.219: /bin/tar:
> > > ./src/ke
> > > rnel-source-2.6.6/arch/i386/boot/compressed/vmlinux.bin: Warning: Cannot
> > > seek to
> > >  0: Bad file descriptor
> > > 
> > > Strange, eh?
> > 
> > Yes, it is strange.  Is that file otherwise accessible?
> 
> Yes.

Then I really don't understand it.

> Vanilla 2.4.26 plus vserver patch.

That's pretty current.  :^)

> Why is it that updatedb would cause things to swap out?  It shouldn't be
> using much RAM.  Do we have a bug in the VM system where it
> over-aggressively maximizes cache size?

I've observed, that jfs's inode cache grows really big.  Look at the
jfs_ip entry in /proc/slabinfo.

The behavior is tunable with /proc/sys/vm/swappiness.  I'm not a vm
expert, but I found this, which explains it better than I could:
http://kerneltrap.org/node/view/3000
(Continue reading)

Markus Raab | 3 Jul 07:57 2004
Picon
Picon

Re: Starvation with find?

Am Freitag, 2. Juli 2004 15:53 schrieb John Goerzen:

> CPU usage usually remains low but occasionally spends a couple of
> seconds at 100%.

What type of CPU usage? I had problems that io-wait spent 100% and there was 
the behavior you described. io-wait is a new diagnostic cpu value, maybe you 
have to update top (is in psproc) to see it.

Markus
Henrik Hellerstedt | 5 Jul 14:08 2004

trashed jfs system


I have a "small" problem with my jfs partition. The box is hp nc-8000
laptop with a plain filsystem layout, only /boot and /.

It started with me running a find (which i doubt is the reason).
The box froze and some error messages appeared in dmesg. At the time
i didnt think the error was that big so i didnt save that dmesg :(

Rebooted the box into singel user and tried to run jfs fsck on
the partition. It failed, complain it was missing libuuid and libacl,
i checked and they existed in /lib but i was not allowed to access
them. So i gave this up.

I installed RIP into the swapspace (currently converted to fat) 
and had great hope it would be able to fix my broken jfs partition,
but its fsck also fails.

All info i could think of is avaliable at http://ecure.se/~henrik/jfs
If more info is needed i will gladly supply it, just tell me whats
needed.

I also made an "long" S.M.A.R.T check to rule out any problems with
the hardware. smartctl -t long /dev/hda reports no errors at all.

Possible reason to the problem? Will it happen again?
Can i solve it? Should i take this to some other list?

Any help is welcome.

TIA / Henrik Hellerstedt <henrik <at> anka.org>
(Continue reading)


Gmane