Andrew Morton | 4 Feb 01:08 2004

Fw: BUG in jfscode


Begin forwarded message:

Date: Tue, 3 Feb 2004 13:37:10 +0100
From: Tobias Bengtsson <tobbe <at> tobbe.nu>
To: linux-kernel <at> vger.kernel.org
Subject: BUG in jfscode

Hi!

This output I got from a 2.6.2-rc3 compiled with the config attached 

cheers, Tobias

PS. Please cc me the replies to this thread as I'm not subscribed.

------------[ cut here ]------------
kernel BUG at fs/jfs/jfs_dmap.c:2686!
invalid operand: 0000 [#4]
CPU:    0
EIP:    0060:[<c02276d0>]    Not tainted
EFLAGS: 00210246
EIP is at dbBackSplit+0xe0/0x130
eax: 0000004b   ebx: 00000400   ecx: c0478e90   edx: 00200246
esi: 00000000   edi: ee587166   ebp: d264b90c   esp: d264b8e8
ds: 007b   es: 007b   ss: 0068
Process dpkg (pid: 22568, threadinfo=d264a000 task=ee623900)
Stack: c041d043 c041d031 00000a7e c041d400 0000000d 00000001 ee587001 0000000f 
       00000000 d264b960 c02274f0 ee587000 0000000f 0000000d 00000174 00000164 
       0000000f ee587000 ffffffff 09000000 efd1a904 0001e41a 00000000 00000000 
(Continue reading)

Dave Kleikamp | 5 Feb 22:22 2004
Picon

Re: BUG in jfscode

On Tue, 2004-02-03 at 06:37, Tobias Bengtsson wrote:
> Hi!

Hi, Sorry it's taken me so long to respond.

> kernel BUG at fs/jfs/jfs_dmap.c:2686!

I've seen a similar bug reported before, but it occurs only rarely.  I'm
not sure what the cause is.  My initial thought was the the block map
got corrupted, but after digging through the code I don't know if it's
that simple.

I would be interested if you can recreate the problem.  If so, I may be
able to put in some debug code to help determine where JFS is getting
confused.  Also, if it is block map corruption, running fsck -f against
the volume should fix that.

Thanks,
Shaggy
--

-- 
David Kleikamp
IBM Linux Technology Center
Dave Kleikamp | 11 Feb 17:35 2004
Picon

JFS default behavior (was: UTF-8 in file systems? xfs/extfs/etc.)

On Wed, 2004-02-11 at 00:39, Tim Connors wrote:
> I submitted a bug to the jfs people, because jfs incorrectly returns
> -EINVAL (this isn't even documented in man pages as a valid return
> from open()) from an open() on a filename with UTF-8 in it.
> 
> See http://www-124.ibm.com/developerworks/bugs/?func=detailbug&bug_id=3838&group_id=35
> and http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=229308
> 
> This was triggered just by upgrading the console-utils package in
> debian (the problem existed all along, except that when I first made
> the filesystem a jfs one, I reinstalled from backups, rather than
> reinstalling debian from scratch)

Yeah, JFS has poor default behavior based on CONFIG_NLS_DEFAULT.  I
attempted to explain why it works that way in the first bug listed above
if anyone is curious.

I think the right thing for JFS to do is to change the default behavior
to simply store the bytes as they are seen, and to only do charset
conversion when the iocharset mount option is explicitly set.  This may
impact some current users, but they will be able to get the old behavior
by setting iocharset to whatever CONFIG_NLS_DEFAULT is set to in the
running kernel.

I intend to make this change soon if there are no objections.

Thanks,
Shaggy
--

-- 
David Kleikamp
(Continue reading)

John P Janosik | 19 Feb 16:44 2004
Picon

JFS bug?

We got a kernel error in /var/log/messages when cancelling an unmount of a
JFS filesystem with ctrl-c on kernel 2.4.24 + jfsutils 1.1.4 on Intel.  I
am wondering if we should not have cancelled the unmount.  It had been
hanging for a few minutes and there did not appear to be any IO.  Here is
what is in the log:

Feb 18 09:19:09 rchs93hd kernel: BUG at jfs_logmgr.c:1483
assert(list_empty(&log->synclist))
Feb 18 09:19:09 rchs93hd kernel: kernel BUG at jfs_logmgr.c:1483!
Feb 18 09:19:09 rchs93hd kernel: invalid operand: 0000
Feb 18 09:19:09 rchs93hd kernel: CPU:    0
Feb 18 09:19:09 rchs93hd kernel: EIP:    0010:[<f0908dca>]    Not tainted
Feb 18 09:19:09 rchs93hd kernel: EFLAGS: 00010282
Feb 18 09:19:09 rchs93hd kernel: eax: 0000003f   ebx: 00000320   ecx:
00000000   edx: 00000001
Feb 18 09:19:09 rchs93hd kernel: esi: eccd2000   edi: ef05fd18   ebp:
ef05fc80   esp: eccd3eec
Feb 18 09:19:09 rchs93hd kernel: ds: 0018   es: 0018   ss: 0018
Feb 18 09:19:09 rchs93hd kernel: Process umount (pid: 8954,
stackpage=eccd3000)
Feb 18 09:19:09 rchs93hd kernel: Stack: f09165ee f09165e1 000005cb f09166ba
c1912248 ef88cca0 ef88cca0 ef88cca0
Feb 18 09:19:09 rchs93hd kernel:        eccd3f50 c015edc1 efffb470 ef88cca0
eefb4520 efc17d60 ef05fc80 eefd5400
Feb 18 09:19:09 rchs93hd kernel:        f08ed3f7 ef05fc80 00000002 00000000
eefb4120 eefd5400 efc17d60 f0918f60
Feb 18 09:19:09 rchs93hd kernel: Call Trace:    [<f09165ee>] [<f09165e1>]
[<f09166ba>] [<c015edc1>] [<f08ed3f7>]

Output of ksymoops:
(Continue reading)

Chris Tusa | 19 Feb 17:16 2004

Re: JFS bug?

How did you determine that the system was not processing the umount?
You said it appeared to hang, did you use another virtual console to check the 
system with iostat or top prior to pressing CTRL-C ? 

What size the the FS? 

Where any files shared using NFS or SAMBA on the FS?

Did this FS store any user home dirs or other data that may have been in use?

This will help us troubleshoot somewhat. Also, it will assist us in our 
documentation for Shark Linux.

--

-- 
Chris Tusa

Linisys, Computing Evolution
ctusa <at> linisys.com
504.464.4610 x1
- Shark Linux Software
- Linux Hardware Products
- Consulting

On Thursday 19 February 2004 09:44, John P Janosik wrote:
> We got a kernel error in /var/log/messages when cancelling an unmount of a
> JFS filesystem with ctrl-c on kernel 2.4.24 + jfsutils 1.1.4 on Intel.  I
> am wondering if we should not have cancelled the unmount.  It had been
> hanging for a few minutes and there did not appear to be any IO.  Here is
> what is in the log:
>
(Continue reading)

John P Janosik | 19 Feb 17:55 2004
Picon

Re: JFS bug?


Yes, top was used in another virtual console.  The filesystem size is
284337856K.  This is the /home filesystem and contains home directories for
local admin users as well as home directories accessed only via Samba.
Before trying the unmount we stopped Samba, had the admin users log out or
killed their sessions, and then verified there were no open files with
lsof.

We ended up rebooting the box.  The log replayed successfully on the way
back up and the filesystem got mounted rw.  I guess I forgot to mention
that the unmount on the way down for reboot did not hang but complained
that the filesystem was already unmounted.

Thanks,

John

                                                                           
             Chris Tusa                                                    
             <webmaster <at> sharkl                                             
             inux.com>                                                  To 
                                       John P Janosik/Rochester/IBM <at> IBMUS, 
             02/19/2004 10:16          jfs-discussion <at> www-124.southbury.us 
             AM                        f.ibm.com                           
                                                                        cc 

                                                                   Subject 
                                       Re: [Jfs-discussion] JFS bug?       

How did you determine that the system was not processing the umount?
(Continue reading)

Dave Kleikamp | 19 Feb 18:09 2004
Picon

Re: JFS bug?

On Thu, 2004-02-19 at 09:44, John P Janosik wrote:
> We got a kernel error in /var/log/messages when cancelling an unmount of a
> JFS filesystem with ctrl-c on kernel 2.4.24 + jfsutils 1.1.4 on Intel.  I
> am wondering if we should not have cancelled the unmount.

I'm not sure you can cancel the unmount since its in a system call. 
This failure will occur after a timeout as jfs waits for outstanding I/O
to complete.  Did it crash immediately when you hit cntl-c, or may the
cntl-c have had nothing to do with it?

> 
> We were trying to unmount the filesystem to replay the log and remount
> because we noticed that the filesystem had been remounted readonly the
> night before.  The filesystem was 99% full(~4Gig free out of ~280Gig) when
> were looking into this on 2/18, so maybe the filesystem filled up?  I don't
> see any indications of hardware problems on the raid array holding this
> filesystem.
> 
> Feb 17 19:32:50 rchs93hd kernel: ERROR: (device sd(8,17)): __get_metapage:
> mp->logical_size != size
> Feb 17 19:32:50 rchs93hd kernel: ERROR: (device sd(8,17)): remounting
> filesystem as read-only
> Feb 17 19:32:50 rchs93hd kernel:
> Feb 17 19:32:50 rchs93hd kernel: ERROR: (device sd(8,17)): __get_metapage:
> mp->logical_size != size
> Feb 17 19:32:50 rchs93hd last message repeated 73 times

This should never happen!  mp->logical_size and size should always be
4096.  The test is in there in case we ever add support for smaller
block sizes.  Something is very wrong here.
(Continue reading)

John P Janosik | 19 Feb 20:45 2004
Picon

Re: JFS bug?


jfs-discussion-admin <at> www-124.southbury.usf.ibm.com wrote on 02/19/2004
11:09:34 AM:

> On Thu, 2004-02-19 at 09:44, John P Janosik wrote:
> > We got a kernel error in /var/log/messages when cancelling an unmount
of a
> > JFS filesystem with ctrl-c on kernel 2.4.24 + jfsutils 1.1.4 on Intel.
I
> > am wondering if we should not have cancelled the unmount.
>
> I'm not sure you can cancel the unmount since its in a system call.
> This failure will occur after a timeout as jfs waits for outstanding I/O
> to complete.  Did it crash immediately when you hit cntl-c, or may the
> cntl-c have had nothing to do with it?

I checked with the person actually at the console and he says it happened
immediately but I can believe it was a coincidence.

>
> --
> David Kleikamp
> IBM Linux Technology Center
>
> _______________________________________________
> Jfs-discussion mailing list
> Jfs-discussion <at> www-124.ibm.com
> http://www-124.ibm.com/developerworks/oss/mailman/listinfo/jfs-discussion
Sumit Narayan | 23 Feb 19:21 2004
Picon

(no subject)

Hi,

I was looking for a method to know the JFS buffer cache size, and how to 
modify it. Could someone help me.

Thanks
Sumit

_________________________________________________________________
Click, drag and drop. My MSN is the simple way to design your homepage. 
http://click.atdmt.com/AVE/go/onm00200364ave/direct/01/
Peter Nelson | 2 Mar 05:46 2004
Picon

Desktop Filesystem Benchmarks in 2.6.3

I recently decided to reinstall my system and at the same time try a new 
file system. Trying to decide what filesystem to use I found a few 
benchmarks but either they don't compare all available fs's, are too 
synthetic (copy a source tree multiple times or raw i/o), or are meant 
for servers/databases (like Bonnie++). The two most file system 
intensive tasks I do regularly are `apt-get upgrade` waiting for the 
packages to extract and set themselves up and messing around with the 
kernel so I benchmarked these. To make it more realistic I installed 
ccache and did two compiles, one to fill the cache and a second using 
the full cache.

The tests I timed (in order):
  * Debootstrap to install base Debian system
  * Extract the kernel source
  * Run `make all` using the defconfig and an empty ccache
  * Copy the entire new directory tree
  * Run `make clean`
  * Run `make all` again, this time using the filled ccache
  * Deleting the entire directory tree

Here is summary of the results based upon what I am calling "dead" time 
calculated as `total time - user time`. As you can see in the full 
results on my website the user time is almost identical between 
filesystems, so I believe this is an accurate comparison. The dead time 
is then normalized using ext2 as a baseline (> 1 means it took that many 
times longer than ext2).

FS      deb     tar     make    cp      clean   make2   rm      total
ext2    1.00    1.00    1.00    1.00    1.00    1.00    1.00    1.00
ext3    1.12    2.47    0.88    1.16    0.91    0.93    3.01    1.13
(Continue reading)


Gmane