bugzilla-daemon | 1 Oct 2010 17:47

[Bug 19432] New: ext4 crashes in journal code/barrier

https://bugzilla.kernel.org/show_bug.cgi?id=19432

           Summary: ext4 crashes in journal code/barrier
           Product: File System
           Version: 2.5
    Kernel Version: Ubuntu Linux 2.6.32-24-generic
          Platform: All
        OS/Version: Linux
              Tree: Mainline
            Status: NEW
          Severity: normal
          Priority: P1
         Component: ext4
        AssignedTo: fs_ext4 <at> kernel-bugs.osdl.org
        ReportedBy: valerio <at> aimale.com
        Regression: No

I've seen several kernel crashes inside the ext4 journal/barrire code. It's a
db mysql server with Adaptec RAID Controller. The RAID controller is an Adaptec
RAID 52445, with battery-backed write cache

06:00.0 RAID bus controller: Adaptec AAC-RAID (rev 09)
    Subsystem: Adaptec Device 02d0
    Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr-
Stepping- SERR- FastB2B- DisINTx-
    Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort-
<MAbort- >SERR- <PERR- INTx-
    Latency: 0, Cache Line Size: 32 bytes
    Interrupt: pin A routed to IRQ 16
    Region 0: Memory at fb200000 (64-bit, non-prefetchable) [size=2M]
(Continue reading)

Lukas Czerner | 1 Oct 2010 17:58
Picon
Favicon

Re: [PATCH 0/6 v4] Lazy itable initialization for Ext4

On Wed, 29 Sep 2010, Lukas Czerner wrote:

> On Tue, 28 Sep 2010, Ted Ts'o wrote:
> 
> > On Thu, Sep 16, 2010 at 02:47:25PM +0200, Lukas Czerner wrote:
> > > 
> > > as Mike suggested I have rebased the patch #1 against Jens'
> > > linux-2.6-block.git 'for-next' branch and changed sb_issue_zeroout()
> > > to cope with the new blkdev_issue_zeroout(), and changed
> > > sb_issue_zeroout() to the new syntax everywhere I am using it.
> > > Also some typos gets fixed.
> > 
> > We may have a problem with the lazy_itable patches.  I've tried
> > running the XFSTESTS three times now.  This was with a system where
> > mke2fs was setup (via /etc/mke2fs.conf) to always format the file
> > system using lazy_itable_init.  This meant that any of the xfstests
> > which reformated the scratch partition and then started a stress test
> > would stress the newly added itable initialization code.
> > Unfortunately the results weren't good.
> > 
> > The first time, I got the following soft lockup warning:
> > 
> > [ 2520.528745] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> > [ 2520.531445]  ef2b8e44 00000046 00000007 e29c1500 e29c1500 e29c1760 e29c175c c0b55500
> > [ 2520.534983]  c0b55500 e29c175c c0b55500 c0b55500 c0b55500 32423426 00000224 00000000
> > [ 2520.538270]  00000224 e29c1500 00000001 ef205000 00000005 ef2b8e74 ef2b8e80 c026eb2c
> > [ 2520.541743] Call Trace:
> > [ 2520.542742]  [<c026eb2c>] jbd2_log_wait_commit+0x103/0x14f
> > [ 2520.544291]  [<c01711dc>] ? autoremove_wake_function+0x0/0x34
> > [ 2520.545816]  [<c026bf95>] jbd2_log_do_checkpoint+0x1a8/0x458
(Continue reading)

Lukas Czerner | 1 Oct 2010 18:00
Picon
Favicon

[PATCH 4/6 fixed] Use sb_issue_zeroout in setup_new_group_blocks

Use sb_issue_zeroout to zero out inode table and descriptor table
blocks instead of old approach which involves journaling.

Signed-off-by: Lukas Czerner <lczerner <at> redhat.com>
---
 fs/ext4/resize.c |   47 +++++++++++++++--------------------------------
 1 files changed, 15 insertions(+), 32 deletions(-)

diff --git a/fs/ext4/resize.c b/fs/ext4/resize.c
index ca5c8aa..49c8aff 100644
--- a/fs/ext4/resize.c
+++ b/fs/ext4/resize.c
 <at>  <at>  -226,23 +226,16  <at>  <at>  static int setup_new_group_blocks(struct super_block *sb,
 	}

 	/* Zero out all of the reserved backup group descriptor table blocks */
-	for (i = 0, bit = gdblocks + 1, block = start + bit;
-	     i < reserved_gdb; i++, block++, bit++) {
-		struct buffer_head *gdb;
-
-		ext4_debug("clear reserved block %#04llx (+%d)\n", block, bit);
+	ext4_debug("clear inode table blocks %#04llx -> %#04llx\n",
+			block, sbi->s_itb_per_group);
+	err = sb_issue_zeroout(sb, gdblocks + start + 1, reserved_gdb,
+			       GFP_NOFS, BLKDEV_IFL_WAIT);
+	if (err)
+		goto exit_bh;

-		if ((err = extend_or_restart_transaction(handle, 1, bh)))
-			goto exit_bh;
(Continue reading)

Theodore Ts'o | 1 Oct 2010 23:35
Picon
Picon
Favicon
Gravatar

2.6.35-rc6 REGRESSION: Dirtiable inode bdi default != sb bdi ext2/ext3/ext4


I'm not sure this is related to:

http://bugzilla.kernel.org/show_bug.cgi?id=19062

(2.6.35-rc6 REGRESSION: Dirtiable inode bdi default != sb bdi btrfs)

or not, but this is a problem that did not exist in 2.6.36-rc3 and
showed up when I tried going to 2.6.36-rc6.

The symptoms are that if I mount a filesystem, whether it be ext2, ext3,
or ext4, modify it slightly (say, create a file or a directory), then
umount the filesystem, and run "e2fsck -f" on that filesystem, I get the
warning:

[  866.543173] WARNING: at /usr/projects/linux/ext4/fs/fs-writeback.c:87 ino_bdi+0x4e/0x5c()
[  866.546156] Hardware name: 
[  866.547415] Dirtiable inode bdi block != sb bdi block
[  866.556113] Modules linked in:
[  866.557522] Pid: 1993, comm: e2fsck Tainted: G        W   2.6.36-rc6-0004aa513 #722
[  866.560365] Call Trace:
[  866.561475]  [<c015a2e2>] warn_slowpath_common+0x6a/0x7f
[  866.563312]  [<c020a762>] ? inode_to_bdi+0x4e/0x5c
[  866.565047]  [<c015a36a>] warn_slowpath_fmt+0x2b/0x2f
[  866.566852]  [<c020a762>] inode_to_bdi+0x4e/0x5c
[  866.568457]  [<c020b6c3>] __mark_inode_dirty+0xaf/0x162
[  866.570242]  [<c0202305>] file_update_time+0xcc/0xe9
[  866.571924]  [<c01c68dd>] __generic_file_aio_write+0x136/0x28f
[  866.573770]  [<c02145a4>] blkdev_aio_write+0x33/0x72
[  866.575480]  [<c01f20da>] do_sync_write+0x8f/0xca
(Continue reading)

Ted Ts'o | 2 Oct 2010 04:31
Picon
Picon
Favicon
Gravatar

Re: I/O topology fixes for big physical block size

On Fri, Oct 01, 2010 at 06:19:21PM -0400, Martin K. Petersen wrote:
> Since not all drives guarantee that read-modify-write cycle on a 4 KiB
> physical block won't clobber adjacent 512-byte logical blocks it may be
> a good idea to look at physical block size if there are atomicity
> concerns.  I.e. filesystems that depend on atomic journal writes may
> want to look at the reported physical block size.

OK, but what do we do when we start seeing devices with 8k or 16k
physical block sizes?  The VM doesn't deal well with block sizes >
page size.

					- Ted
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo <at> vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Daniel Taylor | 2 Oct 2010 05:03
Picon
Favicon

RE: I/O topology fixes for big physical block size


> -----Original Message-----
> From: linux-ext4-owner <at> vger.kernel.org 
> [mailto:linux-ext4-owner <at> vger.kernel.org] On Behalf Of Ted Ts'o
> Sent: Friday, October 01, 2010 7:31 PM
> To: Martin K. Petersen
> Cc: Mike Snitzer; Eric Sandeen; Jens Axboe; 
> James.Bottomley <at> hansenpartnership.com; 
> linux-scsi <at> vger.kernel.org; linux-ext4 <at> vger.kernel.org
> Subject: Re: I/O topology fixes for big physical block size
> 
> On Fri, Oct 01, 2010 at 06:19:21PM -0400, Martin K. Petersen wrote:
> > Since not all drives guarantee that read-modify-write cycle 
> on a 4 KiB
> > physical block won't clobber adjacent 512-byte logical 
> blocks it may be
> > a good idea to look at physical block size if there are atomicity
> > concerns.  I.e. filesystems that depend on atomic journal writes may
> > want to look at the reported physical block size.
> 
> OK, but what do we do when we start seeing devices with 8k or 16k
> physical block sizes?  The VM doesn't deal well with block sizes >
> page size.

This is a very real concern.

Those drives already exist, in essence, in RAID configurations, and
we have had to do a workaround that complicates our production process
to handle file systems for embedded devices where the file system block
size is 64K (the kernel block size for the device is also 64K), but
(Continue reading)

bugzilla-daemon | 2 Oct 2010 12:33

[Bug 19502] New: 'losetup -c' on a mounted loopN device with ext[34] causes soft lockup

https://bugzilla.kernel.org/show_bug.cgi?id=19502

           Summary: 'losetup -c' on a mounted loopN device with ext[34]
                    causes soft lockup
           Product: File System
           Version: 2.5
    Kernel Version: 2.6.35.6, 2.6.35.7
          Platform: All
        OS/Version: Linux
              Tree: Mainline
            Status: NEW
          Severity: normal
          Priority: P1
         Component: ext3
        AssignedTo: fs_ext3 <at> kernel-bugs.osdl.org
        ReportedBy: sliedes <at> cc.hut.fi
        Regression: No

(This log is from qemu, but I have reproduced this on real x86-64 hardware too)

*** Steps to reproduce ***

1. mkdir test
2. mount none test -t tmpfs
3. cd test
4. dd if=/dev/zero bs=1M count=128 >img
5. mkfs.ext3 img # OR ext4
6. mount img /media/ -o loop
7. cp ../.bashrc /media/
7b. OPTIONAL (here not done): dd if=/dev/zero bs=1M count=128 >>img
(Continue reading)

bugzilla-daemon | 2 Oct 2010 18:42

[Bug 17361] Watchdog detected hard LOCKUP in jbd2_journal_get_write_access

https://bugzilla.kernel.org/show_bug.cgi?id=17361

Theodore Tso <tytso <at> mit.edu> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |tytso <at> mit.edu

--- Comment #14 from Theodore Tso <tytso <at> mit.edu>  2010-10-02 16:42:45 ---
People may not be paying attention to this due to the subject line.

Except for the initial bug report, none of the other stack traces have anything
to do with ext4/jbd2.    And in the initial ext4 trace, we see the complaint
that we're calling might_sleep() in ext4_mark_inode_dirty(), in a code path
where we are manifestly not taking any spinlocks.    And in fact we don't see
any spinlocks being taken at the point where the complaint is mode in
ext4_mark_inode_dirty().   Yet preempt_count > 1.

It looks to me like some unrelated piece of code is bumping preempt_count, and
not decrementing it.  Maybe in some code which is called from an interrupt
handler, in some device driver?   That might explain why you're getting
failures all over the kernel.

It may be worth closing this report, and opening several new ones, one for each
failure, and make it clear this is not an ext4-related problem, since the
subject line and component assigned for this bug is highly misleading.

Including your kernel config would also be useful when you do that.

--

-- 
(Continue reading)

Ted Ts'o | 2 Oct 2010 21:55
Picon
Picon
Favicon
Gravatar

Re: [PATCH 0/6 v4] Lazy itable initialization for Ext4

On Fri, Oct 01, 2010 at 05:58:52PM +0200, Lukas Czerner wrote:
> 
> After extensive xfstest-ing I have not been able to reproduce it.
> However, after a while hammering it with other stress test (the one
> I have proposed to test batched discard implementation with) I have
> got a panic due to not up-to-date buffer_head in submit_bh() :
> kernel BUG at fs/buffer.c:2910! - I have been able to reproduce it
> every time (on different BUG_ON sometimes)

I found it --- or at least I found one of the problems.

The call to ext4_unregister_li_request(sb) comes *after* the call to
jbd2_journal_destroy().  If while we are destroying the journal, we
get unlucky and call ext4_init_inode_table(), then we end up creating
a handle after the journal thread is shutdown, during the final call
to jbd2_journal_commit_transaction(), but before
jbd2_journal_destroy() calls jbd2_log_do_checkpoint(), then we end up
waiting forever in jbd2_log_wait_commit().

This shouldn't however lock up the system tight enough that it doesn't
respond to magic sysrq, but I haven't seen that problem since I moved
from 2.6.36-rc3 to 2.6.36-rc6.  I do see this problem, which is
definitely a bug.

I am getting a lot of warnings from fs/writeback.c:76 (Dirtiable inode
bdi block != sb bdi block) which I have been commenting out for now,
since it seems to be noisy but otherwise relatively harmless.

I also found a bug in ext4_init_inode_table() where you compare 
(num > EXT4_INODES_PER_GROUP(sb)) in ext4_init_inode_table(), which
(Continue reading)

Theodore Ts'o | 2 Oct 2010 22:36
Picon
Picon
Favicon
Gravatar

[PATCH] jbd2: Add sanity check for attempts to start handle during umount

If there is an attempt to modify the file system during the call to
jbd2_destroy_journal(), it can lead to a system lockup.  So add some
checking to make it much more obvious when this happens to and to
determine where the offending code is located.

Signed-off-by: "Theodore Ts'o" <tytso <at> mit.edu>
---
 fs/jbd2/checkpoint.c  |   10 ++++++++++
 fs/jbd2/transaction.c |    1 +
 2 files changed, 11 insertions(+), 0 deletions(-)

diff --git a/fs/jbd2/checkpoint.c b/fs/jbd2/checkpoint.c
index 5247e7f..524800d 100644
--- a/fs/jbd2/checkpoint.c
+++ b/fs/jbd2/checkpoint.c
 <at>  <at>  -299,6 +299,16  <at>  <at>  static int __process_buffer(journal_t *journal, struct journal_head *jh,
 		transaction->t_chp_stats.cs_forced_to_close++;
 		spin_unlock(&journal->j_list_lock);
 		jbd_unlock_bh_state(bh);
+		if (unlikely(journal->j_flags & JBD2_UNMOUNT))
+			/*
+			 * The journal thread is dead; so starting and
+			 * waiting for a commit to finish will cause
+			 * us to wait for a _very_ long time.
+			 */
+			printk(KERN_ERR "JBD2: %s: "
+			       "Waiting for Godot: block %llu\n",
+			       journal->j_devname,
+			       (unsigned long long) bh->b_blocknr);
 		jbd2_log_start_commit(journal, tid);
(Continue reading)


Gmane