Dave Chinner | 1 Apr 2011 01:40

Re: [RFC][PATCH] Re: [BUG] ext4: cannot unfreeze a filesystem due to a deadlock

On Mon, Mar 28, 2011 at 05:06:28PM +0900, Toshiyuki Okajima wrote:
> Hi.
> 
> On Thu, 17 Feb 2011 11:45:52 +0100
> Jan Kara <jack <at> suse.cz> wrote:
> > On Thu 17-02-11 12:50:51, Toshiyuki Okajima wrote:
> > > (2011/02/16 23:56), Jan Kara wrote:
> > > >On Wed 16-02-11 08:17:46, Toshiyuki Okajima wrote:
> > > >>On Tue, 15 Feb 2011 18:29:54 +0100
> > > >>Jan Kara<jack <at> suse.cz>  wrote:
> > > >>>On Tue 15-02-11 12:03:52, Ted Ts'o wrote:
> > > >>>>On Tue, Feb 15, 2011 at 05:06:30PM +0100, Jan Kara wrote:
> > > >>>>>Thanks for detailed analysis. Indeed this is a bug. Whenever we do IO
> > > >>>>>under s_umount semaphore, we are prone to deadlock like the one you
> > > >>>>>describe above.
> > > >>>>
> > > >>>>One of the fundamental problems here is that the freeze and thaw
> > > >>>>routines are using down_write(&sb->s_umount) for two purposes.  The
> > > >>>>first is to prevent the resume/thaw from racing with a umount (which
> > > >>>>it could do just as well by taking a read lock), but the second is to
> > > >>>>prevent the resume/thaw code from racing with itself.  That's the core
> > > >>>>fundamental problem here.
> > > >>>>
> > > >>>>So I think we can solve this by introduce a new mutex, s_freeze, and
> > > >>>>having the the resume/thaw first take the s_freeze mutex and then
> > > >>>>second take a read lock on the s_umount.
> > > >>>   Sadly this does not quite work because even down_read(&sb->s_umount)
> > > >>>in thaw_super() can block if there is another process that tries to acquire
> > > >>>s_umount for writing - a situation like:
> > > >>>   TASK 1 (e.g. flusher)		TASK 2	(e.g. remount)		TASK 3 (unfreeze)
(Continue reading)

Eric Sandeen | 1 Apr 2011 01:53
Picon
Favicon
Gravatar

Re: [RFC][PATCH] Re: [BUG] ext4: cannot unfreeze a filesystem due to a deadlock

On 3/31/11 6:40 PM, Dave Chinner wrote:
> On Mon, Mar 28, 2011 at 05:06:28PM +0900, Toshiyuki Okajima wrote:
>> Hi.
>>
>> On Thu, 17 Feb 2011 11:45:52 +0100
>> Jan Kara <jack <at> suse.cz> wrote:
>>> On Thu 17-02-11 12:50:51, Toshiyuki Okajima wrote:
>>>> (2011/02/16 23:56), Jan Kara wrote:
>>>>> On Wed 16-02-11 08:17:46, Toshiyuki Okajima wrote:
>>>>>> On Tue, 15 Feb 2011 18:29:54 +0100
>>>>>> Jan Kara<jack <at> suse.cz>  wrote:
>>>>>>> On Tue 15-02-11 12:03:52, Ted Ts'o wrote:
>>>>>>>> On Tue, Feb 15, 2011 at 05:06:30PM +0100, Jan Kara wrote:
>>>>>>>>> Thanks for detailed analysis. Indeed this is a bug. Whenever we do IO
>>>>>>>>> under s_umount semaphore, we are prone to deadlock like the one you
>>>>>>>>> describe above.
>>>>>>>>
>>>>>>>> One of the fundamental problems here is that the freeze and thaw
>>>>>>>> routines are using down_write(&sb->s_umount) for two purposes.  The
>>>>>>>> first is to prevent the resume/thaw from racing with a umount (which
>>>>>>>> it could do just as well by taking a read lock), but the second is to
>>>>>>>> prevent the resume/thaw code from racing with itself.  That's the core
>>>>>>>> fundamental problem here.
>>>>>>>>
>>>>>>>> So I think we can solve this by introduce a new mutex, s_freeze, and
>>>>>>>> having the the resume/thaw first take the s_freeze mutex and then
>>>>>>>> second take a read lock on the s_umount.
>>>>>>>   Sadly this does not quite work because even down_read(&sb->s_umount)
>>>>>>> in thaw_super() can block if there is another process that tries to acquire
>>>>>>> s_umount for writing - a situation like:
(Continue reading)

Trond Myklebust | 1 Apr 2011 02:10
Picon

Re: [Lsf-pc] [LSF/FS TOPIC] Ext4 snapshots status update

On Wed, 2011-03-30 at 14:08 +0200, Amir Goldstein wrote:
> On Wed, Mar 30, 2011 at 1:50 PM, Chris Mason <chris.mason <at> oracle.com> wrote:
> > Excerpts from Amir Goldstein's message of 2011-03-30 00:16:45 -0400:
> >> On Wed, Mar 30, 2011 at 2:34 AM, Joel Becker <jlbec <at> evilplan.org> wrote:
> >> > On Wed, Mar 23, 2011 at 10:19:38PM +0200, Amir Goldstein wrote:
> >> >> On Fri, Feb 4, 2011 at 2:20 AM, Joel Becker <jlbec <at> evilplan.org> wrote:
> >> >> > On Fri, Feb 04, 2011 at 12:33:39AM +0200, Amir Goldstein wrote:
> >> >> >        I've already got a design for a front-end snapshot program that
> >> >> > implements a policy on top this generic behavior.  This design would
> >> >> > cover both first-class and hidden style snapshots, because it assume
> >> >> > snapshots are in a distinct namespace.  I haven't gotten around to
> >> >> > implementing it yet, but btrfs and other snapshottable filesystems were
> >> >> > part of the design goal.
> >> >>
> >> >> Any chance of getting a copy of that design of yours, to get a head start
> >> >> for LSF?
> >> >
> >> >        Yeah, I owe it to you.  It wasn't a written-down thing, it was a
> >> > hammered-out-in-our-heads thing among some ocfs2 developers.  I'm going
> >> > to braindump here to get us going.  First, I'll speak to your points.
> >> >
> >> >> Here are some other generic snapshot related topics we may want to discuss:
> >> >>
> >> >> 1. Collaborating the use of inode flags COW_FL, NOCOW_FL, suggested by Chris.
> >> >
> >> >        I'm unsure where these fit, perhaps because I missed the
> >> > discussion between Chris and you.  ocfs2 has the inode flag
> >> > OCFS2_REFCOUNTED_FL to signify a refcount tree is attached to the inode.
> >> > This is ocfs2's structure for maintaining extent reference counts.  Is
> >> > your COW_FL the same?  Or is it a permission flag?  NOCOW_FL sounds
(Continue reading)

Yongqiang Yang | 1 Apr 2011 10:44
Picon

[PATCH v0 RFC] ext4: Fix a bug in ext4_journal_start_sb().

ext4_journal_start_sb() should not prevent active handle from being
started due to s_frozen.  Otherwise, deadlock is easy to happen, below
is a situation.

======================================================================
     freeze         |       truncate             |      kjournald
======================================================================
                    |  ext4_ext_truncate()       |
    freeze_super()  |   starts a handle          |
    sets s_frozen   |                            |
                    |  ext4_ext_truncate()       |
                    |  holds i_data_sem          |
  ext4_freeze()     |                            |   commit_transaction()
  waits for updates |                            |   waits for i_data_sem
                    |  ext4_free_blocks()        |
                    |  calls dquot_free_block()  |
                    |                            |
                    |  dquot_free_blocks()       |
                    |  calls ext4_dirty_inode()  |
                    |                            |
                    |  ext4_dirty_inode()        |
                    |  trys to start an active   |
                    |  handle                    |
                    |                            |
                    |  block due to s_frozen     |
=======================================================================

Messages reported by Amir:
while running phoronix test suite and taking a snapshot every 10 seconds,
the following hang happened during tiobench [64MB Random Write - 32 Threads]:
(Continue reading)

Amir Goldstein | 1 Apr 2011 11:54
Picon
Gravatar

Re: kernel BUG at fs/jbd2/transaction.c:1086!

On Thu, Mar 31, 2011 at 3:23 PM, Jan Kara <jack <at> suse.cz> wrote:
> On Thu 31-03-11 22:59:24, Lukas Czerner wrote:
>> On Thu, 31 Mar 2011, Jan Kara wrote:
>>
>> > On Wed 30-03-11 17:32:06, Lukas Czerner wrote:
>> > > Hello,
>> > >
>> > > I have hit BUG_ON while running xfstest 234 in the loop
>> > > on the ext4. Backtrace below .
>> > >
>> > > I thought that this problem was solved with
>> > > 67eeb5685d2a211c0252ac7884142e503c759500 however it is still present.
>> > > I might be a bit hard to hit, but once you run
>> > >
>> > > while true; do sync; sleep 0.3; done
>> > >
>> > > concurrently it is reproducible almost all the time. I have tried to
>> > > understand what is going on but only thing I can see is this course
>> > > of action:
>> > >
>> > > ext4_write_dquot
>> > >   ext4_journal_start <- h_buffer_credits = 2
>> > >   dquot_commit
>> > >     v2_write_dquot
>> > >       qtree_write_dquot
>> > >         ext4_quota_write
>> > >           ext4_handle_dirty_metadata      <- this takes one block reserved
>> > >                                      for journal
>> > >                                      h_buffer_credits--
>> > >     v2_write_file_info
(Continue reading)

Yongqiang Yang | 1 Apr 2011 16:07
Picon

Re: [Bug 32182] New: EXT4-fs error: bad header/extent

Could the bug be reproduced?
If so,  how to?

On Wed, Mar 30, 2011 at 12:47 AM,  <bugzilla-daemon <at> bugzilla.kernel.org> wrote:
> https://bugzilla.kernel.org/show_bug.cgi?id=32182
>
>           Summary: EXT4-fs error: bad header/extent
>           Product: File System
>           Version: 2.5
>    Kernel Version: 2.6.38.1
>          Platform: All
>        OS/Version: Linux
>              Tree: Mainline
>            Status: NEW
>          Severity: high
>          Priority: P1
>         Component: ext4
>        AssignedTo: fs_ext4 <at> kernel-bugs.osdl.org
>        ReportedBy: ramses.rommel <at> gmail.com
>        Regression: No
>
>
> I started noticing ext4-fs errors in my dmesg output, they look like this:
>
> EXT4-fs error (device sda2): ext4_ext_check_inode:428: inode #3760218: comm
> dropbox: bad header/extent: invalid magic - magic 0, entries 0, max 0(0), depth
> 0(0)
> EXT4-fs error (device sda2): ext4_ext_check_inode:428: inode #3760208: comm
> dropbox: bad header/extent: invalid magic - magic 0, entries 0, max 0(0), depth
> 0(0)
(Continue reading)

bugzilla-daemon | 1 Apr 2011 16:07

[Bug 32182] EXT4-fs error: bad header/extent

https://bugzilla.kernel.org/show_bug.cgi?id=32182

--- Comment #1 from Anonymous Emailer <anonymous <at> kernel-bugs.osdl.org>  2011-04-01 14:07:56 ---
Reply-To: xiaoqiangnk <at> gmail.com

Could the bug be reproduced?
If so,  how to?

On Wed, Mar 30, 2011 at 12:47 AM,  <bugzilla-daemon <at> bugzilla.kernel.org> wrote:
> https://bugzilla.kernel.org/show_bug.cgi?id=32182
>
>           Summary: EXT4-fs error: bad header/extent
>           Product: File System
>           Version: 2.5
>    Kernel Version: 2.6.38.1
>          Platform: All
>        OS/Version: Linux
>              Tree: Mainline
>            Status: NEW
>          Severity: high
>          Priority: P1
>         Component: ext4
>        AssignedTo: fs_ext4 <at> kernel-bugs.osdl.org
>        ReportedBy: ramses.rommel <at> gmail.com
>        Regression: No
>
>
> I started noticing ext4-fs errors in my dmesg output, they look like this:
>
> EXT4-fs error (device sda2): ext4_ext_check_inode:428: inode #3760218: comm
(Continue reading)

Jan Kara | 1 Apr 2011 16:08
Picon

Re: [RFC][PATCH] Re: [BUG] ext4: cannot unfreeze a filesystem due to a deadlock

On Fri 01-04-11 10:40:50, Dave Chinner wrote:
> On Mon, Mar 28, 2011 at 05:06:28PM +0900, Toshiyuki Okajima wrote:
> > On Thu, 17 Feb 2011 11:45:52 +0100
> > Jan Kara <jack <at> suse.cz> wrote:
> > > On Thu 17-02-11 12:50:51, Toshiyuki Okajima wrote:
> > > > (2011/02/16 23:56), Jan Kara wrote:
> > > > >On Wed 16-02-11 08:17:46, Toshiyuki Okajima wrote:
> > > > >>On Tue, 15 Feb 2011 18:29:54 +0100
> > > > >>Jan Kara<jack <at> suse.cz>  wrote:
> > > > >>>On Tue 15-02-11 12:03:52, Ted Ts'o wrote:
> > > > >>>>On Tue, Feb 15, 2011 at 05:06:30PM +0100, Jan Kara wrote:
> > > > >>>>>Thanks for detailed analysis. Indeed this is a bug. Whenever we do IO
> > > > >>>>>under s_umount semaphore, we are prone to deadlock like the one you
> > > > >>>>>describe above.
> > > > >>>>
> > > > >>>>One of the fundamental problems here is that the freeze and thaw
> > > > >>>>routines are using down_write(&sb->s_umount) for two purposes.  The
> > > > >>>>first is to prevent the resume/thaw from racing with a umount (which
> > > > >>>>it could do just as well by taking a read lock), but the second is to
> > > > >>>>prevent the resume/thaw code from racing with itself.  That's the core
> > > > >>>>fundamental problem here.
> > > > >>>>
> > > > >>>>So I think we can solve this by introduce a new mutex, s_freeze, and
> > > > >>>>having the the resume/thaw first take the s_freeze mutex and then
> > > > >>>>second take a read lock on the s_umount.
> > > > >>>   Sadly this does not quite work because even down_read(&sb->s_umount)
> > > > >>>in thaw_super() can block if there is another process that tries to acquire
> > > > >>>s_umount for writing - a situation like:
> > > > >>>   TASK 1 (e.g. flusher)		TASK 2	(e.g. remount)		TASK 3 (unfreeze)
> > > > >>>down_read(&sb->s_umount)
(Continue reading)

bugzilla-daemon | 1 Apr 2011 17:09

[Bug 25832] kernel crashes upon resume if usb devices are removed when suspended

https://bugzilla.kernel.org/show_bug.cgi?id=25832

--- Comment #53 from Theodore Tso <tytso <at> mit.edu>  2011-04-01 15:09:03 ---
Folks who are experiencing this problem might want to try 2.6.38.2.   There was
a fix that was committed to mainline and backported to the stable kernels that
may fix this problem:

commit 95f28604a65b1c40b6c6cd95e58439cd7ded3add
Author: Jens Axboe <jaxboe <at> fusionio.com>
Date:   Thu Mar 17 11:13:12 2011 +0100

    fs: assign sb->s_bdi to default_backing_dev_info if the bdi is going away

    We don't have proper reference counting for this yet, so we run into
    cases where the device is pulled and we OOPS on flushing the fs data.
    This happens even though the dirty inodes have already been
    migrated to the default_backing_dev_info.

    Reported-by: Torsten Hilbrich <torsten.hilbrich <at> secunet.com>
    Tested-by: Torsten Hilbrich <torsten.hilbrich <at> secunet.com>
    Cc: stable <at> kernel.org
    Signed-off-by: Jens Axboe <jaxboe <at> fusionio.com>

Sorry for not responding to this bug sooner, but I've been crazy busy in the
last couple of weeks; troubleshooting and discussion was taking place on LKML,
and I was pretty sure this wasn't an ext4 specific issue.

--

-- 
Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
(Continue reading)

Lukas Czerner | 1 Apr 2011 17:26
Picon
Favicon

Re: breaking ext4 to test recovery

On Thu, 31 Mar 2011, Eric Sandeen wrote:

> On 3/31/11 5:21 PM, Andreas Dilger wrote:
> 
> > We have a kernel patch "dev_read_only" that we use with Lustre to
> > disable writes to the block device while the device is in use.  This
> > allows simulating crashes at arbitrary points in the code or test
> > scripts.  It was based on Andrew Morton's test harness that he used
> > for ext3 recovery testing back when it was being ported to the 2.4
> > kernel.
> > 
> > http://git.whamcloud.com/?p=fs/lustre-release.git;a=blob_plain;f=lustre/kernel_patches/patches/dev_read_only-2.6.32-rhel6.patch;hb=HEAD
> >
> >  The best part of this patch is that it works with any block device,
> > can simulate power failure w/o any need for automated power control,
> > and once the block device is unused (all buffers and references
> > dropped) it can be re-activated safely.
> 
> It won't simulate a lost write cache though, will it?

That's a very good question, I would like to know if there is any way at
all to force the device to drop the write cache. That would really help
the power failure testing filesystems.

-Lukas

> 
> -Eric
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
(Continue reading)


Gmane