Michael Tokarev | 1 Aug 2010 12:57
Picon

sw raid array completely hungs during verify in 2.6.32

Hello.

It is the second time we come across this issue
after switching from 2.6.27 to 2.6.32 about 3
months ago.

At some point, an md-raid10 array hungs - that
is, all the processes that tries to access it,
either read or write, hungs forever.

Here's a typical set of messages found in kern.log:

 INFO: task oracle:7602 blocked for more than 120 seconds.
 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
 oracle        D ffff8801a8837148     0  7602      1 0x00000000
  ffffffff813bc480 0000000000000082 0000000000000000 0000000000000001
  ffff8801a8b7fdd8 000000000000e1c8 ffff88003b397fd8 ffff88003f47d840
  ffff88003f47dbe0 000000012416219a ffff88002820e1c8 ffff88003f47dbe0
 Call Trace:
  [<ffffffffa018e8ae>] ? wait_barrier+0xee/0x130 [raid10]
  [<ffffffff8104f570>] ? default_wake_function+0x0/0x10
  [<ffffffffa0191852>] ? make_request+0x82/0x5f0 [raid10]
  [<ffffffffa007cb2c>] ? md_make_request+0xbc/0x130 [md_mod]
  [<ffffffff810c4722>] ? mempool_alloc+0x62/0x140
  [<ffffffff8117d26f>] ? generic_make_request+0x30f/0x410
  [<ffffffff8112eee4>] ? bio_alloc_bioset+0x54/0xf0
  [<ffffffff8112e28b>] ? __bio_add_page+0x12b/0x240
  [<ffffffff8117d3cc>] ? submit_bio+0x5c/0xe0
  [<ffffffff811313da>] ? dio_bio_submit+0x5a/0x90
  [<ffffffff81131d63>] ? __blockdev_direct_IO+0x5a3/0xcd0
(Continue reading)

wiebittewas | 1 Aug 2010 20:28

failed to re-assemble after cable-problem...


well, ext4 fails with IO-errors, so I checked the disks of the raid
and found, that's something wrong with the cables...

after an exchange of cables I tried a "mdadm --assemble --force",
but result was that the missing devices are only added as
spare-drives, not as active ones.

well maybe, there are errors in filesystem, but the raid should be
ok, so how can I change the disk from spare to active?

maybe setting the right info directly in superblock of the disk?

any help would be appreciated

wiebittewas
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo <at> vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Justin P. Mattock | 2 Aug 2010 00:23
Picon

Re: [PATCH]md:dm.c Fix warning: statement with no effect

On 07/31/2010 12:07 PM, Alasdair G Kergon wrote:
> On Sat, Jul 31, 2010 at 12:05:04PM -0700, Justin P. Mattock wrote:
>> haven't heard any feedback on this any ideas?
>>> Ive noticed that having CONFIG_BLK_DEV_INTEGRITY=n I get warning messages generated by GCC(below)
>>>     CC      drivers/md/dm.o
>>> drivers/md/dm.c: In function 'split_bvec':
>>> drivers/md/dm.c:1117:3: warning: statement with no effect
>>> drivers/md/dm.c: In function 'clone_bio':
>>> drivers/md/dm.c:1145:3: warning: statement with no effect
>
> I'd suggest hiding it inside the .h files and not trying to scatter #ifdefs
> throughout the code.
>
> Alasdair
>
>

o.k. this ones a bit tricky.. but I did finally get this to build clean.
below is an updated patch that gets me to build clean. Let me know if 
something other than this should be used.(or if this is a good solution 
then let me know and I'll resend)

<---cut--->

When building the kernel with CONFIG_BLK_DEV_INTEGRITY=n everything
gets sent(if Im reading this correctly) too:
(line #660 bio.h)
#else /* CONFIG_BLK_DEV_INTEGRITY */

#define bio_integrity(a)		(0)
(Continue reading)

Neil Brown | 2 Aug 2010 04:29
Picon
Gravatar

Re: Raid10 device hangs during resync and heavy I/O.

On Fri, 23 Jul 2010 11:47:01 -0400
Justin Bronder <jsbronder <at> gentoo.org> wrote:

> On 23/07/10 13:19 +1000, Neil Brown wrote:
> > On Thu, 22 Jul 2010 14:49:33 -0400

> > 
> > So the 'dd' process successfully waited for the barrier to be gone at
> > 189.021179, and thus set pending to '1'.  It then submitted the IO request.
> > We should then see swapper (or possibly some other thread) calling
> > allow_barrier when the request completes.  But we don't.
> > A request could possibly take many milliseconds to complete, but it shouldn't
> > take seconds and certainly not minutes.
> > 
> > It might be helpful if you could run this again, and in make_request(), after
> > the call to "wait_barrier()" print out:
> >   bio->bi_sector, bio->bi_size, bio->bi_rw
> > 
> > I'm guessing that the last request that doesn't seem to complete will be
> > different from the other in some important way.
> 
> Nothing stood out to me, but here's the tail end of a couple of different
> traces.

Thanks a lot!  Something does stand out for me!....

>            <...>-5047  [002]   207.051215: wait_barrier: out: dd - w:0 p:1 b:0
>            <...>-5047  [002]   207.051216: make_request: dd - sector:7472081 sz:20480 rw:0
>            <...>-4958  [003]   207.051218: raise_barrier: mid: md99_resync - w:0 p:1 b:1
>            <...>-5047  [002]   207.051227: wait_barrier: in:  dd - w:0 p:1 b:1
(Continue reading)

Neil Brown | 2 Aug 2010 04:58
Picon
Gravatar

Re: Raid10 device hangs during resync and heavy I/O.

On Mon, 2 Aug 2010 12:29:49 +1000
Neil Brown <neilb <at> suse.de> wrote:

> Ahhhh.... I see the problem.  Because a 'generic_make_request' is already
> active, the once called by raid10::make_request just queues the request until
> the top level one completes.   This results in a deadlock.
> 
> I'll have to ponder a bit to figure out the best way to fix this.
> 

So, one good strong cup of tea later I think I have a good solution.

Would you be able to test with this patch and confirm that you cannot
reproduce the hang?
Thanks.

NeilBrown

diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c
index 42e64e4..d1d6891 100644
--- a/drivers/md/raid10.c
+++ b/drivers/md/raid10.c
 <at>  <at>  -825,11 +825,29  <at>  <at>  static int make_request(mddev_t *mddev, struct bio * bio)
 		 */
 		bp = bio_split(bio,
 			       chunk_sects - (bio->bi_sector & (chunk_sects - 1)) );
+
+		/* Each of these 'make_request' calls will call 'wait_barrier'.
+		 * If the first succeeds but the second blocks due to the resync
+		 * thread raising the barrier, we will deadlock because the
(Continue reading)

Neil Brown | 2 Aug 2010 05:01
Picon
Gravatar

Re: sw raid array completely hungs during verify in 2.6.32

On Sun, 01 Aug 2010 14:57:56 +0400
Michael Tokarev <mjt <at> tls.msk.ru> wrote:

> Hello.
> 
> It is the second time we come across this issue
> after switching from 2.6.27 to 2.6.32 about 3
> months ago.
> 
> At some point, an md-raid10 array hungs - that
> is, all the processes that tries to access it,
> either read or write, hungs forever.

Thanks for the report.
This is the same problem that has been reported recently in a thread with
subject "Raid10 device hangs during resync and heavy I/O."

I have just posted a patch which should address it - I will include it here
was well.

Note that you need to be careful when reading the stack traces.  A "?" means
that the function named make not be in the actual call trace - it may just be
an old address that happens to still be on the stack.
In this case, the "mempool_alloc" was stray - nothing was actually blocking
on that.

This is the patch that I have proposed.

Thanks,
NeilBrown
(Continue reading)

Tejun Heo | 2 Aug 2010 16:12

[PATCH 1/2 block#for-linus] bio, fs: update READA and SWRITE to match the corresponding BIO_RW_* bits

Commit a82afdf (block: use the same failfast bits for bio and request)
moved BIO_RW_* bits around such that they match up with REQ_* bits.
Unfortunately, fs.h hard coded READ, WRITE, READA and SWRITE as 0, 1,
2 and 3, and expected them to match with BIO_RW_* bits.  READ/WRITE
didn't change but BIO_RW_AHEAD was moved to bit 4 instead of bit 1,
breaking READA and SWRITE.

This patch updates READA and SWRITE such that they match the BIO_RW_*
bits again.  A follow up patch will update the definitions to directly
use BIO_RW_* bits so that this kind of breakage won't happen again.

Stable: The offending commit a82afdf was released with v2.6.32, so
this patch should be applied to all kernels since then but it must
_NOT_ be applied to kernels earlier than that.

Signed-off-by: Tejun Heo <tj <at> kernel.org>
Reported-and-bisected-by: Vladislav Bolkhovitin <vst <at> vlnb.net>
Root-caused-by: Neil Brown <neilb <at> suse.de>
Cc: Jens Axobe <axboe <at> kernel.dk>
Cc: stable <at> kernel.org
---
Aieee... thanks for root causing it Neil.  That was a stupid bug.  I
knew that READ/WRITE were hardcoded but forgot about READA.  :-(
Moving BIO_RW_AHEAD back to bit 1 might be a better solution but I'm
afraid that would cause more confusions downstream.  This patch
updates READA and SWRITE to match BIO_RW_AHEAD and should also appear
in -stable releases.  The next patch will create bio_types.h and
define all constants in terms of BIO_RW_*.

Thanks.
(Continue reading)

Tejun Heo | 2 Aug 2010 16:15

[PATCH RESEND 2/2 block#for-linus] bio, fs: separate out bio_types.h and define READ/WRITE constants in terms of BIO_RW_* flags

linux/fs.h hard coded READ/WRITE constants which should match BIO_RW_*
flags.  This is fragile and caused breakage during BIO_RW_* flag
rearrangement.  The hardcoding is to avoid include dependency hell.

Create linux/bio_types.h which contatins definitions for bio data
structures and flags and include it from bio.h and fs.h, and make fs.h
define all READ/WRITE related constants in terms of BIO_RW_* flags.

Signed-off-by: Tejun Heo <tj <at> kernel.org>
Cc: Jens Axobe <axboe <at> kernel.dk>
---
 include/linux/bio.h       |  153 +-----------------------------------------
 include/linux/bio_types.h |  164 ++++++++++++++++++++++++++++++++++++++++++++++
 include/linux/fs.h        |   17 ++--
 3 files changed, 176 insertions(+), 158 deletions(-)

Index: work/include/linux/bio.h
===================================================================
--- work.orig/include/linux/bio.h
+++ work/include/linux/bio.h
 <at>  <at>  -9,7 +9,7  <at>  <at> 
  *
  * This program is distributed in the hope that it will be useful,
  * but WITHOUT ANY WARRANTY; without even the implied warranty of
-
+ *
  * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
  * GNU General Public License for more details.
  *
 <at>  <at>  -28,6 +28,9  <at>  <at> 
(Continue reading)

Doug Ledford | 2 Aug 2010 20:41
Picon
Favicon

Re: mdadm 3.1.x and bitmap chunk

On 06/09/2010 05:05 PM, Piergiorgio Sartor wrote:
> Hi folks,
> 
> I just noticed that mdadm 3.1.x sets the default
> bitmap chunk to 64MB.
> 
> Searching around, I found the reason should be
> related with write performances, which seem to
> be less compromised with large bitmap chunks.
> 
> I've no problem with that, but is it possible
> to set the bitmap chunk to the minimum size,
> without trying all the possible dimensions?
> 
> Thanks a lot in advance,
> 
> bye,
> 

This depends somewhat on what you mean by "minimum size" to the bitmap
chunk.  The bitmap chunk size can never be smaller than the minimum size
on the device, so 512bytes for most drives, 4k for newer drives.  But
such a chunk size would be horrible for performance as it would make
every single write synchronous on the bitmap update.  However, the
minimum size can also be limited by the available space for the bitmap
when using an internal bitmap.  So, you can only make the size as small
as it will go and the bitmap still fit between the end of the
{array,superblock} and the start of the {superblock,array}.

However, in general, a bitmap that's small enough that you can sync the
(Continue reading)

Justin Bronder | 2 Aug 2010 22:37
Picon
Favicon

Re: Raid10 device hangs during resync and heavy I/O.

On 02/08/10 12:58 +1000, Neil Brown wrote:
> On Mon, 2 Aug 2010 12:29:49 +1000
> Neil Brown <neilb <at> suse.de> wrote:
> 
> 
> > Ahhhh.... I see the problem.  Because a 'generic_make_request' is already
> > active, the once called by raid10::make_request just queues the request until
> > the top level one completes.   This results in a deadlock.
> > 
> > I'll have to ponder a bit to figure out the best way to fix this.
> > 
> 
> So, one good strong cup of tea later I think I have a good solution.
> 
> Would you be able to test with this patch and confirm that you cannot
> reproduce the hang?

I've been running with this patch on 2.6.34.1 all day and have yet to cause
the hang.  Given it took under 5 minutes earlier, feel free to add:

Tested-by:  Justin Bronder <jsbronder <at> gentoo.org>

I really appreciate you taking care of this.  Thanks.

> Thanks.
> 
> NeilBrown
> 
> diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c
> index 42e64e4..d1d6891 100644
(Continue reading)


Gmane