David Rees | 1 Jan 10:56 2008
Picon

Re: Last ditch plea on remote double raid5 disk failure

On Dec 31, 2007 2:39 AM, Marc MERLIN <merlin <at> gmail.com> wrote:
> new years eve :(  I was wondering if I can tell the kernel not to kick
> a drive out of an array if it sees a block error and just return the
> block error upstream, but continue otherwise (all my partitions are on
> a raid5 array, with lvm on top, so even if I were to lose a partition,
> I would still be likely to get the other ones back up if I can stop
> the auto kicking-out and killing the md array feature).

Best bet is to get a new drive into the machine that is at least the
same size as the bad-sector disk, use dd_rescue[1] to copy as much of
the bad-sector disk to the new one.

Remove the bad-sector disk, reboot and hopefully you'll have a
functioning raid array with a bit of bad data on it somewhere.

I'm probably missing a step somewhere but you get the general idea...

-Dave

[1] http://www.garloff.de/kurt/linux/ddrescue/
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo <at> vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Jon Nelson | 2 Jan 22:15 2008
Picon

stopped array, but /sys/block/mdN still exists.

This isn't a high priority issue or anything, but I'm curious:

I --stop(ped) an array but /sys/block/md2 remained largely populated.
Is that intentional?

--

-- 
Jon
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo <at> vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Neil Brown | 3 Jan 05:44 2008
X-Face
Picon

Re: stopped array, but /sys/block/mdN still exists.

On Wednesday January 2, jnelson-linux-raid <at> jamponi.net wrote:
> This isn't a high priority issue or anything, but I'm curious:
> 
> I --stop(ped) an array but /sys/block/md2 remained largely populated.
> Is that intentional?

It is expected.
Because of the way that md devices are created (just open the
device-special file), it is very hard to make them disappear in a
race-free manner.  I tried once and failed.  It is probably getting
close to trying again, but as you say: it isn't a high priority issue.

NeilBrown
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo <at> vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Neil Brown | 3 Jan 22:32 2008
X-Face
Picon

Re: stopped array, but /sys/block/mdN still exists.

On Thursday January 3, davidsen <at> tmr.com wrote:
> 
> So what happens if I try to _use_ that /sys entry? For instance run a 
> script which reads data, or sets the stripe_cache_size higher, or 
> whatever? Do I get back status, ignored, or system issues?

Try it:-)

The stripe_cache_size attributes will disappear (it is easy to remove
attributes, and stripe_cache_size is only meaningful for certain raid
levels).
Other attributes will return 0 or some equivalent, though I think
chunk_size will have the old value.

NeilBrown

-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo <at> vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Neil Brown | 3 Jan 23:37 2008
X-Face
Picon

Re: PROBLEM: RAID5 reshape data corruption

On Monday December 31, nagilum <at> nagilum.org wrote:
> Ok, since my previous thread didn't seem to attract much attention,
> let me try again.

Thank you for your report and your patience.

> An interrupted RAID5 reshape will cause the md device in question to
> contain one corrupt chunk per stripe if resumed in the wrong manner.
> A testcase can be found at http://www.nagilum.de/md/ .
> The first testcase can be initialized with "start.sh" the real test
> can then be run with "test.sh". The first testcase also uses dm-crypt
> and xfs to show the corruption.

It looks like this can be fixed with the patch:

Signed-off-by: Neil Brown <neilb <at> suse.de>

### Diffstat output
 ./drivers/md/raid5.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff .prev/drivers/md/raid5.c ./drivers/md/raid5.c
--- .prev/drivers/md/raid5.c	2008-01-04 09:20:54.000000000 +1100
+++ ./drivers/md/raid5.c	2008-01-04 09:21:05.000000000 +1100
 <at>  <at>  -2865,7 +2865,7  <at>  <at>  static void handle_stripe5(struct stripe
 		md_done_sync(conf->mddev, STRIPE_SECTORS, 1);
 	}

-	if (s.expanding && s.locked == 0)
+	if (s.expanding && s.locked == 0 && s.req_compute == 0)
(Continue reading)

NeilBrown | 3 Jan 23:46 2008
X-Face
Picon

[PATCH] md: Fix data corruption when a degraded raid5 array is reshaped.

This patch fixes a fairly serious bug in md/raid5 in 2.6.23 and 24-rc.
It would be great if it cold get into 23.13 and 24.final.
Thanks.
NeilBrown

### Comments for Changeset

We currently do not wait for the block from the missing device
to be computed from parity before copying data to the new stripe
layout.

The change in the raid6 code is not techincally needed as we
don't delay data block recovery in the same way for raid6 yet.
But making the change now is safer long-term.

This bug exists in 2.6.23 and 2.6.24-rc

Cc: stable <at> kernel.org
Cc: Dan Williams <dan.j.williams <at> intel.com>
Signed-off-by: Neil Brown <neilb <at> suse.de>

### Diffstat output
 ./drivers/md/raid5.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff .prev/drivers/md/raid5.c ./drivers/md/raid5.c
--- .prev/drivers/md/raid5.c	2008-01-04 09:42:05.000000000 +1100
+++ ./drivers/md/raid5.c	2008-01-04 09:42:27.000000000 +1100
 <at>  <at>  -2865,7 +2865,7  <at>  <at>  static void handle_stripe5(struct stripe
 		md_done_sync(conf->mddev, STRIPE_SECTORS, 1);
(Continue reading)

Dan Williams | 4 Jan 00:00 2008
Picon

Re: [PATCH] md: Fix data corruption when a degraded raid5 array is reshaped.

On Thu, 2008-01-03 at 15:46 -0700, NeilBrown wrote:
> This patch fixes a fairly serious bug in md/raid5 in 2.6.23 and 24-rc.
> It would be great if it cold get into 23.13 and 24.final.
> Thanks.
> NeilBrown
> 
> ### Comments for Changeset
> 
> We currently do not wait for the block from the missing device
> to be computed from parity before copying data to the new stripe
> layout.
> 
> The change in the raid6 code is not techincally needed as we
> don't delay data block recovery in the same way for raid6 yet.
> But making the change now is safer long-term.
> 
> This bug exists in 2.6.23 and 2.6.24-rc
> 
> Cc: stable <at> kernel.org
> Cc: Dan Williams <dan.j.williams <at> intel.com>
> Signed-off-by: Neil Brown <neilb <at> suse.de>
> 
Acked-by: Dan Williams <dan.j.williams <at> intel.com>

-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo <at> vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

(Continue reading)

Dan Williams | 4 Jan 00:40 2008
Picon

Re: [PATCH] md: Fix data corruption when a degraded raid5 array is reshaped.

On Thu, 2008-01-03 at 16:00 -0700, Williams, Dan J wrote:
> On Thu, 2008-01-03 at 15:46 -0700, NeilBrown wrote:
> > This patch fixes a fairly serious bug in md/raid5 in 2.6.23 and
> 24-rc.
> > It would be great if it cold get into 23.13 and 24.final.
> > Thanks.
> > NeilBrown
> >
> > ### Comments for Changeset
> >
> > We currently do not wait for the block from the missing device
> > to be computed from parity before copying data to the new stripe
> > layout.
> >
> > The change in the raid6 code is not techincally needed as we
> > don't delay data block recovery in the same way for raid6 yet.
> > But making the change now is safer long-term.
> >
> > This bug exists in 2.6.23 and 2.6.24-rc
> >
> > Cc: stable <at> kernel.org
> > Cc: Dan Williams <dan.j.williams <at> intel.com>
> > Signed-off-by: Neil Brown <neilb <at> suse.de>
> >
> Acked-by: Dan Williams <dan.j.williams <at> intel.com>
> 

On closer look the safer test is:

	!test_bit(STRIPE_OP_COMPUTE_BLK, &sh->ops.pending).
(Continue reading)

Clayton Bell | 4 Jan 01:20 2008
Picon
Picon

md raid1 in active-active arrangement


Is it reasonable for a raid1 mirror to be momentarily active on two servers at the same time?

This question has come about from Xen virtual host live migrations.

Consider the case with shared storage between two servers:

Server A: /dev/md0 is raid1 to devices /dev/mapper/wwidX and /dev/mapper/wwidY
Server B: /dev/md0 is raid1 to devices /dev/mapper/wwidX and /dev/mapper/wwidY

/dev/md0 is assigned to the Xen virtual host (ie it becomes /dev/sda within the virtual host)

When live migrating the virtual host from Server A to Server B, /dev/md0 must be active on both server A and B
at the same time, at least momentarily.

Is md going to cope with this kind of setup?  Under what conditions will it fail dismally?

To me it would seem as though the migration can only occur when:
1. the mirrors are in sync
2. Server A and B use their own/separate external bitmap file for /dev/md0

Even if the time duration that both /dev/md0's are active is minimized to less than a second, what possible
data corruption could occur?

Feedback much appreciated.  I hope the linux-raid group finds this situation least a little interesting.

Thank you

Clayton

(Continue reading)

Neil Brown | 4 Jan 01:41 2008
X-Face
Picon

Re: [PATCH] md: Fix data corruption when a degraded raid5 array is reshaped.

On Thursday January 3, dan.j.williams <at> intel.com wrote:
> 
> On closer look the safer test is:
> 
> 	!test_bit(STRIPE_OP_COMPUTE_BLK, &sh->ops.pending).
> 
> The 'req_compute' field only indicates that a 'compute_block' operation
> was requested during this pass through handle_stripe so that we can
> issue a linked chain of asynchronous operations.
> 
> ---
> 
> From: Neil Brown <neilb <at> suse.de>

Technically that should probably be
  From: Dan Williams <dan.j.williams <at> intel.com>

now, and then I add
  Acked-by: NeilBrown <neilb <at> suse.de>

because I completely agree with your improvement.

We should keep an eye out for then Andrew commits this and make sure
the right patch goes in...

Thanks,
NeilBrown

-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
(Continue reading)


Gmane