Daniel Pittman | 1 Mar 09:52 2009
Picon

MD software RAID1 vs suspend-to-disk

G'day.

I have a random desktop machine here, running Debian/sid with a 2.6.26
Debian kernel.  It has a two disk software RAID1, and apparently passes
through a suspend/resume cycle correctly, but...

The but is that while booting for resume it warned (in the initrd) that
the RAID array was unclean, and that it would be rebuilt.

After resuming, however, the array was listed as clean, but (damn) it
wasn't; checking the array[1] reported that there were 48800 errors, and
a repair claimed to fix them.

That makes me suspect that something went wrong with shutting down the
array during the suspend process — given it is the array with / mounted
it could still be busy, and possibly unclean.

Then, resuming detects that, starts to correct it, switches back to the
previous kernel and ... viola, the saved "clean" state is restored,
unaware that the array was out of sync or that anything changed under
it.

I don't know quite enough about the suspend/resume implementation to
know if this is a problem, or just likely to be, or some quirk of this
system.

It does concern me, though, so: should I expect suspend on MD RAID1 to
work, cleanly, in all cases?

Regards,
(Continue reading)

Kasper Sandberg | 1 Mar 15:14 2009
Picon

Re: System hangs on raid md recovery/resync - revisit

On Sat, 2009-02-28 at 07:04 -0500, Justin Piszcz wrote:
> 
> On Sat, 28 Feb 2009, Brad wrote:
> 
> > On Sat, Feb 28, 2009 at 7:08 PM, Justin Piszcz <jpiszcz <at> lucidpixels.com> wrote:
> >>
> >> On Sat, 28 Feb 2009, Brad wrote:
> >>
> >>> Hi.  I'd like to revisit a problem I put to the mailing list on the
<snip>
> > I've had another problem with the Realtek network driver ... under network
> > load it seemed to miss interrupts and/or pass them to the IDE driver, which
> > would print out errors about unexpected/unknown interrupts.  I had to take
> > IDE out of my kernel.
> Correct, buy an Intel 1GBPS PCI-e card, I do for all of my main machines
> that do not have Intel NICs, solves the problem.  They are $30-40 and then
> all of your network issues will be solved.
> 
> >
I have a gigabyte X48 board with two of those realtek NICs, and apart
from some driver troubles which the r8169 maintainer fixed for me, i've
had no issues with it.

I suggest contacting the maintainer if you really believe its the NIC
and/or driver

<snip>
> Justin.

--
(Continue reading)

Andrey Falko | 1 Mar 19:27 2009
Picon

Re: Trouble recovering raid5 array

On Fri, Feb 27, 2009 at 11:56 PM, Andrey Falko <ma3oxuct <at> gmail.com> wrote:
> Hi everyone,
>
> I'm having some strange problems putting one of my raid5 back
> together. Here is back ground story:
>
> I have 4 drives paritioned into a bunch of raid arrays. One of the
> drives failed and I replaced it with a new one. I was able to get
> mdadm to recover all arrays, except one raid5 array. The array with
> troubles is /dev/md8 and it is supposed to have /dev/sd[abcd]13 under
> it.
>
> This command started the recovery process (same thing that worked for
> my other raid5 arrays):
> mdadm --manage --add /dev/md8 /dev/sdc13
>
> md8 : active raid5 sdc13[4] sdd13[3] sdb13[1] sda13[0]
>       117185856 blocks level 5, 64k chunk, algorithm 2 [4/3] [UU_U]
>       [>....................]  recovery =  1.6% (634084/39061952)
> finish=12.1min speed=52840K/sec
>
> However sometime after 1.6% into recovery, I did a "cat /proc/mdstat" and saw:
>
> md8 : active raid5 sdc13[4](S) sdd13[3] sdb13[1] sda13[5](F)
>       117185856 blocks level 5, 64k chunk, algorithm 2 [4/2] [_U_U]
>
> /dev/sda only "failed" for this array and not on any of the other
> arrays. I proceeded trying to remove and re-add /dev/sda13 and
> /dev/sdc13, however that did not work. I ran the following:
>
(Continue reading)

Michał Przyłuski | 1 Mar 20:12 2009
Picon

Re: Raid6 write performance

Hello,

2009/2/28 H. Peter Anvin <hpa <at> zytor.com>:
> Peter Rabbitson wrote:
>> Hi,
>>
>> I am experimenting with raid6 on 4 drives on 2.6.27.11. The problem I am
>> having is that no matter what chunk size I use, the write benchmark
>> always comes out at single drive speed, although I should be seeing
>> double drive speed (read speed is at near 4x as expected).
>
> I have no idea why you "should" be seeing double drive speed.  All
> drives have to be written, so you'd logically see single drive speed.

I'm afraid that might be incorrect.

Let's assume we want to write 100MB of data onto a 4 drive raid6.
Let's divide 100MB of data into two parts, say A and B, each 50MB big.
Writing the data on the raid, would mean writing:
* A on disk1
* B on disk2
* XOR(A,B) on disk3
* Q(A,B) on disk4
That is actually assuming 50MB chunk, and whole chunk writes, etc.
Each of written portions would have been 50MB in size. That sounds
reasonable to me, as with 2 data disks, only half of data has to be
written on each. The fact that disks are really striped with data, XOR
and Q doesn't change the image in terms of amount written.

I do hope I had understood the situation correctly, but I'll be ever
(Continue reading)

Peter Rabbitson | 1 Mar 20:19 2009
Picon

Re: Raid6 write performance

H. Peter Anvin wrote:
> Peter Rabbitson wrote:
>> Hi,
>>
>> I am experimenting with raid6 on 4 drives on 2.6.27.11. The problem I am
>> having is that no matter what chunk size I use, the write benchmark
>> always comes out at single drive speed, although I should be seeing
>> double drive speed (read speed is at near 4x as expected).
> 
> I have no idea why you "should" be seeing double drive speed.  All
> drives have to be written, so you'd logically see single drive speed.
> 

Because with properly adjusted elevators and chunk sizes it is reasonable
to expect N * S write speed from _any_ raid, where N is the number of
different data bearing disks in a stripe, and S is the speed of a hard
drive (assuming the drive speeds are equal). So for raid5 we have N =
numdisks-1, for raid6 numdisks-2, for raid10 -n4 -pf3 we get 4-(3-1) and
so on. I have personally verified the write behavior for raid10 and raid5,
don't see why it should/would be different for raid6.

In any case the witnessed problem was due to a hardware misconfiguration,
which has not been resolved to this day. Thus thread is dead :)
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo <at> vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

H. Peter Anvin | 1 Mar 21:35 2009

Re: Raid6 write performance

Michał Przyłuski wrote:
> 
> I'm afraid that might be incorrect.
> 
> Let's assume we want to write 100MB of data onto a 4 drive raid6.
> Let's divide 100MB of data into two parts, say A and B, each 50MB big.
> Writing the data on the raid, would mean writing:
> * A on disk1
> * B on disk2
> * XOR(A,B) on disk3
> * Q(A,B) on disk4
> That is actually assuming 50MB chunk, and whole chunk writes, etc.
> Each of written portions would have been 50MB in size. That sounds
> reasonable to me, as with 2 data disks, only half of data has to be
> written on each. The fact that disks are really striped with data, XOR
> and Q doesn't change the image in terms of amount written.
> 
> I do hope I had understood the situation correctly, but I'll be ever
> happy to be proved wrong.
> 

Ah, sorry, yes you're of course right.  I was thinking about latency,
not throughput, for some idiotic reason.

	-hpa

--

-- 
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel.  I don't speak on their behalf.

(Continue reading)

John Robinson | 2 Mar 01:42 2009
Picon

Re: MD software RAID1 vs suspend-to-disk

On 01/03/2009 08:52, Daniel Pittman wrote:
> I have a random desktop machine here, running Debian/sid with a 2.6.26
> Debian kernel.  It has a two disk software RAID1, and apparently passes
> through a suspend/resume cycle correctly, but...

I'm not sure if this is the same suspend/resume - there are after all 
several Googleable reasons why one might suspend or resume various 
things - but it might be worth a look at NeilB's recent post of a patch 
to "hopefully enable suspend/resume of md devices": 
http://marc.info/?l=linux-raid&m=123440845819870&w=2

Cheers,

John.

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo <at> vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Daniel Pittman | 2 Mar 03:23 2009
Picon

Re: MD software RAID1 vs suspend-to-disk

John Robinson <john.robinson <at> anonymous.org.uk> writes:
> On 01/03/2009 08:52, Daniel Pittman wrote:
>
>> I have a random desktop machine here, running Debian/sid with a 2.6.26
>> Debian kernel.  It has a two disk software RAID1, and apparently passes
>> through a suspend/resume cycle correctly, but...
>
> I'm not sure if this is the same suspend/resume - there are after all several
> Googleable reasons why one might suspend or resume various things - but it
> might be worth a look at NeilB's recent post of a patch to "hopefully enable
> suspend/resume of md devices":
> http://marc.info/?l=linux-raid&m=123440845819870&w=2

No, that appears to be about suspending and resuming access to the
MD device while reconfiguring it; I don't /think/ that is accessed
during a system-wide suspend/resume (aka hibernate, or s2disk) cycle.

Certainly, it doesn't look like the path is invoked for that from my
reading of the code.

Regards,
        Daniel

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo <at> vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

NeilBrown | 2 Mar 03:58 2009
Picon

Re: MD software RAID1 vs suspend-to-disk

On Mon, March 2, 2009 1:23 pm, Daniel Pittman wrote:
> John Robinson <john.robinson <at> anonymous.org.uk> writes:
>> On 01/03/2009 08:52, Daniel Pittman wrote:
>>
>>> I have a random desktop machine here, running Debian/sid with a 2.6.26
>>> Debian kernel.  It has a two disk software RAID1, and apparently passes
>>> through a suspend/resume cycle correctly, but...
>>
>> I'm not sure if this is the same suspend/resume - there are after all
>> several
>> Googleable reasons why one might suspend or resume various things - but
>> it
>> might be worth a look at NeilB's recent post of a patch to "hopefully
>> enable
>> suspend/resume of md devices":
>> http://marc.info/?l=linux-raid&m=123440845819870&w=2
>
> No, that appears to be about suspending and resuming access to the
> MD device while reconfiguring it; I don't /think/ that is accessed
> during a system-wide suspend/resume (aka hibernate, or s2disk) cycle.
>
> Certainly, it doesn't look like the path is invoked for that from my
> reading of the code.

Correct, they are completely unrelated.

I have never tried hibernating to an md array, but I think others have,
though I don't have a lot of specifics.

One observation is that you really don't want resync to start before the
(Continue reading)

Brian Manning | 2 Mar 03:42 2009

Raid-5 Reshape Gone Bad

I've been running a MD three-drive raid-5 for a while now with no problems
on a CentOS 5.2 i386 box.  I've attempted to add a fourth drive to the
array yesterday & grow it.  This is where things got ugly....

It began the reshape as expected, some hours later I rebooted the box for
another reason entirely, forgetting about the reshape that was still going
on.  But it was a clean shutdown process and md stopped just fine.  So I
wasn't too worried about it, I knew it was just pick up again once it
booted.

After startup the kernel found the md, said it was to resume the
reshape... then it came time for the kernel to mount root.. and hung
scanning for Logical Volumes, I left it for over an hour, it never
proceeded past this stage.  Disk io light was off, nothing going on.

My entire OS save /boot is on the raid-5, split across several LVM2s
inside that md device.  It's always worked fine for me in the past.

But now LVM is hanging on boot, I can't even get into single mode or
anything like that.  So I bring out the boot disc and go into rescue mode.

I check the raid status, everything looks okay, so I manually start the MD
again from the boot cd, and that fires up as expected, however.... when I
look at /proc/mdstat... the speed is 0KB/sec, and the ETA is growing by
100's of minutes a second.

I let this go for about 2 hours, and nothing ever happens, speed is 0,
diskio light is off, nothing is happening.

Any process that attempts to look at or use md0 will "freeze" just like at
(Continue reading)


Gmane