Brad Campbell | 1 Mar 06:34 2005
Picon

Re: Raid-6 hang on write.

Neil Brown wrote:
> On Friday February 25, brad <at> wasp.net.au wrote:
> 
>>Turning on debugging in raid6main.c and md.c make it much harder to hit. So I'm assuming something 
>>timing related.
>>
>>raid6d --> md_check_recovery --> generic_make_request --> make_request --> get_active_stripe
> 
> 
> Yes, there is a real problem here.  I see if I can figure out the best
> way to remedy it...
> However I think you reported this problem against a non "-mm" kernel,
> and the path from md_check_recovery to generic_make_requests only
> exists in "-mm".
> 
> Could you please confirm if there is a problem with
>     2.6.11-rc4-bk4-≥bk10

There is (was). I have three kernels I was testing against. 2.6.11-rc4-bk4, 2.6.11-rc4-bk10 and 
2.6.11-rc4-mm1. I moved onto 2.6.11-rc4-mm1 for my main debugging (inserting lots of printks and 
generally doing stuff that was going to crash). I hope to reproduce the faults against the vanilla 
2.6.11-rc4 kernels and I'm now testing with 2.6.11-rc5-bk2.

As per the original bug report, 2.6.11-rc4-bk(4/10) locked in [<c0268574>] 
get_active_stripe+0x224/0x260. Although unlike -mm1 I'm not sure of the sequence of events that 
caused it and it's not anywhere as easy to hit. I am willing to investigate as time allows however.

Testing 2.6.11-rc5-bk2 and it of course is flatly refusing to misbehave. I'll keep beating on it for 
a couple of days and after writing 3TB with 2.6.11-rc5-bk2, I'll go back to the older kernels and 
try and reproduce the failure there. It *did* lockup 4 times in a row in get_active_stripe on the 
(Continue reading)

Brad Campbell | 1 Mar 07:58 2005
Picon

Re: Raid-6 hang on write.

Neil Brown wrote:
> 
> Could you please confirm if there is a problem with
>     2.6.11-rc4-bk4-≥bk10
> 
> as reported, and whether it seems to be the same problem.

Ok.. are we all ready? I had applied your development patches to all my vanilla 2.6.11-rc4-* 
kernels. Thus they all exhibited the same problem in the same way as -mm1. <Smacks forehead against 
wall repeatedly>

I had applied the patch to correct the looping resync issue with too many failed drives, and just 
continued and applied all the other patches also.

I have been unable to reproduce the fault using a vanilla 2.6.11-rc5-bk2 kernel.

Oh well, at least we now know about a bug in the -mm patches.

Regards,
Brad
--

-- 
"Human beings, who are almost unique in having the ability
to learn from the experience of others, are also remarkable
for their apparent disinclination to do so." -- Douglas Adams
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo <at> vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

(Continue reading)

Neil Brown | 1 Mar 10:18 2005
X-Face
Picon
Picon

Re: Raid-6 hang on write.

On Tuesday March 1, brad <at> wasp.net.au wrote:
> Neil Brown wrote:
> > 
> > Could you please confirm if there is a problem with
> >     2.6.11-rc4-bk4-≥bk10
> > 
> > as reported, and whether it seems to be the same problem.
> 
> Ok.. are we all ready? I had applied your development patches to all my vanilla 2.6.11-rc4-* 
> kernels. Thus they all exhibited the same problem in the same way as -mm1. <Smacks forehead against 
> wall repeatedly>

Thanks for following through with this so we know exactly where the
problem is ... and isn't.  And admitting your careless mistake in
public is a great example to all the rest of us who are too shy to do
so - thanks :-)

> 
> Oh well, at least we now know about a bug in the -mm patches.
> 

Yes, and very helpful to know it is.  Thanks again.

NeilBrown
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo <at> vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

(Continue reading)

Nicola Fankhauser | 1 Mar 14:24 2005
Picon

non-optimal RAID 5 performance with 8 drive array

Hi all

I have a RAID 5 array consisting of 8 300GB Maxtor SATA drives 
(6B300S0), hooked up to a Asus A8N-SLI deluxe motherboard with 4 NForce4 
SATA ports and 4 SiI 3114 ports.

see [3] for a description of what I did and more details.

each single disk in the array gives a read performance (tested with dd) 
of about 62MB/s (when reading the first 4GiB of a disk).

the array (reading the first 8GiB from /dev/md0 with dd, bs=1024K) 
performs at about 174MiB/s, accessing the array through LVM2 (still with 
bs=1024K) only 86MiB/s.

my first conclusion was to leave out the LVM2 of the loop and directly 
put a file system on /dev/md0.

however, when reading Neil Brown's article [1], I got the impression 
that my system should perform better, given the fact that each disk has 
a transfer rate of minimally 37MiB/s, maximally 65MiB/s [2].

or is there a flaw in my argumentation and the current performance is 
normal? are at last the controllers saturating?

any suggestions are welcome.

regards
nicola

(Continue reading)

Robin Bowes | 1 Mar 18:53 2005

Re: non-optimal RAID 5 performance with 8 drive array

Nicola Fankhauser wrote:
> see [3] for a description of what I did and more details.

Hi Nicola,

I read your description with interest.

I thought I'd try some speed tests myself but dd doesn't seem to work 
the same for me (on FC3). Here's what I get:

[root <at> dude test]# dd if=/dev/zero of=/home/test/test.tmp bs=4096 
count=100000
100000+0 records in
100000+0 records out

Notice there is no timing information.

For the read test:

[root <at> dude test]# dd of=/dev/null if=/home/test/test.tmp bs=4096
100000+0 records in
100000+0 records out

Again, no timing information.

Anyone know if this is a quirk of the FC3 version of dd?

R.
--

-- 
http://robinbowes.com
(Continue reading)

Roberto Fichera | 1 Mar 19:04 2005
Picon

Re: non-optimal RAID 5 performance with 8 drive array

At 18.53 01/03/2005, Robin Bowes wrote:
>Nicola Fankhauser wrote:
>>see [3] for a description of what I did and more details.
>
>Hi Nicola,
>
>I read your description with interest.
>
>I thought I'd try some speed tests myself but dd doesn't seem to work the 
>same for me (on FC3). Here's what I get:
>
>[root <at> dude test]# dd if=/dev/zero of=/home/test/test.tmp bs=4096 count=100000
>100000+0 records in
>100000+0 records out
>
>Notice there is no timing information.
>
>For the read test:
>
>[root <at> dude test]# dd of=/dev/null if=/home/test/test.tmp bs=4096
>100000+0 records in
>100000+0 records out
>
>Again, no timing information.
>
>Anyone know if this is a quirk of the FC3 version of dd?

you have to use:

time dd if=/dev/zero of=/home/test/test.tmp bs=4096 count=100000
(Continue reading)

Robin Bowes | 1 Mar 19:12 2005

Re: non-optimal RAID 5 performance with 8 drive array

Roberto Fichera wrote:
> At 18.53 01/03/2005, Robin Bowes wrote:
>> [root <at> dude test]# dd if=/dev/zero of=/home/test/test.tmp bs=4096 
>> count=100000
>> 100000+0 records in
>> 100000+0 records out
>>
>> Notice there is no timing information.
> 
> you have to use:
> 
> time dd if=/dev/zero of=/home/test/test.tmp bs=4096 count=100000

Robert,

That's not what I meant - I know I can use "time" to get CPU usage 
information.

I was meaning that I don't get the disk speed summary like Nicola did, e.g.:

me <at> beast:$ dd if=/dev/zero of=/storagearray/test.tmp bs=4096
1238865+0 records in
1238864+0 records out
5074386944 bytes transferred in 63.475536 seconds (79942404 bytes/sec)

R.
--

-- 
http://robinbowes.com

-
(Continue reading)

Roberto Fichera | 1 Mar 19:41 2005
Picon

Re: non-optimal RAID 5 performance with 8 drive array

At 19.12 01/03/2005, Robin Bowes wrote:
>Roberto Fichera wrote:
>>At 18.53 01/03/2005, Robin Bowes wrote:
>>>[root <at> dude test]# dd if=/dev/zero of=/home/test/test.tmp bs=4096 
>>>count=100000
>>>100000+0 records in
>>>100000+0 records out
>>>
>>>Notice there is no timing information.
>>you have to use:
>>time dd if=/dev/zero of=/home/test/test.tmp bs=4096 count=100000
>
>Robert,
>
>That's not what I meant - I know I can use "time" to get CPU usage 
>information.
>
>I was meaning that I don't get the disk speed summary like Nicola did, e.g.:
>
>me <at> beast:$ dd if=/dev/zero of=/storagearray/test.tmp bs=4096
>1238865+0 records in
>1238864+0 records out
>5074386944 bytes transferred in 63.475536 seconds (79942404 bytes/sec)

Opss!!! Sorry! May be some recent coreutils have this nice feature :-)!

>R.
>--
>http://robinbowes.com
>
(Continue reading)

Heinz Mauelshagen | 1 Mar 20:11 2005
Picon

*** Announcement: dmraid 1.0.0.rc6 ***


               *** Announcement: dmraid 1.0.0.rc6 ***

dmraid 1.0.0.rc6 is available at
http://people.redhat.com:/heinzm/sw/dmraid/ in source tarball,
source rpm and i386 rpms (shared, static and dietlibc).

This release introduces support for VIA Software RAID.

dmraid (Device-Mapper Raid tool) discovers, [de]activates and displays
properties of software RAID sets (i.e. ATARAID) and contained DOS
partitions using the device-mapper runtime of the 2.6 kernel.

The following ATARAID types are supported on Linux 2.6:

Highpoint HPT37X
Highpoint HPT45X
Intel Software RAID
LSI Logic MegaRAID
NVidia NForce
Promise FastTrack
Silicon Image Medley
VIA Software RAID	*** NEW ***

Please provide insight to support those metadata formats completely.

Thanks.

See files README and CHANGELOG, which come with the source tarball for
prerequisites to run this software, further instructions on installing
(Continue reading)

Nicola Fankhauser | 1 Mar 20:54 2005
Picon

dd version I used is 5.2.1, but...

hi

the version of dd I used is 5.2.1 (debian testing), but does anybody 
have an idea regarding my performance question?

regards
nicola
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo <at> vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Gmane