Keld Jørn Simonsen | 1 Mar 2008 03:05
Picon

Re: Understanding bonnie++ results

On Fri, Feb 29, 2008 at 09:24:42AM +0100, Franck Routier wrote:
> By the way, my server in not really in a production state, and the
> hardware is running, so I might take some time to do sensible tests if
> anyone has ideas of what a good test would be...

I would like you to investigate why random writes are so relatively
slow with raid10,f2. You could run the bonnie+ tests, and then watch
via an iostat how each of the disks are performing, compared to a 
HW RAID10. 

And also see if it matters if the resync has completed or not.

Best regards
keld

> 
> > I like to get such results of comparison between HW and SW raid.
> > How advanced are Adaptec controllers considered these  days? 
> > My thoughts are that SW raid is faster than HW raid, because Neil and the
> > other people here together can develop more sophisticated algorithms,
> > but I would like some hard figures to back up that thought.
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo <at> vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo <at> vger.kernel.org
(Continue reading)

Bill Davidsen | 1 Mar 2008 20:33

Re: Article aimed at general audience

Bruce Miller wrote:
> Carla Schroder has posted an article on setting up a RAID10 array (with
> LVM). I was one of the two readers who encouraged her to cover RAID10,
> an idea that I got from lurking on this list.
> http://www.enterprisenetworkingplanet.com/nethub/article.php/10950_3730176_1
>  

I got as far as the point where she said that "RAID 10 is shorthand for 
RAID1+0, a mirrored striped array." Linux software raid10 involves both 
stripes and mirrors, but it isn't the same as 1+0 thank you. A little 
learning is a dangerous thing.

--

-- 
Bill Davidsen <davidsen <at> tmr.com>
  "Woe unto the statesman who makes war without a reason that will still
  be valid when the war is over..." Otto von Bismark 

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo <at> vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Bill Davidsen | 1 Mar 2008 21:04

Re: Understanding bonnie++ results

Franck Routier wrote:
> Hi,
>
> I am experimenting with Adaptec 31205 hardware raid versus md raid on
> raid level 10 with 3 arrays of 4 disks each.
> md array was created with f2 option.
> I get some results with bonnie++ tests I would like to understand:
>
> - per char sequential output is consistantly around 70k/sec for both
> setup
> - but block sequential output shows a huge difference between hw and sw
> raid: about 160k/sec for hw versus 60k/sec for md. Where can this come
> from ??
>
>   
Do you have a base raw read speed for a single drive? That helps 
visualize things as percentages of single drive speed. Also, is the 
Adaptec hw raid10 really raid1+0 or the distributed raid10 done by Linux 
sw raid?
> On the contrary, md beat hw on inputs:
> - sequential input show 360k/sec versus 220k/sec for hw
> - random seek 1350/sec for md versus 1150/sec for hw
>
> So, these bonnie++ tests show quite huge differences for the same
> hardware between adaptec's hardware setup and md driver.
>
> Does anyone has any explanation on this ? (btw, the fs on top of this is
> xfs).
>
>   
(Continue reading)

Bill Davidsen | 1 Mar 2008 21:04

Re: very poor ext3 write performance on big filesystems?

Andreas Dilger wrote:
> I'm CCing the linux-raid mailing list, since I suspect they will be
> interested in this result.
>
> I would suspect that the "journal guided RAID recovery" mechanism
> developed by U.Wisconsin may significantly benefit this workload
> because the filesystem journal is already recording all of these
> block numbers and the MD bitmap mechanism is pure overhead.
>   

Thanks for sharing these numbers. I think use of a bitmap is one of 
those things which people have to configure to match their use, 
certainly using a larger bitmap seems to reduce the delays, using an 
external bitmap certainly help, especially on an SSD. But on a large 
array, without a bitmap, performance can be compromised for hours during 
recovery, so the administrator must decide if normal case performance is 
more important than worst case performance.

--

-- 
Bill Davidsen <davidsen <at> tmr.com>
  "Woe unto the statesman who makes war without a reason that will still
  be valid when the war is over..." Otto von Bismark 

--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo <at> vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Bill Davidsen | 1 Mar 2008 21:07

Re: 2x6 or 3x4 raid10 arrays ?

Janek Kozicki wrote:
> Janek Kozicki said:     (by the date of Thu, 28 Feb 2008 19:25:00 +0100)
>
> sorry about replying to myself.
>
> * two 6 disks raid10 arrays : theoretical max speed 6 times single disc
> * three 4 disks raid10 arrays : theoretical max speed 4 times single disc
> * single raid10 far=2  : theoretical max speed 12 times single disc (!)
>
> isn't that true?
>
>   
True for throughput, not for seek. Also, what I have seen and read 
indicates that smaller chunks help a lot for database with lots of seeks.

--

-- 
Bill Davidsen <davidsen <at> tmr.com>
  "Woe unto the statesman who makes war without a reason that will still
  be valid when the war is over..." Otto von Bismark 

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo <at> vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Bill Davidsen | 1 Mar 2008 21:18

Re: 2x6 or 3x4 raid10 arrays ?

Franck Routier wrote:
> Hi,
>
> I am installing a database (postgresql) server.
> I am considering two options:
> - either setup two 6 disks raid10 arrays
> - or setup three 4 disks raid10 arrays
>   

One more thought on this, if you use is such that you have one table 
which is really getting heavy read use, have more than two copies. Bad 
for disk utilization, good for having the data on an idle spindle.

--

-- 
Bill Davidsen <davidsen <at> tmr.com>
  "Woe unto the statesman who makes war without a reason that will still
  be valid when the war is over..." Otto von Bismark 

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo <at> vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Franck Routier | 1 Mar 2008 21:25
Favicon

Re: Understanding bonnie++ results


Le samedi 01 mars 2008 à 15:04 -0500, Bill Davidsen a écrit :
> Do you have a base raw read speed for a single drive? That helps 

Disk speed measured by hdparm -t shows :
Timing buffered disk reads:  270 MB in  3.02 seconds =  89.52 MB/sec

> Adaptec hw raid10 really raid1+0 or the distributed raid10 done by Linux 
> sw raid?

I don't know, but RAID10 is an option on it's own in the bios (I don't
have to build raid 1 then raid 0)

Franck

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo <at> vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Bill Davidsen | 1 Mar 2008 21:31

Re: Corrupted RAID 50 (aparently?) array. Help needed.

nmella <at> kepler.cl wrote:
> Hi list,
>
> Hope someone can give me the right direction on how to resolve this.

Were you under the impression that this list is a professional service 
where people are paid to give you good response? You sent the first copy 
of your question at 10:42, got impatient and resent at 11:04, then 11:15 
and 11:22. Sorry we don't respond fast enough for you, perhaps you 
should take your business elsewhere.

--

-- 
Bill Davidsen <davidsen <at> tmr.com>
  "Woe unto the statesman who makes war without a reason that will still
  be valid when the war is over..." Otto von Bismark 

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo <at> vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Keld Jørn Simonsen | 1 Mar 2008 21:40
Picon

Re: 2x6 or 3x4 raid10 arrays ?

On Thu, Feb 28, 2008 at 10:36:22PM +0000, Nat Makarevitch wrote:
> Franck Routier <franck.routier <at> axege.com> writes:
> 
> > database (postgresql) server.
> 
> AFAIK if the average size of an I/O operation as long as the corresponding
> variance are low... go for a single RAID10,f2 with a stripe size slightly
> superior to this average. This way you will have most requests mobilizing only a
> single spindle and all your spindles acting in parallel. If this average size
> varies upon tables one may create a RAID (with the adequate stripe size) per
> database partition.

I believe that a full chunk is read for each read access.
Or, at least, if one operation can be done within one chunk,
not more than that chunk is operated upon.

And chunks are recommended to be between 256 kiB and 1 MiB.
Most random database reads are much smaller than 256 kiB.
So the probability that one random read can be done with just one 
seek + read operation is very high, as far as I understand it.

This would lead to that it is not important whether to use 
two arrays of 6 disks each, or 3 arrays of 4 disks each. 
Or for that sake 1 array of 12 disks.

Some other factors may be more important: such as the ability to survive
disk crashes. raid10,f2 is good for surviving 1 disk crash. If you have
3 raids of 4 disks, it can survive a disk crash in each of these raids.
Furthermore some combinations of crashes of 2 disks within a raid can
also be survived. There are 16 combinations of failing disksi, with 0 to
(Continue reading)

Bill Davidsen | 1 Mar 2008 21:45

Re: Severe slowdown with LVM on RAID, alignment problem?

Peter Rabbitson wrote:
> Michael Guntsche wrote:
>>
>> Is it possible that my computer is just too slow to get good read 
>> results?
> unlikely
>
>> While reading is a little bit faster it's nowhere near the speed I 
>> get on
>> md0 itself.
>>
>
> I would guess that you did not set the correct read-ahead values for 
> the LV. If you do not specify anything it will default to 128k (256 
> sectors), which is terribly small for sequential reads. On the 
> contrary the MD device will do some clever calculations and set its 
> read-ahead correctly depending on the raid level and the number of 
> disks. Do:
>
> blockdev --setra 65536 <your lv device>
>
> and run the tests again. You are almost certainly going to get the 
> results you are after.

I will just comment that really large readahead values may cause 
significant memory usage and transfer of unused data. My observations 
and some posts indicate that very large readahead and/or chunk size may 
reduce random access performance. I believe you said you had 512MB RAM, 
that may be a factor as well.

(Continue reading)


Gmane