Alexander Motin | 4 Jul 2009 21:14
Picon
Favicon

DFLTPHYS vs MAXPHYS

Hi.

Can somebody explain me a difference between DFLTPHYS and MAXPHYS 
constants? As I understand, the last one is a maximal amount of memory, 
that can be mapped to the kernel, or passed to the hardware drivers. But 
why then DFLTPHYS is used in so many places and what does it mean?

Isn't it a time to review their values for increasing? 64KB looks funny, 
comparing to modern memory sizes and data rates. It just increases 
interrupt rates, but I don't think it really need to be so small to 
improve interactivity now.

--

-- 
Alexander Motin
_______________________________________________
freebsd-arch <at> freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-arch
To unsubscribe, send any mail to "freebsd-arch-unsubscribe <at> freebsd.org"

Gary Jennejohn | 5 Jul 2009 10:00
Picon
Favicon

Re: DFLTPHYS vs MAXPHYS

On Sat, 04 Jul 2009 22:14:53 +0300
Alexander Motin <mav <at> FreeBSD.org> wrote:

> Can somebody explain me a difference between DFLTPHYS and MAXPHYS 
> constants? As I understand, the last one is a maximal amount of memory, 
> that can be mapped to the kernel, or passed to the hardware drivers. But 
> why then DFLTPHYS is used in so many places and what does it mean?
> 

There's a pretty good comment on these in /sys/conf/NOTES.

> Isn't it a time to review their values for increasing? 64KB looks funny, 
> comparing to modern memory sizes and data rates. It just increases 
> interrupt rates, but I don't think it really need to be so small to 
> improve interactivity now.
> 

Probably historical from the days when memory was scarce.

There's nothing preventing the user from upping these values in his
kernel config file.  But note the warning in NOTES about possibly
making the kernel unbootable.  It's not clear whether this warning is
still valid given todays larger memory footprints and the inmproved
VM system.

I wonder whether all drivers can correctly handle larger values for
DFLTPHYS.

---
Gary Jennejohn
(Continue reading)

Alexander Motin | 5 Jul 2009 10:38
Picon
Favicon

Re: DFLTPHYS vs MAXPHYS

Gary Jennejohn wrote:
> On Sat, 04 Jul 2009 22:14:53 +0300
> Alexander Motin <mav <at> FreeBSD.org> wrote:
> 
>> Can somebody explain me a difference between DFLTPHYS and MAXPHYS 
>> constants? As I understand, the last one is a maximal amount of memory, 
>> that can be mapped to the kernel, or passed to the hardware drivers. But 
>> why then DFLTPHYS is used in so many places and what does it mean?
> 
> There's a pretty good comment on these in /sys/conf/NOTES.

But it does not explains why.

>> Isn't it a time to review their values for increasing? 64KB looks funny, 
>> comparing to modern memory sizes and data rates. It just increases 
>> interrupt rates, but I don't think it really need to be so small to 
>> improve interactivity now.
> 
> Probably historical from the days when memory was scarce.
> 
> There's nothing preventing the user from upping these values in his
> kernel config file.  But note the warning in NOTES about possibly
> making the kernel unbootable.  It's not clear whether this warning is
> still valid given todays larger memory footprints and the inmproved
> VM system.
> 
> I wonder whether all drivers can correctly handle larger values for
> DFLTPHYS.

There are always will be drivers/devices with limitations. They should 
(Continue reading)

Bruce Evans | 5 Jul 2009 16:11
Picon

Re: DFLTPHYS vs MAXPHYS

On Sun, 5 Jul 2009, Alexander Motin wrote:

> Gary Jennejohn wrote:
>> On Sat, 04 Jul 2009 22:14:53 +0300
>> Alexander Motin <mav <at> FreeBSD.org> wrote:
>> 
>>> Can somebody explain me a difference between DFLTPHYS and MAXPHYS 
>>> constants? As I understand, the last one is a maximal amount of memory, 
>>> that can be mapped to the kernel, or passed to the hardware drivers. But 
>>> why then DFLTPHYS is used in so many places and what does it mean?
>> 
>> There's a pretty good comment on these in /sys/conf/NOTES.
>
> But it does not explains why.

DFLTPHYS is the default -- the size to be used when the correct size is
not known.  However, this is mostly broken:

- the correct size should always be known at a low level.  You have to
   know the maximum size for a device to know that this size is larger
   than the default, else using the default size won't work.  Also, you
   have to know that the default size is a multiple of the minimum size.
   Both of these are usually true accidentally, so things sort of work.

- the default size is defaulted inconsistently.  Geom hides the device
   maximum i/o size (d_maxsize, which is normally either 64K or DFLTPHYS
   which happen to be the same) from the top level of devices (it reblocks
   if necessary so that sizes up to (s_iosize_max, which is always
   MAXPHYS) work, so it is difficult to see the the low-level size or to
   use an i/o size that is a multiple of the device maximum i/o size if
(Continue reading)

Gary Jennejohn | 5 Jul 2009 16:16
Picon
Favicon

Re: DFLTPHYS vs MAXPHYS

On Sun, 05 Jul 2009 11:38:23 +0300
Alexander Motin <mav <at> FreeBSD.org> wrote:

> Gary Jennejohn wrote:
> > I wonder whether all drivers can correctly handle larger values for
> > DFLTPHYS.
> 
> There are always will be drivers/devices with limitations. They should 
> just be able to report that limitations to system. This is possible with 
> GEOM, but it doesn't looks tuned well for all providers. There are many 
> places, when DFLTPHYS used just with hope that it will work. IMHO if 
> driver unable to adapt to any defined DFLTPHYS value, it should not use 
> it, but instead should announce some specific value that it really supports.
> 

This would be the correct way to do things.

I remember back in the good-old-days, circa 1985, disk drivers _always_
did their own PHYS handling so that utilities could pass in whatever
value they wanted to use for the size.  Of course, that meant that each
driver reinvented the wheel.

---
Gary Jennejohn
_______________________________________________
freebsd-arch <at> freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-arch
To unsubscribe, send any mail to "freebsd-arch-unsubscribe <at> freebsd.org"

(Continue reading)

Alexander Motin | 5 Jul 2009 16:37
Picon
Favicon

Re: DFLTPHYS vs MAXPHYS

Bruce Evans wrote:
> On Sun, 5 Jul 2009, Alexander Motin wrote:
>>>> Isn't it a time to review their values for increasing? 64KB looks 
>>>> funny, comparing to modern memory sizes and data rates. It just 
>>>> increases interrupt rates, but I don't think it really need to be so 
>>>> small to improve interactivity now.
> 
> 64K is large enough to bust modern L1 caches and old L2 caches.  Make the
> size bigger to bust modern L2 caches too.  Interrupt rates don't matter
> when you are transfering 64K items per interrupt.

How cache size related to it, if DMA transfers data directly to RAM? 
Sure, CPU will invalidate related cache lines, but why it should 
invalidate everything?

Small transfers give more work to all levels from GEOM down to CAM/ATA, 
controllers and drives. It is not just a context switching.

>>> I wonder whether all drivers can correctly handle larger values for
>>> DFLTPHYS.
> 
> Most can't, since their hardware can't.  They can fake it (ata used to)
> but there is negative point in this for most drivers, since geom already
> reblocks for disk devices and reblocking would be wrong for devices like
> tapes.

I am not speaking about reblocking. I am speaking about best possible 
hardware usage. I can't say about the most, but at least AHCI and modern 
SiI SATA chips, I have worked closely, practically have no limits for 
transaction size, except the amount of memory their drivers allocate for 
(Continue reading)

Bruce Evans | 5 Jul 2009 18:46
Picon

Re: DFLTPHYS vs MAXPHYS

On Sun, 5 Jul 2009, Alexander Motin wrote:

> Bruce Evans wrote:
>> On Sun, 5 Jul 2009, Alexander Motin wrote:
>>>>> Isn't it a time to review their values for increasing? 64KB looks funny, 
>>>>> comparing to modern memory sizes and data rates. It just increases 
>>>>> interrupt rates, but I don't think it really need to be so small to 
>>>>> improve interactivity now.
>> 
>> 64K is large enough to bust modern L1 caches and old L2 caches.  Make the
>> size bigger to bust modern L2 caches too.  Interrupt rates don't matter
>> when you are transfering 64K items per interrupt.
>
> How cache size related to it, if DMA transfers data directly to RAM? Sure, 
> CPU will invalidate related cache lines, but why it should invalidate 
> everything?

I was thinking more of transfers to userland.  Increasing user buffer
sizes above about half the L2 cache size guarantees busting the L2
cache, if the application actually looks at all of its data.  If the
data is read using read(), then the L2 cache will be busted twice (or
a bit less with nontemporal copying), first by copying out the data
and then by looking at it.  If the data is read using mmap(), then the
L2 cache will only be busted once.  This effect has always been very
noticeable using dd.  Larger buffer sizes are also bad for latency.

> Small transfers give more work to all levels from GEOM down to CAM/ATA, 
> controllers and drives. It is not just a context switching.

Yes, I can't see any cache busting below the level of copyout().  Also,
(Continue reading)

Alexander Motin | 5 Jul 2009 19:12
Picon
Favicon

Re: DFLTPHYS vs MAXPHYS

Bruce Evans wrote:
> On Sun, 5 Jul 2009, Alexander Motin wrote:
>> Bruce Evans wrote:
>>> On Sun, 5 Jul 2009, Alexander Motin wrote:
>>> 64K is large enough to bust modern L1 caches and old L2 caches.  Make 
>>> the
>>> size bigger to bust modern L2 caches too.  Interrupt rates don't matter
>>> when you are transfering 64K items per interrupt.
>>
>> How cache size related to it, if DMA transfers data directly to RAM? 
>> Sure, CPU will invalidate related cache lines, but why it should 
>> invalidate everything?
> 
> I was thinking more of transfers to userland.  Increasing user buffer
> sizes above about half the L2 cache size guarantees busting the L2
> cache, if the application actually looks at all of its data.  If the
> data is read using read(), then the L2 cache will be busted twice (or
> a bit less with nontemporal copying), first by copying out the data
> and then by looking at it.  If the data is read using mmap(), then the
> L2 cache will only be busted once.  This effect has always been very
> noticeable using dd.  Larger buffer sizes are also bad for latency.
> 
>> Small transfers give more work to all levels from GEOM down to 
>> CAM/ATA, controllers and drives. It is not just a context switching.
> 
> Yes, I can't see any cache busting below the level of copyout().  Also,
> after you convert all applications to use mmap() instead of read(),
> the cache busting should become per-CPU.

As soon as file data usually passing via buffer cache, they will anyway 
(Continue reading)

Bruce Evans | 5 Jul 2009 20:32
Picon

Re: DFLTPHYS vs MAXPHYS

On Sun, 5 Jul 2009, Alexander Motin wrote:

> Bruce Evans wrote:
>> I was thinking more of transfers to userland.  Increasing user buffer
>> sizes above about half the L2 cache size guarantees busting the L2
>> cache, if the application actually looks at all of its data.  If the
>> data is read using read(), then the L2 cache will be busted twice (or
>> a bit less with nontemporal copying), first by copying out the data
>> and then by looking at it.  If the data is read using mmap(), then the
>> L2 cache will only be busted once.  This effect has always been very
>> noticeable using dd.  Larger buffer sizes are also bad for latency.
> ...
> How to reproduce that dd experiment? I have my system running with MAXPHYS of 
> 512K and here is what I have:

I used a regular file with the same size as main memory (1G), and for
today's test, not quite dd, but a program that throws away the data
(so as to avoid overcall for write syscalls) and prints status info
in a more suitable form than even dd's ^T.

Your results show that physio() behaves quite differently than copying
reading a regular file.  I see similar behaviour input from a disk file.

> # dd if=/dev/ada0 of=/dev/null bs=512k count=1000
> 1000+0 records in
> 1000+0 records out
> 524288000 bytes transferred in 2.471564 secs (212128024 bytes/sec)

512MB would be too small with buffering for a regular file, but should
be OK with a disk file.
(Continue reading)

Alexander Motin | 5 Jul 2009 20:51
Picon
Favicon

Re: DFLTPHYS vs MAXPHYS

Bruce Evans wrote:
> On Sun, 5 Jul 2009, Alexander Motin wrote:
> 
>> Bruce Evans wrote:
>>> I was thinking more of transfers to userland.  Increasing user buffer
>>> sizes above about half the L2 cache size guarantees busting the L2
>>> cache, if the application actually looks at all of its data.  If the
>>> data is read using read(), then the L2 cache will be busted twice (or
>>> a bit less with nontemporal copying), first by copying out the data
>>> and then by looking at it.  If the data is read using mmap(), then the
>>> L2 cache will only be busted once.  This effect has always been very
>>> noticeable using dd.  Larger buffer sizes are also bad for latency.
>> ...
>> How to reproduce that dd experiment? I have my system running with 
>> MAXPHYS of 512K and here is what I have:
> 
> I used a regular file with the same size as main memory (1G), and for
> today's test, not quite dd, but a program that throws away the data
> (so as to avoid overcall for write syscalls) and prints status info
> in a more suitable form than even dd's ^T.
> 
> Your results show that physio() behaves quite differently than copying
> reading a regular file.  I see similar behaviour input from a disk file.
> 
>> # dd if=/dev/ada0 of=/dev/null bs=512k count=1000
>> 1000+0 records in
>> 1000+0 records out
>> 524288000 bytes transferred in 2.471564 secs (212128024 bytes/sec)
> 
> 512MB would be too small with buffering for a regular file, but should
(Continue reading)


Gmane