Cliff Wickman | 1 Feb 23:37 2010
Picon

Re: Fix to numa_node_to_cpus_v2


Hi Sharyathi,

  Thanks for both patch and test case.

  The patch needs one more change I think.
  The target buffer may be bigger, so the copy of the map needs
  to be zero-extended.
  Would you review it?

Thx.
-Cliff

> Date: Thu, 28 Jan 2010 11:23:05 +0530
> From: Sharyathi Nagesh <sharyath <at> in.ibm.com>
> To: linux-numa <at> vger.kernel.org, Andi Kleen <andi <at> firstfloor.org>,
> 	Christoph Lameter <clameter <at> sgi.com>, Cliff Wickman <cpw <at> sgi.com>,
> 	Lee Schermerhorn <lee.schermerhorn <at> hp.com>,
> 	Amit K Arora <amitarora <at> in.ibm.com>, deepti.kalra <at> in.ibm.com
> Subject: Fix to numa_node_to_cpus_v2
> 
> Hi
> 
> We observed that numa_node_to_cpus api() api converts a node number to a 
> bitmask of CPUs. The user must pass a long enough buffer. If the buffer is not
> long enough errno will be set to ERANGE and -1 returned. On success 0 is returned.
> This api has been changed in numa version 2.0. It has new implementation (_v2)
> 
> Analysis:
> Now within the numa_node_to_cpus code there is a check if the size of buffer
(Continue reading)

Sharyathi Nagesh | 2 Feb 06:16 2010
Picon

Re: Fix to numa_node_to_cpus_v2

Cliff
	Thank you for providing the correction. It looks like the best thing to do with changed 
buffer size context after adding earlier patch that I sent.
	I had one observation, though it doesn't impact this issue directly. In function 
copy_bitmask_to_bitmask() 3rd condition looked redundant to me. Since first 2 conditions 
cover all the cases, in that situation would these conditions make sense ?
-----------------------------------------------------------------------------------------------
else {
                 bytes = CPU_BYTES(bmpfrom->size);
                 memcpy(bmpto->maskp, bmpfrom->maskp, bytes);
         }
-----------------------------------------------------------------------------------------------
Do let us know when can we expect these patches upstream.
Thank you
Sharyathi

On 02/02/2010 04:07 AM, Cliff Wickman wrote:
> Hi Sharyathi,
>
>    Thanks for both patch and test case.
>
>    The patch needs one more change I think.
>    The target buffer may be bigger, so the copy of the map needs
>    to be zero-extended.
>    Would you review it?
>
> Thx.
> -Cliff
>
>> Date: Thu, 28 Jan 2010 11:23:05 +0530
(Continue reading)

Cliff Wickman | 2 Feb 14:40 2010
Picon

Re: Fix to numa_node_to_cpus_v2

Hi Sharyathi

On Tue, Feb 02, 2010 at 10:46:41AM +0530, Sharyathi Nagesh wrote:
> Cliff
> 	Thank you for providing the correction. It looks like the best thing to 
> do with changed buffer size context after adding earlier patch that I 
> sent.
> 	I had one observation, though it doesn't impact this issue directly. In 
> function copy_bitmask_to_bitmask() 3rd condition looked redundant to me. 
> Since first 2 conditions cover all the cases, in that situation would 
> these conditions make sense ?
> -----------------------------------------------------------------------------------------------
> else {
>                 bytes = CPU_BYTES(bmpfrom->size);
>                 memcpy(bmpto->maskp, bmpfrom->maskp, bytes);
>         }
> -----------------------------------------------------------------------------------------------

Indeed extraneous. I removed them. Thanks.

> Do let us know when can we expect these patches upstream.
> Thank you
> Sharyathi

Your patch is included in numactl-2.0.4-rc2.tar.gz
(at ftp://oss.sgi.com/www/projects/libnuma/download/ )

-Cliff

>
(Continue reading)

Scott Lurndal | 9 Feb 22:48 2010

Library versioning issues.


Hi,

  I've a third-party application that's designed to be NUMA aware when
running on a NUMA system, but since this application also runs on non-NUMA
systems and a wide variety of operating systems, they don't link the
application directly with the libnuma shared object.   Instead, they use
dlopen() to open the library at run-time, if it exists.

  The issue is that the application expects the 1.0 interfaces, however the
run time linker dlsym() is returning a function pointer to the _v2 interface
for numa_node_to_cpus_v2, thus the call fails and the application believes
that there is no NUMA configuration available.

  We've fixed this internally (and temporarily) with the patch below,
but would like some guidance as to a longer-term fix that accomodates
applications that use dlopen to access libnuma and preserves versioning
so that older applications will continue to function correctly.

  Perhaps the API for a given function should not change, but rather
new functions should be implemented and the older deprecated?

  thanks,

scott

$ cvs -q diff -ub -r1.1 -r1.2 libnuma.c
Index: libnuma.c
===================================================================
RCS file: /home/slurndal/cvs/opensource/numactl/libnuma.c,v
(Continue reading)

Andi Kleen | 10 Feb 08:34 2010

Re: Library versioning issues.

On Tue, Feb 09, 2010 at 01:48:30PM -0800, Scott Lurndal wrote:
> 
> Hi,
> 
>   I've a third-party application that's designed to be NUMA aware when
> running on a NUMA system, but since this application also runs on non-NUMA
> systems and a wide variety of operating systems, they don't link the
> application directly with the libnuma shared object.   Instead, they use
> dlopen() to open the library at run-time, if it exists.

I think you really need to ask this question the dlopen() developers.
I don't think it's a problem that should be somehow kludged around
in libnuma. Simply dlsym() et.al. need a way to specify the expected
version. Otherwise it's a don't do that if it hurts(tm).

-Andi

Andi Kleen | 10 Feb 13:45 2010

Re: Library versioning issues II


As a addendum (after looking at the documentation) it seems like
glibc has already anticipated that problem and 
your user simply needs to switch to dlvsym() and specify the expected
version.

-Andi
--

-- 
ak <at> linux.intel.com -- Speaking for myself only.
Scott Lurndal | 10 Feb 20:28 2010

Re: Library versioning issues II

On Wed, Feb 10, 2010 at 01:45:12PM +0100, Andi Kleen wrote:
> 
> As a addendum (after looking at the documentation) it seems like
> glibc has already anticipated that problem and 
> your user simply needs to switch to dlvsym() and specify the expected
> version.

Thanks Andi,

   I'll convey this to the vendor.  They may be reluctant to use a
non-posix interface, but c'est la vie.

scott
MacCana, Mike | 11 Feb 15:26 2010

Possible man numactl error?

Hi NUMA folks,

I've spotted what appears to be an inconsistency between man numactl and
/proc & /sys.

       --physcpubind=cpus, -C cpus
              Only  execute  process  on cpus.  This accepts physical
cpu numbers as shown in the processor fields of /proc/cpuinfo.

However the processor field of /proc/cpuinfo refers to logical
processors (ie, run queues), not physical ones. Eg:

$ grep processor /proc/cpuinfo 
processor       : 0
processor       : 1
... 
Compared to:

$ cat /sys/devices/system/cpu/cpu0/topology/physical_package_id 
0

$ cat /sys/devices/system/cpu/cpu1/topology/physical_package_id 
0

I.e., CPUs 0 and 1 are clearly on the same physical package. 

Futhermore, physical CPU numbers are shown with the string 'physical id'
not 'processor' in /proc/cpuinfo. 

Is the numactl man page incorrect, or am I missing something? Does -C
(Continue reading)

Andi Kleen | 11 Feb 15:40 2010

Re: Possible man numactl error?

On Thu, Feb 11, 2010 at 02:26:14PM -0000, MacCana, Mike wrote:
> I've spotted what appears to be an inconsistency between man numactl and
> /proc & /sys.
>  
>        --physcpubind=cpus, -C cpus
>               Only  execute  process  on cpus.  This accepts physical
> cpu numbers as shown in the processor fields of /proc/cpuinfo.

"physical CPU" in this case means CPUs as shown by cpuinfo,
not APIC IDs or package ids or anything like that. 

> However the processor field of /proc/cpuinfo refers to logical
> processors (ie, run queues), not physical ones. Eg:

I don't think the term 'logical CPU' or 'run queues'
is used anywhere in the kernel for CPU numbers.

Perhaps the man page could be clarified (mostly for people
who think too complicated like you :-) and yes perhaps
it is better in this age with all kinds of virtual CPUs

-Andi
MacCana, Mike | 11 Feb 15:49 2010

RE: Possible man numactl error?

-----Original Message-----
From: Andi Kleen [mailto:andi <at> firstfloor.org]
Sent: 11 February 2010 14:41
To: MacCana, Mike
Cc: linux-numa <at> vger.kernel.org
Subject: Re: Possible man numactl error?

>On Thu, Feb 11, 2010 at 02:26:14PM -0000, MacCana, Mike wrote:
>> I've spotted what appears to be an inconsistency between man numactl
>> and /proc & /sys.
>> 
>>        --physcpubind=cpus, -C cpus
>>               Only  execute  process  on cpus.  This accepts physical
>> cpu numbers as shown in the processor fields of /proc/cpuinfo.

>"physical CPU" in this case means CPUs as shown by cpuinfo, not APIC
IDs or package ids or anything like that.
> Perhaps the man page could be clarified (mostly for people who think
too complicated like you :-) and yes perhaps it is better in this age
with all kinds of virtual CPUs

Thanks for the quick reply. I agree, the man page (and the long switch
name) are confusing - I've never heard the term 'physical CPU' to mean
anything other than a CPU die / package. 

Mike

=============================================================================== 
 Please access the attached hyperlink for an important electronic communications disclaimer: 
 http://www.credit-suisse.com/legal/en/disclaimer_email_ib.html 
(Continue reading)


Gmane