Hefty, Sean | 1 Nov 2011 01:40
Picon
Favicon

RE: Question about a corner case in the CMA

> If a user tried to join a multicast group and the variable "mc" is being added
> to the list of the mc_list:
> 1) If the write() fails, the value of mc->next may be changed by another
> thread (for example, removing the mcast which is pointed by mc->next)
>     will the "mc->next" still point to the right location? (since i don't see
> any "volatile" or something like this).

This shouldn't matter.  On a failure, the code searches the mc_list for the entry and removes it.  mc->next
can change between entering the mc entry and removing it.

- Sean
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@...
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Ira Weiny | 1 Nov 2011 07:24
Gravatar

[PATCH] infiniband-diags: ibqueryerrors: don't exit on reset error


IBERROR exits the process.  ibqueryerrors should continue despite this error.

Signed-off-by: Ira Weiny <weiny2@...>
---
 src/ibqueryerrors.c |    6 +++---
 1 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/src/ibqueryerrors.c b/src/ibqueryerrors.c
index d37d268..e070baf 100644
--- a/src/ibqueryerrors.c
+++ b/src/ibqueryerrors.c
 <at>  <at>  -629,7 +629,7  <at>  <at>  static void clear_port(ib_portid_t * portid, uint16_t cap_mask,
 	if (mask)
 		if (!performance_reset_via(pc, portid, port, mask, ibd_timeout,
 					   IB_GSI_PORT_COUNTERS, ibmad_port))
-			IBERROR("Failed to reset errors %s port %d", node_name,
+			fprintf(stderr, "Failed to reset errors %s port %d\n", node_name,
 				port);

 	if (clear_errors && details) {
 <at>  <at>  -653,8 +653,8  <at>  <at>  static void clear_port(ib_portid_t * portid, uint16_t cap_mask,

 		if (!reset_pc_ext(pc, portid, port, mask, ibd_timeout,
 		    ibmad_port))
-			IBERROR("Failed to reset extended data counters %s, "
-				"%s port %d", node_name, portid2str(portid),
+			fprintf(stderr, "Failed to reset extended data counters %s, "
+				"%s port %d\n", node_name, portid2str(portid),
 				port);
(Continue reading)

Or Gerlitz | 1 Nov 2011 07:40
Picon

Re: [PATCH 4/5] ib/core: add support for extended performance counters in sysfs

On Mon, Oct 31, 2011 at 9:38 PM, Roland Dreier <roland@...> wrote:

> Sorry for the late review here

Oh yes... BTW this is patch 4/5, I don't see patches 1,2,3 on your for-next
tree/branch  <at>  kernel.org, have you accepted them?

> Sorry for the late review here, but does it seem like the best
> approach to have a separate "counters_ext" directory for
> some subset of performance counters?  Instead we could
> have two attribute_groups, one the basic counters and one
> the basic and extended counters, and basically do
>
>        if (is_pma_class_cap_ext_width(device, port_num) == 0)
>                sysfs_create_group(...basic and extended counters...)
>       else
>                sysfs_create_group(...basic counters...)

Basically, I don't see a problem to have one directory along the lines
of your suggestion

> Or is there some reason why users would want to make the
> distinction between basic and extended counters?

Today, e.g in some IBoE perf monitoring scripts we wrote, the
distinction is done by if (the ext counter directory exists) then go
and read the counters from there, else read from the non extended
counters directory. With the change you propose, that if (.) would
become a little less elegant and would check if this or that --file--
exists (e.g the 64 bit tx data counter) and if yes, would read the
(Continue reading)

Hal Rosenstock | 1 Nov 2011 13:37
Picon

[PATCH][TRIVIAL] opensm: Fix typo in routing section in man page and doc


Signed-off-by: Hal Rosenstock <hal@...>
---
 doc/current-routing.txt |    2 +-
 man/opensm.8.in         |    2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/doc/current-routing.txt b/doc/current-routing.txt
index f97bc3f..ae2c65f 100644
--- a/doc/current-routing.txt
+++ b/doc/current-routing.txt
 <at>  <at>  -69,7 +69,7  <at>  <at>  a decision is made as to what port should be used to get to that LID.
    This step is common to standard and Up/Down routing. Each port has a
 counter counting the number of target LIDs going through it.
    When there are multiple alternative ports with same MinHop to a LID,
-the one with less previously assigned ports is selected.
+the one with less previously assigned LIDs is selected.
    If LMC > 0, more checks are added: Within each group of LIDs assigned to
 same target port,
    a. use only ports which have same MinHop
diff --git a/man/opensm.8.in b/man/opensm.8.in
index 042bee3..029e3b8 100644
--- a/man/opensm.8.in
+++ b/man/opensm.8.in
 <at>  <at>  -756,7 +756,7  <at>  <at>  decision is made as to what port should be used to get to that LID.
    This step is common to standard and Up/Down routing. Each port has a
 counter counting the number of target LIDs going through it.
    When there are multiple alternative ports with same MinHop to a LID,
-the one with less previously assigned ports is selected.
+the one with less previously assigned LIDs is selected.
(Continue reading)

Vlad Weinbaum | 1 Nov 2011 14:25

Re: cq error timeout issue

Hi,

packet_life_time is 12 in sm.conf.

I'm debugging the polling process and see that ibv_poll_cq returns
error when mlx4_cqe->owner_sr_opcode gets MLX4_CQE_OPCODE_ERROR value
(in libmlx4, mlx4_poll_one). Where this value came from?

Thanks,
Vlad

On Mon, Oct 31, 2011 at 3:00 PM, Or Gerlitz <or.gerlitz@...> wrote:
> On Mon, Oct 31, 2011 at 9:08 AM, Vlad Weinbaum
> <vlad.weinbaum@...> wrote:
>> [...] I found detail that I cannot explain. I query the QP after connect and get timeout value 16,
>>  that must be 4 us * 2^16 = 256 ms, but I get about 800 ms.
>
> As Sean indicated, the timeout is **based** on the packet_life_time,
> in case you're configuring
> your QP through the rdma-cm, the IB stack code actually adds one to
> the packet_life_time quantity
> as a rough estimate for the local hca ack delay, still maybe there is
> a hole here and the query qp verb
> isn't reporting correctly, what was the value you configured on the sm
> side, and how many retries did
> you use for the qp? each retry will double the time you should be
> observing in practice.
>
> Or.
>
(Continue reading)

Hal Rosenstock | 1 Nov 2011 14:46
Picon

Re: [PATCH] opensm: Add the precreation of multicast groups

On 10/31/2011 4:12 PM, Ira Weiny wrote:
> On Thu, 27 Oct 2011 12:41:53 -0700
> Hal Rosenstock <hal@...> wrote:
> 
>> On 10/6/2011 8:13 PM, Ira Weiny wrote:
>>>
>>> Allow for the pre-creation of these groups on a partition by partition basis.
>>
>> This looks good to me and has been needed for some time now; Thanks for
>> doing this!
>>
>> Just some minor comments below from scanning through the patch. I
>> haven't tried it yet...
>>
>>> P_Key is taken from the partition specification.  Q_Key, TClass, rate, and mtu
>>> can be specified.
>>
>> If TClass is added, should FlowLabel also be added ?
> 
> Yes, added in next version.  Also added Q_Key, tclass, and FlowLabel to BC
> group config which uses common parsing code.
> 
>>
>>> For IP groups, rate and mtu are verified to match the broadcast groups
>>> parameters.  P_Key can be specified in the mgid itself and is verified to match
>>> the P_Key of the partition if != 0x0000.  If pkey == 0x0000 then the P_Key is
>>> taken from the partition specification.
>>
>> Do you mean the pkey in the IPoIB MGID here by "partition specification" ?
> 
(Continue reading)

Hal Rosenstock | 1 Nov 2011 15:04
Picon

Re: cq error timeout issue

Just to clarify:

On 11/1/2011 9:25 AM, Vlad Weinbaum wrote:
> packet_life_time is 12 in sm.conf.

OpenSM packet_life_time setting has nothing to do with this (but is used
for SwitchInfo:LifeTimeValue setting):

# The code of maximal time a packet can live in a switch
# The actual time is 4.096usec * 2^<packet_life_time>
# The value 0x14 disables this mechanism

The rate in SA PathRecord comes from subnet timeout in the absence of QoS:
        /*
         * Set packet lifetime.
         * According to spec definition IBA 1.2 Table 205
         * PacketLifeTime description, for loopback paths,
         * packetLifeTime shall be zero.
         */
        if (p_src_port == p_dest_port)
                pkt_life = 0;
        else if (p_qos_level && p_qos_level->pkt_life_set)
                pkt_life = p_qos_level->pkt_life;
        else
                pkt_life = sa->p_subn->opt.subnet_timeout;

-- Hal

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
(Continue reading)

Hefty, Sean | 1 Nov 2011 17:17
Picon
Favicon

RE: Question about a corner case in the CMA

> My question is: will the mc->next contain the right value?
> (since the compiler may cache the value of mc->next within that function and
> not read the updated value from memory,
> so if the mc_list will be changed by another thread, we may get bad results
> ...)

Access to the list is protected by a mutex.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@...
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Jason Gunthorpe | 1 Nov 2011 17:22
Favicon

Re: [PATCH 4/5] ib/core: add support for extended performance counters in sysfs

On Tue, Nov 01, 2011 at 08:40:12AM +0200, Or Gerlitz wrote:

> Today, e.g in some IBoE perf monitoring scripts we wrote, the
> distinction is done by if (the ext counter directory exists) then go
> and read the counters from there, else read from the non extended
> counters directory. With the change you propose, that if (.) would
> become a little less elegant and would check if this or that --file--
> exists (e.g the 64 bit tx data counter) and if yes, would read the
> four 64 bit counters (rx/tx packets/data) else the four 32 bits
> counters, so from our user standpoint, diff dirs seems better, but we
> can get along with same dir with diff contents depending on the
> device.

Is there any reason to expose the 32 and 64 bit version of the same
counter? That seems needless. Emit the largest version available and
prepend 0's to fill out to the available width so that userspace can
know the counter size.

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@...
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Alex Netes | 1 Nov 2011 17:26
Favicon

Re: [PATCH][TRIVIAL] opensm/osm_sa_mcmember_record.c: Commentary changes relating to SA assigned MGIDs

Hi Hal,

On 13:46 Mon 17 Oct     , Hal Rosenstock wrote:
> 
> Update one comment to be more accurate
> and remove comment which appears to no longer apply
> 
> Signed-off-by: Hal Rosenstock <hal@...>
> ---

Applied, thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@...
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Gmane