Roland Dreier | 1 Jun 2010 05:19
Picon
Favicon

Re: bug report: dereferencing before check

 > drivers/infiniband/hw/mlx4/cq.c +401 mlx4_ib_resize_cq(56)
 > 	warn: variable dereferenced before check 'cq->resize_buf'
 > 
 >    385          err = mlx4_cq_resize(dev->dev, &cq->mcq, entries, &cq->resize_buf->buf.mtt);
 >                                                                   ^^^^^^^^^^^^^^^^^^^^^^^^
 > 	Dereference "cq->resize_buf" here.  (Ok.  Technically we
 > 	dereference it inside the function).
 > 
 >    386          if (err)
 >    387                  goto err_buf;
 >    388
 >    389          mlx4_mtt_cleanup(dev->dev, &mtt);
 >    390          if (ibcq->uobject) {
 >    391                  cq->buf      = cq->resize_buf->buf;
 >    392                  cq->ibcq.cqe = cq->resize_buf->cqe;
 >    393                  ib_umem_release(cq->umem);
 >    394                  cq->umem     = cq->resize_umem;
 >    395
 >    396                  kfree(cq->resize_buf);
 >    397                  cq->resize_buf = NULL;
 >    398                  cq->resize_umem = NULL;
 >    399          } else {
 >    400                  spin_lock_irq(&cq->lock);
 >    401                  if (cq->resize_buf) {
 >                             ^^^^^^^^^^^^^^
 > 	Check here.
 > 
 >    402                          mlx4_ib_cq_resize_copy_cqes(cq);
 > 
 > Can "cq->resize_buf" be NULL here?
(Continue reading)

Yevgeny Kliteynik | 1 Jun 2010 13:58
Picon

Re: [PATCH] opensm: fixing compilation issues in some header files

Sasha,

On 24-Mar-10 5:50 PM, Yevgeny Kliteynik wrote:
> All the compilation issues refer to implicit casting
> from "void*" to "some_struct_t*"

This was detected when compiling a code that includes these
headers with g++ compiler. The headers should be able to compile
by g++ (they have 'extern "C"'). But the problem is not only
with g++ - it is with typing in general. I may be wrong, but I
feel that every new gcc version is more strongly typed.
Moreover, it appears that it would be more strongly typed in the
future, as GCC folks themselves are starting to compile GCC
with G++:

http://gcc.gnu.org/ml/gcc/2010-05/msg00705.html

-- Yevgeny

  
> Signed-off-by: Yevgeny Kliteynik<kliteyn@...>
> ---
>   opensm/include/opensm/osm_pkey.h   |    8 +++++---
>   opensm/include/opensm/osm_port.h   |    4 ++--
>   opensm/include/opensm/osm_subnet.h |    2 +-
>   3 files changed, 8 insertions(+), 6 deletions(-)
>
> diff --git a/opensm/include/opensm/osm_pkey.h b/opensm/include/opensm/osm_pkey.h
> index d10479d..53e9657 100644
> --- a/opensm/include/opensm/osm_pkey.h
(Continue reading)

Tziporet Koren | 1 Jun 2010 15:28
Picon

Re: [ewg] libibverbs without an HCA

On 6/1/2010 2:39 PM, Albert Strasheim wrote:
> Hello all
>
> Having just reviewed the various on-host shared memory mechanisms
> available in Linux and being quite unimpressed, I have the following
> question:
>
> Has anyone considered making a kernel and userspace driver so that
> libibverbs can be used on a single machine without any Infiniband
> hardware?
>
> The Verbs API seems like a very nice way to handle shared memory
> between processes where the number of shared buffers can vary in size
> and quantity.
>
> Regards
>
> Albert
>
>    
These kind of questions should be addressed by the Linux RDMA List 
(already CC)

Tziporet

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@...
More majordomo info at  http://vger.kernel.org/majordomo-info.html

(Continue reading)

Mike Heinz | 1 Jun 2010 16:35

[PATCH] ipoib pkey race condition

IPoIB is coded to use the 1st PKey in the PKey table as its ib0 interface. 
Additional ib0.pkey interfaces may be created using the /sys/class/...
add_child interface.

However, there is a race.  During normal boot, IPoIB will be started before the
port is Active.  Hence the pkey table has not yet been programmed and has a
default pkey table (with 0xffff as only pkey).

Later when the SM moves the port to Active, the SM may program the pkey table
differently.  However at this point IPoIB has already started using the
incorrect pkey.

It appears that the initially formatted 'broadcast' mgid is never updated to
supply actual pkey value if ipoib comes up before hca port. Proposed patch
targets two issues:

1. Suppress activation of interface and join multicast group queries (it will
fail anyway) until hca port is initialized. When port becomes active - update
pkey value and move on.
2. Update broadcast mgid based on actual pkey, then issue join broadcast group
request.

Signed-Off-By: Michael Heinz <michael.heinz@...>

-------
diff --git a/drivers/infiniband/ulp/ipoib/ipoib_ib.c b/drivers/infiniband/ulp/ipoib/ipoib_ib.c
index ec6b4fb..496d96c 100644
--- a/drivers/infiniband/ulp/ipoib/ipoib_ib.c
+++ b/drivers/infiniband/ulp/ipoib/ipoib_ib.c
 <at>  <at>  -51,6 +51,7  <at>  <at>  MODULE_PARM_DESC(data_debug_level,
(Continue reading)

Jim Schutt | 1 Jun 2010 16:56
Picon

[PATCH v2] opensm/qos.c: Revert port ranges for calls to sl2vl_update_table().

Before commit 051a1dd5 (opensm/osm_qos.c: split switch external and end
ports setup), osm_qos_setup() would end up calling sl2vl_update_table()
for output ports 1-N, and inport ports 0-N.

Commit 051a1dd5 changed this around to be output ports 0-N, and input
ports 1-N, and an InfiniScale IV-based fabric would log lots of errors
like these:

  log_rcv_cb_error: ERR 3111: Received MAD with error status = 0x1C
  SubnGetResp(SLtoVLMappingTable), attr_mod 0x2300, TID 0xad069
  Initial path: 0,1,1,4,13 Return path: 0,25,1,7,10

The attr_mod in every such message has 0x00 in the least significant
byte, which specifies the output port.

With the port ranges restored to their old values, the above log messages
stop.  Hal Rosenstock pointed out that we should not be attempting
to program a base SP0 with SL2VL maps; see, e.g.,  IBA 1.2.1, section
14.2.5.8, page 844.  So, this patch is a full reversion for
switches supporting base SP0, but only a partial reversion for
switches supporting enhanced SP0.

Signed-off-by: Jim Schutt <jaschut@...>
---
 opensm/opensm/osm_qos.c |    5 +++--
 1 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/opensm/opensm/osm_qos.c b/opensm/opensm/osm_qos.c
index f814ea8..cce59ee 100644
--- a/opensm/opensm/osm_qos.c
(Continue reading)

Hefty, Sean | 1 Jun 2010 17:18
Picon
Favicon

RE: [ANNOUNCE] librdmacm 1.0.12

>   gcc -DHAVE_CONFIG_H -I. -I. -I. -I./include -g -Wall -D_GNU_SOURCE -O2 -g
> -pipe -m64 -MT src_librdmacm_la-acm.lo -MD -MP -MF .deps/src_librdmacm_la-
> acm.Tpo -c src/acm.c  -fPIC -DPIC -o
> .libs/src_librdmacm_la-acm.o
> In file included from src/acm.c:44:
> ./include/infiniband/ib.h:49: error: syntax error before "__be16"

At least the following definitions are missing from types.h in redhat 4.x, which is based on 2.6.9.  (These
are from RH 5.x.)

158 #ifdef __CHECKER__
159 #define __bitwise__ __attribute__((bitwise))
160 #else
161 #define __bitwise__
162 #endif
163 #ifdef __CHECK_ENDIAN__
164 #define __bitwise __bitwise__
165 #else
166 #define __bitwise
167 #endif
168 
169 typedef __u16 __bitwise __le16;
170 typedef __u16 __bitwise __be16;
171 typedef __u32 __bitwise __le32;
172 typedef __u32 __bitwise __be32;
173 #if defined(__GNUC__) && !defined(__STRICT_ANSI__)
174 typedef __u64 __bitwise __le64;
175 typedef __u64 __bitwise __be64;
176 #endif

(Continue reading)

Sasha Khapyorsky | 1 Jun 2010 17:32

Re: [PATCH] opensm: Add a rate based mechanism for SMP transactions

Hi Hal,

On 10:11 Wed 16 Dec     , Hal Rosenstock wrote:
> 
> In order to better handle non responsive SMAs (when link is physically up
> but the SMA does not respond), a rate based mechanism for SMPs is added
> to better enable forward progress in a more timely fashion. So rather than
> wait for timeouts and outstanding wire SMPs to drop below some configured
> value, there is also a periodic rate for transaction based SMPs. These
> rate based SMPs are capped at a configured maximum value. In order to
> accomodate these, the vendor layer ibumad match table is increased by
> that number in order not to overflow due to these added transactions.
> 
> Two new options are added for this:
> rate_based_smp_usecs indicates the number of microseconds between rate
> based SMPs. 
> max_rate_based_smps indicates the maximum number of rate based SMPs
> supported. When this limit is reached, rate based SMPs are no longer
> sent (until the number of outstanding ones drops below this limit).

As far as I learned the patch.... Wouldn't something like below does the
same work:

diff --git a/opensm/opensm/osm_vl15intf.c b/opensm/opensm/osm_vl15intf.c
index ff9e4db..a16d88e 100644
--- a/opensm/opensm/osm_vl15intf.c
+++ b/opensm/opensm/osm_vl15intf.c
 <at>  <at>  -113,6 +113,8  <at>  <at>  static void vl15_poller(IN void *p_ptr)
 	osm_madw_t *p_madw;
 	osm_vl15_t *p_vl = p_ptr;
(Continue reading)

Hefty, Sean | 1 Jun 2010 19:33
Picon
Favicon

[PATCH] librdmacm: support 2.6.9

Redhat 4.x is based on 2.6.9.  Add support for older kernels.

Signed-off-by: Sean Hefty <sean.hefty@...>
---
This should fix the OFED build errors on RH 4.x.  When testing this on a
RH 4.x system, I noticed additional build warnings on 32-bit systems.
I'll add a fix for these warnings separately. 

 include/infiniband/ib.h |   10 ++++++++++
 1 files changed, 10 insertions(+), 0 deletions(-)

diff --git a/include/infiniband/ib.h b/include/infiniband/ib.h
index 3a97322..2e5029a 100644
--- a/include/infiniband/ib.h
+++ b/include/infiniband/ib.h
 <at>  <at>  -43,6 +43,16  <at>  <at> 
 #define PF_IB AF_IB
 #endif

+#ifndef __be16
+#define __be16 __u16
+#endif
+#ifndef __be32
+#define __be32 __u32
+#endif
+#ifndef __be64
+#define __be64 __u64
+#endif
+
 struct ib_addr {
(Continue reading)

Sasha Khapyorsky | 1 Jun 2010 20:37

Re: [PATCH v2] opensm/osmeventplugin: added couple of events to monitor SM

Hi Yevgeny,

On 12:20 Wed 07 Apr     , Yevgeny Kliteynik wrote:
> 
> I've added a couple of new events that allow event
> plug-in to see what SM is doing, when it is sweeping
> and when it updates dump files:
> 
>   OSM_EVENT_ID_L_SWEEP_STARTED,
>   OSM_EVENT_ID_L_SWEEP_DONE,
>   OSM_EVENT_ID_H_SWEEP_STARTED,
>   OSM_EVENT_ID_H_SWEEP_DONE,
>   OSM_EVENT_ID_REROUTE_DONE,
>   OSM_EVENT_ID_ENTERING_STANDBY,
>   OSM_EVENT_ID_SM_PORT_DOWN,
>   OSM_EVENT_ID_SA_DB_DUMPED
> 
> The last event is reported when SA DB was actually dumped.
> I'm thinking of similar optimization for guid2lid file - it
> doesn't have to be dumped at the end of each heavy sweep,
> as many heavy sweeps don't really happen because of nodes
> appearing/disappearing.

I don't think that having a lot of events and spamming OpenSM core code
with osm_opensm_report_event() calls were an original goals. The
plugin interface is done so that it has full access to OpenSM internal
data structures, etc.. So only *really* important things (such as SUBNET
UP) will be transferred as events.

Also when sending the patch like this it would be really nice to have
(Continue reading)

Sasha Khapyorsky | 1 Jun 2010 20:39

Re: [PATCH v2] opensm/qos.c: Revert port ranges for calls to sl2vl_update_table().

On 08:56 Tue 01 Jun     , Jim Schutt wrote:
> Before commit 051a1dd5 (opensm/osm_qos.c: split switch external and end
> ports setup), osm_qos_setup() would end up calling sl2vl_update_table()
> for output ports 1-N, and inport ports 0-N.
> 
> Commit 051a1dd5 changed this around to be output ports 0-N, and input
> ports 1-N, and an InfiniScale IV-based fabric would log lots of errors
> like these:
> 
>   log_rcv_cb_error: ERR 3111: Received MAD with error status = 0x1C
>   SubnGetResp(SLtoVLMappingTable), attr_mod 0x2300, TID 0xad069
>   Initial path: 0,1,1,4,13 Return path: 0,25,1,7,10
> 
> The attr_mod in every such message has 0x00 in the least significant
> byte, which specifies the output port.
> 
> With the port ranges restored to their old values, the above log messages
> stop.  Hal Rosenstock pointed out that we should not be attempting
> to program a base SP0 with SL2VL maps; see, e.g.,  IBA 1.2.1, section
> 14.2.5.8, page 844.  So, this patch is a full reversion for
> switches supporting base SP0, but only a partial reversion for
> switches supporting enhanced SP0.
> 
> Signed-off-by: Jim Schutt <jaschut@...>

Applied. Thanks.

Sasha
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
(Continue reading)


Gmane