Thanks for your reply. I have a few
<hnr>...</hnr> follow up comments below.
----- Original Message -----
Sent: Wednesday, July 28, 2004 6:33
PM
Subject: Re: [Ipoverib] Some Questions on
the IPoIB - Connected Mode Draft
Hal,
Thanks for your comments. My reply within <VK>
below.
Vivek
--
Vivek Kashyap
Linux Technology Center,
IBM
vivk <at> us.ibm.com
kashyapv <at> us.ibm.com
Ph: 503 578 3422
T/L: 775 3422
Hi,
I just finished
reading the "IPoIB connected mode" I-D (http://www.ietf.org/internet-drafts/draft-kashyap-ipoib-connected-mode-02.txt) and have some questions:
1. In 2.2 Outline of Address
Resolution, it is stated that:
Every IPoIB-CM interface MUST have two QPs associated with
it:
1) A connected mode QP
2) An unreliable datagram mode
QP
I'm a little confused by the description of the
first QP. Is this a really a UD QP used for connected mode address resolution
(when
the interface is not in the same scope and partition as the UD one)
? It is also later stated (in 3.0 Address Resolution) that IPoIB-CM
can
use the same UD QP as the one used by IPoIB-UD interface when they share the
same partition and scope so it seems this
contradicts the earlier
statement. It seems like the UD QP for connected mode is only required when
the same partition and scope
are not shared with the IPoIB-UD
interface.
<VK>
Connected mode does not support multicasting. Address resolution
is best implemented as a broadcast/multicast service for IP. Therefore, the
idea is to use an unreliable datagram QP for address resolution - the 2nd QP
mentioned. The first QP is the connected mode (RC or UC) over which the data
transfer after address resolution will occur. That is, there are two types of
QPs required when implementing IPoIB-CM: 1) a Connected mode QP (2) and
unreliable datagram QP.
It is further suggested that one MAY choose to use the
same UD QP as used for IPoIB-UD for address resolution but it is not required
that one do so. One can certainly use a different UD QP. This follows from the
broadcast GID being the same in all cases if the IP version, scope and P_Key
are the same.
<VK>
<hnr> It might be clearer as:
Every IPoIB-CM interface MUST have two or more QPs
associated with it:
1) One or more connected mode QPs
2) An
unreliable datagram mode QP for address resolution
</hnr>
2. In 3.1 Link Layer
Address, it is stated that the RC and UC flags are mutually exclusive. Is this
a requirement or more a
configuration issue ? Is there something which
would break with both flags on ? The only issue I see is if a IPoIB-CM
subnet
includes some nodes supporting only RC and others only UC. That
couldn't be mixed. I suppose setting both opens the door
for that to
occur.
<VK> yes, exactly. Therefore
the drafts keeps the two separate. <VK>
3. In 3.2 IB Connection Setup, it is stated that the node
SHOULD NOT attempt another connection to the remote peer
using the same
service ID as for an already existing connection. Wouldn't that potentially be
useful for support of QoS ?
Is this a recommendation (SHOULD NOT) just to
reduce the number of connections (and QPs) ?
<VK> The IB level connection link is between two machines.
The peer might return the same QP, reject the request or grant another as per
IB. The caller needs to keep track of this in its internal tables.
Additionally, the respondent should do the same if it accepts multiple IB
connections. These aspects are hidden from the IP layer. Hence, though not
disallowed since it can be useful, multiple IB connections must be handled
carefully by the user.<VK>
<hnr> I don't understand how the peer could
return the same QP as RC and UC QPs are 1:1. I think it either accepts the
connection and uses another QP or rejects it. </hnr>
4. In 3.3 Service ID,
wouldn't it be better if the QPN didn't overlap a 32 bit boundary
?
<VK> The QPN is 24 bits though we
could move it by a byte. <VK>
5. In 5.1 Per
Connection MTU, why does the peer need to REJ the connection the the REQ
desired MTU is not acceptable
(and is more than the minimum MTU) ? Couldn't
we REP with the "MTU granted" for that case ? The active side which
issued
the REQ could always REJ the REP if it didn't like the "MTU
granted".
<VK> I'd suggest the
following to make it a very simple transacation.
The private data field includes only one value the
'Desired MTU'. The 'minimum MTU' is the 'link MTU' i.e. the value received on
joining the broadcast-GID (will be known to all nodes). If the receiver cannot
accept the Desired MTU then it responds with the 'link MTU'. If it can then it
responds with the 'Desired MTU'.
An alternative, based on your comment above is to, let
the response be any value between minimum to 'desired'. Then the original
respondent has the option of either accepting that value or responding with
the 'Minimum'.
REJ is a bad option since the nodes belong to the same
'link' and hence should be able to talk to one another. REJ can be used for
error cases only.
<VK>
<hnr> Your idea is simpler and better. The
problem with my idea is that the original requester will need to send private
data in the RTU indicating the MTU he accepts and the RTU cannot be relied
upon. It may be lost so it is a bad idea to rely on
it. </hnr>
-- Hal_______________________________________________
IPoverIB
mailing list
IPoverIB <at> ietf.org
https://www1.ietf.org/mailman/listinfo/ipoverib