A Couple of IPoIB Questions
2004-11-16 21:11:25 GMT
_______________________________________________ IPoverIB mailing list IPoverIB <at> ietf.org https://www1.ietf.org/mailman/listinfo/ipoverib
_______________________________________________ IPoverIB mailing list IPoverIB <at> ietf.org https://www1.ietf.org/mailman/listinfo/ipoverib
Hal Rosenstock wrote: > Hi, Hi, > > I have a couple of questions relative to IPoIB: > > 1. draft-ietf-ipoib-ip-over-infiniband-07.txt states: > "Every IPoIB interface MUST "FullMember" join the IB multicast group > defined by the broadcast-GID." > > Isn't the broadcast group for IPv4 ? When the IPoIB interface is IPv6 > only, does this group still need be joined ? > If not, where do the parameters for any IPv6 groups come from ? I am > presuming that this group needs to be joined in > the IPv6 only case. I just want to be sure. Previously on the WG, we went thru a discussion on this, and the consensus was that all interfaces (irrespective of ipv4 only, ipv6 only, or ipv4 and ipv6) MUST join the broadcast-GID and obtain parameters for all IPv4 and IPv6 groups from this one single broadcast-GID. We further discussed changing the signature part of the address of the broadcast group to reflect that it was IPv4 and IPv6 agnostic, but maintained the IPv4 signature to make it easier for current implementations to make any required changes to adapt to this rule. Thanks. Kanoj > > 2. ALso, what is the latest status of the Vivek's connected mode draft ? > Will it be moving forward ? > > Thanks. > > -- Hal > > > ------------------------------------------------------------------------ > > _______________________________________________ > IPoverIB mailing list > IPoverIB <at> ietf.org > https://www1.ietf.org/mailman/listinfo/ipoverib
| "Hal Rosenstock" <hnrose <at> earthlink.net>
Sent by: ipoverib-bounces <at> ietf.org 11/16/2004 01:11 PM
|
To: "IPoverIB" <ipoverib <at> ietf.org> cc: Subject: [Ipoverib] A Couple of IPoIB Questions |
_______________________________________________ IPoverIB mailing list IPoverIB <at> ietf.org https://www1.ietf.org/mailman/listinfo/ipoverib
See below in <VK>
Vivek
--
Vivek Kashyap
Linux Technology Center, IBM
vivk <at> us.ibm.com
kashyapv <at> us.ibm.com
Ph: 503 578 3422 T/L: 775 3422
"Hal Rosenstock" <hnrose <at> earthlink.net>
Sent by: ipoverib-bounces <at> ietf.org
11/16/2004 01:11 PM
Please respond to Hal Rosenstock
To: "IPoverIB" <ipoverib <at> ietf.org>
cc:
Subject: [Ipoverib] A Couple of IPoIB Questions
Hi,
I have a couple of questions relative to IPoIB:
1. draft-ietf-ipoib-ip-over-infiniband-07.txt states:
"Every IPoIB interface MUST "FullMember" join the IB multicast group defined by the broadcast-GID."
Isn't the broadcast group for IPv4 ? When the IPoIB interface is IPv6 only, does this group still need be joined ?
If not, where do the parameters for any IPv6 groups come from ? I am presuming that this group needs to be joined in
the IPv6 only case. I just want to be sure.
<VK> Yes, the broadcast-GID is at the InfiniBand layer and MUST be joined whether you are running at v4 or v6 layer. <VK>
2. ALso, what is the latest status of the Vivek's connected mode draft ? Will it be moving forward ?
<VK> I'll be submitting it as draft-ietf-ipoib-connected-mode-00.txt by the end of the month. There were some interesting suggestions that were made during the IETF WG meeting. Two of the suggestions of consequence are given below. The others we can discuss when the minutes are published (they include some additional requests on clarification on the transmission draft too).
a. The current draft makes the various modes mutually exclusive i.e. RC, UC and UD are not allowed simultaneously in the same IP subnet. The thought is that it is a link characteristic and hence different per connection mode. It was suggested that one be allowed to mix up RC/UC. This goes back to the original suggestion in the first draft which was:
IPoIB-UD must always be supported. Additionally, the interface can also support either both of RC and UC, or one of them. Or neither of them.
b. Another suggestion was to allow multiple connected mode links (i.e. at IB UC/RC level) between peers.
One thought can be 'yes, but user beware': The IB connections are made using the service ID that is derived from the QPN as described in the draft. If a second attempt succeeds then there are two links. It is up to the implementation to either allow or disallow multiple links.
_______________________________________________ IPoverIB mailing list IPoverIB <at> ietf.org https://www1.ietf.org/mailman/listinfo/ipoverib
Hi,
I have a couple of questions relative to IPoIB:
1. draft-ietf-ipoib-ip-over-infiniband-07.txt states:
"Every IPoIB interface MUST "FullMember" join the IB multicast group defined by the broadcast-GID."
Isn't the broadcast group for IPv4 ? When the IPoIB interface is IPv6 only, does this group still need be joined ?
If not, where do the parameters for any IPv6 groups come from ? I am presuming that this group needs to be joined in
the IPv6 only case. I just want to be sure.
<VK> Yes, the broadcast-GID is at the InfiniBand layer and MUST be joined whether you are running at v4 or v6 layer. <VK>
2. ALso, what is the latest status of the Vivek's connected mode draft ? Will it be moving forward ?
<VK> I'll be submitting it as draft-ietf-ipoib-connected-mode-00.txt by the end of the month. There were some interesting suggestions that were made during the IETF WG meeting. Two of the suggestions of consequence are given below. The others we can discuss when the minutes are published (they include some additional requests on clarification on the transmission draft too).
a. The current draft makes the various modes mutually exclusive i.e. RC, UC and UD are not allowed simultaneously in the same IP subnet. The thought is that it is a link characteristic and hence different per connection mode. It was suggested that one be allowed to mix up RC/UC. This goes back to the original suggestion in the first draft which was:
IPoIB-UD must always be supported. Additionally, the interface can also support either both of RC and UC, or one of them. Or neither of them.
_______________________________________________ IPoverIB mailing list IPoverIB <at> ietf.org https://www1.ietf.org/mailman/listinfo/ipoverib
Hi, I have a couple of questions relative to IPoIB: 1. draft-ietf-ipoib-ip-over-infiniband-07.txt states: "Every IPoIB interface MUST "FullMember" join the IB multicast group defined by the broadcast-GID." Isn't the broadcast group for IPv4 ? When the IPoIB interface is IPv6 only, does this group still need be joined ? If not, where do the parameters for any IPv6 groups come from ? I am presuming that this group needs to be joined in the IPv6 only case. I just want to be sure.
<VK> Yes, the broadcast-GID is at the InfiniBand layer and MUST be joined whether you are running at v4 or v6 layer. <VK> 2. ALso, what is the latest status of the Vivek's connected mode draft ? Will it be moving forward ? <VK> I'll be submitting it as draft-ietf-ipoib-connected-mode-00.txt by the end of the month. There were some interesting suggestions that were made during the IETF WG meeting. Two of the suggestions of consequence are given below. The others we can discuss when the minutes are published (they include some additional requests on clarification on the transmission draft too). a. The current draft makes the various modes mutually exclusive i.e. RC, UC and UD are not allowed simultaneously in the same IP subnet. The thought is that it is a link characteristic and hence different per connection mode. It was suggested that one be allowed to mix up RC/UC. This goes back to the original suggestion in the first draft which was: IPoIB-UD must always be supported. Additionally, the interface can also support either both of RC and UC, or one of them. Or neither of them.
UD MUST always be supported.
<VK> That is and has always been the requirement right from the first draft. <VK>
I personally don't care whether one does RC or UC but I don't think both are required as a MAY option. The advantage of RC is the send credit algorithm. The advantage of UC is the lack of ACK packets. ACK is noise in the fabric while send credits provide a simple method to maintain bandwidth / injection control on a per flow basis.
I see no problems with supporting both UD and *C on the same subnet; it is rather foolish to attempt to mandate these be on separate subnets.b<VK> As per the connected-mode draft the UD mechanism is *always* required; address resolutoin depends on it.
The only point of discussion is whether all nodes must support the same link characteristics in the subnet i.e. all are RC (and UD), or all or UC (and UD), or all are UD only.
The alternative is to allow all the nodes to be mixed up with some nodes being RC/UD, others UC/UD and a third set UD only and yet others probably supporting all. within the same IP subnet. [Can the same serviceID be used by both RC and UC ?]
The third alternative is to associating UD only or UD + one of RC or UC on the same interface. In such a case if mismatched/unsupported connected modes are supported by two nodes then the fall back to UD. This option is not too different from UD QP + RC or UC mechanism.
<VK>
b. Another suggestion was to allow multiple connected mode links (i.e. at IB UC/RC level) between peers. One thought can be 'yes, but user beware': The IB connections are made using the service ID that is derived from the QPN as described in the draft. If a second attempt succeeds then there are two links. It is up to the implementation to either allow or disallow multiple links.
Again, this has been suggested in the past (though most who were involved in the original discussions years gone by are likely gone since much of this discussion occurred before the IETF workgroup was established).
<VK> I'm one of the vestiges of those early times along with you and a few others...so we have hope :). <VK>
There is obvious benefit to supporting multiple RC per endnode pair. I do not see any technical reason to oppose nor any issue from an interoperability perspective. There is no reason for a "user beware".
<VK> It is not opposed. The 'user beware' is only underscoring that the the peer interface might not support multiple links- it might enforce a limited number of connections (maybe only one) between a pair of GIDs. Similarly, an implementation not wanting to support multiple links MUST take steps to deny multiple requests.
<VK>
The work is rather straight to do and implement and the benefit to customers, is again, rather obvious when one considers what the IB fabric offers and how connections can be enable flows through multipath as well as transparent fail-over, flow scheduling, mapping of DiffServ to different arbitration / paths, etc.
<VK> In addition Large MTU and APM are two of the main reasons why I've been proposing IPoIB-connected mode for so long. In terms of IPoIB itself, except for the Large MTU, the parameters are hidden from it.<VK>
_______________________________________________ IPoverIB mailing list IPoverIB <at> ietf.org https://www1.ietf.org/mailman/listinfo/ipoverib
Mike the format is really off in the last mail from you - making it difficult to follow. Other than that let us discuss in the context of the draft. The draft is built upon the following: 1. IPoIB-RC and IPoIB-UC are optional. 2. IPoIB connected mode depends on a UD QP for address resolution and multicast. As far as I know, there has been an agreement since the earliest connected mode draft I posted. I'd like the WG to give input on the following issues: 3. Where does the UD QP come from? Choose one of: a. It is a UD QP that is associated with the interface at startup. b. It is a UD QP that is shared with IPoIB-UD. 3a is more generic. It can be considered to include the case 3b. The original proposal was limited to 3b. 4. Link characteristics The broadcast domain for IPoIB-RC/UC is determined exactly as the IPoIB-UD case i.e. through the broadcast-GID. A UD as per 3 is used in this step. Do all interfaces in the IPoIB-conneced mode(CM) have the same link characteristics? i.e. a. all are either IPoIB-RC or IPoIB-UC. -- There is also a UD QP associated. The UD QP will be either 3a or 3b based on WG concensus. -- All unicast transmission is on the IPoIB mode i.e. RC or UC. b. all are IPoIB-UD. Additionally they can be one of IPoIB-RC or IPoIB-UC or both. -- The presence of the flags indicate the type of communication possible. -- The decision of communicating using a specific mode is determined by the supported modes and the local policy. Note that incompatible policies imply that the fallback is communication over UD. -- fallback mode of communication is UD 4b adds a lot of flexibility at the expense of a simple decision. 4a. by contrast is straightforward. 5. MTU negotiation In the private data field of the CM message the desired MTU is included. It was suggested during the IPoIB meeting at IETF that it need not be symmetric. That is a good idea. Thus each peer declares the max MTU it prefers REQ: <my desired MTU> REP: <my desired MTU> RTU: 6. Multiple connections for the same IP address Local decision. Note that the peer might choose to not honour multiple connections. Vivek On Wed, 17 Nov 2004, Michael Krause wrote: > At 11:38 PM 11/16/2004, Vivek Kashyap wrote: > > > > >Hi, I have a couple of questions relative to IPoIB: 1. > >draft-ietf-ipoib-ip-over-infiniband-07.txt states: "Every IPoIB interface > >MUST "FullMember" join the IB multicast group defined by the > >broadcast-GID." Isn't the broadcast group for IPv4 ? When the IPoIB > >interface is IPv6 only, does this group still need be joined ? If not, > >where do the parameters for any IPv6 groups come from ? I am presuming > >that this group needs to be joined in the IPv6 only case. I just want to > >be sure. > ><VK> Yes, the broadcast-GID is at the InfiniBand layer and MUST be joined > >whether you are running at v4 or v6 layer. <VK> 2. ALso, what is the > >latest status of the Vivek's connected mode draft ? Will it be moving > >forward ? <VK> I'll be submitting it as > >draft-ietf-ipoib-connected-mode-00.txt by the end of the month. There were > >some interesting suggestions that were made during the IETF WG meeting. > >Two of the suggestions of consequence are given below. The others we can > >discuss when the minutes are published (they include some additional > >requests on clarification on the transmission draft too). a. The current > >draft makes the various modes mutually exclusive i.e. RC, UC and UD are > >not allowed simultaneously in the same IP subnet. The thought is that it > >is a link characteristic and hence different per connection mode. It was > >suggested that one be allowed to mix up RC/UC. This goes back to the > >original suggestion in the first draft which was: IPoIB-UD must always be > >supported. Additionally, the interface can also support either both of RC > >and UC, or one of them. Or neither of them. > > > >UD MUST always be supported. > > > ><VK> That is and has always been the requirement right from the first > >draft. <VK> > > > >I personally don't care whether one does RC or UC but I don't think both > >are required as a MAY option. The advantage of RC is the send credit > >algorithm. The advantage of UC is the lack of ACK packets. ACK is noise in > >the fabric while send credits provide a simple method to maintain > >bandwidth / injection control on a per flow basis. > > > >I see no problems with supporting both UD and *C on the same subnet; it is > >rather foolish to attempt to mandate these be on separate subnets.b > ><VK> As per the connected-mode draft the UD mechanism is *always* > >required; address resolutoin depends on it. > > > >The only point of discussion is whether all nodes must support the same > >link characteristics in the subnet i.e. all are RC (and UD), or all or UC > >(and UD), or all are UD only. > > Obviously I would oppose such a solution as it creates artificial > constraints with little benefit. > > >The alternative is to allow all the nodes to be mixed up with some nodes > >being RC/UD, others UC/UD and a third set UD only and yet others probably > >supporting all. within the same IP subnet. [Can the same serviceID be used > >by both RC and UC ?] > > > >The third alternative is to associating UD only or UD + one of RC or UC on > >the same interface. In such a case if mismatched/unsupported connected > >modes are supported by two nodes then the fall back to UD. This option is > >not too different from UD QP + RC or UC mechanism. > > KISS: > > - UD universal > - *C opportunistic > - Local management issue to control what is sent on the *C > interface. No need to specify > - Advertise whether one or more ports are supported by UD or *C > - Advertise whether one or more QP are supported by UD or *C > - Let local management determine policy for what services are > mapped where - no need to specify > > This is both an interoperable approach and simple to implement. There may > be some desire to add a policy interface to state preference for specific > types of traffic over a given QP. I would not oppose this but would view > this as a separate draft once the basics are worked out. > > > > ><VK> > >b. Another suggestion was to allow multiple connected mode links (i.e. at > >IB UC/RC level) between peers. One thought can be 'yes, but user beware': > >The IB connections are made using the service ID that is derived from the > >QPN as described in the draft. If a second attempt succeeds then there are > >two links. It is up to the implementation to either allow or disallow > >multiple links. > > > >Again, this has been suggested in the past (though most who were involved > >in the original discussions years gone by are likely gone since much of > >this discussion occurred before the IETF workgroup was established). > > > ><VK> I'm one of the vestiges of those early times along with you and a few > >others...so we have hope :). <VK> > > > >There is obvious benefit to supporting multiple RC per endnode pair. I do > >not see any technical reason to oppose nor any issue from an > >interoperability perspective. There is no reason for a "user beware". > > > ><VK> It is not opposed. The 'user beware' is only underscoring that the > >the peer interface might not support multiple links- it might enforce a > >limited number of connections (maybe only one) between a pair of GIDs. > >Similarly, an implementation not wanting to support multiple links MUST > >take steps to deny multiple requests. > > *C requires CM to operate thus it is a local issue whether additional CM > operations are accepted or not. A given requester node may issue N and a > given responder may state 0-N as an implementation may limit the number of > *C available for IP traffic. > > > ><VK> > > > >The work is rather straight to do and implement and the benefit to > >customers, is again, rather obvious when one considers what the IB fabric > >offers and how connections can be enable flows through multipath as well > >as transparent fail-over, flow scheduling, mapping of DiffServ to > >different arbitration / paths, etc. > > > ><VK> In addition Large MTU and APM are two of the main reasons why I've > >been proposing IPoIB-connected mode for so long. In terms of IPoIB itself, > >except for the Large MTU, the parameters are hidden from it.<VK> > > Mike __ Vivek Kashyap Linux Technology Center, IBM
> Mike the format is really off in the last mail from you -
> making it difficult to follow.
Vivek, I think that if you used standard quoting in your replies
instead of your own "<VK>" format, it would be much easier to follow
email threads involving your replies.
Thanks,
Roland
Mike the format is really off in the last mail from you - making it difficult
to follow.
Other than that let us discuss in the context of the draft. The draft is
built upon the following:
1. IPoIB-RC and IPoIB-UC are optional.
2. IPoIB connected mode depends on a UD QP for address resolution and multicast.
As far as I know, there has been an agreement since the earliest connected mode
draft I posted.
I'd like the WG to give input on the following issues:
3. Where does the UD QP come from? Choose one of:
a. It is a UD QP that is associated with the interface at startup.
b. It is a UD QP that is shared with IPoIB-UD.
3a is more generic. It can be considered to include the case 3b. The original
proposal was limited to 3b.
4. Link characteristics
The broadcast domain for IPoIB-RC/UC is determined exactly as the
IPoIB-UD case i.e. through the broadcast-GID. A UD as per 3 is used in this
step.
Do all interfaces in the IPoIB-conneced mode(CM) have the same link characteristics? i.e.
a. all are either IPoIB-RC or IPoIB-UC.
-- There is also a UD QP associated. The UD QP will be either 3a or 3b
based on WG concensus.
-- All unicast transmission is on the IPoIB mode i.e. RC or UC.
b. all are IPoIB-UD. Additionally they can be one of IPoIB-RC or IPoIB-UC
or both.
-- The presence of the flags indicate the type of communication possible.
-- The decision of communicating using a specific mode is determined by
the supported modes and the local policy. Note that incompatible
policies imply that the fallback is communication over UD.
-- fallback mode of communication is UD
4b adds a lot of flexibility at the expense of a simple decision. 4a. by
contrast is straightforward.
5. MTU negotiation
In the private data field of the CM message the desired MTU is
included.
It was suggested during the IPoIB meeting at IETF that it need not be
symmetric. That is a good idea. Thus each peer declares the max MTU it
prefers
REQ: <my desired MTU>
REP: <my desired MTU>
RTU:
6. Multiple connections for the same IP address
Local decision. Note that the peer might choose to not honour multiple
connections.
Vivek
On Wed, 17 Nov 2004, Michael Krause wrote:
> At 11:38 PM 11/16/2004, Vivek Kashyap wrote:
>
>
>
> >Hi, I have a couple of questions relative to IPoIB: 1.
> >draft-ietf-ipoib-ip-over-infiniband-07.txt states: "Every IPoIB interface
> >MUST "FullMember" join the IB multicast group defined by the
> >broadcast-GID." Isn't the broadcast group for IPv4 ? When the IPoIB
> >interface is IPv6 only, does this group still need be joined ? If not,
> >where do the parameters for any IPv6 groups come from ? I am presuming
> >that this group needs to be joined in the IPv6 only case. I just want to
> >be sure.
> ><VK> Yes, the broadcast-GID is at the InfiniBand layer and MUST be joined
> >whether you are running at v4 or v6 layer. <VK> 2. ALso, what is the
> >latest status of the Vivek's connected mode draft ? Will it be moving
> >forward ? <VK> I'll be submitting it as
> >draft-ietf-ipoib-connected-mode-00.txt by the end of the month. There were
> >some interesting suggestions that were made during the IETF WG meeting.
> >Two of the suggestions of consequence are given below. The others we can
> >discuss when the minutes are published (they include some additional
> >requests on clarification on the transmission draft too). a. The current
> >draft makes the various modes mutually exclusive i.e. RC, UC and UD are
> >not allowed simultaneously in the same IP subnet. The thought is that it
> >is a link characteristic and hence different per connection mode. It was
> >suggested that one be allowed to mix up RC/UC. This goes back to the
> >original suggestion in the first draft which was: IPoIB-UD must always be
> >supported. Additionally, the interface can also support either both of RC
> >and UC, or one of them. Or neither of them.
> >
> >UD MUST always be supported.
> >
> ><VK> That is and has always been the requirement right from the first
> >draft. <VK>
> >
> >I personally don't care whether one does RC or UC but I don't think both
> >are required as a MAY option. The advantage of RC is the send credit
> >algorithm. The advantage of UC is the lack of ACK packets. ACK is noise in
> >the fabric while send credits provide a simple method to maintain
> >bandwidth / injection control on a per flow basis.
> >
> >I see no problems with supporting both UD and *C on the same subnet; it is
> >rather foolish to attempt to mandate these be on separate subnets.b
> ><VK> As per the connected-mode draft the UD mechanism is *always*
> >required; address resolutoin depends on it.
> >
> >The only point of discussion is whether all nodes must support the same
> >link characteristics in the subnet i.e. all are RC (and UD), or all or UC
> >(and UD), or all are UD only.
>
> Obviously I would oppose such a solution as it creates artificial
> constraints with little benefit.
>
> >The alternative is to allow all the nodes to be mixed up with some nodes
> >being RC/UD, others UC/UD and a third set UD only and yet others probably
> >supporting all. within the same IP subnet. [Can the same serviceID be used
> >by both RC and UC ?]
> >
> >The third alternative is to associating UD only or UD + one of RC or UC on
> >the same interface. In such a case if mismatched/unsupported connected
> >modes are supported by two nodes then the fall back to UD. This option is
> >not too different from UD QP + RC or UC mechanism.
>
> KISS:
>
> - UD universal
> - *C opportunistic
> - Local management issue to control what is sent on the *C
> interface. No need to specify
> - Advertise whether one or more ports are supported by UD or *C
> - Advertise whether one or more QP are supported by UD or *C
> - Let local management determine policy for what services are
> mapped where - no need to specify
>
> This is both an interoperable approach and simple to implement. There may
> be some desire to add a policy interface to state preference for specific
> types of traffic over a given QP. I would not oppose this but would view
> this as a separate draft once the basics are worked out.
>
>
>
> ><VK>
> >b. Another suggestion was to allow multiple connected mode links (i.e. at
> >IB UC/RC level) between peers. One thought can be 'yes, but user beware':
> >The IB connections are made using the service ID that is derived from the
> >QPN as described in the draft. If a second attempt succeeds then there are
> >two links. It is up to the implementation to either allow or disallow
> >multiple links.
> >
> >Again, this has been suggested in the past (though most who were involved
> >in the original discussions years gone by are likely gone since much of
> >this discussion occurred before the IETF workgroup was established).
> >
> ><VK> I'm one of the vestiges of those early times along with you and a few
> >others...so we have hope :). <VK>
> >
> >There is obvious benefit to supporting multiple RC per endnode pair. I do
> >not see any technical reason to oppose nor any issue from an
> >interoperability perspective. There is no reason for a "user beware".
> >
> ><VK> It is not opposed. The 'user beware' is only underscoring that the
> >the peer interface might not support multiple links- it might enforce a
> >limited number of connections (maybe only one) between a pair of GIDs.
> >Similarly, an implementation not wanting to support multiple links MUST
> >take steps to deny multiple requests.
>
> *C requires CM to operate thus it is a local issue whether additional CM
> operations are accepted or not. A given requester node may issue N and a
> given responder may state 0-N as an implementation may limit the number of
> *C available for IP traffic.
>
>
> ><VK>
> >
> >The work is rather straight to do and implement and the benefit to
> >customers, is again, rather obvious when one considers what the IB fabric
> >offers and how connections can be enable flows through multipath as well
> >as transparent fail-over, flow scheduling, mapping of DiffServ to
> >different arbitration / paths, etc.
> >
> ><VK> In addition Large MTU and APM are two of the main reasons why I've
> >been proposing IPoIB-connected mode for so long. In terms of IPoIB itself,
> >except for the Large MTU, the parameters are hidden from it.<VK>
>
> Mike
__
Vivek Kashyap
Linux Technology Center, IBM
_______________________________________________
IPoverIB mailing list
IPoverIB <at> ietf.org
https://www1.ietf.org/mailman/listinfo/ipoverib
_______________________________________________ IPoverIB mailing list IPoverIB <at> ietf.org https://www1.ietf.org/mailman/listinfo/ipoverib
In the last IETF61 IPoIB meeting I made several comments on the connected mode draft. I'm sending them to the list for a general discussion. (Yes I saw some disucssion on the connected mode draft already. I'll try to catch up with the thread after this mail.) 1. The draft makes a distinction between IPoIB-CM interfaces and IPoIB-UD interfaces, and portrays IPoIB-UC or IPoIB-RC as separate subnets superimposed on top of an IPoIB-UD subnet. For the above to work, due to a lack of multicast support, a fully connected network by itself can't meet the requirement of an IP link unless multicast is fully emulated through the use of multiple unicasts. The latter is complex and cumbersome. A much simpler model, which I think was presented in earlier drafts, is to fold the use of IB connections fully into a regular IPoIB-UD subnet, allowing any two IPoIB nodes to optionally negotiate the use of IB connection between themselves. This much simplified model is not without its drawback. Some nice IP link attributes are no longer unique within a link. E.g., the link MTU now becomes per-node-pair MTU. Moreover, the MTU size for multicast will be different from the MTU size for unicast if IB connections are used. IB UC/RC may exhibit different RAS, flow control, QoS or other link characteristics than UD. But I consider these problems a reasonable price to pay for a seamless support of UC/RC mode in an IPoIB link defined by UD. 2. The negotiation of the per-connection MTU seems more complicated than necessary. I think all is needed is for a node to advertise its own "receive MTU". That is, the MTU size its peer should never go over when sending packets to the local interface. Yes this may break the traditional concept of "symmetric" MTUs. But we're already breaking the notion of per-link MTU, requring a lot of changes in the host stack anyway. This additonal breakage doesn't seem much. I haven't verified if this asymmetric MTU matches well with IBA connections though. 3. Regarding allowing multiple IB connections between a node pair, since given an IP address there is only one link-address for it implying one QPN, hence one service-ID, if a single service-ID can be used to create multiple IB connections then this can happen transparently. Otherwise we've got a problem. Jerry
RSS Feed1 | |
|---|---|
1 | |
10 | |
7 | |
6 | |
9 | |
1 | |
13 | |
4 | |
32 | |
76 | |
24 | |
2 | |
1 | |
7 | |
5 | |
33 | |
20 | |
36 | |
6 | |
1 | |
14 | |
10 | |
1 | |
11 | |
12 | |
8 | |
9 | |
28 | |
6 | |
52 | |
41 | |
3 | |
7 | |
36 | |
6 | |
8 | |
10 | |
18 | |
12 | |
12 | |
41 | |
41 | |
5 | |
46 | |
25 | |
11 | |
136 |