Internet-Drafts | 6 Jul 21:55 2015
Picon

I-D ACTION:draft-ietf-rtgwg-dt-encap-00.txt

A new Internet-Draft is available from the on-line Internet-Drafts directories.
This draft is a work item of the Routing Area Working Group Working Group of the IETF.

    Title         : Encapsulation Considerations

    Author(s)     : E. Nordmark, et al
    Filename      : draft-ietf-rtgwg-dt-encap
    Pages         : 42 
    Date          : 2015-07-06 

   The IETF Routing Area director has chartered a design team to look at
   common issues for the different data plane encapsulations being
   discussed in the NVO3 and SFC working groups and also in the BIER
   BoF, and also to look at the relationship between such encapsulations
   in the case that they might be used at the same time.  The purpose of
   this design team is to discover, discuss and document considerations
   across the different encapsulations in the different WGs/BoFs so that
   we can reduce the number of wheels that need to be reinvented in the
   future.

A URL for this Internet-Draft is:
https://www.ietf.org/internet-drafts/draft-ietf-rtgwg-dt-encap-00.txt

Internet-Drafts are also available by anonymous FTP at:
ftp://ftp.ietf.org/internet-drafts/

Below is the data which will enable a MIME compliant mail reader
implementation to automatically retrieve the ASCII version of the
Internet-Draft.
(Continue reading)

Erik Nordmark | 6 Jul 19:17 2015
Picon

Re: RtgDir review: draft-rtg-dt-encap-02

On 6/30/15 2:20 PM, Jamal Hadi Salim wrote:
> I have been selected as the Routing Directorate reviewer for this draft.
> The Routing Directorate seeks to review all routing or routing-related
> drafts as they pass through IETF last call and IESG review, and sometimes
> on special request. The purpose of the review is to provide assistance to
> the Routing ADs. For more information about the Routing Directorate, 
> please see http://trac.tools.ietf.org/area/rtg/trac/wiki/RtgDir
>
> Although these comments are primarily for the use of the Routing ADs, it
> would be helpful if you could consider them along with any other IETF
> Last Call comments that you receive, and strive to resolve
> them through discussion or by updating the draft.
>
> Document: draft-rtg-dt-encap-02
> Reviewer: Jamal Hadi Salim
> Review Date: 6/30/15 (later than requested, sorry)
> Intended Status: Informational
> WG LC End Date: unknown
>
> Summary:
>
> The document has significant good work and recommendations for
> encapsulation design. Many years of experience in issues found
> with encapsulation deployments are discussed. There are times
> where i lost track what the document was about because issues
> were being discussed without making recommendations on what is needed
> from an encapsulation perspective to deal with those issues; otoh,
> a good read is section 18 which would mention an issue and in the
> same breath suggests how a design should handle said issue.
Jamal,
(Continue reading)

Erik Nordmark | 6 Jul 18:01 2015
Picon

Fwd: RtgDir review: draft-rtg-dt-encap-02




-------- Forwarded Message -------- Subject: Resent-Date: Resent-From: Resent-To: Date: From: To: CC:
RtgDir review: draft-rtg-dt-encap-02
Tue, 30 Jun 2015 05:21:17 -0700 (PDT)
hadi <at> mojatatu.com
draft-rtg-dt-encap <at> ietf.org
Tue, 30 Jun 2015 08:20:46 -0400
Jamal Hadi Salim <hadi <at> mojatatu.com>
rtg-ads <at> tools.ietf.org <rtg-ads <at> tools.ietf.org>
rtg-dir <at> ietf.org <rtg-dir <at> ietf.org>, draft-rtg-dt-encap <at> tools.ietf.org


I have been selected as the Routing Directorate reviewer for this draft.
The Routing Directorate seeks to review all routing or routing-related
drafts as they pass through IETF last call and IESG review, and sometimes
on special request. The purpose of the review is to provide assistance to
the Routing ADs. For more information about the Routing Directorate, please see http://trac.tools.ietf.org/area/rtg/trac/wiki/RtgDir

Although these comments are primarily for the use of the Routing ADs, it
would be helpful if you could consider them along with any other IETF
Last Call comments that you receive, and strive to resolve
them through discussion or by updating the draft.

Document: draft-rtg-dt-encap-02
Reviewer: Jamal Hadi Salim
Review Date: 6/30/15 (later than requested, sorry)
Intended Status: Informational
WG LC End Date: unknown

Summary:

The document has significant good work and recommendations for
encapsulation design. Many years of experience in issues found
with encapsulation deployments are discussed. There are times
where i lost track what the document was about because issues
were being discussed without making recommendations on what is needed
from an encapsulation perspective to deal with those issues; otoh,
a good read is section 18 which would mention an issue and in the
same breath suggests how a design should handle said issue.

The document needs at least one more pass.

I have some minor concerns about this document that I believe are resolvable.
Annotated comments attached.

cheers,
jamal



> 
> 
> 
> RTGWG                                                   E. Nordmark (ed)
> Internet-Draft                                           Arista Networks
> Intended status: Informational                                   A. Tian
> Expires: November 22, 2015                                 Ericsson Inc.
>                                                                 J. Gross
>                                                                   VMware
>                                                                J. Hudson
>                                          Brocade Communications Systems,
>                                                                     Inc.
>                                                               L. Kreeger
>                                                      Cisco Systems, Inc.
>                                                                  P. Garg
>                                                                Microsoft
>                                                                P. Thaler
>                                                     Broadcom Corporation
>                                                               T. Herbert
>                                                                   Google
>                                                             May 21, 2015
> 
> 
>                       Encapsulation Considerations
>                          draft-rtg-dt-encap-02
> 

[...]

> 2.  Overview

[..]

> 
>    [I-D.wijnands-bier-architecture], and
>    [I-D.wijnands-mpls-bier-encapsulation].  We assume the reader has
>    some basic familiarity with those proposed encapsulations.  The
>    Related Work section points at some prior work that relates to the
>    encapsulation considerations in this document.
> 
>    Encapsulation protocols typically have some unique information that
>    they need to carry.  In some cases that information might be modified
>    along the path and in other cases it is constant.  The in-flight
>    modifications has impacts on what it means to provide security for
>    the encapsulation headers.
>    o  NVO3 carries a VNI Identifier edge to edge which is not modified.
>       There has been OAM discussions in the WG and it isn't clear
>       whether some of the OAM information might be modified in flight.
>    o  SFC carries service meta-data which might be modified or
>       unmodified as the packets follow the service path.  SFC talks of

Being a little picky, how about:
"SFC carries service meta-data which might be modified as the packets
follow the service path."

>       some loop avoidance mechanism which is likely to result in
>       modifications for for each hop in the service chain even if the
>       meta-data is unmodified.
>    o  BIER carries a bitmap of egress ports to which a packet should be
>       delivered, and as the packet is forwarded down different paths
>       different bits are cleared in that bitmap.
> 
>    Even if information isn't modified in flight there might be devices
>    that wish to inspect that information.  For instance, one can
>    envision future NVO3 security devices which filter based on the
>    virtual network identifier.
> 
>    The need for extensibility is different across the protocols
>    o  NVO3 might need some extensions for OAM and security.
>    o  SFC is all about carrying service meta-data along a path, and
>       different services might need different types and amount of meta-
>       data.
>    o  BIER might need variable number of bits in their bitmaps, or other
>       future schemes to scale up to larger network.
>    The extensibility needs and constraints might be different when
>    considering hardware vs. software implementations of the
>    encapsulation headers.  NIC hardware might have different constraints
>    than switch hardware.
> 

[...]
[..]

> 
> 6.  Terminology
> 
>    The capitalized keyword MUST is used as defined in
>    http://en.wikipedia.org/wiki/Julmust
> 

Missing the context on what looks like a high calorie delicious drink.
and should that be https?;->

>    TBD: Refer to existing documents for at least NVO3 and SFC
>    terminology.  We use at least the VNI ID in this document.
> 
> 
> 7.  Entropy
> 
>    In many cases the encapsulation format needs to enable ECMP in
>    unmodified routers.  Those routers might use different fields in TCP/
>    UDP packets to do ECMP without a risk of reordering a flow.
> 
>    The common way to do ECMP-enabled encapsulation over IP today is to
>    add a UDP header and to use UDP with the UDP source port carrying
>    entropy from the inner/original packet headers as in LISP [RFC6830].
>    The total entropy consists of 14 bits in the UDP source port (using
>    the ephemeral port range) plus the outer IP addresses which seems to
>    be sufficient for entropy; using outer IPv6 headers would give the
>    option for more entropy should it be needed in the future.
> 
>    In some environments it might be fine to use all 16 bits of the port
>    range.  However, middleboxes might make assumptions about the system
>    ports or user ports.  But they should not make any assumptions about
>    the ports in the Dynamic and/or Private Port range, which have the
>    two MSBs set to 11b.
> 
>    The UDP source port might change over the lifetime of an encapsulated
>    flow, for instance for DoS mitigation or re-balancing load across
>    ECMP.

Shouldnt the above statement bear a little more discussion/comment?
What happens to packet ordering then?

> 
>    There is some interaction between entropy and OAM and extensibility
>    mechanism.  It is desirable to be able to send OAM packets to follow
>    the same path as network packets.  Hence OAM packets should use the
>    same entropy mechanism as data packets.  While routers might use
>    information in addition the entropy field and outer IP header, they
>    can not use arbitrary parts of the encapsulation header since that
>    might result in OAM frames taking a different path.  Likewise if
>    routers look past the encapsulation header they need to be aware of
>    the extensibility mechanism(s) in the encapsulation format to be able
>    to find the inner headers in the presence of extensions; OAM frames
>    might use some extensions e.g. for timestamps.
> 

[..]

>    Note that in the proposed BIER encapsulation
>    [I-D.wijnands-mpls-bier-encapsulation], there is an an 8-bit field
>    which specifies an entropy value that can be used for load balancing
>    purposes.  This entropy is for the BIER forwarding decisions, which
>    is independent of any outer delivery ECMP between BIER routers.  Thus
>    it is not part of the delivery ECMP discussed in this section.
>       [Note: For any given bit in BIER (that identifies an exit from the
>       BIER domain) there might be multiple immediate next hops.  The
>       BIER entropy field is used to select that next hop as part of BIER
>       processing.  The BIER forwarding process may do equal cost load
>       balancing, but the load balancing procedure MUST choose the same
>       path for any two packets have the same entropy value.]

"... two packets that have the same ..."

> 
>    In summary:
>    o  The entropy is associated with the transport, that is an outer IP
>       header or MPLS.
>    o  In the case of IP transport use >=14 bits of UDP source port, plus
>       outer IPv6 flowid for entropy.
> 

Looks like a typo.  <=14 bits?

> 
> 8.  Next-protocol indication
> 

[..]

> 
>    Secondly, the encapsulation needs to indicate the type of its
>    payload, which is in scope for the design of the encapsulation.  We
>    have existing protocols which use Ethernet types (such as GRE).  Here
>    each encapsulation header can potentially makes its own choices
>    between:
>    o  Reuse Ethernet types - makes it easy to carry existing L2 and L3
>       protocols including IPv6, IPv6, and Ethernet.  Disadvantages are
>       that it is a 16 bit number and we probably need far less than 100
>       values, and the number space is controlled by the IEEE 802 RAC
>       with its own allocation policies.

If i understood correctly what "reuse" implies: you are suggesting a new 
super-ethertype whose content space will carry an additional type 
semantic so you never have to go back to IEEE?

>    o  Reuse IP protocol numbers - makes it easy to carry e.g., ESP in
>       addition to IP and Etnernet but brings in all existing protocol

Run a spell checker "Ethernet" above..

>       numbers many of which would never be used directly on top of the
>       encapsulation protocol.  IANA managed eight bit values, presumably
>       more difficult to get an assigned number than to get a transport
>       port assignment.
>    o  Define their own next-protocol number space, which can use fewer
>       bits than an Ethernet type and give more flexibility, but at the
>       cost of administering that numbering space (presumably by the
>       IANA).
> 
>    Thirdly, if the IETF ends up defining multiple encapsulations at
>    about the same time, and there is some chance that multiple such
>    encapsulations can be combined in the same packet, there is a
>    question whether it makes sense to use a common approach and
>    numbering space for the encapsulation across the different protocols.
>    A common approach might not be beneficial as long as there is only
>    one way to indicate e.g., SFC inside NVO3.
> 
>    Many Internet protocols use fixed values (typically managed by the
>    IANA function) for their next-protocol field.  That facilitates
>    interpretation of packets by middleboxes and e.g., for debugging
>    purposes, but might make the protocol evolution inflexible.  Our
>    collective experience with MPLS shows an alternative where the label
>    can be viewed as an index to a table containing processing
>    instructions and the table content can be managed in different ways.

Would it not be useful to provide a reference here? Just reading this
has questions popping for me - who populates this tag-indexed table of
instructions and could interop be impacted?

>    Encapsulations might want to consider the tradeoffs between such more
>    flexible versus more fixed approaches.
> 
>    In summary:
>    o  Would it be useful for the IETF come up with a common scheme for
>       encapsulation protocols?  If not each encapsulation can define its
>       own scheme.
> 

In my view it would be hard to come up with a ring to rule them all.
There are cases where simple is good enough and asking someone to carry
a christmas tree is the wrong answer. And, yes, there are cases where 
(to quote Mencken) the answer is clear, simple and wrong (especially
in one-off-use-cases which then are refactored to fit into square pegs).
My suggestion is to not be too clever in answering the question above.

> 
> 9.  MTU and Fragmentation
> 
>    A common approach today is to assume that the underlay have
>    sufficient MTU to carry the encapsulated packets without any
>    fragmentation and reassembly at the tunnel endpoints.  That is
>    sufficient when the operator of the ingress and egress have full
>    control of the paths between those endpoints.  And it makes for
>    simpler (hardware) implementations if fragmentation and reassembly
>    can be avoided.
> 

[..]

>    Encapsulations could also define an optional tunnel fragmentation and
>    reassembly mechanism which would be useful in the case when the
>    operator doesn't have full control of the path, or when the protocol
>    gets deployed outside of its original intended context.  Such a
>    mechanism would be required if the underlay might have a path MTU
>    which makes it impossible to carry at least 1518 bytes (if offering
>    Ethernet service), or at least 1280 (if offering IPv6 service).  The
>    use of such a protocol mechanism could be triggered by receiving a
>    PTB.  But such a mechanism might not be implemented by all
>    encapsulators and decapsulators.  [Aerolink is one example of such a
>    protocol.]
> 

Reference to Aerolink and the sins committed would be useful.
I googled aerolink and found references of some radio thing running over IP.
Given IP provides the fragmention service above, why is aerolink not capable 
of this mechanism? I think there's a simple answer; just reading this didnt help.

>    Depending on the payload carried by the encapsulation there are some
>    additional possibilities:
> 

[..]

>    In summary:
>    o  In some deployments an encapsulation can assume well-managed MTU
>       hence no need for fragmentation and reassembly related to the
>       encapsulation.
>    o  Even so, it makes sense for ingress to track any ICMP packet too
>       big addressed to ingress to be able to log any MTU
>       misconfigurations.
>    o  Should an encapsulation protocol be depoyed outside of the

spell checker: deployed?

>       original context it might very well need support for fragmentation
>       and reassembly.
> 
> 
> 10.  OAM
> 
>    The OAM area is seeing active development in the IETF with
>    discussions (at least) in NVO3 and SFC working groups, plus the new
>    LIME WG looking at architecture and YANG models.
> 
>    The design team has take a narrow view of OAM to explore the
>    potential OAM implications on the encapsulation format.
> 
>    In terms of what we have heard from the various working groups there
>    seem to be needs to:
>    o  Be able to send out-of-band OAM messages - that potentially should
>       follow the same path through the network as some flow of data
>       packets.
>       *  Such OAM messages should not accidentally be decapsulated and
>          forwarded to the end stations.
>       *  Be able to add OAM information to data packets that are
>          encapsulated.  Discussions have been around

Add a semicolon so it reads "Discussions have been around:" and then more 
indentation is needed for the next two bullets below to fit under above bullet.

>       *  Using a bit in the OAM to synchronize sampling of counters
>          between the encapsulator and decapsulator.
>       *  Optional timestamps, sequence numbers, etc for more detailed
>          measurements between encapsulator and decapsulator.
>    o  Usable for both proactive monitoring (akin to BFD) and reactive
>       checks (akin to traceroute to pin-point a failure)
> 
>    To ensure that the OAM messages can follow the same path the OAM
>    messages need to get the same ECMP (and LAG hashing) results as a
>    given data flow.  An encapsulator can choose between one of:
> 
> 

[..]

> 
> 
>    o  Limit ECMP hashing to not look past the UDP header i.e. the
>       entropy needs to be in the source/destination IP and UDP ports
>    o  Make OAM packets look the same as data packets i.e. the initial
>       part of the OAM payload has the inner Ethernet, IP, TCP/UDP
>       headers as a payload.  (This approach was taken in TRILL out of
>       necessity since there is no UDP header.)  Any OAM bit in the
>       encapsulation header must in any case be excluded from the
>       entropy.
> 

Does it make sense to have inband OAM info? i.e carried alongside the
data (sure request for a path trace doesnt fit; but inband 
healthinfo may fit); in such a case OAM info could be carried in something
like a TLV.

>    There can be several ways to prevent OAM packets from accidentally
>    being forwarded to the end station using:
>    o  A bit in the frame (as in TRILL) indicating OAM
>    o  A next-protocol indication with a designated value for "none" or
>       "oam".
>    This assumes that the bit or next protocol, respectively, would not
>    affect entropy/ECMP in the underlay.  However, the next-protocol
>    field might be used to provide differentiated treatement of packets
>    based on their payload; for instance a TCP vs. IPsec ESP payload
>    might be handled differently.  Based on that observation it might be
>    undesirable to overload the next protocol with the OAM drop behavior,
>    resulting in a preference for having a bit to indicate that the
>    packet should be forwarded to the end station after decapsulation.

[..]

> 
> 11.  Security Considerations
> 
>    Different encapsulation use cases will have different requirements
>    around security.  For instance, when encapsulation is used to build
>    overlay networks for network virtualization, isolation between
>    virtual networks may be paramount.  BIER support of multicast may
>    entail different security requirements than encapsulation for
>    unicast.
> 
>    In real deployment, the security of the underlying network may be
>    considered for determining the level of security needed in the
>    encapsulation layer.  However for the purposes of this discussion, we
>    assume that network security is out of scope and that the underlying
>    network does not itself provide adequate or as least uniform security
>    mechanisms for encapsulation.

I found the above paragraph awkward to read.  How about simplifying:
"This document assumes that the underlying network does not itself 
provide adequate or at least uniform security mechanisms for encapsulation. 
The authors understand that the underlying network security could provide
useful input into the security needs of the encapsulation layer but ignore
it to provide a focus on the discussion."

> 
>    There are at least three considerations for security:
>    o  Anti-spoofing/virtual network isolation
>    o  Interaction with packet level security such as IPsec or DTLS

So would IPSEC not be considered "underlying network security"?

>    o  Privacy (e.g., VNI ID confidentially for NVO3)
> 

Confidentially is one - but what about integrity of the VNI?

>    This section uses a VNI ID in NVO3 as an example.  A SFC or BIER
>    encapsulation is likely to have fields with similar security and
>    privacy requirements.
> 
> 11.1.  Encapsulation-specific considerations
> 
>    Some of these considerations appear for a new encapsulation, and
>    others are more specific to network virtualization in datacenters.
>    o  New attack vectors:
>       *  DDOS on specific queued/paths by attempting to reproduce the
>          5-tuple hash for targeted connections.
>       *  Entropy in outer 5-tuple may be too little or predictable.
>       *  Leakage of identifying information in the encapsulation header
>          for an encrypted payload.
>       *  Vulnerabilities of using global values in fields like VNI ID.
>    o  Trusted versus untrusted tenants in network virtualization:
>       *  The criticality of virtual network isolation depends on whether
>          tenants are trusted or untrusted.  In the most extreme cases,
>          tenants might not only be untrusted but may be considered
>          hostile.

So would confidentiality then become a requirement to address this?
It is more readable to make suggestions on each issue on what needs
to be done.

>       *  For a trusted set of users (e.g. a private cloud) it may be
>          sufficient to have just a virtual network identifier to provide
>          isolation.  Packets inadvertently crossing virtual networks
>          should be dropped similar to a TCP packet with a corrupted port
>          being received on the wrong connection.
>       *  In the presence of untrusted users (e.g. a public cloud) the
>          virtual network identifier must be adequately protected against
>          corruption and verified for integrity.  This case may warrant
>          keyed integrity.

Ok, i guess integrity does show up here; should have mentioned it earlier?

>    o  Different forms of isolation:
>       *  Isolation could be blocking all traffic between tenants (or
>          except as allowed by some firewall)
>       *  Could also be about performance isolation i.e. one tenant can
>          overload the network in a way that affects other tenants
>       *  Physical isolation of traffic for different tenants in network
>          may be required, as well as required restrictions that tenants
>          may have on where their packets may be routed.
>    o  New attack vectors from untrusted tenants:
>       *  Third party VMs with untrusted tenants allows internally borne
>          attacks within data centers
>       *  Hostile VMs inside the system may exist (e.g. public cloud)
>       *  Internally launched DDOS
>       *  Passive snooping for mis-delivered packets
>       *  Mitigate damage and detection in event that a VM is able to
>          circumvent isolation mechanisms

[...]
[..]

> 11.4.  In summary:
> 
>    o  Encapsulations need extensibility mechanisms to be able to add
>       security features like cookies and secure hashes protecting the
>       encapsulation header.
>    o  NVO3 proably has specific higher requirements relating to
>       isolation for network virtualization, which is in scope for the
>       NVO3 WG/

"remove the "/"

>    o  Our collective IETF experience is that succesful protocols get
>       deployed outside of the original intended context, hence the
>       initial assumptions about the threat model might become invalid.
>       That needs to be considered in the standardization of new
>       encapsulations.

So whats the recommendation here? Over-engineer in case something is needed
later?

> 
> 
> 12.  QoS
> 
>    In the Internet architecture we support QoS using the Differentiated
>    Services Code Points (DSCP) in the formerly named Type-of-Service
>    field in the IPv4 header, and in the Traffic-Class field in the IPv6

Its been at least a decade since the change, do you really need to say 
"formerly named ToS"?

>    header.  The ToS and TC fields also contain the two ECN bits.

Provide a cross-reference to section 13 for ECN?

> 
>    We have existing specifications how to process those bits.  See
>    [RFC2983] for diffserv handling, which specifies how the received
>    DSCP value is used to set the DSCP value in an outer IP header when
>    encapsulating.  (There are also existing specifications how DSCP can
>    be mapped to layer2 priorities.)
> 

[..]

> 
> 13.  Congestion Considerations
> 
>    Additional encapsulation headers does not introduce anything new for
>    Explicit Congestion Notification.  It is just like IP-in-IP and IPsec
>    tunnels which is specified in [RFC6040] in terms of how the ECN bits
>    in the inner and outer header are handled when encapsulating and
>    decapsulating packets.  Thus new encapsulations can more or less
>    include that by reference.
>    There are additional considerations around carrying non-congestion
>    controlled traffic.  These details have been worked out in
>    [I-D.ietf-mpls-in-udp].  As specified in [RFC5405]: "IP-based traffic
>    is generally assumed to be congestion-controlled, i.e., it is assumed
>    that the transport protocols generating IP-based traffic at the
>    sender already employ mechanisms that are sufficient to address
>    congestion on the path Consequently, a tunnel carrying IP-based

"." needed between "path" and "Consequently"

>    traffic should already interact appropriately with other traffic
>    sharing the path, and specific congestion control mechanisms for the
>    tunnel are not necessary".  Those considerations are being captured
>    in [I-D.ietf-tsvwg-rfc5405bis].
> 

[..]

> 
>    One could make the encapsulation header be extensible to that it can
>    carry sufficient information to be able to measure resource usage,
>    delays, and congestion.  The suggestions in the OAM section about a
>    single bit for counter synchronization, and optional timestamps
>    and/or sequence numbers, could be part of such an approach.  There
>    might also be additional congestion-control extensions to be carried
>    in the encapsulation.  Overall this results in a consideration to be
>    able to have sufficient extensibility in the encapsulation to be
>    handle to handle potential future developments in this space.
> 

get rid of "to be handle" so it reads:
"...extensibility in the encapsulation to handle ..."

>    Coarse measurements are likely to suffice, at least for circuit-
>    breaker-like purposes, see [I-D.wei-tsvwg-tunnel-congestion-feedback]
>    and [I-D.briscoe-conex-data-centre] for examples on active work in
>    this area via use of ECN.  [RFC6040] Appendix C is also relevant.
>    The outer ECN bits seem sufficient (at least when everything uses
>    ECN) to do this course measurements.  Needs some more study for the
>    case when there are also drops; might need to exchange counters
>    between ingress and egress to handle drops.
> 
>    Circuit breakers are not sufficient to make a network with different
>    congestion control when the goal is to provide a predictable service
>    to different tenants.  The fallback would be to rate limit different
>    traffic.
> 
>    In summary:
>    o  Leverage the existing approach in [RFC6040] for ECN handling.
>    o  If the encapsulation can carry non-IP, hence non-congestion
>       controlled traffic, then leverage the approach in
>       [I-D.ietf-mpls-in-udp].
>    o  "Watch this space" for circuit breakers.
> 

 Hopefully coming soon ;->

> 
> 14.  Header Protection
> 
>    Many UDP based encapsulations such as VXLAN [RFC7348] either
>    discourage or explicitly disallow the use of UDP checksums.  The
>    reason is that the UDP checksum covers the entire payload of the
>    packet and switching ASICs are typically optimized to look at only a
>    small set of headers as the packet passes through the switch.  In
>    these case, computing a checksum over the packet is very expensive.
>    (Software endpoints and the NICs used with them generally do not have
>    the same issue as they need to look at the entire packet anyways.)
> 

[..]

>    verify that checksum or, if incapable, drop the packet.  The
>    assumption is that configuration and/or control-plane capability
>    exchanges can be used when different receiver have different checksum
>    validation capabilities.
> 
>    In summary:
>    o  Encapsulations need extensibility to be able to add checksum/CRC
>       for the encapsulation header itself.
>    o  When the encapsulation has a checksum/CRC, include the IPv6
>       pseudo-header in it.
>    o  The checksum/CRC can potentially be avoided when cryptographic
>       protection is applied to to the encapsulation.
> 

get rid of one of the "to"

> 
> 15.  Extensibility Considerations
> 
>    Protocol extensibility is the concept that a networking protocol may
>    be extended to include new use cases or functionality that were not
>    part of the original protocol specification.  Extensibility may be
>    used to add security, control, management, or performance features to
>    a protocol.  A solution may allow private extensions for
>    customization or experimentation.
> 

[..]

> 
>    In some cases it might be more appropriate to define a new inner
>    protocol which can carry the new functionality instead of extending
>    the outer protocol.  Examples where this works well is in the IP/
>    transport split, where the earlier architecture had a single NCP

Is a ref for NCP needed?

>    protocol which carried both the hop-by-hop semantics which are now in
>    IP, and the end-to-end semantics which are now in TCP.  Such a split
>    is effective when different nodes need to act upon the different
>    information.  Applying this for general protocol extensibility
>    through nesting is not well understood, and does result in longer
>    header chains.  Furthermore, our experience with IPv6 extension
>    headers [RFC2460] in middleboxes indicates that the approach does not

"...indicates that the header chaining approach does not"

Is this bad experience documented somewhere? A reference or some clarification
would help.

>    help with middlebox traversal.

> 
>    Many protocol definitions include some number of reserved fields or
>    bits which can be used for future extension.  VXLAN is an example of
>    a protocol that includes reserved bits which are subsequently being
> 
> 
[..]

> 
>    Extending a protocol header with new fields can be done in several
>    ways.
>    o  TLVs are a very popular method used in such protocols as IP and
>       TCP.  Depending on the type field size and structure, TLVs can
>       offer a virtually unlimited range of extensions.  A disadvantage
>       of TLVs is that processing them can be verbose, quite complicated,
>       several validations must often be done for each TLV, and there is

I think if you make such strong comments you need to quantify them.
A TLV is a formal structure with well defined characteristics. You could
write efficient code to parse, identify and validate TLVs. How is it
verbose to process etc?

>       no deterministic ordering for a list of TLVs.  TCP serves as an

The reason deterministic ordering would matter is if there's dependencies
between the TLVs. If that is a huge need, then the document needs to provide a
sample space or explanation why that is important.

>       example of a protocol where TLVs have been successfully used (i.e.
>       required for protocol operation).  IP is an example of a protocol
>       that allows TLVs but are rarely used in practice (router fast
>       paths usually that assume no IP options).  Note that TCP TLVs are
>       implemented in software as well as (NIC) hardware handling various
>       forms of TCP offload.
>    o  Extension headers are closely related to TLVs.  These also carry
>       type/value information, but instead of being a list of TLVs within
>       a single protocol header, each one is in its own protocol header.

The main difference seems to be in the fact that in a list of header
extensions, the current extension describes the next; whereas in TLVs
there is no such relationship; otherwise the T in TLV is an extension
header. One imposes ordering, the other doesnt really.

>       IPv6 extension headers and SFC NSH are examples of this technique.
>       Similar to TLVs these offer a wide range of extensibility, but
>       have similarly complex processing.  Another difference with TLVs
>       is that each extension header is idempotent.  This is beneficial
>       in cases where a protocol implements a push/pop model for header
>       elements like service chaining, but makes it more difficult group
>       correlated information within one protocol header.
> 

[..]

>    o  Flag-fields are a non-TLV like method of extending a protocol
>       header.  The basic idea is that the header contains a set of
>       flags, where each set flags corresponds to optional field that is
>       present in the header.  GRE is an example of a protocol that
>       employs this mechanism.  The fields are present in the header in
>       the order of the flags, and the length of each field is fixed.
>       Flag-fields are simpler to process compared to TLVs, having fewer
>       validations and the order of the optional fields is deterministic.
>       A disadvantage is that range of possible extensions with flag-
>       fields is smaller than TLVs.

Qualify with "much smaller" maybe?

> 
>    The requirements for receiving unknown or unimplemented extensible
>    elements in an encapsulation protocol (flags, TLVs, optional fields)
>    need to be specified.  There are two parties to consider, middle
>    boxes and terminal endpoints of encapsulation (at the decapsulator).
> 

[..]

>    For handling unknown options at terminal nodes, there are two
>    possibilities: drop packet or accept while ignoring the unknown
>    options.  Many Internet protocols specify that reserved flags must be
>    set to zero on transmission and ignored on reception.  L2TP is
>    example data protocol that has such flags.  GRE is a notable
>    exception to this rule, reserved flag bits 1-5 cannot be ignored
>    [RFC2890].  For TCP and IPv4, implementations must ignore optional
>    TLVs with unknown type; however in IPv6 if a packet contains an
>    unknown extension header (unrecognized next header type) the packet
>    must be dropped with an ICMP error message returned.  The IPv6
>    options themselves (encoded inside the destinations options or hop-
>    by-hop options extension header) have more flexibility.  There bits

sub/There/The

>    in the option code are used to instruct the receiver whether to
>    ignore, silently drop, or drop and send error if the option is
>    unknown.  Some protocols define a "mandatory bit" that can is set
>    with TLVs to indicate that an option must not be ignored.
>    Conceptually, optional data elements can only be ignored if they are
>    idempotent and do not alter how the rest of the packet is parsed or
>    processed.
> 
>    Depending on what type of protocol evolution one can predict, it
>    might make sense to have an way for a sender to express that the

"... have a way..."

> 
> 
> 
[...]

> 
> 16.  Layering Considerations
> 

[...]

>    The layering also has some implications for middleboxes.
>    o  A device on the path between the ingress and egress is allowed to
>       transparently inspect all layers of the protocol stack and drop or
>       forward, but not transparently modify anything but the layer in
>       which they operate.  What this means is that an IP router is
>       allowed modify the outer IP ttl and ECN bits, but not the
>       encapsulation header or inner headers and payload.  And a BIER
>       router is allowed to modify the BIER header.
>    o  Alternatively such a device can become visible at a higher layer.
>       E.g., a middlebox could become an decapsulate + function +
>       encapsulate which means it will generate a new encapsulation
>       header.

"a middlebox could first decapsulate, perform some function then encapsulate;
which means it will generate a new encapsulation header."

> 
>    The design team asked itself some additional questions:
>    o  Would it make sense to have a common encapsulation base header
>       (for OAM, security?, etc) and then followed by the specific
>       information for NVO3, SFC, BIER?  Given that there are separate
>       proposals and the set of information needing to be carried
>       differs, and the extensibility needs might be different, it would
>       be difficult and not that useful to have a common base header.
>    o  With a base header in place, one could view the different
>       functions (NVO3, SFC, and BIER) as different extensions to that
>       base header resulting in encodings which are more space optimal by
>       not repeating the same base header.  The base header would only be
>       repeated when there is an additional IP (and hence UDP) header.
>       That could mean a single length field (to skip to get to the
>       payload after all the encapsulation headers).  That might be
>       technically feasible, but it would create a lot of dependencies
>       between different WGs making it harder to make progress.  Compare
>       with the potential savings in packet size.
> 

Agreed.

> 
> 17.  Service model
> 
>    The IP service is lossy and subject to reordering.  In order to avoid
>    a performance impact on transports like TCP the handling of packets
>    is designed to avoid reordering packets that are in the same
>    transport flow (which is typically identified by the 5-tuple).  But
>    across such flows the receiver can see different ordering for a given
>    sender.  That is the case for a unicast vs. a multicast flow from the
>    same sender.
> 
>    There is a general tussle between the desire for high capacity
>    utilization across a multipath network and the import on packet

where you say "import" did you mean:
"importance" or "impact"?

>    ordering within the same flow (which results in lower transport
>    protocol performance).  That isn't affected by the introduction of an
>    encapsulation.  However, the encapsulation comes with some entropy,
>    and there might be cases where folks want to change that in response
>    to overload or failures.  For instance, might want to change UDP

"For instance, one might want ..."

>    source port to try different ECMP route.  Such changes can result in
>    packet reordering within a flow, hence would need to be done
>    infrequently and with care e.g., by identifying packet trains.
> 

Is there a reference to work which says quiet periods (which i am implicitly
reading that in the text above) can be used to change the hash selection?
I would think that one needs to closely observe packet trends to make
such a decision. So please provide some ref to some scholarly or 
engineering work.

>    There might be some applications/services which are not able to
>    handle reordering across flows.  The IETF has defined pseudo-wires
>    [RFC3985] which provides the ability to ensure ordering (implemented
>    using sequence numbers and/or timestamps).
> 

What are you recommending? To use techniques defined in RFC3985?

>    Architectural such services would make sense, but as a separate layer
>    on top of an encapsulation protocol.  They could be deployed between
>    ingress and egress of a tunnel which uses some encaps.  Potentially
>    the tunnel control points at the ingress and egress could become a
>    platform for fixing suboptimal behavior elsewhere in the network.
>    That would clearly be undesirable in the general case.  However,
>    handling encapsulation of non-IP traffic hence non-congestion-
>    controlled traffic is likely to be required, which implies some
>    fairness and/or QoS policing on the ingress and egress devices.
> 
>    But the tunnels could potentially do more like increase reliability
>    (retransmissions, FEC) or load spreading using e.g.  MP-TCP between
>    ingress and egress.
> 
> 
> 18.  Hardware Friendly
> 
>    Hosts, switches and routers often leverage capabilities in the
>    hardware to accelerate packet encapsulation, decapsulation and
>    forwarding.
> 
>    Some design considerations in encapsulation that leverage these
>    hardware capabilities may result in more efficiently packet
>    processing and higher overall protocol throughput.
> 
>    While "hardware friendliness" can be viewed as unnecessary
>    considerations for a design, part of the motivation for considering
>    this is ease of deployment; being able to leverage existing NIC and
>    switch chips for at least a useful subset of the functionality that
>    the new encapsulation provides.  The other part is the ease of
>    implementing new NICs and switch/router chips that support the
>    encapsulation at ever increasing line rates.
> 
>    [disclaimer] There are many different types of hardware in any given
>    network, each maybe better at some tasks while worse at others.  We
>    would still recommend protocol designers to examine the specific
>    hardware that are likely to be used in their networks and make
>    decisions on a case by case basis.
> 
>    Some considerations are:
>    o  Keep the encap header small.  Switches and routers usually only
>       read the first small number of bytes into the fast memory for
>       quick processing and easy manipulation.  The bulk of the packets
> 
> 
> 
> Nordmark (ed), et al.   Expires November 22, 2015              [Page 27]
> 
> Internet-Draft        Encapsulation Considerations              May 2015
> 
> 
>       are usually stored in slow memory.  A big encap header may not fit
>       and additional read from the slow memory will hurt the overall
>       performance and throughput.
>    o  Put important information at the beginning of the encapsulation
>       header.  The reasoning is similar as explained in the previous
>       point.  If important information are located at the beginning of
>       the encapsulation header, the packet may be processed with smaller
>       number of bytes to be read into the fast memory and improve
>       performance.
>    o  Avoid full packet checksums in the encapsulation if possible.
>       Encapsulations should instead consider adding their own checksum
>       which covers the encapsulation header and any IPv6 pseudo-header.
>       The motivation is that most of the switch/router hardware make
>       switching/forwarding decisions by reading and examining only the
>       first certain number of bytes in the packet.  Most of the body of
>       the packet do not need to be processed normally.  If we are
>       concerned of preventing packet to be misdelivered due to memory
>       errors, consider only perform header checksums.  Note that NIC
>       chips can typically already do full packet checksums for TCP/UDP,
>       while adding a header checksum might require adding some hardware
>       support.
>    o  Place important information at fixed offset in the encapsulation
>       header.  Packet processing hardware may be capable of parallel
>       processing.  If important information can be found at fixed
>       offset, different part of the encapsulation header may be
>       processed by different hardware units in parallel (for example
>       multiple table lookups may be launched in parallel).  It is easier
>       for hardware to handle optional information when the information,
>       if present, can be found in ideally one place, but in general, in
>       as few places as possible.  That facilitates parallel processing.
>       TLV encoding with unconstrained order typically does not have that
>       property.
>    o  Limit the number of header combinations.  In many cases the
>       hardware can explore different combinations of headers in
>       parallel, however there is some added cost for this.
> 

I think this section is well done.

In regards to TLVs, I understand now a little more where the earlier 
comments come from (IMO: you will need to point to a reference to this 
section from the earlier reference).

Having said that, lets weigh out the pros and cons:
pro:
TLVs very flexible - almost give you future proofness in terms of extensibility.
cons: 
Harder to parallelize in hardware.

I think the pro side should be driving things.
I would say to the hardware folks - get busy now!

I am still unsure why this is hard to do in h/ware given all the benefits.
At the expense of getting tomatoes thrown at me:
sounds like there's an extra parsing step of hardware processing to find each 
individual TLVs "fixed offset" and after that you can parallelize.

> 18.1.  Considerations for NIC offload
> 
>    This section provides guidelines to provide support of common
>    offloads for encapsulation in Network Interface Cards (NICs).
>    Offload mechanisms are techniques that are implemented separately
>    from the normal protocol implementation of a host networking stack
>    and are intended to optimize or speed up protocol processing.
>    Hardware offload is performed within a NIC device on behalf of a
>    host.
> 
>    There are three basic offload techniques of interest:
>    o  Receive multi queue
>    o  Checksum offload
>    o  Segmentation offload
> 
> 18.1.1.  Receive multi-queue
> 
>    Contemporary NICs support multiple receive descriptor queues (multi-
>    queue).  Multi-queue enables load balancing of network processing for
>    a NIC across multiple CPUs.  On packet reception, a NIC must select
>    the appropriate queue for host processing.  Receive Side Scaling
>    (RSS) is a common method which uses the flow hash for a packet to
>    index an indirection table where each entry stores a queue number.
> 
>    UDP encapsulation, where the source port is used for entropy, should
>    be compatible with multi-queue NICs that support five-tuple hash
>    calculation for UDP/IP packets as input to RSS.  The source port
>    ensures classification of the encapsulated flow even in the case that
>    the outer source and destination addresses are the same for all flows
>    (e.g. all flows are going over a single tunnel).
> 

And the recommendation is to do what?

> 18.1.2.  Checksum offload
> 
>    Many NICs provide capabilities to calculate standard ones complement
>    payload checksum for packets in transmit or receive.  When using
>    encapsulation over UDP there are at least two checksums that may be
>    of interest: the encapsulated packet's transport checksum, and the
>    UDP checksum in the outer header.
> 
> 18.1.2.1.  Transmit checksum offload

[...]

> 18.1.3.  Segmentation offload
> 
>    Segmentation offload refers to techniques that attempt to reduce CPU
>    utilization on hosts by having the transport layers of the stack
>    operate on large packets.  In transmit segmentation offload, a
>    transport layer creates large packets greater than MTU size (Maximum
>    Transmission Unit).  It is only at much lower point in the stack, or
>    possibly the NIC, that these large packets are broken up into MTU
>    sized packet for transmission on the wire.  Similarly, in receive
>    segmentation offload, small packets are coalesced into large, greater
>    than MTU size packets at a point low in the stack receive path or
>    possibly in a device.  The effect of segmentation offload is that the
>    number of packets that need to be processed in various layers of the
>    stack is reduced, and hence CPU utilization is reduced.
> 

What is the recommendation for the protocol design?

> 18.1.3.1.  Transmit Segmentation Offload
> 
>    Transmit Segmentation Offload (TSO) is a NIC feature where a host
>    provides a large (larger than MTU size) TCP packet to the NIC, which
>    in turn splits the packet into separate segments and transmits each
>    one.  This is useful to reduce CPU load on the host.
> 
>    The process of TSO can be generalized as:
>    o  Split the TCP payload into segments which allow packets with size
>       less than or equal to MTU.
>    o  For each created segment:
>       1.  Replicate the TCP header and all preceding headers of the
>           original packet.
>       2.  Set payload length fields in any headers to reflect the length
>           of the segment.
>       3.  Set TCP sequence number to correctly reflect the offset of the
>           TCP data in the stream.
>       4.  Recompute and set any checksums that either cover the payload
>           of the packet or cover header which was changed by setting a
>           payload length.
> 
>    Following this general process, TSO can be extended to support TCP
>    encapsulation UDP.  For each segment the Ethernet, outer IP, UDP
>    header, encapsulation header, inner IP header if tunneling, and TCP
>    headers are replicated.  Any packet length header fields need to be
>    set properly (including the length in the outer UDP header), and
>    checksums need to be set correctly (including the outer UDP checksum
>    if being used).
> 
>    To facilitate TSO with encapsulation it is recommended that optional
>    fields should not contain values that must be updated on a per
>    segment basis-- for example an encapsulation header should not
>    include checksums, lengths, or sequence numbers that refer to the
>    payload.  If the encapsulation header does not contain such fields
>    then the TSO engine only needs to copy the bits in the encapsulation
>    header when creating each segment and does not need to parse the
>    encapsulation header.

Thanks - that was crystal clear.

> 
> 18.1.3.2.  Large Receive Offload
> 
>    Large Receive Offload (LRO) is a NIC feature where packets of a TCP
>    connection are reassembled, or coalesced, in the NIC and delivered to
>    the host as one large packet.  This feature can reduce CPU
>    utilization in the host.
> 
>    LRO requires significant protocol awareness to be implemented
>    correctly and is difficult to generalize.  Packets in the same flow
>    need to be unambiguously identified.  In the presence of tunnels or
>    network virtualization, this may require more than a five-tuple match
>    (for instance packets for flows in two different virtual networks may
>    have identical five-tuples).  Additionally, a NIC needs to perform
>    validation over packets that are being coalesced, and needs to
>    fabricate a single meaningful header from all the coalesced packets.
> 
>    The conservative approach to supporting LRO for encapsulation would
>    be to assign packets to the same flow only if they have identical
>    five-tuple and were encapsulated the same way.  That is the outer IP
>    addresses, the outer UDP ports, encapsulated protocol, encapsulation
>    headers, and inner five tuple are all identical.

Another excellent section.

> 
> 18.1.3.3.  In summary:
> 
>    In summary, for NIC offload:
>    o  The considerations for using full UDP checksums are different for
>       NIC offload than for implementations in forwarding devices like
>       routers and switches.
>    o  Be judicious about encapsulations that change fields on a per-
>       packet basis, since such behavior might make it hard to use TSO.
> 
> 

_______________________________________________
rtgwg mailing list
rtgwg <at> ietf.org
https://www.ietf.org/mailman/listinfo/rtgwg
internet-drafts | 3 Jul 19:43 2015
Picon

I-D Action: draft-ietf-rtgwg-mrt-frr-architecture-06.txt


A New Internet-Draft is available from the on-line Internet-Drafts directories.
 This draft is a work item of the Routing Area Working Group Working Group of the IETF.

        Title           : An Architecture for IP/LDP Fast-Reroute Using Maximally Redundant Trees
        Authors         : Alia Atlas
                          Robert Kebler
                          Chris Bowers
                          Gabor Sandor Enyedi
                          Andras Csaszar
                          Jeff Tantsura
                          Russ White
	Filename        : draft-ietf-rtgwg-mrt-frr-architecture-06.txt
	Pages           : 41
	Date            : 2015-07-03

Abstract:
   With increasing deployment of Loop-Free Alternates (LFA) [RFC5286],
   it is clear that a complete solution for IP and LDP Fast-Reroute is
   required.  This specification provides that solution.  IP/LDP Fast-
   Reroute with Maximally Redundant Trees (MRT-FRR) is a technology that
   gives link-protection and node-protection with 100% coverage in any
   network topology that is still connected after the failure.

   MRT removes all need to engineer for coverage.  MRT is also extremely
   computationally efficient.  For any router in the network, the MRT
   computation is less than the LFA computation for a node with three or
   more neighbors.

The IETF datatracker status page for this draft is:
https://datatracker.ietf.org/doc/draft-ietf-rtgwg-mrt-frr-architecture/

There's also a htmlized version available at:
https://tools.ietf.org/html/draft-ietf-rtgwg-mrt-frr-architecture-06

A diff from the previous version is available at:
https://www.ietf.org/rfcdiff?url2=draft-ietf-rtgwg-mrt-frr-architecture-06

Please note that it may take a couple of minutes from the time of submission
until the htmlized version and diff are available at tools.ietf.org.

Internet-Drafts are also available by anonymous FTP at:
ftp://ftp.ietf.org/internet-drafts/
Shraddha Hegde | 1 Jul 05:42 2015
Picon

RE: New Version Notification for draft-hegde-rtgwg-virtual-multi-instance-00.txt

All,

A new internet draft is posted which provides a mechanism to contain flooding information 
for link state protocols  for specific topologies.

Kindly review and provide your suggestions.

Rgds
Shraddha

-----Original Message-----
From: internet-drafts <at> ietf.org [mailto:internet-drafts <at> ietf.org] 
Sent: Wednesday, July 01, 2015 9:03 AM
To: Shraddha Hegde; Ross Callon; Alia Atlas; Shraddha Hegde; Alia Atlas; Mannan Venkatesan; Ross Callon;
Mannan Venkatesan
Subject: New Version Notification for draft-hegde-rtgwg-virtual-multi-instance-00.txt

A new version of I-D, draft-hegde-rtgwg-virtual-multi-instance-00.txt
has been successfully submitted by Shraddha Hegde and posted to the IETF repository.

Name:		draft-hegde-rtgwg-virtual-multi-instance
Revision:	00
Title:		Virtual multi-instancing for link state protocols
Document date:	2015-06-30
Group:		Individual Submission
Pages:		15
URL:            https://www.ietf.org/internet-drafts/draft-hegde-rtgwg-virtual-multi-instance-00.txt
Status:         https://datatracker.ietf.org/doc/draft-hegde-rtgwg-virtual-multi-instance/
Htmlized:       https://tools.ietf.org/html/draft-hegde-rtgwg-virtual-multi-instance-00

Abstract:
   In networks with routers of different capabilities, some routers may
   not be able to participate fully in supporting new features or
   handling large databases and the associated flooding.  In some cases,
   these restrictions can cause severe scalability issues for the
   network in general.  This document proposes virtual multi-instances,
   a generic mechanism for OSPF and IS-IS, that allows groups of routers
   in specific topologies to have a reduced database and reduces the
   topology changes that are seen.  The virtual multi-instances are
   specified so that no software or protocol changes are required in the
   restricted routers.  Due to the potential number of virtual multi-
   instances in a network, the configuration is limited and is not
   specified per virtual instance.

Please note that it may take a couple of minutes from the time of submission until the htmlized version and
diff are available at tools.ietf.org.

The IETF Secretariat
Chris Bowers | 30 Jun 22:42 2015
Picon

changes to pseudocode and explanation of Select_Alternates_Internal( ) in draft-ietf-rtgwg-mrt-frr-algorithm

All,
 
I found the existing text and pseudo-code for Select_Alternates_Internal to be difficult to understand.  So I rewrote them in a more systematic way that hopefully makes it easier to reason about.  Anil also had some questions a few weeks ago on this part of the code, which I hope this helps to clarify. 
 
The diff of the changes can be found at:
 
 
The next text is also copied below.
 
Please tell me if there are any errors or things that can be improved.
 
Chris
 
-------------
 
  For each primary next-hop node F to each destination D, S can call
   Select_Alternates(S, D, F, primary_intf) to determine whether to use
   the MRT-Blue or MRT-Red next-hops as the alternate next-hop(s) for
   that primary next hop.  The algorithm is given in Figure 24 and
   discussed afterwards.
 
  Select_Alternates_Internal(D, F, primary_intf,
                                 D_lower, D_higher, D_topo_order):
      if D_higher and D_lower
          if F.HIGHER and F.LOWER
              if F.topo_order < D_topo_order
                  return USE_RED
              else
                  return USE_BLUE
          if F.HIGHER
              return USE_RED
          if F.LOWER
              return USE_BLUE
      else if D_higher
          if F.HIGHER and F.LOWER
              return USE_BLUE
          if F.LOWER
              return USE_BLUE
          if F.HIGHER
              if (F.topo_order > D_topo_order)
                  return USE_BLUE
              if (F.topo_order < D_topo_order)
                  return USE_RED
      else if D_lower
          if F.HIGHER and F.LOWER
              return USE_RED
          if F.HIGHER
              return USE_RED
          if F.LOWER
              if F.topo_order > D_topo_order
                  return USE_BLUE
              if F.topo_order < D_topo_order
                  return USE_RED
      else  //D is unordered wrt S
          if F.HIGHER and F.LOWER
              if primary_intf.OUTGOING and primary_intf.INCOMING
                  // this case should not occur
              if primary_intf.OUTGOING
                  return USE_BLUE
              if primary_intf.INCOMING
                  return USE_RED
          if F.LOWER
              return USE_RED
          if F.HIGHER
              return USE_BLUE
 
  Select_Alternates(D, F, primary_intf)
      if (D is F) or (D.order_proxy is F)
          return PRIM_NH_IS_D_OR_OP_FOR_D
          D_lower = D.order_proxy.LOWER
          D_higher = D.order_proxy.HIGHER
          D_topo_order = D.order_proxy.topo_order
      return Select_Alternates_Internal(D, F, primary_intf,
                                        D_lower, D_higher, D_topo_order)
 
                                 Figure 24
 
   It is useful to first handle the case where where F is also D, or F
   is the order proxy for D.  In this case, only link protection is
   possible.  The MRT that doesn't use the failed primary next-hop is
   used.  If both MRTs use the primary next-hop, then the primary next-
   hop must be a cut-link, so either MRT could be used but the set of
   MRT next-hops must be pruned to avoid the failed primary next-hop
   interface.  To indicate this case, Select_Alternates returns
   PRIM_NH_IS_D_OR_OP_FOR_D.  Explicit pseudocode to handle the three
   sub-cases above is not provided.
 
   The logic behind Select_Alternates_Internal is described in
   Figure 25.  As an example, consider the first case described in the
   table, where the D>>S and D<<S.  If this is true, then either S or D
   must be the block root, R.  If F>>S and F<<S, then S is the block
   root.  So the blue path from S to D is the increasing path to D, and
   the red path S to D is the decreasing path to D.  If the
   F.topo_order<D.topo_order, then either F is ordered higher than D or
   F is unordered with respect to D.  Therefore, F is either on a
   decreasing path from S to D, or it is on neither an increasing nor a
   decreasing path from S to D.  In either case, it is safe to take an
   increasing path from S to D to avoid F.  We know that when S is R,
   the increasing path is the blue path, so it is safe to use the blue
   path to avoid F.
 
   If instead F.topo_order>D.topo_order, then either F is ordered lower
   than D, or F is unordered with respect to D.  Therefore, F is either
   on an increasing path from S to D, or it is on neither an increasing
   nor a decreasing path from S to D.  In either case, it is safe to
   take a decreasing path from S to D to avoid F.  We know that when S
   is R, the decreasing path is the red path, so it is safe to use the
   red path to avoid F.
 
   If F>>S or F<<S (but not both), then D is the block root.  We then
   know that the blue path from S to D is the increasing path to R, and
   the red path is the decreasing path to R.  When F>>S, we deduce that
   F is on an increasing path from S to R.  So in order to avoid F, we
   use a decreasing path from S to R, which is the red path.  Instead,
   when F<<S, we deduce that F is on a decreasing path from S to R.  So
   in order to avoid F, we use an increasing path from S to R, which is
   the blue path.
 
   All possible cases are systematically described in the same manner in
   the rest of the table.
 
+------+------------+------+------------------------------+------------+
| D    | MRT blue   | F    | additional      | F          | Alternate  |
| wrt  | and red    | wrt  | criteria        | wrt        |            |
| S    | path       | S    |                 | MRT        |            |
|      | properties |      |                 | (deduced)  |            |
+------+------------+------+-----------------+------------+------------+
| D>>S | Blue path: | F>>S | additional      | F on an    | Use Red    |
| and  | Increasing | only | criteria        | increasing | to avoid   |
| D<<S,| path to R. |      | not needed      | path from  | F          |
| D is | Red path:  |      |                 | S to R     |            |
| R,   | Decreasing +------+-----------------+------------+------------+
|      | path to R. | F<<S | additional      | F on a     | Use Blue   |
|      |            | only | criteria        | decreasing | to avoid   |
|      |            |      | not needed      | path from  | F          |
| or   |            |      |                 | S to R     |            |
|      |            +------+-----------------+------------+------------+
|      |            | F>>S | topo(F)>topo(D) | F on a     | Use Blue   |
| S is | Blue path: | and  | implies that    | decreasing | to avoid   |
| R    | Increasing | F<<S | F>>D or F??D    | path from  | F          |
|      | path to D. |      |                 | S to D or  |            |
|      | Red path:  |      |                 | neither    |            |
|      | Decreasing |      +-----------------+------------+------------+
|      | path to D. |      | topo(F)<topo(D) | F on an    | Use Red    |
|      |            |      | implies that    | increasing | to avoid   |
|      |            |      | F<<D or F??D    | path from  | F          |
|      |            |      |                 | S to D or  |            |
|      |            |      |                 | neither    |            |
+------+------------+------+-----------------+------------+------------+
| D>>S | Blue path: | F<<S | additional      | F on       | Use Blue   |
| only | Increasing | only | criteria        | decreasing | to avoid   |
|      | shortest   |      | not needed      | path from  | F          |
|      | path from  |      |                 | S to R     |            |
|      | S to D.    +------+-----------------+------------+------------+
|      | Red path:  | F>>S | topo(F)>topo(D) | F on       | Use Blue   |
|      | Decreasing | only | implies that    | decreasing | to avoid   |
|      | shortest   |      | F>>D or F??D    | path from  | F          |
|      | path from  |      |                 | R to D     |            |
|      | S to R,    |      |                 | or         |            |
|      | then       |      |                 | neither    |            |
|      | decreasing |      +-----------------+------------+------------+
|      | shortest   |      | topo(F)<topo(D) | F on       | Use Red    |
|      | path from  |      | implies that    | increasing | to avoid   |
|      | R to D.    |      | F<<D or F??D    | path from  | F          |
|      |            |      |                 | S to D     |            |
|      |            |      |                 | or         |            |
|      |            |      |                 | neither    |            |
|      |            +------+-----------------+------------+------------+
|      |            | F>>S | additional      | F on Red   | Use Blue   |
|      |            | and  | criteria        |            | to avoid   |
|      |            | F<<S,| not needed      |            | F          |
|      |            | F is |                 |            |            |
|      |            | R    |                 |            |            |
+------+------------+------+-----------------+------------+------------+
| D<<S | Blue path: | F>>S | additional      | F on       | Use Red    |
| only | Increasing | only | criteria        | increasing | to avoid   |
|      | shortest   |      | not needed      | path from  | F          |
|      | path from  |      |                 | S to R     |            |
|      | S to R,    +------+-----------------+------------+------------+
|      | then       | F<<S | topo(F)>topo(D) | F on       | Use Blue   |
|      | increasing | only | implies that    | decreasing | to avoid   |
|      | shortest   |      | F>>D or F??D    | path from  | F          |
|      | path from  |      |                 | R to D     |            |
|      | R to D.    |      |                 | or         |            |
|      | Red path:  |      |                 | neither    |            |
|      | Decreasing |      +-----------------+------------+------------+
|      | shortest   |      | topo(F)<topo(D) | F on       | Use Red    |
|      | path from  |      | implies that    | increasing | to avoid   |
|      | S to D.    |      | F<<D or F??D    | path from  | F          |
|      |            |      |                 | S to D     |            |
|      |            |      |                 | or         |            |
|      |            |      |                 | neither    |            |
|      |            +------+-----------------+------------+------------+
|      |            | F>>S | additional      | F on Blue  | Use Red    |
|      |            | and  | criteria        |            | to avoid   |
|      |            | F<<S,| not             |            | F          |
|      |            | F is | needed          |            |            |
|      |            | R    |                 |            |            |
+------+------------+------+-----------------+------------+------------+
| D??S | Blue path: | F<<S | additional      | F on a     | Use Red    |
|      | Decr. from | only | criteria        | decreasing | to avoid   |
|      | S to first |      | not needed      | path from  | F          |
|      | node H>>D, |      |                 | S to H.    |            |
|      | then incr. +------+-----------------+------------+------------+
|      | to D.      | F>>S | additional      | F on an    | Use Blue   |
|      | Red path:  | only | criteria        | increasing | to avoid   |
|      | Incr. from |      | not needed      | path from  | F          |
|      | S to first |      |                 | S to G     |            |
|      | node G<<D, |      |                 |            |            |
|      | then decr. |      |                 |            |            |
|      |            +------+-----------------+------------+------------+
|      |            | F>>S | GADAG link      | F on an    | Use Blue   |
|      |            | and  | direction       | incr. path | to avoid   |
|      |            | F<<S,| S->F            | from S     | F          |
|      |            | F is +-----------------+------------+------------+
|      |            | R    | GADAG link      | F on a     | Use Red    |
|      |            |      | direction       | decr. path | to avoid   |
|      |            |      | S<-F            | from S     | F          |
|      |            |      +-----------------+------------+------------+
|      |            |      | GADAG link      | Implies F is the order  |
|      |            |      | direction       | proxy for D, which has  |
|      |            |      | S<-->F          | already been handled.   |
+------+------------+------+-----------------+------------+------------+
 
 
     Figure 25: determining MRT next-hops and alternates based on the
       partial order and topological sort relationships between the
    source(S), destination(D), primary next-hop(F), and block root(R).
       topo(N) indicates the topological sort value of node N.  X??Y
     indicates that node X is unordered with respect to node Y.  It is
   assumed that the case where F is D, or where F is the order proxy for
                       D, has already been handled.
 
_______________________________________________
rtgwg mailing list
rtgwg <at> ietf.org
https://www.ietf.org/mailman/listinfo/rtgwg
Jeff Tantsura | 25 Jun 21:33 2015
Picon

request for RTGWG slot at IETF 93 in Prague

Dear RTGWG,

If you would like an agenda slot for the upcoming RTGWG meetings
at IETF 93 in Prague, please send the chairs a request.

Thanks!

Jeff & Chris
internet-drafts | 25 Jun 14:13 2015
Picon

I-D Action: draft-ietf-rtgwg-lfa-manageability-11.txt


A New Internet-Draft is available from the on-line Internet-Drafts directories.
 This draft is a work item of the Routing Area Working Group Working Group of the IETF.

        Title           : Operational management of Loop Free Alternates
        Authors         : Stephane Litkowski
                          Bruno Decraene
                          Clarence Filsfils
                          Kamran Raza
                          Martin Horneffer
                          Pushpasis Sarkar
	Filename        : draft-ietf-rtgwg-lfa-manageability-11.txt
	Pages           : 29
	Date            : 2015-06-25

Abstract:
   Loop Free Alternates (LFA), as defined in RFC 5286 is an IP Fast
   ReRoute (IP FRR) mechanism enabling traffic protection for IP traffic
   (and MPLS LDP traffic by extension).  Following first deployment
   experiences, this document provides operational feedback on LFA,
   highlights some limitations, and proposes a set of refinements to
   address those limitations.  It also proposes required management
   specifications.

   This proposal is also applicable to remote LFA solution.

The IETF datatracker status page for this draft is:
https://datatracker.ietf.org/doc/draft-ietf-rtgwg-lfa-manageability/

There's also a htmlized version available at:
https://tools.ietf.org/html/draft-ietf-rtgwg-lfa-manageability-11

A diff from the previous version is available at:
https://www.ietf.org/rfcdiff?url2=draft-ietf-rtgwg-lfa-manageability-11

Please note that it may take a couple of minutes from the time of submission
until the htmlized version and diff are available at tools.ietf.org.

Internet-Drafts are also available by anonymous FTP at:
ftp://ftp.ietf.org/internet-drafts/
Joel Jaeggli | 25 Jun 07:14 2015

Joel Jaeggli's No Objection on draft-ietf-rtgwg-lfa-manageability-09: (with COMMENT)

Joel Jaeggli has entered the following ballot position for
draft-ietf-rtgwg-lfa-manageability-09: No Objection

When responding, please keep the subject line intact and reply to all
email addresses included in the To and CC lines. (Feel free to cut this
introductory paragraph, however.)

Please refer to https://www.ietf.org/iesg/statement/discuss-criteria.html
for more information about IESG DISCUSS and COMMENT positions.

The document, along with other ballot positions, can be found here:
https://datatracker.ietf.org/doc/draft-ietf-rtgwg-lfa-manageability/

----------------------------------------------------------------------
COMMENT:
----------------------------------------------------------------------

Ron Bonica's Opsdir review.

Folks,

I have reviewed this document as part of the Operational directorate's
ongoing effort to review all IETF documents being processed by the IESG. 
These comments were written with the intent of improving the operational
aspects of the IETF drafts. Comments that are not addressed in last call
may be included in AD reviews during the IESG review.  Document editors
and WG chairs should treat these comments just like any other last call
comments.

This document is on the Standards Track. It provides operational feedback
on LFA, highlights some limitations, and proposes a set of refinements to
address those limitations.  It also proposes required management
specifications.

The document is well-written and nearly ready for publication.

Major Issues
----------------
None

Minor Issues
---------------
- Please run this document through the NIT checker and address the NITS

- I am not sure how the sitting IESG feels about the use of lowercase
"must", "should" and "may". You may want to check this before the IESG
review.

Ron Bonica

---

example that I would cite as good to all caps

6.1
...

   o  Per prefixes: prefix protection SHOULD have a better priority
      compared to interface protection.  This means that if a specific
      prefix must be protected due to a configuration request, LFA must
      be computed and installed for this prefix even if the primary
      outgoing interface is not configured for protection.

LFA MUST

since it's a requirement

in most other cases I see a lower cast must what is being described is
the logic that draws you to a conclusion, and those are ok.
Ben Campbell | 23 Jun 23:32 2015

Ben Campbell's No Objection on draft-ietf-rtgwg-lfa-manageability-09: (with COMMENT)

Ben Campbell has entered the following ballot position for
draft-ietf-rtgwg-lfa-manageability-09: No Objection

When responding, please keep the subject line intact and reply to all
email addresses included in the To and CC lines. (Feel free to cut this
introductory paragraph, however.)

Please refer to https://www.ietf.org/iesg/statement/discuss-criteria.html
for more information about IESG DISCUSS and COMMENT positions.

The document, along with other ballot positions, can be found here:
https://datatracker.ietf.org/doc/draft-ietf-rtgwg-lfa-manageability/

----------------------------------------------------------------------
COMMENT:
----------------------------------------------------------------------

I've got nothing that rises to the level of a discuss, but I do have a
few comments:

*** Substantive Comments: ***

-- section 4, last paragraph:

This is an odd thing to make normative. It seems more a question of
business and ecosystem decisions.

-- 6.2.4.2:

Can you explain the "color" metaphor, or reference an explanation?

-- 8:

The entirety of the security considerations are of the form of "No new
considerations compared to [RFC5286]". Please offer supporting arguments
for that. For example, a high-level description of the nature of the
changes, new behaviors, or clarifications introduced in this draft, and
how those do or do not impact the security considerations.
*** Editorial Comments: ***

-- General:

There are pervasive grammatical errors, especially incorrect use of
singular and plural forms, missing articles, and incorrect use of commas.
The RFC editor will fix these, but you could save them quite a bit of
work by making another proofreading pass.

Also, throughout the draft, there are lists of examples in the form of
":A,B,C...". I suggest using the form "A,B,C etc.", or even better "For
example , A, B, and C" or "e.g., A, B, and C"

-- section 2, first bullet:

Last sentence is a fragment.

-- 3.1, last paragraph:

This seems to say that, in addition to the technical issues mentioned
here, service providers simply might not like it. Is that really what you
mean to say?

-- 4, last bullet:

Missing "to" . Otherwise, this renders to "... may be able monitor
constantly..."

-- 5, first paragraph: "As all FRR mechanism, ..."

I think there's one or more missing words. Do you mean "As [in/for] all
other FRR mechanisms, ..."?

"Depending of the hardware..."

s/of/on

'... compared to the amount of destinations in RIB."

I suggest " ... compared to the number of destinations in the RIB."

-- 2nd paragraph " ... permit to compute ... "

"Permit" needs a direct object. That is "... permits [something] to
compute...". What is the something?

(Note: This construction appears several times)

-- 6.1, first paragraph: "The granularity of LFA activation should be
controlled ..."

-- 6.1, last bullet in first list: "... SHOULD have a better priority ...
"
s/better/higher

-- 6.2, item 3 in the numbered list: "... an implementation SHOULD pick
only one based on its own
       decision, as a default behavior."

A "default behavior" implies that people might make non-default choices.
I suggest striking the phrase", as a default behavior".

I'm not sure what that means.

-- 2nd paragraph: " ... following criteria:"

I suggest s/criteria/granularities

(Note: I see quite a bit of the use of the word "criteria" when I think
you mean something else. )

-- 6.2.3:

What does "enhanced criteria" mean? Also the word "Downstreamness" seems
a bit of a stretch, but if the RFC editor lets that get by then more
power to you :-)

-- 6.2.4.1:
Please expand SRLG on first mention.

-- 6.2.5.4: 5th paragraph "... it is needed to limit..."

perhaps "... implementations need to limit..."

-- 9 and 10:

Section 9 is out of place (should probably go right before references),
and 10 is empty.
Alvaro Retana (aretana | 23 Jun 15:44 2015
Picon

Re: Alvaro Retana's No Objection on draft-ietf-rtgwg-lfa-manageability-09: (with COMMENT)

On 6/23/15, 5:22 AM, "stephane.litkowski <at> orange.com"
<stephane.litkowski <at> orange.com> wrote:

Stephane:

I¹m looping everyone else back in..I know Brian had the same comment.

>For the point #3, we had a comment from Alia on the list saying that we
>needed to point to some existing solutions.
>
>We propose to change the text as follows :
>BEFORE :
>Link color information SHOULD be signalled in the IGP.  How
>   signalling is done is out of scope of the document but it may be
>   useful to reuse existing admin-groups from traffic-engineering
>   extensions or link attributes extensions like in
>   [I-D.ietf-ospf-prefix-link-attr].
>NEW TEXT :
> Link color information SHOULD be signalled in the IGP in order to limit
>configuration effort.  e.g.
>   [I-D.ietf-ospf-prefix-link-attr], [RFC5305], [RFC3630] ...
>
>Does it work ?

Honestly, both options point at the same thing: the suggestion (at least)
of a solution.  I am completely in favor of reusing
technology/ideas/drafts if they will help solve the problem.  My point is
that this document is not specifying solutions, just requirements..if that
is true, then don¹t point at the solutions.  OTOH, if this document is to
specify solutions, then lets do that.

Having said all that, I can defer to Alia.  However, please at least make
it clear that the solutions you are pointing to are just that (pointers).
In the text above I would rather keep the original text that clearly
states that solutions are out of scope.  The text in 6.2.4.4 doesn¹t
explicitly say that the solutions are not in scope.

Thanks!

Alvaro.

>3. In Section 6.2.4.2 the document talks about signaling color
>information, it includes a set of requirements..and it reads ³How
>signaling is done is out of scope of the document², but then you go on
>and point to a specific solution.  Even if there might be a high
>certainty that the solution you point at is moving on in the process, is
>good, should be used, etc..  I think this document would be better served
>by just defining the requirements (specially if you¹re pointing at the
>solution as out of scope).   You do the same in 6.2.4.4.

Gmane