RE: draft-bryant-filsfils-fat-pw
Yaakov Stein <yaakov_s <at> rad.com>
2007-12-03 22:13:40 GMT
I am
not sure what you mean by "IP PW".
Do you
mean simply IP over MPLS ?
In
that case, yes the CW first nibble is designed to help with load
balancing.
As I
understand this draft, the question is load balancing of PW
traffic,
not IP
over MPLS.
Not
being an operator, I am not sure how important this is.
Y(J)S
OK, so speaking as yet another operator....
there's a clear
need to support fat PWEs, but I'm yet to be convinced that this draft is the
correct solution to the problem.
The intro to the draft talks about the
application being to interconnect IP routers. If that's the case then why not
use an IP pseudowire? If you do that then there will just be one label,
but (AFAIK) many routers will spot the 0x4 (or 0x6) in the first nibble of the
payload and do a hash on the IP header - giving optimum traffic distribution and
also preserving the order of each flow.
If the payload is not IP then I
think we have a problem at any rate, as we don't necessarily know how to
identify a "flow". Sure, you could do a MAC hash for an Ethernet
pseudowire, but in many cases you see precisely one pair of MAC addresses on the
PWE.
Giles
On Nov 28, 2007 2:47 PM, Shane Amante <
shane <at> castlepoint.net> wrote:
Hi
Yaakov,
Yaakov Stein wrote:
> Stewart and other
authors
>
> I just finished reading the FAT-PW draft, and have a
few comments/questions.
>
> 1. The draft says "Operators have
requested the ability..."
> Since I have never heard this
request from any of the operators with
> which we work,
>
can this be changed to "Some operators have requested ..." ?
>
Since there is one operator on the author list, I guess we can
guess
> which operator has requested
> this feature
!
Speaking as /another/ operator, I can say there is an
absolutely strong
need to solve this problem, (and, has been for quite a
long time,
actually). Consider the fact that 10 GbE has become (is
becoming?) a
pretty common access circuit to Backbones and that within most
SP
networks the dominant Backbone link size are 10G. As you're likely
well
aware, the IEEE HSSG is working on both 40 GbE and 100 GbE. Once
40 GbE
is available, (and assuming its used for WAN connectivity,
perhaps
similar to 10 GbE LAN PHY), then OC-768c Backbone links will suffer
the
same problem. 100 GbE will, eventually, be used as both core and
access
links. In short, this problem is not going away. We need
to solve it.
> 2. The example given is for Ethernet PWs.
Is this draft limited to this
> case?
> There is
discussion of whether it is limited to IP over Ethernet,
>
but this more basic question is not addressed.
> For
example, could this load balancing to be performed for ATM PWs
>
based on the AAL5 flows?
From my perspective,
Ethernet is far and away the biggest "problem
child" out there today, due
to the size of access to Backbone links,
(see above). While it may be
admirable to look at making this draft
"generic" for a variety of PW
types, I wouldn't lose any sleep if this
draft remained focused on just
Ethernet.
> 3. PWs are an emulation of the native
service.
> Why is this emulation being called upon to
deliver a feature NOT
> present in the native service ?
>
Doesn't this break the model a bit?
>
> 4. A native service
processing function is required for differentiating
> between different
flows
> at ingress. If this draft is indeed limited to
Ethernet PWs, such a
> processing function
> already
exists in the native service. 802.3 clause 43 (LAG) defines
>
conversations
> for exactly this purpose (commonly
implemented by hashing IP
> addresses and port numbers),
>
and even mentions the use of load balancing in the distribution
of
> conversations over links.
> I think this
function should be at least referenced.
>
> 5. My greatest problem
is with the prefered mode of section 1.1,
> which builds a
PW label stack under the MPLS label stack.
> The proposal
is for 2 PW labels (once again, somewhat breaking RFC3985).
>
Figure 2 is not completely clear about the label structure.
>
There are two possibilities:
> 1) both
load balancing label and PW label have stack bit set. (I
> hope not
!)
> 2) the load balancing label has S=1, and the PW
label has S=0.
> So formally, the PW
label seems to be an MPLS label.
> Both possibilities
break the standard model.
>
> I would certainly like
to see more justification of the problem
> before breaking
the model in this way.
> Perhaps a short requirements
document is in order?
When I read the draft, this is the
part I also had the most concern
with. In particular, I like the
"simplicity" of the LB Label approach
(i.e.: savings on FIB space, no need
to signal first and last labels for
each PW, etc.); however, I am concerned
about the implications of, or
potential need to, define a 'generic' MPLS PW
label.
My primary concern is future extensibility. Specifically,
in case there
are /other/ applications, which may or may not have been
brought to the
surface, yet, that may have similar needs/desire for a 2nd
PW label. If
that ultimately means we gain consensus to amend the
PWE3 Architecture,
I'm OK with that, but certainly we would need to have
more discussion to
see whether or not it is a good approach and, more
importantly, what are
the other implications that go along with it?
> 6. The draft recommends generating a load
balancing label in such fashion
> that the entropy is
high. This assumes that the precise form of the
> label
>
is used to determine the load balancing path (possibly a hash
of
> some sort).
> Could this mechanism, even if
beyond the scope of the document, be
> explained a bit more ?
Load-balancing over LAG and ECMP paths, using some number of
MPLS labels
as input to a load-balancing hash algorithm, is common across
all
vendors. However, such algorithms are 'proprietary' to each
vendor.
I'm not sure how much more can be said other than the fact that,
one
would strongly prefer that the output of a LAG or ECMP hashing
algorithm
is spread out among the largest number of hash buckets, (as
is
practical), to get the most even distribution of flows across a set of
N
links in a LAG or ECMP path. And, I think the draft already makes
this
point, in Section 3:
---snip---
It is recommended
that the method chosen to
generate the load balancing labels
introduces a high degree of
entropy in their values, to
maximise the entropy presented to the
ECMP path selection
mechanism in the LSRs in the PSN, and hence
distribute the
flows as evenly as possible over the available PSN
ECMP
paths.
---snip---
Is there something else you had in mind?
-shane
> 7. With the optional mode of section 1.2
several PW labels are mapped to
> a single AC.
> I
have no problem with this approach. In fact, I feel that it is
>
somewhat similar to the solutions being proposed for PW
protection.
> For PW protection two labels mapped to the
AC or end-user application,
> where one label belongs to
the active PW, and the other to the
> backup PW (not being
used).
> For load balancing two or more PWs, all in active
state, are mapped
> to the same AC.
> Would it be
possible to integrate the two features into one mechanism
>
for mapping multiple PW labels in either active or backup state
to
> one AC or end-user identifier?
>
> 8. The term VC as
opposed to PW is used in various places in the document.
>
I am not sure what is meant here. Is the intent that a "VC" is one
> of
the paths of the
> load-balanced "PW" ?
>
>
The first paragraph of section 4 seems to imply that the authors are
>
willing to settle
> on either of the modes rather than both. I would
support the PW label mode.
> If some entropy-rich information needs to
be placed in the packet,
> perhaps the flags in the CW could be used (if
16 paths is sufficient).
>
>
Y(J)S
>
>
>
>
------------------------------------------------------------------------
_______________________________________________
pwe3 mailing list
pwe3 <at> ietf.org
https://www1.ietf.org/mailman/listinfo/pwe3