Jan Novak (janovak | 1 Feb 2010 13:06
Picon
Favicon

Re: Descriptive paragraph on flow export benchmarking

Hi Al,

Not quite sure if I am supposed to comment but just two details:

'characterize the both' seems to be somehow out of context ??

'multipurpose routing' - I would prefer 'forwarding ' instead
of routing here - Flow Monitoring can (optionally) do both L2 and L3.

Tx, Jan

The climate of Edinburgh is such that the weak
succumb young .... and the strong envy them.

                                 Dr. Johnson  

> -----Original Message-----
> From: bmwg-bounces <at> ietf.org [mailto:bmwg-bounces <at> ietf.org] On Behalf
Of Al
> Morton
> Sent: 29 January 2010 15:02
> To: bmwg <at> ietf.org
> Subject: [bmwg] Descriptive paragraph on flow export benchmarking
> 
> BMWG,
> 
> At our interim meeting, we suspended the discussion of flow export
> benchmarking with the goals to work out how we would include this
> work in our charter and reach consensus on the detailed discussion
> points raised.
(Continue reading)

Benoit Claise | 1 Feb 2010 14:27
Picon
Favicon

Re: Descriptive paragraph on flow export benchmarking

Al,

What about this?

Flow Export and Collection: Develop terminology and methods to
characterize network devices flow monitoring, export, and collection.
The goal is a methodology to assess the maximum IP flow rate that a
network device can sustain without losing any IP flow information or
compromising the accuracy of information exported on the IP flows,
and to asses the forwarding plane performance (if the forwarding 
function is present)
in the presence of  Flow Monitoring.

Regards, Jan and Benoit
> BMWG,
>
> At our interim meeting, we suspended the discussion of flow export
> benchmarking with the goals to work out how we would include this
> work in our charter and reach consensus on the detailed discussion
> points raised.
>
> To get things going again, let me suggest a starting paragraph to
> describe this work area (below) for future re-chartering.
>
> To pick-up where the discussion stopped, please see the meeting notes
> http://home.comcast.net/~acmacm/BMWG/BMWG-Int-Meet.html
> (see item 2. in the Detailed Notes)
>
> Please make your comments on the bmwg-list.
>
(Continue reading)

Al Morton | 1 Feb 2010 16:08
Picon
Favicon

Re: Descriptive paragraph on flow export benchmarking

I like it, the goal has a better flow...

Other comments?

Al
bmwg chair

At 08:27 AM 2/1/2010, Benoit Claise wrote:
>Al,
>
>What about this?
>
>Flow Export and Collection: Develop terminology and methods to
>characterize network devices flow monitoring, export, and collection.
>The goal is a methodology to assess the maximum IP flow rate that a
>network device can sustain without losing any IP flow information or
>compromising the accuracy of information exported on the IP flows,
>and to asses the forwarding plane performance (if the forwarding 
>function is present)
>in the presence of  Flow Monitoring.
>
>Regards, Jan and Benoit
Kris Michielsen | 4 Feb 2010 16:04
Picon
Favicon

Re: WGLC: draft-ietf-bmwg-igp-dataplane drafts

Anuj,
 
Comments in green, marked with [Kris1:].
I also added comments in the attached draft and attached another document with accuracy interval calculations (see below).
Please provide your feedback.
 

From: Dewangan, Anuj [mailto:Anuj.Dewangan <at> spirent.com]
Sent: 24 January 2010 02:20
To: Kris Michielsen; Al Morton; sporetsky <at> allot.com; bimhoff <at> planetspork.com
Subject: RE: [bmwg] WGLC: draft-ietf-bmwg-igp-dataplane drafts

Hi Kris,                                                                                                                                                                               

 

I have commented inline in red and marked as [Anuj1:]. Also I have added comments/additions to the draft and have attached it. Please look for “[Anuj:]” to find the comments. However, the draft still does not answer some problems/issues seen when such a test is performed practically:

 

1. Due to inherent jitters in the traffic forwarded by the DUT, the graph is never as smooth as in theory. Even without a convergence event, the traffic rate is seen fluctuating due to a combination of jitters in the forwarded traffic and the resolution of sampling interval, which is supposed to be as small as possible (and with the definition of atleast one packet per route) and should usually be in milliseconds for any useful/accurate measurement required. As an example, if there are only a few routes in the test, then even a couple of packets extra seen in a sampling interval (due to forwarding jitters) will cause a major fluctuation in the convergence graph. In such a case, the convergence instants are very difficult (or impossible) to calculate. This is a problem even with a “normal” number of routes but a very small sampling interval – which is possible if the offered rate (=DUT throughput) is high. This is not addressed anywhere in the draft.  

[Kris1:] This is mainly a problem in the cases where variations in rate need to be observed. For cases where there is a transition rcv rate X -> rcv rate 0 or rcv rate 0 -> rcv rate X it is less of a problem, jitter will add to the error interval which is already fairly large.
Assuming jitter is symmetric around an average forwarding delay and this average forwarding delay is constant, and assuming that jitter == n*1/Offered Load, and packet sampling interval is of duration N*#routes/Offered Load, the expected amount of packets under steady state in a packet sampling interval is between N*#routes-2n and N*#routes+2n.
If N>2n and the #received packets is outside of the above #received packets interval under steady state, one can decide a variation in rate has
If convergence has not yet completed for >=1 route during a sampling interval, the #received packets in a sampling interval is <= N*(#routes-1). So, under the above assumptions, and if N>2n one can decide the convergence recovery instant was not reached if #received packets < N*#routes-2n. So the larger the jitter, the larger the packet sampling interval needs to be to derive the convergence recovery instant.
I would propose the following change:
 
"If the Packet Sampling Interval is large
   compared to the time between the convergence time instants, then the
   different time instants may not be easily identifiable from the
   Forwarding Rate observation.  Using a small Packet Sampling Interval in the presence of jitter may cause fluctuations of the Forwarding Rate observation and can prevent accurate measurement of the different time instants. The requirements for the Packet
   Sampling Interval are specified in [Po09t].  The Packet Sampling
   Interval MUST be larger than or equal to the time between two
   consecutive packets to the same route.  For maximum accuracy the
   value for the Packet Sampling Interval SHOULD be as small as
   possible, but the presence of jitter may enforce using a larger Packet Sampling Interval.  The Packet Sampling Interval MUST be reported."
 

2. Sampling interval just as a function of the number of routes and the offered rate is not sufficient as is seen above. For the ECMP tests, because Sampling Interval value is set on each egress port and it is calculated as being the time for sending one packet per route, and each ECMP egress port receives part of the traffic (corresponding to the partial routes corresponding to that egress port on the DUT FIB), the convergence graph on each of the ports is even more fluctuating. Hence while adding up the rates from these ports, it becomes even more difficult to determine convergence instants especially the recovery instant. Again this has not been addressed in the draft.

[Kris1:] Apart from different jitter characteristics by having multiple egress interfaces, can you explain why the convergence graph would fluctuate more?

 

3. Please give special attention to my comments on Offered Load and the measurement accuracy in the attached document. I have additional comments on things that I find missing in the draft and will comment on it when required.

 

The answers to these may need changes to meth and the term drafts. Are you willing to work with me on this?

 

Thanks,

Anuj

 

From: Kris Michielsen [mailto:kmichiel <at> cisco.com]
Sent: Thursday, January 21, 2010 6:29 AM
To: Dewangan, Anuj; 'Al Morton'; sporetsky <at> allot.com; bimhoff <at> planetspork.com
Subject: RE: [bmwg] WGLC: draft-ietf-bmwg-igp-dataplane drafts

Anuj,

 

Thank you for taking the time to review these drafts.

Replies below in green.

 

From: Dewangan, Anuj [mailto:Anuj.Dewangan <at> spirent.com]
Sent: 20 January 2010 17:26
To: Kris Michielsen; Al Morton; sporetsky <at> allot.com; bimhoff <at> planetspork.com
Subject: RE: [bmwg] WGLC: draft-ietf-bmwg-igp-dataplane drafts

Answers inline in red.

 

From: Kris Michielsen [mailto:kmichiel <at> cisco.com]
Sent: Thursday, January 14, 2010 8:52 AM
To: Dewangan, Anuj; 'Al Morton'; sporetsky <at> allot.com; bimhoff <at> planetspork.com
Subject: RE: [bmwg] WGLC: draft-ietf-bmwg-igp-dataplane drafts

 

Hi Anuj,

 

Many thanks for your very valuable comments and suggestions. See comments and questions below.

From: Dewangan, Anuj [mailto:Anuj.Dewangan <at> spirent.com]
Sent: 23 November 2009 17:54
To: Al Morton; sporetsky <at> allot.com; bimhoff <at> planetspork.com; kmichiel <at> cisco.com
Subject: RE: [bmwg] WGLC: draft-ietf-bmwg-igp-dataplane drafts

Hi All,

 

Some comments on the dataplane drafts (http://tools.ietf.org/html/draft-ietf-bmwg-igp-dataplane-conv-meth-19 and http://tools.ietf.org/html/draft-ietf-bmwg-igp-dataplane-conv-term-19) that I sent to the authors last year:

 

1. Section 3.1 and Section 3.2:

 

For the first topology, the Tester emulates two routers and routes for traffic destinations. If the Tester is assumed to be able to do that why is it assumed that R2 cannot be emulated by the tester? By doing that, the convergence time on R1 can be calculated. The whole point being, that instead of making assumptions about a Tester capabilities, the standard should talk about how topologies should look like to measure convergence on a particular device. How that is done should be left on the Tester.

There aredifferences between R2 being emulated by the Tester and R2 being a real device: a real device needs time to detect the failure, schedule,generate and transmit the LSP/LSA. These may sum up to a significant part of the total convergence time equation, which is lacking or not matching reality when emulating that device.

 

The Tester emulation can run an implementation of the same routing protocols in question here and should be capable of performing routing functions like a “real” router. Assumptions on tester capabilities/incapabilities should be avoided. 

Obviously a Tester can perfectly emulate routing protocols. But the timing of a real device R2 is crucial in this testcase. If R2 is not a real router of the same type as R1 then you are measuring only a part of the convergence time equation, and you get a testcase equivalent to the IGP metric change in 8.3.2.

 

[Anuj1:]  Could you elaborate on why R2 should be the same device model as R1 and how the “timing” of R2 influences the convergence times? The role of control plane in a “dataplane” only convergence test is restricted to signaling changes in the topology or simulating some fail-overs (like stopping hellos to simulate router down etc). This can be done equally well by a tester as any other device because it would be running the same IGP protocol. The additional hop between R2 and the Tester can be simulated by the Tester too. The only case where this may be important is the presence of lots of routes where the protocol parameters like LSA/LSP update intervals may determine the resulting traffic patterns. But again these “Protocol timings on the Tester SHOULD be made equal to the timings on R1.” – a condition like this will ensure that a convergence test is performed using a single router (DUT) like R1. 

[Kris1:] The goal is to benchmark a real router behaviour. This is very clear in a  local link failure test with a single router where all convergence operations will be done on this single box. Convergence time roughly comes down to this equation: convergence time = failure detection time + LSA/LSP gen hold time + LSA/LSP gen duration + [LSA/LSP flood time] + SPT hold time + SPT duration + RIB update time + FIB update time. The flooding term between []s is only present for remote failure. For remote failure the terms before the flooding are handled by R2 while the terms after the flooding are handled by R1. Replacing R2 by an emulated router removes measurement of the first half of the equation. This could be an interesting measurement, but not the goal of this test.

 

 

Also, the topologies are very restrictive fundamentally. There is a possibility of a topology where multiple egress interfaces are present. Each interface except the Preferred Egress Interface advertises the same route cost. So effectively there can be N Next-best Egress Interfaces. When a Convergence Event takes place, the traffic should move from the Preferred Egress Interface to load distribution across the Next-best Interfaces till total convergence is achieved in the network. Is such a topology is not acceptable, then it should be clearly mentioned and the reason for it stated. If such a topology is not acceptable, then the same reasoning should be applied to the N Interfaces for Section 3.3. If the focus to this standard is not for such cases, then these should be mentioned as out of scope.

Same applies to the topology in Section 3.4 for remote events. Tester capability is assumed and not documented.

 

The likelihood to have N to N-1convergence is much higher than a 1 to N convergence. But I have no objections against such a topology.

 

Does this not mean that such a topology should either be addressed in the benchmark or a reason given as to why this is not addressed? 

I added it for now, but I'm not yet fully convinced if these cases are needed.

 

[Anuj1:]  Figure 1 now is just a specific case of Figure 4 i.e. Figure 4 with number of members in next-best ECMP set = 1 is equivalent to Figure 1. This is what I originally meant with my comment, that Figure 1 is too specific and should be generalized like you have in Figure 4. The only case is that either topology for Figure 1 should be removed or should be highlighted as a specific case of topology in Figure 4.

 

Similar argument as above applies to Figure 5 and Figure 2.

[Kris1:] I would rather specify that Figure 1 and 2 are the tests one SHOULD do while one MAY do the tests of Figure 4 and 5 since I think the 1-to-N case is far less important than 1-to-1 and N-to-N-1.

 

 

 

2. Measurement accuracy for loss derived method (6.1.3) should specify which metric it is referring to. e.g. If it is the metric calculated as "Connectivity Packet Loss/Offered Load" then the convergence time may be upto 1 Inter-Packet arrival period more. This is because the first packet in the sequence of packets that got dropped could have possibly been dropped if this packet had arrived in the interval between the last packet to this route and this packet. The interval packet arrival is calculated as "1/offered load". Also the convergence time could be just greater than Inter-Packet arrival for "Connectitivity Packet Loss-1" packets = "(Connectivity Packet Loss - 1)/Offered load". Again this is possible, if there were a packet following the last packet dropped in the sequence of dropped packets to the route in the interval between the last packet dropped and the packet following it (=Inter packet arrival). Hence in this case, the range of the metric is "Connectivity Packet loss/Offered load +- 1/offered load". Ranges should be specified for each of the reported metrics.

I agree, the accuracyneeds to be corrected.

 

3. Section 6.2.1 recommends a Sampling interval. There is no discussion on the influence of the offered traffic rate and the sampling interval. eg. if the offered traffic rate is 10 packets/second and the Sampling rate is 10 ms, then 1 packet is received every 10 Sampling intervals. This means that 9/10 sampling intervals have a traffic rate of 0 because no packets were received during those Sampling Intervals. This will have a profound impact on the convergence graph. The argument of offered rate being equal to the DUT throughput and hence not being a small value would be a generic assumption on all DUTs and should not be resorted to in a standard, because there is no knowledge about the DUT throughput and anything else would be an assumption. Instead of recommending a sampling interval, sampling interval should be recommended to be a function of the following:

 

i. Offered Traffic rate:

 

This would mean that the Sampling Interval would be calculated based on the Offered traffic rate or the Received Traffic rate (as argued below) at the egress ports. This can be done by benchmarking minimum number of packets per sampling interval. Hence if x packets per Sampling interval is benchmarked, then the Sampling Interval will become a function of the offered traffic rate - which is benchmarked as the DUT throughput. Hence the Sampling Interval for each test may be different but at the same time ensure that the convergence graph is "smoother" and the problem stated at the head of this section is solved.

-Note that this may not apply to the ECMP test cases as traffic is distributed across the egress interfaces and the smoothness of the graph will be lost because of the traffic distribution and consequent smaller number of packets per sampling interval. So this standard MAY be based on RECEIVED traffic rate on the egress ports and not the offered traffic rate.

To make sure I understand what you're saying here: "for the ECMP testcases we should base sampling rate on the traffic rate received per egress port since the total offered load is distributed over multiple egress interfaces". Correct?

 

 

This is true for all testcases not only ECMP test-cases. The sampling rate then can only be calculated per port. The inaccuracy of the entire test can then be a function of the sampling rates on each port.

 

Don't we only care about total received rate? Even if traffic is received over more than one port, we should add all port stats together and sample that total. Or sample all port stats andsum up the sampled stats.

 

Because sampling rate and sampling is per port, only summing up would work. 

Only the aggregate load on the ECMP members is of importance, otherwise one has to make assumptions/requirements on how the router distributes the load over the ECMP members.

 

[Anuj1:] Yes. Only the aggregate is important but sampling still remains per port and it has practical implications. This is discussed in Point 2 in my email from 23rd Jan. 

[Kris1:] This is internal to the Tester. I don't think the purpose is to describe coping with specific Tester implementations. If I look at sampled stats of a Tester I expect that all ports are sampled at the same time or at least in a very small time window. Adding up the collected stats will give the aggregate.

 

 

ii. Number of routes:

 

There should be atleast one packet per route in the sampling interval. This has been addressed by the standard. However if the number of routes in the test is very large, then the Sampling interval again becomes a function of the Offered Traffic Rate. eg. If the number of routes is 10000 and the offered rate (=DUT throughput) is 10000 fps, then the Sampling Interval becomes 1 second. This example is based on the present specification. In this case, the Sampling Interval cannot be set to 10 ms, because then it does not make sense in two ways:

-There is far too much fluctuations in the convergence graph. This is because there are only 100 packets per Sampling Interval.

-Setting it to 10ms does not increase the accuracy of the test because of the fact that one packet is not being sent to each route. Hence the gating factor for the test accuracy becomes the interval between consecutive packets to the same route and not the Sampling Interval.

 

Because number of routes is already considered a parameter in sampling interval and the value recommended is 10 ms, then this is calling for scale troubles. Suppose there are 10000 routes (not unreasonable assumption); hence 10000 packets per 10 ms need to be offered to the DUT. This is 1000000 packets/second, which is greater than most DUT throughputs in the market now. Hence with the current specifications convergence times in scale environment is an issue.

 

You have a point.10ms seemed to be a fair accuracy goal, but low end devices, where 10ms sampling interval is a stretch based on the limited throughput of such devices, were not taken into account. An equation such as "sampling interval >= #routes/offered load" (but still as small as possible) would be better.

 

The equation above needs to be factored for the minimum number of packets per sampling interval as stated in i) above. 

The requirement to have >= 1 packet to each route per sampling interval is absolute.

 

4. Section 6.2.3 talks about measurement accuracy. The measurement accuracy stated as an addition of the Sampling Interval and the time between consecutive packets to the same route may be a generalization. This is not true for a case where the offered traffic has packets generated to each route in a round-robin fashion and the DUT has FCFS que processing for forwarding. In this case the inaccuracy would be MAX of Sampling interval and the time to offer consecutive packets to the same route. Note that these values may be the same if Sampling Interval is set as a function of number of routes as described in the previous section. Also the  

I agree the accuracy statements may be a generalization. The accuracy for the different instants can be better specified seperately:

 

If sampling interval is calculated as per the arguments in 3., it will be the only factor influencing the accuracy of the test. 

 If sampling interval == time between consecutive packets to the same route then the highest accuracy can be achieved, but it's not a requirement, it can be >=.

 

1) convergence event instant:

This is instantaneous for all routes by definition (otherwise a timestamp needs to be collected).

accuracy interval: -(sampling interval), +0

 

This should have been: -(sampling interval + 1/offered load), +0. But if 1/offered load <<sampling interval then the 1/offered load term can be ignored.

 

2) first route convergence instant and convergence recovery instant

 

The accuracy interval for these two also needs to be specified as is for convergence event instant and is pretty trivial. 

I did, but for these instants one can distinguish situations a) and b) below. 

 

a) convergence recovery transition is non-instantaneous for all routes

accuracy interval: -(time between consecutive packets to the same route + sampling interval), +0

The "time between consecutive packets to the same route" term is the uncertainty when traffic is sent to a destination.

 

"time between consecutive packets to the same route” can be a certainty if the traffic packet scheduling algorithm is round-robin and DUT is FCFS processing (discussed below). This value will then be equal to the sampling interval.

"Uncertainty" in the sense that one doesn't know when a packet is sent to the 1st, 2nd, ... last route to complete convergence, since that also depends on the order of convergence which is unknown before the test.

 

[Anuj1:] Discussed in my comments in the attached draft

 

[Kris1:] The convergence recovery instant accuracy interval I calculated before was incorrect. Here are the corrected ones.

The calculations to derive them are attached, such that you can keep me honest.

 

The real instant falls within the indicated interval around the measured value:

convergence event instant: [-(S+1/O), +0]

first route convergence event instant: [-(S+I), +0]

convergence recovery instant: [-2S, -(S-I)]

 

derived metrics:

first route convergence time: [-(S+I), +(S+1/O)]

full convergence time: [-2S, +(I+1/O)]

 

Thanks,

Kris

 

 

b) convergence recovery transition is instantaneous for all routes, they're equal so only measuring first route convergence instant is enough

I don't think this is a realistic case for IGP convergence.

accuracy interval: -(sampling interval), +0

 

This should have been: -(sampling interval + 1/offered load), +0. But if 1/offered load << sampling interval then the 1/offered load term can be ignored

 

The above equations will not be true if “sampling interval > #routes/offered load”. They will only be true if the traffic data packet scheduling algorithm sends data packets to the routes in a round-robin (or an algorithm that ensure that one packet is sent to each route before a second packet is sent to any route) and the DUT strictly follows FCFS queue processing. These conditions MUST be met in the test.

 

These traffic/forwarding assumptions are implied.

Can you show why "The above equations will not be true if “sampling interval > #routes/offered load”"? I think they are correct as they are.

 

[Anuj1:] Discussed in my comments in the attached draft

 

 

Specifying forces a change of:

"   When using the Rate-Derived Method, the Convergence Recovery Instant
   falls within the Packet Sampling Interval preceding the first
   interval where the observed Forwarding Rate on the Next-Best Egress
   Interface equals the Offered Load."

Since under the assumption quoted here the accuracy would be -(time between consecutive packets to the same route), +(sampling interval)

 

measurement accuracy should be a range and is per metric. These metrics even include Convergence Event Instant, Convergence recovery instant, First Route Convergence Instant. The derived metrics from these like the rate-derived convergence time, first route convergence time, convergence recovery transition, convergence event transition have a different range because they are derived from a range itself. These I feel should be part of the specification.  

The accuracy intervals I reported previously (below) were incorrect. These are the correct ones:

 

The accuracy interval of the metrics Rate-Derived Convergence Time and First Route Convergence Time is: -(Packet Sampling Interval + time between two consecutive packets to the same destination), +(Packet Sampling Interval + 1/Offered Load).

 

If the Convergence Recovery Transition is instantaneous for all routes then the accuracy interval of the metrics Rate-Derived Convergence Time and First Route Convergence Time is: -(Packet Sampling Interval + 1/Offered Load), +(Packet Sampling Interval + 1/Offered Load).

 

If 1/Offered Load is much smaller than Packet Sampling Interval the term "1/Offered Load" can be ignored in the accuracy intervals above.

 

[Anuj1:] Discussed in my comments in the attached draft

 

 Are your accuracy algorithms different from the following:

a) convergence recovery transition is non-instantaneous for all routes

rate-derived convergence time and first route convergence time accuracy:

-(sampling interval), +(time between consecutive packets to the same route)

 

These are by definition functions of the instants (convergence event instant, convergence recovery instant, etc). As the instants themselves are intervals, the intervals for these derived values should “engulf” the intervals which they are a function of. This again is very trivial once we know the intervals of the instants as we discussed above.

 

convergence recovery transition duration accuracy:

-(time between consecutive packets to the same route), +(time between consecutive packets to the same route)

 

b) convergence recovery transition is instantaneous for all routes

-(sampling interval), +(sampling interval)

 

Discussed above.

 

5. The above three sections of this email discuss how some things in the specification conflict and do not address a convergence test requirements for many devices in the market now. One of the solution approaches for Sampling Interval, Offered rate, number of routes and measurement accuracy could be to make Sampling Interval a function of just the Received Rate on the EgressPort, validate the minimum offered rate, and address the problem of having one packet to each route in the measurement accuracy of the metrics.

 

6. Sustained convergence validation time: What is the rationale behing setting it to a constant value of 5 seconds? This value may again spell trouble if there is a test where the number of routes to the offered traffic rate is greater than 5 seconds, leading to not even a single packet being sent to each route during the convergence test. An approach where n consecutive packets are sent to each route and the forwarded traffic rate is cnstant and on the next-best egress port seems more logical.

Itprobably needs to be a combination of a number of packet transmissions cycles and a 5 seconds interval, otherwise there is a similar issue on the lower end of packet cycle intervals.

 

If sampling interval is calculated as we discussed above (and hence ensuring one packet per route is sent in the interval), then this value could just be a multiple of the sampling interval. The multiplier though needs to be benchmarked. 

I chose sustained convergence validation time to be max(5sec, 5*(time between consecutive packets to the same route)). If one just takes n*(time between consecutive packets to the same route) or n*(sampling interval) it may end up being a very small duration.

 

[Anuj1:] Sounds good as long, as it is a function of time between consecutive packets to the same route.

 

 

7. It has not been mentioned in the standard that traffic is just a means of measuring convergence times and hence traffic rate is a factor in the accuracy of the test. This should be highlighted in the beginning of the draft to lend better understanding to the user. 

I'll see how it can be emphasized more.

 

Many thanks again,

 

Kris

 

As stated earlier I would add value to the benchmarking draft and would love to be a contributing author. Please give it a thought and let me know.

 

I don't think it isneeded at this point.

 

I attached new versions of the drafts addressing the comments sofar. Can you review?

[Anuj1:] Reviewed and attached.

 

Thanks,

Kris

 

Thanks,

Anuj

 

 

Please write back to me with responses/discussions/questions.

 

I will be have limited email access in the next few weeks and would not be able to reply to the responses immediately.

 

Thanks,

Anuj Dewangan

Spirent Communications,

Raleigh, NC27560

 

From: bmwg-bounces <at> ietf.org [mailto:bmwg-bounces <at> ietf.org] On Behalf Of Al Morton
Sent: Monday, November 02, 2009 9:52 AM
To: bmwg <at> ietf.org
Subject: [bmwg] WGLC: draft-ietf-bmwg-igp-dataplane drafts

 

BMWG,

This message begins a WG Last Call on the IGP-Dataplane Convergence
Time Benchmarking drafts.

http://tools.ietf.org/html/draft-ietf-bmwg-igp-dataplane-conv-term-19

http://tools.ietf.org/html/draft-ietf-bmwg-igp-dataplane-conv-meth-19

The Last Call with end on November 16, 2009, at 5PM US EST, 2300 GMT.

This is a topic we've been discussing in BMWG 
as long as I have been chairman.  The state of the art advanced
while we were developing these drafts, and hopefully now they
are fully in-sync and relevant.  The term and meth drafts
have been substantially revised in the -19- versions.

We also need to decide whether we need this expired draft:
http://tools.ietf.org/html/draft-ietf-bmwg-igp-dataplane-conv-app-17
It may be that the revisions to bring this in sync with the terms
and meth drafts are fairly trivial.  Comments on this are welcome.

Please weigh-in on whether or not these Internet-Drafts
should be given to the Area Directors and IESG for consideration and
publication as an Informational RFCs.  Send your comments
to this list or acmorton <at> att.com.

Al
bmwg chair

 



E-mail confidentiality.
--------------------------------
This e-mail contains confidential and / or privileged information belonging to Spirent Communications plc, its affiliates and / or subsidiaries. If you are not the intended recipient, you are hereby notified that any disclosure, copying, distribution and / or the taking of any action based upon reliance on the contents of this transmission is strictly forbidden. If you have received this message in error please notify the sender by return e-mail and delete it from your system. If you require assistance, please contact our IT department at helpdesk <at> spirent.com.

Spirent Communications plc
NorthwoodPark, Gatwick Road, Crawley, West Sussex, RH10 9XN, United Kingdom.
Tel No. +44 (0) 1293 767676
Fax No. +44 (0) 1293 767677

Registered in England Number 470893
Registered at NorthwoodPark, Gatwick Road, Crawley, West Sussex, RH10 9XN, United Kingdom.

Or if within the US,

Spirent Communications,
26750 Agoura Road, Calabasas, CA, 91302, USA.
Tel No. 1-818-676- 2300

 



E-mail confidentiality.
--------------------------------
This e-mail contains confidential and / or privileged information belonging to Spirent Communications plc, its affiliates and / or subsidiaries. If you are not the intended recipient, you are hereby notified that any disclosure, copying, distribution and / or the taking of any action based upon reliance on the contents of this transmission is strictly forbidden. If you have received this message in error please notify the sender by return e-mail and delete it from your system. If you require assistance, please contact our IT department at helpdesk <at> spirent.com.

Spirent Communications plc
NorthwoodPark, Gatwick Road, Crawley, West Sussex, RH10 9XN, United Kingdom.
Tel No. +44 (0) 1293 767676
Fax No. +44 (0) 1293 767677

Registered in England Number 470893
Registered at NorthwoodPark, Gatwick Road, Crawley, West Sussex, RH10 9XN, United Kingdom.

Or if within the US,

Spirent Communications,
26750 Agoura Road, Calabasas, CA, 91302, USA.
Tel No. 1-818-676- 2300




E-mail confidentiality.
--------------------------------
This e-mail contains confidential and / or privileged information belonging to Spirent Communications plc, its affiliates and / or subsidiaries. If you are not the intended recipient, you are hereby notified that any disclosure, copying, distribution and / or the taking of any action based upon reliance on the contents of this transmission is strictly forbidden. If you have received this message in error please notify the sender by return e-mail and delete it from your system. If you require assistance, please contact our IT department at helpdesk <at> spirent.com.

Spirent Communications plc
Northwood Park, Gatwick Road, Crawley, West Sussex, RH10 9XN, United Kingdom.
Tel No. +44 (0) 1293 767676
Fax No. +44 (0) 1293 767677

Registered in England Number 470893
Registered at Northwood Park, Gatwick Road, Crawley, West Sussex, RH10 9XN, United Kingdom.

Or if within the US,

Spirent Communications,
26750 Agoura Road, Calabasas, CA, 91302, USA.
Tel No. 1-818-676- 2300

S=sampling interval
I=time between two packets to the same route
O=offered load

1) convergence event instant
first sampling interval where received rate < offered load
we can observe to all routes (instantaneous loss for all routes)
a) best case (smallest deviation between real and measured value)
T0-S+2dT: sample (received rate == offered load)
T0-1/O+dT: packet, received
T0: convergence event instant
T0+dT: packet, dropped
T0+2dT: sample (received rate < offered load)

real - measured = (T0) - (T0+2dT) ~ 0

b) worst case (largest deviation between real and measured value)
T0-dT: packet, received
T0: convergence event
T0+1/O-2dT: sample (received rate == offered load)
T0+1/O-dT: packet, dropped
T0+1/O+S-2dT: sample (received rate < offered load)

real - measured = (T0) - (T0+1/O+S-2dT) ~ -(S+1/O)
Note: for ECMP member failures, less traffic is sent on preferred egress interface (in extremis only
traffic to one route) it can become -(S+I)

=> measured - (S+1/O) < real < measured

2) first route convergence event instant
first sampling interval where received rate starts increasing
we can focus the observation to the first route x (yet unknown) converging
a) best case
T0: first route convergence event instant
T0+dT: packet to x, received
T0+2dT: sample (received rate increased)

real - measured = (T0) - (T0+2dT) ~ 0

b) worst case
T0-dT: packet to x, dropped
T0: first route convergence event instant
T0+I-2dT: sample (received rate not increased)
T0+I-dT: packet to x, received
T0+I+S-2dT: sample (received rate increased)

real - measured = (T0) - (T0+I+S-2dT) ~ -(S+I)

=> measured - (S+I) < real < measured

 
3) convergence recovery instant
first sampling interval where received rate == offered load
we can focus on the last route y (yet unknown) converging
a) best case
T0-I+dT: packet to y, dropped
T0-I+dT: sample (received rate < offered load)
T0: convergence recovery instant
T0+dT: packet to y, received
T0-I+S+dT: sample (received rate == offered load)

real - measured = (T0) - (T0-I+S+dT) ~ -(S-I)

b) worst case
T0-I-dT: packet to y, dropped
T0-2dT: sample (received rate < offered load)
T0-dT: packet to y, received
T0: convergence recovery instant
T0+S-2dT: sample (received rate < offered load)
T0+2S-2dT: (received rate == offered load)

real - measured = (T0) - (T0+2S-2dT) ~ -2S

=> measured - 2S < real < measured - (S-I)

Derived metrics
===============

convergence event instant: [-(S+1/O), +0]
first route convergence event instant: [-(S+I), +0]
convergence recovery instant: [-2S, -(S-I)]

first route convergence time
= first route convergence event instant - convergence event instant
accuracy interval: [-(S+I)-0, 0-(-(S+1/O))] = [-(S+I), +(S+1/O)]

full convergence time
= convergence recovery instant - convergence event instant
accuracy interval: [-2S-0, -(S-I)-(-(S+1/O))] = [-2S, +(I+1/O)]


Network Working Group                                        S. Poretsky
Internet-Draft                                      Allot Communications
Intended status: Informational                                 B. Imhoff
Expires: July 25, 2010                                  Juniper Networks
                                                           K. Michielsen
                                                           Cisco Systems
                                                        January 21, 2010


Benchmarking Methodology for Link-State IGP Data Plane Route Convergence
               draft-ietf-bmwg-igp-dataplane-conv-meth-20

Abstract

   This document describes the methodology for benchmarking Link-State
   Interior Gateway Protocol (IGP) Route Convergence.  The methodology
   is to be used for benchmarking IGP convergence time through
   externally observable (black box) data plane measurements.  The
   methodology can be applied to any link-state IGP, such as ISIS and
   OSPF.

Status of this Memo

   This Internet-Draft is submitted to IETF in full conformance with the
   provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF), its areas, and its working groups.  Note that
   other groups may also distribute working documents as Internet-
   Drafts.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   The list of current Internet-Drafts can be accessed at
   http://www.ietf.org/ietf/1id-abstracts.txt.

   The list of Internet-Draft Shadow Directories can be accessed at
   http://www.ietf.org/shadow.html.

   This Internet-Draft will expire on July 25, 2010.

Copyright Notice

   Copyright (c) 2010 IETF Trust and the persons identified as the
   document authors.  All rights reserved.



Poretsky, et al.          Expires July 25, 2010                 [Page 1]

Internet-Draft    IGP Convergence Benchmark Methodology     January 2010


   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents
   (http://trustee.ietf.org/license-info) in effect on the date of
   publication of this document.  Please review these documents
   carefully, as they describe your rights and restrictions with respect
   to this document.  Code Components extracted from this document must
   include Simplified BSD License text as described in Section 4.e of
   the Trust Legal Provisions and are provided without warranty as
   described in the BSD License.

   This document may contain material from IETF Documents or IETF
   Contributions published or made publicly available before November
   10, 2008.  The person(s) controlling the copyright in some of this
   material may not have granted the IETF Trust the right to allow
   modifications of such material outside the IETF Standards Process.
   Without obtaining an adequate license from the person(s) controlling
   the copyright in such materials, this document may not be modified
   outside the IETF Standards Process, and derivative works of it may
   not be created outside the IETF Standards Process, except to format
   it for publication as an RFC or to translate it into languages other
   than English.






























Poretsky, et al.          Expires July 25, 2010                 [Page 2]

Internet-Draft    IGP Convergence Benchmark Methodology     January 2010


Table of Contents

   1.  Introduction and Scope . . . . . . . . . . . . . . . . . . . .  5
   2.  Existing Definitions . . . . . . . . . . . . . . . . . . . . .  5
   3.  Test Topologies  . . . . . . . . . . . . . . . . . . . . . . .  5
     3.1.  Test topology for local changes  . . . . . . . . . . . . .  5
     3.2.  Test topology for remote changes . . . . . . . . . . . . .  6
     3.3.  Test topologies for local changes with ECMP  . . . . . . .  7
     3.4.  Test topologies for remote changes with ECMP . . . . . . .  8
     3.5.  Test topology for Parallel Link changes  . . . . . . . . . 10
   4.  Convergence Time and Loss of Connectivity Period . . . . . . . 11
     4.1.  Convergence Events without instant traffic loss  . . . . . 12
     4.2.  Loss of Connectivity . . . . . . . . . . . . . . . . . . . 14
   5.  Test Considerations  . . . . . . . . . . . . . . . . . . . . . 15
     5.1.  IGP Selection  . . . . . . . . . . . . . . . . . . . . . . 15
     5.2.  Routing Protocol Configuration . . . . . . . . . . . . . . 15
     5.3.  IGP Topology . . . . . . . . . . . . . . . . . . . . . . . 15
     5.4.  Timers . . . . . . . . . . . . . . . . . . . . . . . . . . 16
     5.5.  Interface Types  . . . . . . . . . . . . . . . . . . . . . 16
     5.6.  Offered Load . . . . . . . . . . . . . . . . . . . . . . . 16
     5.7.  Measurement Accuracy . . . . . . . . . . . . . . . . . . . 17
     5.8.  Measurement Statistics . . . . . . . . . . . . . . . . . . 17
     5.9.  Tester Capabilities  . . . . . . . . . . . . . . . . . . . 17
   6.  Selection of Convergence Time Benchmark Metrics and Methods  . 18
     6.1.  Loss-Derived Method  . . . . . . . . . . . . . . . . . . . 18
       6.1.1.  Tester capabilities  . . . . . . . . . . . . . . . . . 18
       6.1.2.  Benchmark Metrics  . . . . . . . . . . . . . . . . . . 18
       6.1.3.  Measurement Accuracy . . . . . . . . . . . . . . . . . 19
     6.2.  Rate-Derived Method  . . . . . . . . . . . . . . . . . . . 19
       6.2.1.  Tester Capabilities  . . . . . . . . . . . . . . . . . 19
       6.2.2.  Benchmark Metrics  . . . . . . . . . . . . . . . . . . 19
       6.2.3.  Measurement Accuracy . . . . . . . . . . . . . . . . . 19
     6.3.  Route-Specific Loss-Derived Method . . . . . . . . . . . . 20
       6.3.1.  Tester Capabilities  . . . . . . . . . . . . . . . . . 20
       6.3.2.  Benchmark Metrics  . . . . . . . . . . . . . . . . . . 20
       6.3.3.  Measurement Accuracy . . . . . . . . . . . . . . . . . 21
   7.  Reporting Format . . . . . . . . . . . . . . . . . . . . . . . 21
   8.  Test Cases . . . . . . . . . . . . . . . . . . . . . . . . . . 22
     8.1.  Interface failures . . . . . . . . . . . . . . . . . . . . 23
       8.1.1.  Convergence Due to Local Interface Failure . . . . . . 23
       8.1.2.  Convergence Due to Remote Interface Failure  . . . . . 24
       8.1.3.  Convergence Due to ECMP Member Local Interface
               Failure  . . . . . . . . . . . . . . . . . . . . . . . 26
       8.1.4.  Convergence To ECMP set Due to Local Interface
               Failure  . . . . . . . . . . . . . . . . . . . . . . . 27
       8.1.5.  Convergence Due to ECMP Member Remote Interface
               Failure  . . . . . . . . . . . . . . . . . . . . . . . 28
       8.1.6.  Convergence To ECMP set Due to Remote Interface



Poretsky, et al.          Expires July 25, 2010                 [Page 3]

Internet-Draft    IGP Convergence Benchmark Methodology     January 2010


               Failure  . . . . . . . . . . . . . . . . . . . . . . . 29
       8.1.7.  Convergence Due to Parallel Link Interface Failure . . 30
     8.2.  Other failures . . . . . . . . . . . . . . . . . . . . . . 31
       8.2.1.  Convergence Due to Layer 2 Session Loss  . . . . . . . 31
       8.2.2.  Convergence Due to Loss of IGP Adjacency . . . . . . . 33
       8.2.3.  Convergence Due to Route Withdrawal  . . . . . . . . . 34
     8.3.  Administrative changes . . . . . . . . . . . . . . . . . . 36
       8.3.1.  Convergence Due to Local Adminstrative Shutdown  . . . 36
       8.3.2.  Convergence Due to Cost Change . . . . . . . . . . . . 37
   9.  Security Considerations  . . . . . . . . . . . . . . . . . . . 38
   10. IANA Considerations  . . . . . . . . . . . . . . . . . . . . . 39
   11. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 39
   12. Normative References . . . . . . . . . . . . . . . . . . . . . 39
   Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 40





































Poretsky, et al.          Expires July 25, 2010                 [Page 4]

Internet-Draft    IGP Convergence Benchmark Methodology     January 2010


1.  Introduction and Scope

   This document describes the methodology for benchmarking Link-State
   Interior Gateway Protocol (IGP) convergence.  The motivation and
   applicability for this benchmarking is described in [Po09a].  The
   terminology to be used for this benchmarking is described in [Po09t].

   IGP convergence time is measured on the data plane at the Tester by
   observing packet loss through the DUT.  All factors contributing to
   convergence time are accounted for by measuring on the data plane, as
   discussed in [Po09a].  The test cases in this document are black-box
   tests that emulate the network events that cause convergence, as
   described in [Po09a].

   The methodology described in this document can be applied to IPv4 and
   IPv6 traffic and link-state IGPs such as ISIS [Ca90][Ho08], OSPF
   [Mo98][Co08], and others.


2.  Existing Definitions

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
   document are to be interpreted as described in BCP 14, RFC 2119
   [Br97].  RFC 2119 defines the use of these key words to help make the
   intent of standards track documents as clear as possible.  While this
   document uses these keywords, this document is not a standards track
   document.

   This document uses much of the terminology defined in [Po09t] and
   uses existing terminology defined in other BMWG work.  Examples
   include, but are not limited to:

      Throughput                         [Ref.[Br91], section 3.17]
      Device Under Test (DUT)            [Ref.[Ma98], section 3.1.1]
      System Under Test (SUT)            [Ref.[Ma98], section 3.1.2]
      Out-of-order Packet                [Ref.[Po06], section 3.3.2]
      Duplicate Packet                   [Ref.[Po06], section 3.3.3]
      Stream                             [Ref.[Po06], section 3.3.2]
      Loss Period                        [Ref.[Ko02], section 4]


3.  Test Topologies

3.1.  Test topology for local changes

   Figure 1 shows the test topology to measure IGP convergence time due
   to local Convergence Events such as Local Interface failure



Poretsky, et al.          Expires July 25, 2010                 [Page 5]

Internet-Draft    IGP Convergence Benchmark Methodology     January 2010


   (Section 8.1.1), layer 2 session failure (Section 8.2.1), and IGP
   adjacency failure (Section 8.2.2).  This topology is also used to
   measure IGP convergence time due to the route withdrawal
   (Section 8.2.3), and route cost change (Section 8.3.2) Convergence
   Events.  IGP adjancencies MUST be established between Tester and DUT,
   one on the Preferred Egress Interface and one on the Next-Best Egress
   Interface.  For this purpose the Tester emulates two routers, each
   establishing one adjacency with the DUT.  An IGP adjacency SHOULD be
   established on the Ingress Interface between Tester and DUT.

            ---------       Ingress Interface         ----------
            |       |<--------------------------------|        |
            |       |                                 |        |
            |       |    Preferred Egress Interface   |        |
            |  DUT  |-------------------------------->| Tester |
            |       |                                 |        |
            |       |-------------------------------->|        |
            |       |    Next-Best Egress Interface   |        |
            ---------                                 ----------

         Figure 1: IGP convergence test topology for local changes

3.2.  Test topology for remote changes

   Figure 2 shows the test topology to measure IGP convergence time due
   to Remote Interface failure (Section 8.1.2).  In this topology the
   two routers R1 and R2 are considered System Under Test (SUT) and
   SHOULD be identically configured devices of the same model.  IGP
   adjancencies MUST be established between Tester and SUT, one on the
   Preferred Egress Interface and one on the Next-Best Egress Interface.
   For this purpose the Tester emulates one or two routers.  An IGP
   adjacency SHOULD be established on the Ingress Interface between
   Tester and SUT.  In this topology there is a possibility of a
   transient microloop between R1 and R2 during convergence.

















Poretsky, et al.          Expires July 25, 2010                 [Page 6]

Internet-Draft    IGP Convergence Benchmark Methodology     January 2010


                       ------                      ----------
                       |    |  Preferred           |        |
              ------   | R2 |--------------------->|        |
              |    |-->|    |  Egress Interface    |        |
              |    |   ------                      |        |
              | R1 |                               | Tester |
              |    |           Next-Best           |        |
              |    |------------------------------>|        |
              ------           Egress Interface    |        |
                 ^                                 ----------
                 |                                     |
                 ---------------------------------------
                             Ingress Interface

        Figure 2: IGP convergence test topology for remote changes

3.3.  Test topologies for local changes with ECMP

   Figure 3 shows the test topology to measure IGP convergence time due
   to local Convergence Events of a member of an Equal Cost Multipath
   (ECMP) set (Section 8.1.3).  In this topology, the DUT is configured
   with each egress interface as a member of a single ECMP set and the
   Tester emulates N next-hop routers, one router for each member.  IGP
   adjancencies MUST be established between Tester and DUT, one on each
   member of the ECMP set.  For this purpose each of the N routers
   emulated by the Tester establishes one adjacency with the DUT.  An
   IGP adjacency SHOULD be established on the Ingress Interface between
   Tester and DUT.

            ---------       Ingress Interface         ----------
            |       |<--------------------------------|        |
            |       |                                 |        |
            |       |     ECMP set interface 1        |        |
            |       |-------------------------------->|        |
            |  DUT  |               .                 | Tester |
            |       |               .                 |        |
            |       |               .                 |        |
            |       |-------------------------------->|        |
            |       |     ECMP set interface N        |        |
            ---------                                 ----------

      Figure 3: IGP convergence test topology for local N to N-1 ECMP
                                convergence

   Figure 4 shows the test topology to measure IGP convergence time due
   to local Convergence Events with a non-ECMP Preferred Egress
   Interface and ECMP Next-Best Egress Interfaces (Section 8.1.4).  In
   this topology, the DUT is configured with each Next-Best Egress



Poretsky, et al.          Expires July 25, 2010                 [Page 7]

Internet-Draft    IGP Convergence Benchmark Methodology     January 2010


   interface as a member of a single ECMP set.  The Preferred Egress
   Interface is not a member of an ECMP set.  The Tester emulates N
   next-hop routers, one router for the Preferred Egress Interface and
   N-1 routers for the members of the ECMP set.  IGP adjancencies MUST
   be established between Tester and DUT, one on the Preferred Egress
   Interface, an one on each member of the ECMP set.  For this purpose
   each of the N routers emulated by the Tester establishes one
   adjacency with the DUT.  An IGP adjacency SHOULD be established on
   the Ingress Interface between Tester and DUT.

            ---------       Ingress Interface         ----------
            |       |<--------------------------------|        |
            |       |    Preferred Egress Interface   |        |
            |       |-------------------------------->|        |
            |       |     ECMP set interface 1        |        |
            |  DUT  |-------------------------------->| Tester |
            |       |               .                 |        |
            |       |               .                 |        |
            |       |-------------------------------->|        |
            |       |     ECMP set interface N-1      |        |
            ---------                                 ----------

    Figure 4: IGP convergence test topology for local non-ECMP to ECMP
                                convergence

3.4.  Test topologies for remote changes with ECMP

   Figure 5 shows the test topology to measure IGP convergence time due
   to remote Convergence Events of a member of an Equal Cost Multipath
   (ECMP) set (Section 8.1.5).  In this topology the two routers R1 and
   R2 are considered System Under Test (SUT) and MUST be identically
   configured devices of the same model.  Router R1 is configured with
   each egress interface as a member of a single ECMP set and the Tester
   emulates N next-hop routers, one router for each member.  IGP
   adjancencies MUST be established between Tester and SUT, one on each
   egress interface of SUT.  For this purpose each of the N routers
   emulated by the Tester establishes one adjacency with the SUT.  An
   IGP adjacency SHOULD be established on the Ingress Interface between
   Tester and SUT.  In this topology there is a possibility of a
   transient microloop between R1 and R2 during convergence.











Poretsky, et al.          Expires July 25, 2010                 [Page 8]

Internet-Draft    IGP Convergence Benchmark Methodology     January 2010


                                        ------     ----------
                                        |    |     |        |
              ------      ECMP set      | R2 |---->|        |
              |    |------------------->|    |     |        |
              |    |      Interface 1   ------     |        |
              |    |                               |        |
              |    |      ECMP set interface 2     |        |
              | R1 |------------------------------>| Tester |
              |    |               .               |        |
              |    |               .               |        |
              |    |               .               |        |
              |    |------------------------------>|        |
              ------      ECMP set interface N     |        |
                 ^                                 ----------
                 |                                     |
                 ---------------------------------------
                             Ingress Interface

     Figure 5: IGP convergence test topology for remote N to N-1 ECMP
                                convergence

   Figure 6 shows the test topology to measure IGP convergence time due
   to remote Convergence Events with a non-ECMP Preferred Egress
   Interface and ECMP Next-Best Egress Interfaces (Section 8.1.6).  In
   this topology the two routers R1 and R2 are considered System Under
   Test (SUT) and MUST be identically configured devices of the same
   model.  Router R1 is configured with each Next-Best Egress interface
   as a member of the same ECMP set.  The Preferred Egress Interface of
   R1 is not a member of an ECMP set.  The Tester emulates N next-hop
   routers, one for R2 and one for each member of the ECMP set.  IGP
   adjancencies MUST be established between Tester and SUT, one on each
   egress interface of SUT.  For this purpose each of the N routers
   emulated by the Tester establishes one adjacency with the SUT.  An
   IGP adjacency SHOULD be established on the Ingress Interface between
   Tester and SUT.  In this topology there is a possibility of a
   transient microloop between R1 and R2 during convergence.















Poretsky, et al.          Expires July 25, 2010                 [Page 9]

Internet-Draft    IGP Convergence Benchmark Methodology     January 2010


                                        ------     ----------
                                        |    |     |        |
              ------      Preferred     | R2 |---->|        |
              |    |------------------->|    |     |        |
              |    |  Egress Interface  ------     |        |
              |    |                               |        |
              |    |      ECMP set interface 1     |        |
              | R1 |------------------------------>| Tester |
              |    |               .               |        |
              |    |               .               |        |
              |    |               .               |        |
              |    |------------------------------>|        |
              ------      ECMP set interface N     |        |
                 ^                                 ----------
                 |                                     |
                 ---------------------------------------
                             Ingress Interface

    Figure 6: IGP convergence test topology for remote non-ECMP to ECMP
                                convergence

3.5.  Test topology for Parallel Link changes

   Figure 7 shows the test topology to measure IGP convergence time due
   to local Convergence Events with members of a Parallel Link
   (Section 8.1.7).  In this topology, the DUT is configured with each
   egress interface as a member of a Parallel Link and the Tester
   emulates the single next-hop router.  IGP adjancencies MUST be
   established on all N members of the Parallel Link between Tester and
   DUT.  For this purpose the router emulated by the Tester establishes
   N adjacencies with the DUT.  An IGP adjacency SHOULD be established
   on the Ingress Interface between Tester and DUT.

            ---------       Ingress Interface         ----------
            |       |<--------------------------------|        |
            |       |                                 |        |
            |       |     Parallel Link Interface 1   |        |
            |       |-------------------------------->|        |
            |  DUT  |               .                 | Tester |
            |       |               .                 |        |
            |       |               .                 |        |
            |       |-------------------------------->|        |
            |       |     Parallel Link Interface N   |        |
            ---------                                 ----------

     Figure 7: IGP convergence test topology for Parallel Link changes





Poretsky, et al.          Expires July 25, 2010                [Page 10]

Internet-Draft    IGP Convergence Benchmark Methodology     January 2010


4.  Convergence Time and Loss of Connectivity Period

   Two concepts will be highlighted in this section: convergence time
   and loss of connectivity period.

   The Route Convergence [Po09t] time indicates the period in time
   between the Convergence Event Instant [Po09t] and the instant in time
   the DUT is ready to forward traffic for a specific route on its Next-
   Best Egress Interface and maintains this state for the duration of
   the Sustained Convergence Validation Time [Po09t].  To measure Route
   Convergence time, the Convergence Event Instant and the traffic
   received from the Next-Best Egress Interface need to be observed.

   The Route Loss of Connectivity Period [Po09t] indicates the time
   during which traffic to a specific route is lost following a
   Convergence Event until Full Convergence [Po09t] completes.  This
   Route Loss of Connectivity Period can consist of one or more Loss
   Periods [Ko02].  For the testcases described in this document it is
   expected to have a single Loss Period.  To measure Route Loss of
   Connectivity Period, the traffic received from the Preferred Egress
   Interface and the traffic received from the Next-Best Egress
   Interface need to be observed.

   The Route Loss of Connectivity Period is most important since that
   has a direct impact on the network user's application performance.

   In general the Route Convergence time is larger than or equal to the
   Route Loss of Connectivity Period.  Depending on which Convergence
   Event occurs and how this Convergence Event is applied, traffic for a
   route may still be forwarded over the Preferred Egress Interface
   after the Convergence Event Instant, before converging to the Next-
   Best Egress Interface.  In that case the Route Loss of Connectivity
   Period is shorter than the Route Convergence time.

   At least one condition needs to be fulfilled for Route Convergence
   time to be equal to Route Loss of Connectivity Period.  The condition
   is that the Convergence Event causes an instantaneous traffic loss
   for the measured route.  A fiber cut on the Preferred Egress
   Interface is an example of such a Convergence Event.

   A second condition applies to Route Convergence time measurements
   based on Connectivity Packet Loss [Po09t].  This second condition is
   that there is only a single Loss Period during Route Convergence.
   For the testcases described in this document this is expected to be
   the case.






Poretsky, et al.          Expires July 25, 2010                [Page 11]

Internet-Draft    IGP Convergence Benchmark Methodology     January 2010


4.1.  Convergence Events without instant traffic loss

   To measure convergence time benchmarks for Convergence Events caused
   by a Tester, such as an IGP cost change, the Tester MAY start to
   discard all traffic received from the Preferred Egress Interface at
   the Convergence Event Instant, or MAY separately observe packets
   received from the Preferred Egress Interface prior to the Convergence
   Event Instant.  This way these Convergence Events can be treated the
   same as Convergence Events that cause instantaneous traffic loss.

   To measure convergence time benchmarks without instantaneous traffic
   loss (either real or induced by the Tester) at the Convergence Event
   Instant, such as a reversion of a link failure Convergence Event, the
   Tester SHALL only observe packet statistics on the Next-Best Egress
   Interface.  If using the Rate-Derived method to benchmark convergence
   times for such Convergence Events, the Tester MUST collect a
   timestamp at the Convergence Event Instant.  If using a loss-derived
   method to benchmark convergence times for such Convergence Events,
   the Tester MUST measure the period in time between the Start Traffic
   Instant and the Convergence Event Instant.  To measure this period in
   time the Tester can collect timestamps at the Start Traffic Instant
   and the Convergence Event Instant.

   The Convergence Event Instant together with the receive rate
   observations on the Next-Best Egress Interface allow to derive the
   convergence time benchmarks using the Rate-Derived Method [Po09t].

   By observing lost packets on the Next-Best Egress Interface only, the
   observed packet loss is the number of lost packets between Traffic
   Start Instant and Convergence Recovery Instant.  To measure
   convergence times using a loss-derived method, packet loss between
   the Convergence Event Instant and the Convergence Recovery Instant is
   needed.  The time between Traffic Start Instant and Convergence Event
   Instant must be accounted for.  An example may clarify this.

   Figure 8 illustrates a Convergence Event without instantaneous
   traffic loss for all routes.  The top graph shows the Forwarding Rate
   over all routes, the bottom graph shows the Forwarding Rate for a
   single route Rta. Some time after the Convergence Event Instant,
   Forwarding Rate observed on the Preferred Egress Interface starts to
   decrease.  In the example, route Rta is the first route to experience
   packet loss at time Ta.  Some time later, the Forwarding Rate
   observed on the Next-Best Egress Interface starts to increase.  In
   the example, route Rta is the first route to complete convergence at
   time Ta'.






Poretsky, et al.          Expires July 25, 2010                [Page 12]

Internet-Draft    IGP Convergence Benchmark Methodology     January 2010


                ^
           Fwd  |
           Rate |-------------                    ............
                |             \                  .
                |              \                .
                |               \              .
                |                \            .
                |.................-.-.-.-.-.-.----------------
                +----+-------+---------------+----------------->
                ^    ^       ^               ^             time
               T0   CEI      Ta              Ta'

                ^
           Fwd  |
           Rate |-------------               .................
           Rta  |            |               .
                |            |               .
                |.............-.-.-.-.-.-.-.-.----------------
                +----+-------+---------------+----------------->
                ^    ^       ^               ^             time
               T0   CEI      Ta              Ta'

                Preferred Egress Interface: ---
                Next-Best Egress Interface: ...

   With T0 the Start Traffic Instant; CEI the Convergence Event Instant;
   Ta the time instant traffic loss for route Rta starts; Ta' the time
   instant traffic loss for route Rta ends.

                                 Figure 8

   If only packets received on the Next-Best Egress Interface are
   observed, the duration of the packet loss period for route Rta can be
   calculated from the received packets as in Equation 1.  Since the
   Convergence Event Instant is the start time for convergence time
   measurement, the period in time between T0 and CEI needs to be
   subtracted from the calculated result to become the convergence time,
   as in Equation 2.

   Next-Best Egress Interface packet loss period
       = (packets transmitted
           - packets received from Next-Best Egress Interface) / tx rate
       = Ta' - T0

                                Equation 1






Poretsky, et al.          Expires July 25, 2010                [Page 13]

Internet-Draft    IGP Convergence Benchmark Methodology     January 2010


      convergence time
          = Next-Best Egress Interface packet loss period - (CEI - T0)
          = Ta' - CEI

                                Equation 2

4.2.  Loss of Connectivity

   Route Loss of Connectivity Period SHOULD be measured using the Route-
   Specific Loss-Derived Method.  Since the start instant and end
   instant of the Route Loss of Connectivity Period can be different for
   each route, these can not be accurately derived by only observing
   global statistics over all routes.  An example may clarify this.

   Following a Convergence Event, route Rta is the first route for which
   packet loss starts, the Route Loss of Connectivity Period for route
   Rta starts at time Ta.  Route Rtb is the last route for which packet
   loss starts, the Route Loss of Connectivity Period for route Rtb
   starts at time Tb with Tb>Ta.

                  ^
             Fwd  |
             Rate |--------                       -----------
                  |        \                     /
                  |         \                   /
                  |          \                 /
                  |           \               /
                  |            ---------------
                  +------------------------------------------>
                           ^   ^             ^    ^      time
                          Ta   Tb           Ta'   Tb'
                                            Tb''  Ta''


            Figure 9: Example Route Loss Of Connectivity Period

   If the DUT implementation would be such that Route Rta would be the
   first route for which traffic loss ends at time Ta' with Ta'>Tb.
   Route Rtb would be the last route for which traffic loss ends at time
   Tb' with Tb'>Ta'.  By using only observing global traffic statistics
   over all routes, the minimum Route Loss of Connectivity Period would
   be measured as Ta'-Ta.  The maximum calculated Route Loss of
   Connectivity Period would be Tb'-Ta.  The real minimum and maximum
   Route Loss of Connectivity Periods are Ta'-Ta and Tb'-Tb.
   Illustrating this with the numbers Ta=0, Tb=1, Ta'=3, and Tb'=5,
   would give a LoC Period between 3 and 5 derived from the global
   traffic statistics, versus the real LoC Period between 3 and 4.




Poretsky, et al.          Expires July 25, 2010                [Page 14]

Internet-Draft    IGP Convergence Benchmark Methodology     January 2010


   If the DUT implementation would be such that route Rtb would be the
   first for which packet loss ends at time Tb'' and route Rta would be
   the last for which packet loss ends at time Ta'', then the minimum
   and maximum Route Loss of Connectivity Periods derived by observing
   only global traffic statistics would be Tb''-Ta, and Ta''-Ta.  The
   real minimum and maximum Route Loss of Connectivity Periods are
   Tb''-Tb and Ta''-Ta.  Illustrating this with the numbers Ta=0, Tb=1,
   Ta''=5, Tb''=3, would give a LoC Period between 3 and 5 derived from
   the global traffic statistics, versus the real LoC Period between 2
   and 5.

   The two implementation variations in the above example would result
   in the same derived minimum and maximum Route Loss of Connectivity
   Periods when only observing the global packet statistics, while the
   real Route Loss of Connectivity Periods are different.


5.  Test Considerations

5.1.  IGP Selection

   The test cases described in Section 8 MAY be used for link-state
   IGPs, such as ISIS or OSPF.  The IGP convergence time test
   methodology is identical.

5.2.  Routing Protocol Configuration

   The obtained results for IGP convergence time may vary if other
   routing protocols are enabled and routes learned via those protocols
   are installed.  IGP convergence times SHOULD be benchmarked without
   routes installed from other protocols.

5.3.  IGP Topology

   The Tester emulates a single IGP topology.  The DUT establishes IGP
   adjacencies with one or more of the emulated routers in this single
   IGP topology emulated by the Tester.  See test topology details in
   Section 3.  The emulated topology SHOULD only be advertised on the
   DUT egress interfaces.

   The number of IGP routes will impact the measured IGP route
   convergence time.  [Anuj:] The number of IGP routers simulated/emulated
   by the Tester and the IGP topology advertized also impacts the measured
   convergence metrics. This is because the IGP topology influences the 
   SPF calculations performed by the DUT which in turn influences the FIB
   updates.
[Kris:] I'll make it: "The number of IGP routes, number of nodes, and type
of topology will impact the measured IGP route convergence time."
   
   To obtain results similar to those that would be
   observed in an operational network, it is RECOMMENDED that the number
   of installed routes and nodes closely approximate that of the network
   (e.g. thousands of routes with tens or hundreds of nodes).

   The number of areas (for OSPF) and levels (for ISIS) can impact the
   benchmark results.



Poretsky, et al.          Expires July 25, 2010                [Page 15]

Internet-Draft    IGP Convergence Benchmark Methodology     January 2010


5.4.  Timers

   There are timers that may impact the measured IGP convergence times.
   The benchmark metrics MAY be measured at any fixed values for these
   timers.  To obtain results similar to those that would be observed in
   an operational network, it is RECOMMENDED to configure the timers
   with the values as configured in the operational network.

   Examples of timers that may impact measured IGP convergence time
   include, but are not limited to:

      Interface failure indication

      IGP hello timer

      IGP dead-interval or hold-timer

      LSA or LSP generation delay

      LSA or LSP flood packet pacing

      SPF delay

5.5.  Interface Types

   All test cases in this methodology document MAY be executed with any
   interface type.  The type of media may dictate which test cases may
   be executed.  Each interface type has a unique mechanism for
   detecting link failures and the speed at which that mechanism
   operates will influence the measurement results.  All interfaces MUST
   be the same media and Throughput [Br91][Br99] for each test case.
   All interfaces SHOULD be configured as point-to-point.

5.6.  Offered Load

   The Throughput of the device, as defined in [Br91] and benchmarked in
   [Br99] at a fixed packet size, needs to be determined over the
   preferred path and over the next-best path.  The Offered Load SHOULD
   be the minimum of the measured Throughput of the device over the
   primary path and over the backup path.  The packet size is selectable
   and MUST be recorded.  Packet size is measured in bytes and includes
   the IP header and payload.
   
   The destination addresses for the Offered Load MUST be distributed
   such that all routes or a statistically representative subset of all
   routes are matched and each of these routes is offered an equal share
   of the Offered Load.  The traffic rate offered to each route is
   constant without bursts.  It is RECOMMENDED to send traffic matching



Poretsky, et al.          Expires July 25, 2010                [Page 16]

Internet-Draft    IGP Convergence Benchmark Methodology     January 2010


   all routes, but a statistically representative subset of all routes
   can be used if required.

   In the Remote Interface failure testcases using topologies 2, 5 and 6
   there is a possibility of a transient microloop between R1 and R2
   during convergence.  The TTL or Hop Limit value of the packets sent
   by the Tester may influence the benchmark measurements since it
   determines which device in the topology may send an ICMP Time
   Exceeded Message for looped packets.

   The duration of the Offered Load MUST be greater than the convergence
   time. [Anuj:] plus the Sustained Convergence Validation Time.
[Kris:] OK.
   
   [Anuj:] Offered Load should send a packet to each destination address
   before sending another packet to the same destination. It is 
   RECOMMENDED that such packet dispatching for the Offered Load be done
   in a round-robin fashion. This is important as the sequence of packet 
   bound the measurement accuracy.
[Kris:] I'll make it: "Offered Load should send a packet to each destination address
   before sending another packet to the same destination. It is 
   RECOMMENDED that such packet dispatching for the Offered Load be done
   in a round-robin fashion with an even interpacket delay."

5.7.  Measurement Accuracy

   Since packet loss is observed to measure the Route Convergence Time,
   the time between two successive packets offered to each individual
   route is the highest possible accuracy of any packet loss based
   measurement.  The higher the traffic rate offered to each route, the
   higher the possible measurement accuracy.  When packet jitter is much
   less than the convergence time, it is a negligible source of error
   and therefore it will be ignored here.
   
5.8.  Measurement Statistics

   The benchmark measurements may vary for each trial, due to the
   statistical nature of timer expirations, cpu scheduling, etc.
   Evaluation of the test data must be done with an understanding of
   generally accepted testing practices regarding repeatability,
   variance and statistical significance of a small number of trials.

5.9.  Tester Capabilities

   It is RECOMMENDED that the Tester used to execute each test case has
   the following capabilities:

   1.  Ability to establish IGP adjacencies and advertise a single IGP
       topology to one or more peers.

   2.  Ability to insert a timestamp in each data packet's IP payload.
       [Anuj:] All required benchmarks for the test (Event Instant, 
	   Recovery Instant, First Route Convergence Instant etc) are 
	   measureable by observing the traffic rates on the egress ports.
	   Hence this condition is not necessary for the measurements and
	   hence there is no need for such a recommendation.
[Kris:] That is not the purpose of this packet timestamp. There is a need to measure
forwarding delay. I can modify it to: "Ability to measure forwarding delay,
duplicate packets and out-of-order packets."
	   
   3.  An internal time clock to control timestamping, time
       measurements, and time calculations. [Anuj:] This clock needs to
	   synchronized across all the interfaces of the Tester so that
	   the measured metrics are comparable.
[Kris:]This is internal to the Tester. Externally the Tester just needs
to be able to give correct time measurements.

   4.  Ability to distinguish traffic load received on the Preferred and
       Next-Best Interfaces [Po09t].





Poretsky, et al.          Expires July 25, 2010                [Page 17]

Internet-Draft    IGP Convergence Benchmark Methodology     January 2010


   5.  Ability to disable or tune specific Layer-2 and Layer-3 protocol
       functions on any interface(s).
	   	   
   The Tester MAY be capable to make non-data plane convergence
   observations and use those observations for measurements.  The Tester
   MAY be capable to send and receive multiple traffic Streams [Po06].

   Also see Section 6 for method-specific capabilities.


6.  Selection of Convergence Time Benchmark Metrics and Methods

   Different convergence time benchmark methods MAY be used to measure
   convergence time benchmark metrics.  The Tester capabilities are
   important criteria to select a specific convergence time benchmark
   method.  The criteria to select a specific benchmark method include,
   but are not limited to:

   Tester capabilities:               Sampling Interval, number of
                                      Stream statistics to collect,
									  [Anuj:] Sampling Duration
[Kris:]Sampling Duration <> Sampling Interval?
   Measurement accuracy:              Sampling Interval, Offered Load,
                                      [Anuj:] number or routes
[Kris:] OK.
   Test specification:                number of routes, 
									  [Anuj:] IGP topology
[Kris:]IGP topology is not a measurement method selection criterium.
   DUT capabilities:                  Throughput,
                                      [Anuj:] IGP route scale
[Kris:] The DUT's IGP route scale (i.e. how many routes it can support) is not a method selection criterium.

6.1.  Loss-Derived Method

6.1.1.  Tester capabilities

   The Offered Load SHOULD consist of a single Stream [Po06].  If
   sending multiple Streams, the measured packet loss statistics for all
   Streams MUST be added together.

   In order to verify Full Convergence completion and the Sustained
   Convergence Validation Time, the Tester MUST measure Forwarding Rate
   each Packet Sampling Interval.

   The total number of packets lost between the start of the traffic and
   the end of the Sustained Convergence Validation Time is used to
   calculate the Loss-Derived Convergence Time.

6.1.2.  Benchmark Metrics

   The Loss-Derived Method can be used to measure the Loss-Derived
   Convergence Time, which is the average convergence time over all
   routes, and to measure the Loss-Derived Loss of Connectivity Period,
   which is the average Route Loss of Connectivity Period over all
   routes.




Poretsky, et al.          Expires July 25, 2010                [Page 18]

Internet-Draft    IGP Convergence Benchmark Methodology     January 2010


6.1.3.  Measurement Accuracy

   The measurement accuracy interval of the Loss-Derived Method is
   -(1/Offered Load), +(1/Offered Load).

6.2.  Rate-Derived Method

6.2.1.  Tester Capabilities

   The Offered Load SHOULD consist of a single Stream. [Anuj:] Because a
   single stream is defined as having a unique source and destination IP,
   how will this work if Offered load needs to send traffic to each route?
   Just sending traffic to one route will give convergence metric just
   for that route. 
[Kris:] It's Stream as defined in "[Po06], section 3.3.2":
"3.3.2.  Stream

   Definition:
      A group of packets tracked as a single entity by the traffic
      receiver.  A stream MUST share common content, such as type (IP,
      UDP), IP SA/DA, packet size, or payload.
"

   If sending
   multiple Streams, the measured traffic rate statistics for all
   Streams MUST be added together.

   The Tester measures Forwarding Rate each Sampling Interval.  The
   Packet Sampling Interval influences the observation of the different
   convergence time instants. If the Packet Sampling Interval is large
   compared to the time between the convergence time instants, then the
   different time instants may not be easily identifiable from the
   Forwarding Rate observation.  The requirements for the Packet
   Sampling Interval are specified in [Po09t].  The Packet Sampling
   Interval MUST be larger than or equal to the time between two
   consecutive packets to the same route.  For maximum accuracy the
   value for the Packet Sampling Interval SHOULD be as small as
   possible.  The Packet Sampling Interval MUST be reported.

6.2.2.  Benchmark Metrics

   The Rate-Derived Method SHOULD be used to measure First Route
   Convergence Time and Full Convergence Time.  It SHOULD NOT be used to
   measure Loss of Connectivity Period (see Section 4).

6.2.3.  Measurement Accuracy

   The measurement accuracy interval of the Rate-Derived Method depends
   on the metric being measured or calculated and the characteristics of
   the related transition.

   If the Convergence Event Instant is observed on the dataplane using
   the Rate Derived Method, it needs to be instantaneous for all routes
   (see Section 4.1).  The accuracy interval for measuring the
   Convergence Event Instant using the Rate-Derived Method is: -(Packet
   Sampling Interval + 1/Offered Load), +0.
   
   [Anuj:] I assume Packet Sampling Interval >= time between two 
   consecutive packets to the same destination, as per definition in 
   terminology document. The range is better represented as 
   (Convergence Event Instant as observed by the Tester - Packet Sampling 
   Interval - 1/offered load, Convergence Event Instant as observed by 
   the Tester, )
[Kris:] I can make it: "The real Convergence Event Instant is within
the accuracy interval [-(Packet Sampling Interval + 1/Offered Load), +0]
around the Convergence Event Instant as measured using the Rate-Derived Method."

   [Anuj:] Also required is the accuracy interval for the instant when 
   packet loss is started for the last route. We need to coin a term
   for this. The interval would be -0, +(Packet Sampling Interval +
   1/Offered Load). The range is better represented as 
   (Packet loss start for the last route as observed by the Tester, 
   Packet loss start for the last route as observed by the Tester + 
   Packet Sampling Interval + 1/offered load)
[Kris:] Measuring that instant is not needed to measure convergence time. See also discussion in section 4.
   If the Convergence Recovery Transition is non-instantaneous for all
   routes then the accuracy interval for measuring the First Route
   Convergence Instant and Convergence Recovery Instant using the Rate-
   Derived Method is: -(Packet Sampling Interval + time between two
   consecutive packets to the same destination), +0.
   
   [Anuj:] How? This should be derived from the accuracy of Convergence 
   Event Instant and instance when packet loss is seen for the last route.
   In effect, it should be an addition of the inaccuracies of both the 
   ranges.
[Kris:] First Route Convergence Instant and Convergence Recovery Instant
are measurements in absolute time, not derived metrics. Mentioning
"Convergence Recovery Instant" was just to indicate that the Rate-Derived
measurement accuracy interval depends on the characteristics of the transition.

Poretsky, et al.          Expires July 25, 2010                [Page 19]

Internet-Draft    IGP Convergence Benchmark Methodology     January 2010


   The term "time between two consecutive packets to the same
   destination" is added in the above accuracy interval since packets
   are sent in a particular order to all destinations in a stream and
   when part of the routes experience packet loss, it is unknown where
   in the transmit cycle packets to these routes are sent.  This
   uncertainty adds to the error.

   The accuracy interval of the derived metrics Rate-Derived Convergence
   Time and First Route Convergence Time is: -(Packet Sampling Interval
   + time between two consecutive packets to the same destination),
   +(Packet Sampling Interval + 1/Offered Load).

   [Anuj:] How? This is again derived from two instants - event instant
   and recovery instant. The range should encorporate the ranges for the
   two instants. Also it is not clear in the format you have represented
   it.
[Kris:] I'll use abbreviations to clarify the formulas (such as PSI: Packet Sampling Interval;
OL: Offered Load; ...)
It does encorporate the ranges for the two instants. I calculate the
error range between real and measured value (correct me if I've made a mistake):
convergence recovery instant (R) - convergence event instant (E)
suffix "r": real value; suffix "m": measured value

Ra, Rb, Ea, Eb are the errors:
Rm-Ra<Rr<Rm+Rb
Em-Ea<Er<Em+Eb

PSI=Packet Sampling Interval
I=time between two consecutive packets to the same destination
OL=Offered Load

convergence event instant: [-(PSI+1/OL), +0]
convergence recovery instant: [-2PSI, -(PSI-I)]

minimal possible Rr-Er for a measured Rm-Em: (Rm-Ra) - (Em+Eb) = (Rm-2PSI) - (Em+0) = (Rm-Em) - 2PSI
maximal possible Rr-Er for a measured Rm-Em: (Rm+Rb) - (Em-Ea) = (Rm-(PSI-I)) - (Em-(PSI+1/OL)) = (Rm-Em) + (I+1/OL)

full convergence time accuracy interval: [-2PSI, +(I+1/OL)]


   If the Convergence Recovery Transition is instantaneous for all
   routes then the accuracy interval for measuring the First Route
   Convergence Instant and Convergence Recovery Instant using the Rate-
   Derived Method is: -(Packet Sampling Interval + 1/Offered Load), +0.

   [Anuj:] Same as above.
   
   The accuracy interval of the derived metrics Rate-Derived Convergence
   Time and First Route Convergence Time is: -(Packet Sampling Interval
   + 1/Offered Load), +(Packet Sampling Interval + 1/Offered Load).

   [Anuj:] same as above.
   
   If 1/Offered Load is much smaller than Packet Sampling Interval the
   term 1/Offered Load can be ignored in the accuracy interval
   calculations in this section.

6.3.  Route-Specific Loss-Derived Method

6.3.1.  Tester Capabilities

   The Offered Load consists of multiple Streams.  The Tester MUST
   measure packet loss for each Stream separately.

   In order to verify Full Convergence completion and the Sustained
   Convergence Validation Time, the Tester MUST measure packet loss each
   Packet Sampling Interval.  This measurement at each Packet Sampling
   Interval MAY be per Stream.

   Only the total packet loss measured per Stream at the end of the
   Sustained Convergence Validation Time is used to calculate the
   benchmark metrics with this method.

6.3.2.  Benchmark Metrics

   The Route-Specific Loss-Derived Method SHOULD be used to measure
   Route-Specific Convergence Times.  It is the RECOMMENDED method to
   measure Route Loss of Connectivity Period.

   Under the conditions explained in Section 4, First Route Convergence



Poretsky, et al.          Expires July 25, 2010                [Page 20]

Internet-Draft    IGP Convergence Benchmark Methodology     January 2010


   Time and Full Convergence Time as benchmarked using Rate-Derived
   Method, may be equal to the minimum resp. maximum of the Route-
   Specific Convergence Times.

6.3.3.  Measurement Accuracy

   The measurement accuracy of the Route-Specific Loss-Derived Method is
   equal to the time between two consecutive packets to the same route.


7.  Reporting Format

   For each test case, it is recommended that the reporting tables below
   are completed and all time values SHOULD be reported with resolution
   as specified in [Po09t].

        Parameter                           Units
        ----------------------------------- -----------------------
        Test Case                           test case number
        Test Topology                       (1, 2, 3, 4, or 5)
        IGP                                 (ISIS, OSPF, other)
        Interface Type                      (GigE, POS, ATM, other)
        Packet Size offered to DUT          bytes
        Offered Load                        packets per second
        IGP Routes advertised to DUT        number of IGP routes
        Nodes in emulated network           number of nodes
        Number of Routes measured           number of routes
        Packet Sampling Interval on Tester  seconds
        Forwarding Delay Threshold          seconds

        Timer Values configured on DUT:
         Interface failure indication delay seconds
         IGP Hello Timer                    seconds
         IGP Dead-Interval or hold-time     seconds
         LSA Generation Delay               seconds
         LSA Flood Packet Pacing            seconds
         LSA Retransmission Packet Pacing   seconds
         SPF Delay                          seconds

   Test Details:

      If the Offered Load matches a subset of routes, describe how this
      subset is selected.

      Describe how the Convergence Event is applied; does it cause
      instantaneous traffic loss or not.

   Complete the table below for the initial Convergence Event and the



Poretsky, et al.          Expires July 25, 2010                [Page 21]

Internet-Draft    IGP Convergence Benchmark Methodology     January 2010


   reversion Convergence Event.

     Parameter                                  Units
     ------------------------------------------ ----------------------
     Conversion Event                           (initial or reversion)

     Traffic Forwarding Metrics:
      Total number of packets offered to DUT    number of Packets
      Total number of packets forwarded by DUT  number of Packets
      Connectivity Packet Loss                  number of Packets
      Convergence Packet Loss                   number of Packets
      Out-of-Order Packets                      number of Packets
      Duplicate Packets                         number of Packets

     Convergence Benchmarks:
      Rate-Derived Method:
       First Route Convergence Time             seconds
       Full Convergence Time                    seconds
      Loss-Derived Method:
       Loss-Derived Convergence Time            seconds
      Route-Specific Loss-Derived Method:
       Route-Specific Convergence Time[n]       array of seconds
       Minimum R-S Convergence Time             seconds
       Maximum R-S Convergence Time             seconds
       Median R-S Convergence Time              seconds
       Average R-S Convergence Time             seconds

     Loss of Connectivity Benchmarks:
      Loss-Derived Method:
       Loss-Derived Loss of Connectivity Period seconds
      Route-Specific Loss-Derived Method:
       Route LoC Period[n]                      array of seconds
       Minimum Route LoC Period                 seconds
       Maximum Route LoC Period                 seconds
       Median Route LoC Period                  seconds
       Average Route LoC Period                 seconds


8.  Test Cases

   It is RECOMMENDED that all applicable test cases be performed for
   best characterization of the DUT.  The test cases follow a generic
   procedure tailored to the specific DUT configuration and Convergence
   Event [Po09t].  This generic procedure is as follows:

   1.   Establish DUT and Tester configurations and advertise an IGP
        topology from Tester to DUT.




Poretsky, et al.          Expires July 25, 2010                [Page 22]

Internet-Draft    IGP Convergence Benchmark Methodology     January 2010


   2.   Send Offered Load from Tester to DUT on ingress interface.

   3.   Verify traffic is routed correctly. [Anuj:] Verify traffic is not
        getting dropped, reordered, or packets are delayed.
[Kris:] "Verify if traffic is forwarded without drops, without out-of-order
packets, and without exceeding the Forwarding Delay threshold."

   4.   Introduce Convergence Event [Po09t].

   5.   Measure First Route Convergence Time [Po09t].
   [Anuj:] why is this required to be done at real time? the tester may
   just choose to record the rates and number of packets in the sampling 
   intervals and then measure the first route convergence time as an
   end-of-test metric analysis.
[Kris:] The measurement can be done at this point. If you choose to postpone
processing of the measurements taken at that point, you can do that but I don't
think it's needed to indicate that.  The steps are not intended as an implementation
algorithm.

   6.   Measure Full Convergence Time [Po09t].

   7.   Stop Offered Load.

   8.   Measure Route-Specific Convergence Times, Loss-Derived
        Convergence Time, Route LoC Periods, and Loss-Derived LoC Period
        [Po09t].

   9.   Wait sufficient time for queues to drain.
   [Anuj:] How is this defined?
[Kris:] "This time period duration is equal to the Forwarding Delay threshold. In
absence of a Forwarding Delay threshold specification the duration of this time period is 2 seconds [RFC2544]."
I should probably add Forwarding Delay Threshold to the terms doc.

   10.  Restart Offered Load.
   [Anuj:] Why should this and steps beyond be performed is all routes did
   not converge till this point. If test is continued and convergence is not
   achieved till this point, then the reversion test is bound to produce
   wrong metrics.
[Kris:] If convergence has not been achieved, it would be stuck at 6. I think it's
clear that in that case convergence times cannot be reported or are reported as "infinite".
   
   11.  Reverse Convergence Event.

   12.  Measure First Route Convergence Time.

   13.  Measure Full Convergence Time.

   14.  Stop Offered Load.

   15.  Measure Route-Specific Convergence Times, Loss-Derived
        Convergence Time, Route LoC Periods, and Loss-Derived LoC
        Period.

8.1.  Interface failures

8.1.1.  Convergence Due to Local Interface Failure

   Objective

   To obtain the IGP convergence times due to a Local Interface failure
   event.

   Procedure

   1.   Advertise an IGP topology from Tester to DUT using the topology
        shown in Figure 1.

   2.   Send Offered Load from Tester to DUT on ingress interface.




Poretsky, et al.          Expires July 25, 2010                [Page 23]

Internet-Draft    IGP Convergence Benchmark Methodology     January 2010


   3.   Verify traffic is forwarded over Preferred Egress Interface.

   4.   Remove link on DUT's Preferred Egress Interface.  This is the
        Convergence Event.

   5.   Measure First Route Convergence Time.

   6.   Measure Full Convergence Time.

   7.   Stop Offered Load.

   8.   Measure Route-Specific Convergence Times and Loss-Derived
        Convergence Time.

   9.   Wait sufficient time for queues to drain.

   10.  Restart Offered Load.

   11.  Restore link on DUT's Preferred Egress Interface.

   12.  Measure First Route Convergence Time.

   13.  Measure Full Convergence Time.

   14.  Stop Offered Load.

   15.  Measure Route-Specific Convergence Times, Loss-Derived
        Convergence Time, Route LoC Periods, and Loss-Derived LoC
        Period.

   Results

   The measured IGP convergence time may be influenced by the link
   failure indication time, LSA/LSP delay, LSA/LSP generation time, LSA/
   LSP flood packet pacing, SPF delay, SPF execution time, and routing
   and forwarding tables update time [Po09a].

8.1.2.  Convergence Due to Remote Interface Failure

   Objective

   To obtain the IGP convergence time due to a Remote Interface failure
   event.

   Procedure






Poretsky, et al.          Expires July 25, 2010                [Page 24]

Internet-Draft    IGP Convergence Benchmark Methodology     January 2010


   1.   Advertise an IGP topology from Tester to SUT using the topology
        shown in Figure 2.

   2.   Send Offered Load from Tester to SUT on ingress interface.

   3.   Verify traffic is forwarded over Preferred Egress Interface.

   4.   Remove link on Tester's interface [Po09t] connected to SUT's
        Preferred Egress Interface.  This is the Convergence Event.

   5.   Measure First Route Convergence Time.

   6.   Measure Full Convergence Time.

   7.   Stop Offered Load.

   8.   Measure Route-Specific Convergence Times and Loss-Derived
        Convergence Time.

   9.   Wait sufficient time for queues to drain.

   10.  Restart Offered Load.

   11.  Restore link on Tester's interface connected to DUT's Preferred
        Egress Interface.

   12.  Measure First Route Convergence Time.

   13.  Measure Full Convergence Time.

   14.  Stop Offered Load.

   15.  Measure Route-Specific Convergence Times, Loss-Derived
        Convergence Time, Route LoC Periods, and Loss-Derived LoC
        Period.

   Results

   The measured IGP convergence time may be influenced by the link
   failure indication time, LSA/LSP delay, LSA/LSP generation time, LSA/
   LSP flood packet pacing, SPF delay, SPF execution time, and routing
   and forwarding tables update time.  This test case may produce Stale
   Forwarding [Po09t] due to a transient microloop between R1 and R2
   during convergence, which may increase the measured convergence times
   and loss of connectivity periods.






Poretsky, et al.          Expires July 25, 2010                [Page 25]

Internet-Draft    IGP Convergence Benchmark Methodology     January 2010


8.1.3.  Convergence Due to ECMP Member Local Interface Failure

   Objective

   To obtain the IGP convergence time due to a Local Interface link
   failure event of an ECMP Member.

   Procedure

   1.   Advertise an IGP topology from Tester to DUT using the test
        setup shown in Figure 3.

   2.   Send Offered Load from Tester to DUT on ingress interface.

   3.   Verify traffic is forwarded over the DUT's ECMP member interface
        that will be failed in the next step.

   4.   Remove link on one of the DUT's ECMP member interfaces.  This is
        the Convergence Event.

   5.   Measure First Route Convergence Time.

   6.   Measure Full Convergence Time.

   7.   Stop Offered Load.

   8.   Measure Route-Specific Convergence Times and Loss-Derived
        Convergence Time.  At the same time measure Out-of-Order Packets
        [Po06] and Duplicate Packets [Po06].

   9.   Wait sufficient time for queues to drain.

   10.  Restart Offered Load.

   11.  Restore link on DUT's ECMP member interface.

   12.  Measure First Route Convergence Time.

   13.  Measure Full Convergence Time.

   14.  Stop Offered Load.

   15.  Measure Route-Specific Convergence Times, Loss-Derived
        Convergence Time, Route LoC Periods, and Loss-Derived LoC
        Period.  At the same time measure Out-of-Order Packets [Po06]
        and Duplicate Packets [Po06].

   Results



Poretsky, et al.          Expires July 25, 2010                [Page 26]

Internet-Draft    IGP Convergence Benchmark Methodology     January 2010


   The measured IGP Convergence time may be influenced by link failure
   indication time, LSA/LSP delay, LSA/LSP generation time, LSA/LSP
   flood packet pacing, SPF delay, SPF execution time, and routing and
   forwarding tables update time [Po09a].

8.1.4.  Convergence To ECMP set Due to Local Interface Failure

   Objective

   To obtain the IGP convergence time due to a Local Interface link
   failure event from the Preferred Egress Interface.  The Next-Best
   Egress Interfaces are members of a single ECMP set.

   Procedure

   1.   Advertise an IGP topology from Tester to DUT using the test
        setup shown in Figure 4.

   2.   Send Offered Load from Tester to DUT on ingress interface.

   3.   Verify traffic is forwarded over Preferred Egress Interface.

   4.   Remove link on Tester's interface connected to DUT's Preferred
        Egress Interface.  This is the Convergence Event.

   5.   Measure First Route Convergence Time.

   6.   Measure Full Convergence Time.

   7.   Stop Offered Load.

   8.   Measure Route-Specific Convergence Times and Loss-Derived
        Convergence Time.  At the same time measure Out-of-Order Packets
        [Po06] and Duplicate Packets [Po06].

   9.   Wait sufficient time for queues to drain.

   10.  Restart Offered Load.

   11.  Restore link on Tester's interface connected to DUT's Preferred
        Egress Interface.

   12.  Measure First Route Convergence Time.

   13.  Measure Full Convergence Time.

   14.  Stop Offered Load.




Poretsky, et al.          Expires July 25, 2010                [Page 27]

Internet-Draft    IGP Convergence Benchmark Methodology     January 2010


   15.  Measure Route-Specific Convergence Times, Loss-Derived
        Convergence Time, Route LoC Periods, and Loss-Derived LoC
        Period.  At the same time measure Out-of-Order Packets [Po06]
        and Duplicate Packets [Po06].

   Results

   The measured IGP Convergence time may be influenced by link failure
   indication time, LSA/LSP delay, LSA/LSP generation time, LSA/LSP
   flood packet pacing, SPF delay, SPF execution time, and routing and
   forwarding tables update time [Po09a].

8.1.5.  Convergence Due to ECMP Member Remote Interface Failure

   Objective

   To obtain the IGP convergence time due to a Remote Interface link
   failure event for an ECMP Member.

   Procedure

   1.   Advertise an IGP topology from Tester to DUT using the test
        setup shown in Figure 5.

   2.   Send Offered Load from Tester to DUT on ingress interface.

   3.   Verify traffic is forwarded over the DUT's ECMP member interface
        that will be failed in the next step.

   4.   Remove link on Tester's interface to R2.  This is the
        Convergence Event Trigger.

   5.   Measure First Route Convergence Time.

   6.   Measure Full Convergence Time.

   7.   Stop Offered Load.

   8.   Measure Route-Specific Convergence Times and Loss-Derived
        Convergence Time.  At the same time measure Out-of-Order Packets
        [Po06] and Duplicate Packets [Po06].

   9.   Wait sufficient time for queues to drain.

   10.  Restart Offered Load.

   11.  Restore link on Tester's interface to R2.




Poretsky, et al.          Expires July 25, 2010                [Page 28]

Internet-Draft    IGP Convergence Benchmark Methodology     January 2010


   12.  Measure First Route Convergence Time.

   13.  Measure Full Convergence Time.

   14.  Stop Offered Load.

   15.  Measure Route-Specific Convergence Times, Loss-Derived
        Convergence Time, Route LoC Periods, and Loss-Derived LoC
        Period.  At the same time measure Out-of-Order Packets [Po06]
        and Duplicate Packets [Po06].

   Results

   The measured IGP convergence time may influenced by the link failure
   indication time, LSA/LSP delay, LSA/LSP generation time, LSA/LSP
   flood packet pacing, SPF delay, SPF execution time, and routing and
   forwarding tables update time.  This test case may produce Stale
   Forwarding [Po09t] due to a transient microloop between R1 and R2
   during convergence, which may increase the measured convergence times
   and loss of connectivity periods.

8.1.6.  Convergence To ECMP set Due to Remote Interface Failure

   Objective

   To obtain the IGP convergence time due to a Remote Interface link
   failure event from the Preferred Egress Interface.  The Next-Best
   Egress Interfaces are members of a single ECMP set.

   Procedure

   1.   Advertise an IGP topology from Tester to DUT using the test
        setup shown in Figure 6.

   2.   Send Offered Load from Tester to DUT on ingress interface.

   3.   Verify traffic is forwarded over Preferred Egress Interface.

   4.   Remove link on Tester's interface to R2.  This is the
        Convergence Event Trigger.

   5.   Measure First Route Convergence Time.

   6.   Measure Full Convergence Time.

   7.   Stop Offered Load.





Poretsky, et al.          Expires July 25, 2010                [Page 29]

Internet-Draft    IGP Convergence Benchmark Methodology     January 2010


   8.   Measure Route-Specific Convergence Times and Loss-Derived
        Convergence Time.  At the same time measure Out-of-Order Packets
        [Po06] and Duplicate Packets [Po06].

   9.   Wait sufficient time for queues to drain.

   10.  Restart Offered Load.

   11.  Restore link on Tester's interface to R2.

   12.  Measure First Route Convergence Time.

   13.  Measure Full Convergence Time.

   14.  Stop Offered Load.

   15.  Measure Route-Specific Convergence Times, Loss-Derived
        Convergence Time, Route LoC Periods, and Loss-Derived LoC
        Period.  At the same time measure Out-of-Order Packets [Po06]
        and Duplicate Packets [Po06].

   Results

   The measured IGP convergence time may influenced by the link failure
   indication time, LSA/LSP delay, LSA/LSP generation time, LSA/LSP
   flood packet pacing, SPF delay, SPF execution time, and routing and
   forwarding tables update time.  This test case may produce Stale
   Forwarding [Po09t] due to a transient microloop between R1 and R2
   during convergence, which may increase the measured convergence times
   and loss of connectivity periods.

8.1.7.  Convergence Due to Parallel Link Interface Failure

   Objective

   To obtain the IGP convergence due to a local link failure event for a
   member of a parallel link.  The links can be used for data load
   balancing

   Procedure

   1.   Advertise an IGP topology from Tester to DUT using the test
        setup shown in Figure 7.

   2.   Send Offered Load from Tester to DUT on ingress interface.

   3.   Verify traffic is forwarded over the parallel link member that
        will be failed in the next step.



Poretsky, et al.          Expires July 25, 2010                [Page 30]

Internet-Draft    IGP Convergence Benchmark Methodology     January 2010


   4.   Remove link on one of the DUT's parallel link member interfaces.
        This is the Convergence Event.

   5.   Measure First Route Convergence Time.

   6.   Measure Full Convergence Time.

   7.   Stop Offered Load.

   8.   Measure Route-Specific Convergence Times and Loss-Derived
        Convergence Time.  At the same time measure Out-of-Order Packets
        [Po06] and Duplicate Packets [Po06].

   9.   Wait sufficient time for queues to drain.

   10.  Restart Offered Load.

   11.  Restore link on DUT's Parallel Link member interface.

   12.  Measure First Route Convergence Time.

   13.  Measure Full Convergence Time.

   14.  Stop Offered Load.

   15.  Measure Route-Specific Convergence Times, Loss-Derived
        Convergence Time, Route LoC Periods, and Loss-Derived LoC
        Period.  At the same time measure Out-of-Order Packets [Po06]
        and Duplicate Packets [Po06].

   Results

   The measured IGP convergence time may be influenced by the link
   failure indication time, LSA/LSP delay, LSA/LSP generation time, LSA/
   LSP flood packet pacing, SPF delay, SPF execution time, and routing
   and forwarding tables update time [Po09a].

8.2.  Other failures

8.2.1.  Convergence Due to Layer 2 Session Loss

   Objective

   To obtain the IGP convergence time due to a local layer 2 loss.

   Procedure





Poretsky, et al.          Expires July 25, 2010                [Page 31]

Internet-Draft    IGP Convergence Benchmark Methodology     January 2010


   1.   Advertise an IGP topology from Tester to DUT using the topology
        shown in Figure 1.

   2.   Send Offered Load from Tester to DUT on ingress interface.

   3.   Verify traffic is routed over Preferred Egress Interface.

   4.   Remove Layer 2 session from DUT's Preferred Egress Interface.
        This is the Convergence Event.

   5.   Measure First Route Convergence Time.

   6.   Measure Full Convergence Time.

   7.   Stop Offered Load.

   8.   Measure Route-Specific Convergence Times, Loss-Derived
        Convergence Time, Route LoC Periods, and Loss-Derived LoC
        Period.

   9.   Wait sufficient time for queues to drain.

   10.  Restart Offered Load.

   11.  Restore Layer 2 session on DUT's Preferred Egress Interface.

   12.  Measure First Route Convergence Time.

   13.  Measure Full Convergence Time.

   14.  Stop Offered Load.

   15.  Measure Route-Specific Convergence Times, Loss-Derived
        Convergence Time, Route LoC Periods, and Loss-Derived LoC
        Period.

   Results

   The measured IGP Convergence time may be influenced by the Layer 2
   failure indication time, LSA/LSP delay, LSA/LSP generation time, LSA/
   LSP flood packet pacing, SPF delay, SPF execution time, and routing
   and forwarding tables update time [Po09a].

   Discussion

   Configure IGP timers such that the IGP adjacency does not time out
   before layer 2 failure is detected.




Poretsky, et al.          Expires July 25, 2010                [Page 32]

Internet-Draft    IGP Convergence Benchmark Methodology     January 2010


   To measure convergence time, traffic SHOULD start dropping on the
   Preferred Egress Interface on the instant the layer 2 session is
   removed.  Alternatively the Tester SHOULD record the time the instant
   layer 2 session is removed and traffic loss SHOULD only be measured
   on the Next-Best Egress Interface.  For loss-derived benchmarks the
   time of the Start Traffic Instant SHOULD be recorded as well.  See
   Section 4.1.

8.2.2.  Convergence Due to Loss of IGP Adjacency

   Objective

   To obtain the IGP convergence time due to loss of an IGP Adjacency.

   Procedure

   1.   Advertise an IGP topology from Tester to DUT using the topology
        shown in Figure 1.

   2.   Send Offered Load from Tester to DUT on ingress interface.

   3.   Verify traffic is routed over Preferred Egress Interface.

   4.   Remove IGP adjacency from the Preferred Egress Interface while
        the layer 2 session MUST be maintained.  This is the Convergence
        Event.

   5.   Measure First Route Convergence Time.

   6.   Measure Full Convergence Time.

   7.   Stop Offered Load.

   8.   Measure Route-Specific Convergence Times, Loss-Derived
        Convergence Time, Route LoC Periods, and Loss-Derived LoC
        Period.

   9.   Wait sufficient time for queues to drain.

   10.  Restart Offered Load.

   11.  Restore IGP session on DUT's Preferred Egress Interface.

   12.  Measure First Route Convergence Time.

   13.  Measure Full Convergence Time.





Poretsky, et al.          Expires July 25, 2010                [Page 33]

Internet-Draft    IGP Convergence Benchmark Methodology     January 2010


   14.  Stop Offered Load.

   15.  Measure Route-Specific Convergence Times, Loss-Derived
        Convergence Time, Route LoC Periods, and Loss-Derived LoC
        Period.

   Results

   The measured IGP Convergence time may be influenced by the IGP Hello
   Interval, IGP Dead Interval, LSA/LSP delay, LSA/LSP generation time,
   LSA/LSP flood packet pacing, SPF delay, SPF execution time, and
   routing and forwarding tables update time [Po09a].

   Discussion

   Configure layer 2 such that layer 2 does not time out before IGP
   adjacency failure is detected.

   To measure convergence time, traffic SHOULD start dropping on the
   Preferred Egress Interface on the instant the IGP adjacency is
   removed.  Alternatively the Tester SHOULD record the time the instant
   the IGP adjacency is removed and traffic loss SHOULD only be measured
   on the Next-Best Egress Interface.  For loss-derived benchmarks the
   time of the Start Traffic Instant SHOULD be recorded as well.  See
   Section 4.1.

8.2.3.  Convergence Due to Route Withdrawal

   Objective

   To obtain the IGP convergence time due to route withdrawal.

   Procedure

   1.   Advertise an IGP topology from Tester to DUT using the topology
        shown in Figure 1.  The routes that will be withdrawn MUST be a
        set of leaf routes advertised by at least two nodes in the
        emulated topology.  The topology SHOULD be such that before the
        withdrawal the DUT prefers the leaf routes advertised by a node
        "nodeA" via the Preferred Egress Interface, and after the
        withdrawal the DUT prefers the leaf routes advertised by a node
        "nodeB" via the Next-Best Egress Interface.

   2.   Send Offered Load from Tester to DUT on Ingress Interface.

   3.   Verify traffic is routed over Preferred Egress Interface.





Poretsky, et al.          Expires July 25, 2010                [Page 34]

Internet-Draft    IGP Convergence Benchmark Methodology     January 2010


   4.   The Tester withdraws the set of IGP leaf routes from nodeA.
        This is the Convergence Event.  The withdrawal update message
        SHOULD be a single unfragmented packet.  If the routes cannot be
        withdrawn by a single packet, the messages SHOULD be sent using
        the same pacing characteristics as the DUT.  The Tester MAY
        record the time it sends the withdrawal message(s).

   5.   Measure First Route Convergence Time.

   6.   Measure Full Convergence Time.

   7.   Stop Offered Load.

   8.   Measure Route-Specific Convergence Times, Loss-Derived
        Convergence Time, Route LoC Periods, and Loss-Derived LoC
        Period.

   9.   Wait sufficient time for queues to drain.

   10.  Restart Offered Load.

   11.  Re-advertise the set of withdrawn IGP leaf routes from nodeA
        emulated by the Tester.  The update message SHOULD be a single
        unfragmented packet.  If the routes cannot be advertised by a
        single packet, the messages SHOULD be sent using the same pacing
        characteristics as the DUT.  The Tester MAY record the time it
        sends the update message(s).

   12.  Measure First Route Convergence Time.

   13.  Measure Full Convergence Time.

   14.  Stop Offered Load.

   15.  Measure Route-Specific Convergence Times, Loss-Derived
        Convergence Time, Route LoC Periods, and Loss-Derived LoC
        Period.

   Results

   The measured IGP convergence time is influenced by SPF or route
   calculation delay, SPF or route calculation execution time, and
   routing and forwarding tables update time [Po09a].

   Discussion

   To measure convergence time, traffic SHOULD start dropping on the
   Preferred Egress Interface on the instant the routes are withdrawn by



Poretsky, et al.          Expires July 25, 2010                [Page 35]

Internet-Draft    IGP Convergence Benchmark Methodology     January 2010


   the Tester.  Alternatively the Tester SHOULD record the time the
   instant the routes are withdrawn and traffic loss SHOULD only be
   measured on the Next-Best Egress Interface.  For loss-derived
   benchmarks the time of the Start Traffic Instant SHOULD be recorded
   as well.  See Section 4.1.

8.3.  Administrative changes

8.3.1.  Convergence Due to Local Adminstrative Shutdown

   Objective

   To obtain the IGP convergence time due to taking the DUT's Local
   Interface administratively out of service.

   Procedure

   1.   Advertise an IGP topology from Tester to DUT using the topology
        shown in Figure 1.

   2.   Send Offered Load from Tester to DUT on ingress interface.

   3.   Verify traffic is routed over Preferred Egress Interface.

   4.   Take the DUT's Preferred Egress Interface administratively out
        of service.  This is the Convergence Event.

   5.   Measure First Route Convergence Time.

   6.   Measure Full Convergence Time.

   7.   Stop Offered Load.

   8.   Measure Route-Specific Convergence Times, Loss-Derived
        Convergence Time, Route LoC Periods, and Loss-Derived LoC
        Period.

   9.   Wait sufficient time for queues to drain.

   10.  Restart Offered Load.

   11.  Restore Preferred Egress Interface by administratively enabling
        the interface.

   12.  Measure First Route Convergence Time.

   13.  Measure Full Convergence Time.




Poretsky, et al.          Expires July 25, 2010                [Page 36]

Internet-Draft    IGP Convergence Benchmark Methodology     January 2010


   14.  Stop Offered Load.

   15.  Measure Route-Specific Convergence Times, Loss-Derived
        Convergence Time, Route LoC Periods, and Loss-Derived LoC
        Period.

   16.  It is possible that no measured packet loss will be observed for
        this test case.

   Results

   The measured IGP Convergence time may be influenced by LSA/LSP delay,
   LSA/LSP generation time, LSA/LSP flood packet pacing, SPF delay, SPF
   execution time, and routing and forwarding tables update time
   [Po09a].

8.3.2.  Convergence Due to Cost Change

   Objective

   To obtain the IGP convergence time due to route cost change.

   Procedure

   1.   Advertise an IGP topology from Tester to DUT using the topology
        shown in Figure 1.

   2.   Send Offered Load from Tester to DUT on ingress interface.

   3.   Verify traffic is routed over Preferred Egress Interface.

   4.   The Tester, emulating the neighbor node, increases the cost for
        all IGP routes at DUT's Preferred Egress Interface so that the
        Next-Best Egress Interface becomes preferred path.  The update
        message advertising the higher cost MUST be a single
        unfragmented packet.  This is the Convergence Event.  The Tester
        MAY record the time it sends the update message advertising the
        higher cost on the Preferred Egress Interface.

   5.   Measure First Route Convergence Time.

   6.   Measure Full Convergence Time.

   7.   Stop Offered Load.

   8.   Measure Route-Specific Convergence Times, Loss-Derived
        Convergence Time, Route LoC Periods, and Loss-Derived LoC
        Period.



Poretsky, et al.          Expires July 25, 2010                [Page 37]

Internet-Draft    IGP Convergence Benchmark Methodology     January 2010


   9.   Wait sufficient time for queues to drain.

   10.  Restart Offered Load.

   11.  The Tester, emulating the neighbor node, decreases the cost for
        all IGP routes at DUT's Preferred Egress Interface so that the
        Preferred Egress Interface becomes preferred path.  The update
        message advertising the lower cost MUST be a single unfragmented
        packet.

   12.  Measure First Route Convergence Time.

   13.  Measure Full Convergence Time.

   14.  Stop Offered Load.

   15.  Measure Route-Specific Convergence Times, Loss-Derived
        Convergence Time, Route LoC Periods, and Loss-Derived LoC
        Period.

   Results

   The measured IGP Convergence time may be influenced by SPF delay, SPF
   execution time, and routing and forwarding tables update time
   [Po09a].

   Discussion

   To measure convergence time, traffic SHOULD start dropping on the
   Preferred Egress Interface on the instant the cost is changed by the
   Tester.  Alternatively the Tester SHOULD record the time the instant
   the cost is changed and traffic loss SHOULD only be measured on the
   Next-Best Egress Interface.  For loss-derived benchmarks the time of
   the Start Traffic Instant SHOULD be recorded as well.  See Section
   4.1.


9.  Security Considerations

   Benchmarking activities as described in this memo are limited to
   technology characterization using controlled stimuli in a laboratory
   environment, with dedicated address space and the constraints
   specified in the sections above.

   The benchmarking network topology will be an independent test setup
   and MUST NOT be connected to devices that may forward the test
   traffic into a production network, or misroute traffic to the test
   management network.



Poretsky, et al.          Expires July 25, 2010                [Page 38]

Internet-Draft    IGP Convergence Benchmark Methodology     January 2010


   Further, benchmarking is performed on a "black-box" basis, relying
   solely on measurements observable external to the DUT/SUT.

   Special capabilities SHOULD NOT exist in the DUT/SUT specifically for
   benchmarking purposes.  Any implications for network security arising
   from the DUT/SUT SHOULD be identical in the lab and in production
   networks.


10.  IANA Considerations

   This document requires no IANA considerations.


11.  Acknowledgements

   Thanks to Sue Hares, Al Morton, Kevin Dubray, Ron Bonica, David Ward,
   Peter De Vriendt, Anuj Dewangan and the BMWG for their contributions
   to this work.


12.  Normative References

   [Br91]   Bradner, S., "Benchmarking terminology for network
            interconnection devices", RFC 1242, July 1991.

   [Br97]   Bradner, S., "Key words for use in RFCs to Indicate
            Requirement Levels", BCP 14, RFC 2119, March 1997.

   [Br99]   Bradner, S. and J. McQuaid, "Benchmarking Methodology for
            Network Interconnect Devices", RFC 2544, March 1999.

   [Ca90]   Callon, R., "Use of OSI IS-IS for routing in TCP/IP and dual
            environments", RFC 1195, December 1990.

   [Co08]   Coltun, R., Ferguson, D., Moy, J., and A. Lindem, "OSPF for
            IPv6", RFC 5340, July 2008.

   [Ho08]   Hopps, C., "Routing IPv6 with IS-IS", RFC 5308,
            October 2008.

   [Ko02]   Koodli, R. and R. Ravikanth, "One-way Loss Pattern Sample
            Metrics", RFC 3357, August 2002.

   [Ma98]   Mandeville, R., "Benchmarking Terminology for LAN Switching
            Devices", RFC 2285, February 1998.

   [Mo98]   Moy, J., "OSPF Version 2", STD 54, RFC 2328, April 1998.



Poretsky, et al.          Expires July 25, 2010                [Page 39]

Internet-Draft    IGP Convergence Benchmark Methodology     January 2010


   [Po06]   Poretsky, S., Perser, J., Erramilli, S., and S. Khurana,
            "Terminology for Benchmarking Network-layer Traffic Control
            Mechanisms", RFC 4689, October 2006.

   [Po09a]  Poretsky, S., "Considerations for Benchmarking Link-State
            IGP Data Plane Route Convergence",
            draft-ietf-bmwg-igp-dataplane-conv-app-17 (work in
            progress), March 2009.

   [Po09t]  Poretsky, S. and B. Imhoff, "Terminology for Benchmarking
            Link-State IGP Data Plane Route Convergence",
            draft-ietf-bmwg-igp-dataplane-conv-term-18 (work in
            progress), July 2009.


Authors' Addresses

   Scott Poretsky
   Allot Communications
   67 South Bedford Street, Suite 400
   Burlington, MA  01803
   USA

   Phone: + 1 508 309 2179
   Email: sporetsky <at> allot.com


   Brent Imhoff
   Juniper Networks
   1194 North Mathilda Ave
   Sunnyvale, CA  94089
   USA

   Phone: + 1 314 378 2571
   Email: bimhoff <at> planetspork.com


   Kris Michielsen
   Cisco Systems
   6A De Kleetlaan
   Diegem, BRABANT  1831
   Belgium

   Email: kmichiel <at> cisco.com







Poretsky, et al.          Expires July 25, 2010                [Page 40]


_______________________________________________
bmwg mailing list
bmwg <at> ietf.org
https://www.ietf.org/mailman/listinfo/bmwg
Vijay K. Gurbani | 8 Feb 2010 16:09
Favicon

BMWG SIP work items (-01) submitted

Al: I have just submitted

   draft-ietf-bmwg-sip-bench-term-01
   draft-ietf-bmwg-sip-bench-meth-01

These drafts include all the comments we have received on this
work so far.

Carol, Scott and I believe that this work can be progressed
forward in the working group.

The URL to the drafts is:

http://www.ietf.org/id/draft-ietf-bmwg-sip-bench-term-01.txt
http://www.ietf.org/id/draft-ietf-bmwg-sip-bench-meth-01.txt

Thank you and apologies for the delay in getting this out.

- vijay
--

-- 
Vijay K. Gurbani, Bell Laboratories, Alcatel-Lucent
1960 Lucent Lane, Rm. 9C-533, Naperville, Illinois 60566 (USA)
Email: vkg <at> {alcatel-lucent.com,bell-labs.com,acm.org}
Web:   http://ect.bell-labs.com/who/vkg/
Internet-Drafts | 8 Feb 2010 16:15
Picon
Favicon

I-D Action:draft-ietf-bmwg-sip-bench-term-01.txt

A New Internet-Draft is available from the on-line Internet-Drafts directories.
This draft is a work item of the Benchmarking Methodology Working Group of the IETF.

	Title           : Terminology for Benchmarking Session Initiation Protocol (SIP) Networking Devices
	Author(s)       : S. Poretsky, et al.
	Filename        : draft-ietf-bmwg-sip-bench-term-01.txt
	Pages           : 34
	Date            : 2010-02-08

This document provides a terminology for benchmarking SIP performance
in networking devices.  Terms are included for test components, test
setup parameters, and performance benchmark metrics for black-box
benchmarking of SIP networking devices.  The performance benchmark
metrics are obtained for the SIP control plane and media plane.  The
terms are intended for use in a companion methodology document for
complete performance characterization of a device in a variety of
conditions making it possible to compare performance of different
devices.  It is critical to provide test setup parameters and a
methodology document for SIP performance benchmarking because SIP
allows a wide range of configuration and operational conditions that
can influence performance benchmark measurements.  It is necessary to
have terminology and methodology standards to ensure that reported
benchmarks have consistent definition and were obtained following the
same procedures.  Benchmarks can be applied to compare performance of
a variety of SIP networking devices.

A URL for this Internet-Draft is:
http://www.ietf.org/internet-drafts/draft-ietf-bmwg-sip-bench-term-01.txt

Internet-Drafts are also available by anonymous FTP at:
ftp://ftp.ietf.org/internet-drafts/

Below is the data which will enable a MIME compliant mail reader
implementation to automatically retrieve the ASCII version of the
Internet-Draft.
_______________________________________________
bmwg mailing list
bmwg <at> ietf.org
https://www.ietf.org/mailman/listinfo/bmwg
Internet-Drafts | 8 Feb 2010 16:15
Picon
Favicon

I-D Action:draft-ietf-bmwg-sip-bench-meth-01.txt

A New Internet-Draft is available from the on-line Internet-Drafts directories.
This draft is a work item of the Benchmarking Methodology Working Group of the IETF.

	Title           : Methodology for Benchmarking SIP Networking Devices
	Author(s)       : S. Poretsky, et al.
	Filename        : draft-ietf-bmwg-sip-bench-meth-01.txt
	Pages           : 20
	Date            : 2010-02-08

This document describes the methodology for benchmarking Session
Initiation Protocol (SIP) performance as described in SIP
benchmarking terminology document.  The methodology and terminology
are to be used for benchmarking signaling plane performance with
varying signaling and media load.  Both scale and establishment rate
are measured by signaling plane performance.  The SIP Devices to be
benchmarked may be a single device under test (DUT) or a system under
test (SUT).  Benchmarks can be obtained and compared for different
types of devices such as SIP Proxy Server, SBC, and server paired
with a media relay or Firewall/NAT device.

A URL for this Internet-Draft is:
http://www.ietf.org/internet-drafts/draft-ietf-bmwg-sip-bench-meth-01.txt

Internet-Drafts are also available by anonymous FTP at:
ftp://ftp.ietf.org/internet-drafts/

Below is the data which will enable a MIME compliant mail reader
implementation to automatically retrieve the ASCII version of the
Internet-Draft.
_______________________________________________
bmwg mailing list
bmwg <at> ietf.org
https://www.ietf.org/mailman/listinfo/bmwg
Al Morton | 8 Feb 2010 16:23
Picon
Favicon

Re: BMWG SIP work items (-01) submitted

Thanks for posting the latest status, Vijay.

It would be good if a several people could read/comment on
the new drafts so we have a view of where this work stands
- the wg hasn't seen drafts on this topic in a while.

Al
bmwg chair

At 10:09 AM 2/8/2010, Vijay K. Gurbani wrote:
>Al: I have just submitted
>
>   draft-ietf-bmwg-sip-bench-term-01
>   draft-ietf-bmwg-sip-bench-meth-01
>
>These drafts include all the comments we have received on this
>work so far.
>
>Carol, Scott and I believe that this work can be progressed
>forward in the working group.
>
>The URL to the drafts is:
>
>http://www.ietf.org/id/draft-ietf-bmwg-sip-bench-term-01.txt
>http://www.ietf.org/id/draft-ietf-bmwg-sip-bench-meth-01.txt
>
>Thank you and apologies for the delay in getting this out.
>
>- vijay
>--
>Vijay K. Gurbani, Bell Laboratories, Alcatel-Lucent
>1960 Lucent Lane, Rm. 9C-533, Naperville, Illinois 60566 (USA)
>Email: vkg <at> {alcatel-lucent.com,bell-labs.com,acm.org}
>Web:   http://ect.bell-labs.com/who/vkg/
>_______________________________________________
>bmwg mailing list
>bmwg <at> ietf.org
>https://www.ietf.org/mailman/listinfo/bmwg
Al Morton | 10 Feb 2010 21:46
Picon
Favicon

Re: WGLC: draft-ietf-bmwg-igp-dataplane drafts

Adding my thoughts to one of the key issues:
Al
(as a participant)

At 10:04 AM 2/4/2010, Kris Michielsen wrote:
Anuj,
 
Comments in green, marked with [Kris1:].
I also added comments in the attached draft and attached another document with accuracy interval calculations (see below).
Please provide your feedback.
 

From: Dewangan, Anuj [ mailto:Anuj.Dewangan <at> spirent.com]
Sent: 24 January 2010 02:20
To: Kris Michielsen; Al Morton; sporetsky <at> allot.com; bimhoff <at> planetspork.com
Subject: RE: [bmwg] WGLC: draft-ietf-bmwg-igp-dataplane drafts

Hi Kris,                                                                                                                                                                               
 
I have commented inline in red and marked as [Anuj1:]. Also I have added comments/additions to the draft and have attached it. Please look for “[Anuj:]” to find the comments. However, the draft still does not answer some problems/issues seen when such a test is performed practically:
 
1. Due to inherent jitters in the traffic forwarded by the DUT, the graph is never as smooth as in theory. Even without a convergence event, the traffic rate is seen fluctuating due to a combination of jitters in the forwarded traffic and the resolution of sampling interval, which is supposed to be as small as possible (and with the definition of atleast one packet per route) and should usually be in milliseconds for any useful/accurate measurement required. As an example, if there are only a few routes in the test, then even a couple of packets extra seen in a sampling interval (due to forwarding jitters) will cause a major fluctuation in the convergence graph. In such a case, the convergence instants are very difficult (or impossible) to calculate. This is a problem even with a “normal” number of routes but a very small sampling interval – which is possible if the offered rate (=DUT throughput) is high. This is not addressed anywhere in the draft. 

[Kris1:] This is mainly a problem in the cases where variations in rate need to be observed. For cases where there is a transition rcv rate X -> rcv rate 0 or rcv rate 0 -> rcv rate X it is less of a problem, jitter will add to the error interval which is already fairly large.
Assuming jitter is symmetric around an average forwarding delay and this average forwarding delay is constant, and assuming that jitter == n*1/Offered Load, and packet sampling interval is of duration N*#routes/Offered Load, the expected amount of packets under steady state in a packet sampling interval is between N*#routes-2n and N*#routes+2n.
If N>2n and the #received packets is outside of the above #received packets interval under steady state, one can decide a variation in rate has
If convergence has not yet completed for >=1 route during a sampling interval, the #received packets in a sampling interval is <= N*(#routes-1). So, under the above assumptions, and if N>2n one can decide the convergence recovery instant was not reached if #received packets < N*#routes-2n. So the larger the jitter, the larger the packet sampling interval needs to be to derive the convergence recovery instant.
I would propose the following change:
 
"If the Packet Sampling Interval is large
   compared to the time between the convergence time instants, then the
   different time instants may not be easily identifiable from the
   Forwarding Rate observation.  Using a small Packet Sampling Interval in the presence of jitter may cause fluctuations of the Forwarding Rate observation and can prevent accurate measurement of the different time instants. The requirements for the Packet
   Sampling Interval are specified in [Po09t].  The Packet Sampling
   Interval MUST be larger than or equal to the time between two
   consecutive packets to the same route.  For maximum accuracy the
   value for the Packet Sampling Interval SHOULD be as small as
   possible, but the presence of jitter may enforce using a larger Packet Sampling Interval.  The Packet Sampling Interval MUST be reported."
 
[ACM]  The only guidance on jitter I found in the preliminary -20 draft was in Section 5.7:
   ... When packet jitter is much
   less than the convergence time, it is a negligible source of error
   and therefore it will be ignored here.

It seems that a reasonable sanity check on the test configuration might be to measure the packet rate variation during the Packet Sampling Interval  (or jitter, above) and compare the jitter with the value for N (the number of packets sent on each route during a sampling interval). The jitter should be accounted for when determining if the rate in a particular interval is sufficient to declare the convergence recovery instant.

------------------

Speaking as chair, this rate jitter issue appears to affect the
methodologies in a few other bmwg drafts, so folks should take a look
at this and form their own opinions or add their experiences.

Al

_______________________________________________
bmwg mailing list
bmwg <at> ietf.org
https://www.ietf.org/mailman/listinfo/bmwg
Fernando Calabria (fcalabri | 19 Feb 2010 18:02
Picon
Favicon

New Version Notification for draft-asati-bmwg-reset-03

Hello WG

 

We just submitted a new revision of the  draft-asati-bmwg-reset , so far we incorporated all the comments and feedback received by you, so please take some time “if possible” to go over the document a and let us know your thoughts and why not any  questions and or concerns you may have

 

 

Kind Regards

 

Fernando Calabria & Authors

 

-----Original Message-----

From: IETF I-D Submission Tool [mailto:idsubmission <at> ietf.org]

Sent: Friday, February 19, 2010 10:05 AM

To: Carlos Pignataro (cpignata)

Cc: Rajiv Asati (rajiva); Fernando Calabria (fcalabri); cesar.olvera <at> consulintel.es

Subject: New Version Notification for draft-asati-bmwg-reset-03

 

 

A new version of I-D, draft-asati-bmwg-reset-03.txt has been successfuly submitted by Carlos Pignataro and posted to the IETF repository.

 

Filename:   draft-asati-bmwg-reset

Revision:   03

Title:            Device Reset Characterization

Creation_date:    2010-02-19

WG ID:            Independent Submission

Number_of_pages: 19

 

Abstract:

An operational forwarding device may need to be re-started

(automatically or manually) for a variety of reasons, an event that

we call a "reset" in this document. Since there may be an

interruption in the forwarding operation during a reset, it is

useful to know how long a device takes to begin forwarding packets

again.

 

This document specifies a methodology for characterizing reset

during benchmarking of forwarding devices, and provides clarity and

consistency in reset test procedures beyond what's specified in

RFC2544. It therefore updates RFC2544.

                                                                                 

 

 

The IETF Secretariat.

 

 

 

 

 

 

Fernando Calabria
Technical Leader AS
Customer Advocacy
fcalabri <at> cisco.com
Phone: +1 919 392 7931
Mobile: +1 919 345 5246
Pager: fcalabri <at> epage.cisco.com

CCIE - 4217

Cisco Systems, Inc.
United States
Cisco.com

 

 

Think before you print.


This email may contain confidential and privileged material for the sole use of the intended recipient. Any review, use, distribution or disclosure by others is strictly prohibited. If you are not the intended recipient (or authorized to receive for the recipient), please contact the sender by reply email and delete all copies of this message.

For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/index.html

 

 

_______________________________________________
bmwg mailing list
bmwg <at> ietf.org
https://www.ietf.org/mailman/listinfo/bmwg

Gmane