Re: Fast Retransmission in SCTP
Randy Stewart <randall <at> lakerest.net>
2012-01-17 12:01:20 GMT
Maxim:
Being one of the authors of the paper and research and having
done a few other things on SCTP.. I will add my 2 cents
… mostly
in-line .. but out of line here too
In general sending FR to where they originally were sent is a good
idea. The reason being is the ACK clock is not broken.. you just noticed
something was lost. Once you move it to another destination you are moving
it to somewhere that is what the first Michael below describes… and
further more, if you do that, you can *not* do more than one fast retransmit
per packet. So if you have an unlucky packet that gets dropped twice but you
have not loss the ack-clock, you are going to take a full scale time-out in
order to get it through.
I have seen this happen in real networks and have found it best then to
keep the packet on the path it was originally sent on in general.
Now all of that is in *general true* for when
a) You have NOT switched primary's
<and>
b) You still have an ACK-Clock.
If on the other hand, the user has switched primary's or the primary has
died and move to another path. You may not wish to follow that advice.
The RFC very specifically left loose not adding any MUST so that an implementation
can have some freedom if it wants to worry about these corner cases. This is a good
thing.. since there are several corner cases that one can dream up that then the
standard advice.. keep the packet on the same path originally sent on.. would *not*
be a good choice.
On Jan 17, 2012, at 5:11 AM, Maxim Proshin wrote:
> Hi,
>
> it seems that your point is reasonable but it still requires some clarfifications:
>
> 1. You are talking about "current primary path failovers". What if the user defined a destination? Should
SCTP send fast retransmissions to the primary destination anyway?
I personally would still hold the original path if the user swapped paths since the
ack clock will slowly shut down. Unless of course you know that there is nothing
left on that old path of course.. again this is quite a corner case. Most FR's will
happen with no switches, no deaths, just a momentary drop by a router in the network
to slow us down a bit.
>
> 2. It is unclear how the Karns algorithm can affect RTO calculation during FR. If the primary destination
is just changed and gaps are detected for the previous primary destination, it means that RTO could not be
changed (at least due to Karns algorithm). Could you please clarify scenarious where it could happen? In
your item 1 you are talking about an alternative path and conservative RTO. I think you mean the previous
primary destination then it is unclear why its RTO is conservative.
>
> 3. I agree with the drawback about CWND. But as usually there are pros and cons. How do you propose to
calculate CWND in this case? Should it be recalculated for the previous primary destination and new one
(or defined by the user and primary one)? Looks like it should be recalculated but in such case it impacts
SCTP performance and conflicts with RFC. It seems there are also problematic scenarious if CWND is
recalculated. What if the number of bytes in in-flight for the new primary destination is equal to CWND,
how fast retransmissions from the previous primary one should be made in such case, should it be
postponed? Moreover changing the primary destination is not normally done often and amount of fast
retransmissions to the previous primary destination will not give a big impact on CWND in comparison with
all data.
I think this is again splitting hairs. The loss of one opportunity to grow the CWND in one
obscure corner case where the user switched primary's (or the path died) is not important.
If the user switches primary's he is going to take a hit anyway since the old path
with a very open cwnd (possibly) will of course have much more data going through it
vs the new path that is sitting at the initial window. In the case of path-death, well
again the new path will probably be in slow start and its just a consequence of the path
switch...
In fact actually the more I think on this it must be that the user switched the primary
since if the path exceeded its strike count and was marked as down, all the packets
in-flight or otherwise, must be moved to the new primary.
Unless of course you get a double switch back, i.e. it restores.. in which case
we are again walking a very narrow corner.
Again, this is why the wording is loose, to allow the implementations the ability to
add code to handle obscure corner cases if they wish.
R
>
> BR, Maxim
>
> 2011/12/21 Michael Vittrup Larsen <michael.larsen <at> tieto.com>
> I don't believe this is a correct interpretation of the articles from Armando
> et. al ([4] of RFC4460). I believe the correct interpretation is
>
> "fast retransmissions should be sent to the destination to which
> we currently are sending data"
>
> Note that this is not the same as "...to the same destination where initial
> transmissions were sent" in case of current primary path failovers. In fact,
> if we send fast retransmissions "...to the same destination where initial
> transmissions were sent" we may in fact be sending these on a path different
> from where we currently are sending data - and this is exactly (to my
> understanding) what the change in SCTP fast retransmission in RFC4960 is all
> about.
>
> To see why we need to look behind the words used by RFC4960 and into the
> articles and the reasoning in the articles:
>
> 1) Sending fast retransmissions (FR) on a path different from where we
> currently are sending new data may be subject to excessive RTO values on the
> alternative path. Successful FRs do not update the RTT of the alternative path
> due to Karns algorithm and the RTO on the alternative path is therefore
> conservative for the majority of the association and sending FRs on such a
> path is not optimal.
>
> 2) FRs do not benefit the cwnd of the path where we are sending new data but
> instead the cwnd of the alternative path, which is less attractive.
>
> 3) Possible timeouts of FRs sent on the alternative path will delay cwnd
> increments on the path where we are sending new data. Since these timeouts are
> conservative the delay may be excessive.
>
> Neither [4] nor RFC4960 explicitly in text describes the FR policy in case
> where a change of the current primary data transfer path has occurred. But in
> essence the above reasoning/discovery of [4] clearly dictates what the correct
> interpretation should be.
>
>
> Regards,
> Michael
>
> On Friday, December 09, 2011 11:34:34 Michael Tüxen wrote:
> > On Dec 9, 2011, at 8:27 AM, Maxim Proshin wrote:
> > > Hi experts,
> > >
> > > here I would like to discuss the Fast Retransmission algorithm in RFC
> > > 4960. Frankly this RFC does not describe clearly how to send (to which
> > > destination) fast retransmissions but we can find some information in
> > > RFC 4460 which says:
> > >
> > > "
> > > 2.39. Retransmission Policy
> > >
> > > 2.39.1. Description of the Problem
> > >
> > > The current retransmission policy (send all retransmissions an
> > > alternate destination) in the specification has performance issues
> > > under certain loss conditions with multihomed endpoints. Instead,
> > > fast retransmissions should be sent to the same destination, and only
> > > timeout retransmissions should be sent to an alternate destination [4].
> > > "
> > >
> > > Reading this article my first feeling was that fast retransmissions
> > > should be sent to the same destination where initial transmissions were
> > > sent.
> > >
> > > So my first question is
> > >
> > > is my RFC interpretation correct or not?
> >
> > Yes. Fast retransmissions should be send to the same destination as the
> > initial transmission. Timer based retransmissions should use an alternate
> > destination.
> >
> > > Meanwhile, there are some more articles about retransmision policies
> > > where RFC's authors also participated. For instance, "Retransmission
> > > Policies With Transport Layer Multihoming" describes some improvements
> > > in fast retransmission policy in respect to RFC 2960 and I think that
> > > they were considered during RFC 4960 preparation. This analysis also
> > > proposes to send fast retransmissions to the same destination: "
> > > In this solution, all retransmissions are sent to the
> > > same destination as their original transmissions.
> > > ...
> > > Instead of all
> > > retransmissions following the Retransmit to Same Destination
> > > policy of Solution 1, only Fast Retransmissions
> > > should follow this policy.
> > > "
> > >
> > > It seems that this statement confirms my initial feeling described above.
> > > But at the same time it says about primary path: "
> > > Using the same destination for retransmissions has
> > > the added advantage that the primary destination's cwnd
> > > bene ts from successful retransmissions.
> > > "
> > > and it results is some doubts. On the one hand fast retransmissions
> > > should be sent to the same destination, on the other it should be sent
> > > to the primary destination. In usual case "the same" and primary
> > > destinations are equal. But RFC 4960 doesn't prevent primary destination
> > > changing by the user so it can be changed in run-time.
> > >
> > > So my second question is
> > >
> > > if the primary destination is changed in run-time, where SCTP should
> > > sent fast retransmissions which were detected for the previous primary
> > > destination?
> >
> > Changing the primary is normally not done at a high rate. So you are
> > looking at the case where fast retransmission occur and the user changed
> > the primary in between.
> >
> > I guess there is no special case on the RFC for that. So the fast
> > retransmission would go to the destination to which it was initially sent.
> > The same applies if the user specified a destination and therefore the
> > primary was not used.
> >
> > > The mentioned above article describes one advantage in respect to sending
> > > to the primay destination, this is about cwnd benets from successful
> > > retransmission. But I think that in case of fast retransmission it will
> > > not be observed or the advantage is very very low.
> > >
> > > At the same time sending to the primary destination results in a big
> > > disadvantage: SCTP should recalculate cwnd if it sends fast
> > > retransmissions to the primary destination because RFC 4960 says: "
> > > When a Fast Retransmit is being performed, the sender SHOULD ignore the
> > > value of cwnd "
> > > I think the reason is that it was considered during initial transmission.
> >
> > If you do a fast retransmit, you assume that the packet has left the
> > network. So you would have to subtract and add it. This means ignoring...
> >
> > Best regards
> > Michael
> >
> > > What do you think about this?
> > >
> > > I'm not sure whether it was discussed already or not. If yes, please
> > > provide me with discussion's results.
>
> --
> *****************************************************************************************
> Michael Vittrup Larsen
> Software Architect
>
> Tieto Denmark A/S
> IP Solutions, Telecom & Media
> Skanderborgvej 232, 8260 Viby J, DK-Denmark
> Direct Phone / Mobile +45 3091 8469
> E-mail: michael.larsen <at> tieto.com
> *****************************************************************************************
> www.tieto.com
>
> Please note: The information contained in this message may be legally
> privileged and confidential and protected from disclosure. If the reader of
> this message is not the intended recipient, you are hereby notified that any
> unauthorised use, distribution or copying of this communication is strictly
> prohibited. If you have received this communication in error, please notify us
> immediately by replying to the message and deleting it from your computer.
> Thank You.
>
>
>
> --
> BR, Max
-----
Randall Stewart
randall <at> lakerest.net