rfc-editor | 1 Oct 2005 01:27
Favicon

RFC 4129 on Digital Private Network Signaling System (DPNSS)/Digital Access Signaling System 2 (DASS 2) Extensions to the IUA Protocol


A new Request for Comments is now available in online RFC libraries.

        RFC 4129

        Title:      Digital Private Network Signaling System (DPNSS)/
                    Digital Access Signaling System 2 (DASS 2)
                    Extensions to the IUA Protocol
        Author(s):  R. Mukundan, K. Morneault, N. Mangalpally
        Status:     Standards Track
        Date:       September 2005
        Mailbox:    ranjith.mukundan <at> wipro.com, kmorneau <at> cisco.com,
                    narsim <at> nortelnetworks.com
        Pages:      15
        Characters: 29034
        Updates/Obsoletes/SeeAlso:    None

        I-D Tag:    draft-ietf-sigtran-dua-08.txt

        URL:        ftp://ftp.rfc-editor.org/in-notes/rfc4129.txt

This document defines a mechanism for backhauling Digital Private
Network Signaling System 1 (DPNSS 1) and Digital Access Signaling
System 2 (DASS 2) messages over IP by extending the ISDN User
Adaptation (IUA) Layer Protocol defined in RFC 3057.  DPNSS 1,
specified in ND1301:2001/03 (formerly BTNR 188), is used to
interconnect Private Branch Exchanges (PBX) in a private network.
DASS 2, specified in BTNR 190, is used to connect PBXs to the
PSTN.  This document aims to become an Appendix to IUA and to be
the base for a DPNSS 1/DASS 2 User Adaptation (DUA) implementation.
(Continue reading)

Brian F. G. Bidulock | 1 Oct 2005 03:38
Favicon

Re: recommendations for M3UA/SUA BEATs when running over SCTP

Jeff,

Jeff Morriss wrote:                                                                   (Fri, 30 Sep 2005 17:35:49)
> 
> Hi Brian,
> 
> Brian F. G. Bidulock wrote:
> > Jeff,
> > 
> > Jeff Morriss wrote:                         (Fri, 30 Sep 2005 14:41:35)
> > 
> >>Hi list,
> >>
> >>Neither the M3UA nor the SUA RFC recommend sending BEATs when used over 
> >>SCTP.  ETSI goes a bit further and precludes the use of BEATs.
> >>
> >>However, SCTP (in particular, the I-G) allows the association to get 
> >>"stuck" such that it will pass no data: a receiver is allowed to hold 
> >>its window closed "for an indefinite time" (new text for section 6.1 A).
> > 
> > 
> > No, no, no.  SCTP is not permitted to hold its window closed
> > indefinitely just for the sake of it.  Only while its receive buffer is
> > indeed full and the user is not servicing it.  This was always the case
> > for SCTP.  The IG merely adds the zero window probe procedure to keep
> > the sender from stalling if the data happens to be unidirectional and
> > SACKs are being lost from the receiver.
> 
> RFC 2960 doesn't specifically mention that the receiver is allowed to 
> hold it closed indefinitely while the I-G specifically mentions it:
(Continue reading)

Michael Tuexen | 1 Oct 2005 21:01
Picon

Re: recommendations for M3UA/SUA BEATs when running over SCTP

Hi Jeff,

if the receiving SCTP endpoint gives you an a_rwnd = 0 the application 
is
too slow to handle all traffic. Sending application level HB does not 
help here,
it hurts because it puts even more messages on the receiver.

I think the sender has to supervise its sending queue and start some 
application
level flow control mechanisms like sending SIBs to its local user in 
the SS7 world.

So I think not sending these messages AND supervising the send queue is 
the way to go.

Best regards
Michael

On Sep 30, 2005, at 20:41 Uhr, Jeff Morriss wrote:

>
> Hi list,
>
> Neither the M3UA nor the SUA RFC recommend sending BEATs when used 
> over SCTP.  ETSI goes a bit further and precludes the use of BEATs.
>
> However, SCTP (in particular, the I-G) allows the association to get 
> "stuck" such that it will pass no data: a receiver is allowed to hold 
> its window closed "for an indefinite time" (new text for section 6.1 
(Continue reading)

varadaraj.yatirajula | 3 Oct 2005 15:10

SUA: Routing Indicator in SUA RFC3868 and Q.713

All,

 

The routing indicator values for SCCP and SUA differ:

 

As per ITU Q.713, Sec 3.4.1:

Bit 7 of the address indicator octet contains routing information identifying

which address element shall be used for routing, and is encoded as follows:

Bit

7

1 Route on SSN

0 Route on GT

 

As per RFC 3868, Section 3.10.2.1. Routing Indicator

The following values are valid for the routing indicator:

      Reserved                      0

      Route on Global Title         1

      Route on SSN + PC             2

      Route on Hostname             3

      Route on SSN + IP Address     4

 

 

Question:

At the SUA layer when we receive a message from the SCCP user we decode the routing indicator as per Q.713

and when we receive a message from the SCTP we decode the routing indicator as per RFC 3868.

 

However in a IPSP-IPSP configuration how does an SCCP user send a message to the SUA with dest IP Address/Hostname.

Since the SCCP user can only fill routing indicators(1-Route on SSN/ 0-Route on GT)? What routing indicator would be used?

  

_______________________________________________
Sigtran mailing list
Sigtran <at> ietf.org
https://www1.ietf.org/mailman/listinfo/sigtran
Jeff Morriss | 3 Oct 2005 16:12
Favicon

Re: recommendations for M3UA/SUA BEATs when running over SCTP


Hi Michael,

Michael Tuexen wrote:
> Hi Jeff,
> 
> if the receiving SCTP endpoint gives you an a_rwnd = 0 the application is
> too slow to handle all traffic. Sending application level HB does not 
> help here,
> it hurts because it puts even more messages on the receiver.

Actually the application is fine.  In fact I believe, in this case, it 
has a second assoc which is in good shape.  It's just got one assoc 
which is "stuck."

> I think the sender has to supervise its sending queue and start some 
> application
> level flow control mechanisms like sending SIBs to its local user in the 
> SS7 world.
> 
> So I think not sending these messages AND supervising the send queue is 
> the way to go.

The problem then becomes one of detecting when congestion gets to be 
"too much."  I guess a timer is needed.

Regards,
-Jeff

> On Sep 30, 2005, at 20:41 Uhr, Jeff Morriss wrote:
> 
>>
>> Hi list,
>>
>> Neither the M3UA nor the SUA RFC recommend sending BEATs when used 
>> over SCTP.  ETSI goes a bit further and precludes the use of BEATs.
>>
>> However, SCTP (in particular, the I-G) allows the association to get 
>> "stuck" such that it will pass no data: a receiver is allowed to hold 
>> its window closed "for an indefinite time" (new text for section 6.1 A).
>>
>> I have encountered a number of problems in peer SCTPs which have 
>> caused those peers to close their windows and keep them closed 
>> indefinately. This leads to very serious traffic loss since M3UA will 
>> continue to queue mesages to the "stuck" association until the queues 
>> fill up and/or overflow.
>>
>> Using an end-to-end health check mechanism (such as M3UA or SUA BEATs) 
>> solves this problem pretty nicely (similar to the way periodic SLTMs 
>> do in MTP3).  However, I suspect many M3UA/SUA implementors may not 
>> implement or may not turn on (by default) BEATs under the (false, due 
>> to the reasons listed above) pretense that SCTP heartbeats are 
>> sufficient to ensure the viability of the association.
>>
>> What to do?  Should the RFCs recommend (or even require) using BEATs, 
>> even when run over SCTP?  (This is just a recommendation change for 
>> the RFCs but is a complete reversal for ETSI--but I guess that's not 
>> this list's problem.)
>>
>> Regards,
>> -Jeff
>>
>> _______________________________________________
>> Sigtran mailing list
>> Sigtran <at> ietf.org
>> https://www1.ietf.org/mailman/listinfo/sigtran
>>
> 
Jeff Morriss | 3 Oct 2005 16:29
Favicon

Re: recommendations for M3UA/SUA BEATs when running over SCTP


Hi Brian,

Brian F. G. Bidulock wrote:
> Jeff,
> 
> Jeff Morriss wrote:                                                                   (Fri, 30 Sep 2005 17:35:49)
> 
>>Hi Brian,
>>
>>Brian F. G. Bidulock wrote:
>>
>>>Jeff,
>>>
>>>Jeff Morriss wrote:                         (Fri, 30 Sep 2005 14:41:35)
>>>
>>>
>>>>Hi list,
>>>>
>>>>Neither the M3UA nor the SUA RFC recommend sending BEATs when used over 
>>>>SCTP.  ETSI goes a bit further and precludes the use of BEATs.
>>>>
>>>>However, SCTP (in particular, the I-G) allows the association to get 
>>>>"stuck" such that it will pass no data: a receiver is allowed to hold 
>>>>its window closed "for an indefinite time" (new text for section 6.1 A).
>>>
>>>
>>>No, no, no.  SCTP is not permitted to hold its window closed
>>>indefinitely just for the sake of it.  Only while its receive buffer is
>>>indeed full and the user is not servicing it.  This was always the case
>>>for SCTP.  The IG merely adds the zero window probe procedure to keep
>>>the sender from stalling if the data happens to be unidirectional and
>>>SACKs are being lost from the receiver.
>>
>>RFC 2960 doesn't specifically mention that the receiver is allowed to 
>>hold it closed indefinitely while the I-G specifically mentions it:
>>
>>~~
>>       If the sender continues to receive new packets from the receiver
>>       while doing zero window probing, the unacknowledged window probes
>>       should not increment the error counter for the association or any
>>       destination transport address.The reason is that the receiver MAY
>>       keep its window closed for an indefinite time.  Refer to
>>~~
>>
>>(more below)
> 
> 
> Because the application is stuck.

No, the application is fine.  But (one of) the SCTP assocs it is using 
is stuck (due to a bug or whatever).

>>>Also, this has nothing to do with UA BEATs.  The UA BEAT message cannot
>>>arrive at the peer UA if the peer UA is not servicing its receive buffer
>>>and the buffer is full (rwnd = 0).
>>
>>Yes, that's exactly the point.  If the BEATs fail that means the assoc 
>>isn't carrying traffic.
>>
>>
>>>>I have encountered a number of problems in peer SCTPs which have caused 
>>>>those peers to close their windows and keep them closed indefinately. 
> 
> 
> There is no point in sending the messages.  SCTP provides a lifetime
> capability at the sender which allows the association to be aborted
> if a message ages in the send buffer beyond an interval.  Use that
> instead if you simply want to abort when the receiver sticks.
> 
> 
>>>
>>>Not a recommended practice.  Some implementations might artificially
>>>adjust rwnd.  This is not correct, and has been counter-recommended on
>>>TSVWG many times.  The only reason for closing rwnd is the actually
>>>filling of the receive buffer.  Anything else is an incorrect indication
>>>to the peer and any resulting performance or reliability problem is the
>>>fault of the SCTP implementation artificially closing the window.
>>
>>That's all well and good, but "problems" (read: bugs) do crop up and 
>>"the system" should be able to detect the problem, reset, and keep going 
>>(or at least try real hard to do so).
> 
> 
> A similar bug in SS7 would cause problems too.  

Hmmm, I'm not so sure...

MTP2 has T7 whose expiry will kill the link for excessive delay of ACK 
and T6 which will kill the link due to excessive congestion.  SCTP has 
neither, though it does, as you say, have an optional (and barely 
documented) data lifetime.  (I'll have to look at how many SCTPs 
implement that...)

MTP3 has (optional) periodic SLTMs in case those two MTP2 safeguards 
fail.  M3UA has (optional) BEATs but doesn't recommend them; neither 
does it recommend using SCTP's lifetime feature.

These differences mean, to me, that M3UA (as specified) has a big hole 
in it (through which someone could drive several hundred thousand call 
failures).

 > For SS7, validation and
> interworking testing is performed to ensure that the protocol stacks are
> not so poorly designed.

No amount of testing and validation will uncover every single bug in the 
system.  (And I do hope the above was a typo and you do you know the 
difference between a bug and a design problem.)

>>>>This leads to very serious traffic loss since M3UA will continue to 
>>>>queue mesages to the "stuck" association until the queues fill up and/or 
>>>>overflow.
>>>
>>>
>>>The sending M3UA can always monitor its send buffer occupancy to such a
>>>peer and respond accordingly.  If you look at the same ETSI spec that
>>>did away with BEATs, it also says that congestion procedures must be
>>>implemented.  An M3UA SG queuing messages to such a stuck ASP would,
>>>when following ETSI, have to indicate congestion back to the SS7 network
>>>or other sending ASPs.
>>>
>>>Then queues neither fill up, nor overflow.
>>
>>True, but what if the window being closed is due to a bug in the peer 
>>and that peer won't ever recover until the assoc is reset (read: is torn 
>>down and gets a fresh start on life)?
> 
> 
> So abort it.  Set a lifetime or a buffer threshold to abort.  But your
> still going to loose all buffered messages or risk duplicating them if
> you don't follow the corrid procedures.

I'll look into that.  But should the RFCs be updated to have some kind 
of recommendation?  I'd hate for other people/implementations to run 
into the same problem (isn't that part of what recommendations are for?).

>>>>Using an end-to-end health check mechanism (such as M3UA or SUA BEATs) 
>>>>solves this problem pretty nicely (similar to the way periodic SLTMs do 
>>>>in MTP3).
>>>
>>>
>>>SLTMs do not do this.  They are for detecting circuit assignment
>>>problems more than anything else.  A link will should never be taken
>>>down from a failure to acknowledge an SLTM due to queuing delay.  I have
>>>see cascading network failures from the failure to follow this
>>>principle.
>>
>>As per above, we're not talking about congestion nor queuing delay. 
>>We're talking about a "stuck" assoc.  Those BEATs/SLTMs will _never_ be 
>>responded to; in MTP3 2 such failures will cause the link to be failed 
>>(T1.111.7 section 2.2).
> 
> 
> No.  Not while in service, only during activation.  It is optional to
> even send SLTM while in service.

Agreed (that's why I specified that I was talking about periodioc 
SLTMs).  But MTP3 can afford to make that optional since MTP2 has T6 and T7.

> I remember in the early days of SS7 the switch line modules used to
> nicely send an echo SLTMs even though MTP on the front-end was dead.
> Sending heartbeats does not cure anything.  Proper design of the
> application does.

Sounds like that MTP was poorly designed; there's no point in having 
health checks if they're responded to by the wrong layer.

>>>>However, I suspect many M3UA/SUA implementors may not implement or may
>>>>not turn on (by default) BEATs under the (false, due to the reasons
>>>>listed above) pretense that SCTP heartbeats are sufficient to ensure
>>>>the viability of the association.
> 
> 
> SCTP heartbeats do ensure the viability of the SCTP association.  If an
> application fails, yet does so without closing the association, it is
> simply in error.  I don't see the need for the end operating correctly
> to compensate for the end operating in error.

So that there are never, ever, sustained call failures?  Or at least not 
avoidable ones?

If SIGTRAN hopes to be as reliable as SS7, "the other side is broke, 
it's their problem" is the wrong attitude to have.

(And again: the application is fine.  But one assoc is stuck.)

> Nevertheless, set a lifetime on each sent message.  Abort the
> association if a lifetime expires without being acknowledged by SCTP
> and consult the corrid draft for how to handle messages in transit.
> 
> 
>>>
>>>If you follow ETSI's congestion requirements you will not have a
>>>problem.  (Note that the BEATs will not get through under those
>>>conditions anyway.)
>>
>>Except that the bug has now caused the peer to be permanently congested 
>>(unless there's a failure due to excessive congestion--I haven't checked).
> 
> 
> Which is precisely why congestion should be signalled, so the rest
> of the network can avoid sending to the overloaded component.

But it still misses the point that the application itself is _not_ 
congested.  It could even have a second assoc which is perfectly fine. 
Congestion would tell the network to stop sending to the application 
(meaning few if any calls would go through); that could actually be 
worse than the current problem which, if the application had, say, 2 
(loadshared) assocs would "only" lead to 50% call failure.

A timer guarding the congestion would solve all that, so that's probably 
the way to go.

>>>>What to do?  Should the RFCs recommend (or even require) using BEATs, 
>>>>even when run over SCTP?  (This is just a recommendation change for the 
>>>>RFCs but is a complete reversal for ETSI--but I guess that's not this 
>>>>list's problem.)
>>>
>>>
>>>Just implement congestion control.  Also if you have an alternative ASP
>>>or SGP (e.g. in a loadsharing arrangement) switch over to that.  When
>>>switching traffic, use the proceedures of draft-bidulock-sigtran-corrid
>>>and you will avoid message loss or duplication, even to a non-corrid
>>>aware host (but to use all the corrid procedures you must be able to
>>>send BEATs).
>>
>>I suppose that the peer being marked as congested will certainly raise 
>>alarms throughout the network (which would at least make the problem 
>>visible to everybody, not just the adjacent node) but in my experience, 
>>there are a fair number of bugs which can be automatically (no manual 
>>intervention--which may not arrive for several hours) fixed by simply 
>>resetting the troubled thing.  Unless there is an "excessive congestion" 
>>clause, this won't help.
> 
> 
> So set lifetime and abort the assocation.  But that is surely not only
> implementation dependent but is an operational consideration that has
> not place in the protocol specification.

Hmmm, I disagree.  Reading the specs would lead one to believe that one 
is reasonably safe in letting SCTP manage the viability of the assocs 
(and M3UA's ability to send data on them).  This could lead implementors 
to ignore this potential problem until they run into it.

(But obviously it's not my call; I just thought people here might be 
interested...)

Regards,
-Jeff
Brian F. G. Bidulock | 3 Oct 2005 18:30
Favicon

Re: SUA: Routing Indicator in SUA RFC3868 and Q.713

varadaraj.yatirajula,

An SCCP User can always specify destination point code, that is
the choices for SCCP are PC+SSN or GT.

An SCCP User never specifies Hostname or SSN+IP Address, only
an SUAP does.  An SUAP is a specialized kind of user that
understands how to specify things like Hostname and SSN+IP
Address.  It has no parallel in any standards specification.

--brian

varadaraj.yatirajula <at> wipro.com wrote:                                 (Mon, 03 Oct 2005 18:40:49)
> 
>    All,
> 
> 
>    The routing indicator values for SCCP and SUA differ:
> 
> 
>    As per ITU Q.713, Sec 3.4.1:
> 
>    Bit  7  of  the  address  indicator octet contains routing information
>    identifying
> 
>    which  address  element  shall  be used for routing, and is encoded as
>    follows:
> 
>    Bit
> 
>    7
> 
>    1 Route on SSN
> 
>    0 Route on GT
> 
> 
>    As per RFC 3868, Section 3.10.2.1. Routing Indicator
> 
>    The following values are valid for the routing indicator:
> 
>          Reserved                      0
> 
>          Route on Global Title         1
> 
>          Route on SSN + PC             2
> 
>          Route on Hostname             3
> 
>          Route on SSN + IP Address     4
> 
> 
> 
>    Question:
> 
>    At  the  SUA  layer  when  we  receive a message from the SCCP user we
>    decode the routing indicator as per Q.713
> 
>    and  when  we  receive  a  message from the SCTP we decode the routing
>    indicator as per RFC 3868.
> 
> 
>    However  in  a  IPSP-IPSP  configuration  how does an SCCP user send a
>    message to the SUA with dest IP Address/Hostname.
> 
>    Since  the  SCCP user can only fill routing indicators(1-Route on SSN/
>    0-Route on GT)? What routing indicator would be used?

> _______________________________________________
> Sigtran mailing list
> Sigtran <at> ietf.org
> https://www1.ietf.org/mailman/listinfo/sigtran

--

-- 
Brian F. G. Bidulock
bidulock <at> openss7.org
http://www.openss7.org/
Brian F. G. Bidulock | 3 Oct 2005 19:14
Favicon

Re: recommendations for M3UA/SUA BEATs when running over SCTP

Jeff,

Jeff Morriss wrote:                                                                   (Mon, 03 Oct 2005 10:29:40)
> > 
> > 
> > Because the application is stuck.
> 
> No, the application is fine.  But (one of) the SCTP assocs it is using 
> is stuck (due to a bug or whatever).

Splitting hairs.  Either the SCTP implementation of appplication are
broken.  Fix them.

> 
> > 
> > 
> > A similar bug in SS7 would cause problems too.  
> 
> Hmmm, I'm not so sure...
> 
> MTP2 has T7 whose expiry will kill the link for excessive delay of ACK 
> and T6 which will kill the link due to excessive congestion.  SCTP has 
> neither, though it does, as you say, have an optional (and barely 
> documented) data lifetime.  (I'll have to look at how many SCTPs 
> implement that...)

No.  T7 is equivalent to the SCTP RTO.  In your situation the messages
have passed SCTP and been delivered to the user (receive buffer).  The
user is not collecting them.

If an SS7 link acknowledges messages and places them in the RB and the
MTP3 does not collect them and local processor outage is not signalled,
the same thing will occur.  There would be no T7 to expire.  The
implementation might SIB at some point, but that would be implementation
dependent.

Careful and correct design (and, yes, of course, accurate realization)
ensures that this cannot happen.  SS7 Level 2 does not acknowledge
message until they have actually been delivered to MTP3, not just the
receive buffer.

If you need such an acknowegement, use CorrelationId/CorrelationId Ack
in M3UA.

> 
> MTP3 has (optional) periodic SLTMs in case those two MTP2 safeguards 
> fail.

SLTM is not used in this way.

> M3UA has (optional) BEATs but doesn't recommend them; neither 
> does it recommend using SCTP's lifetime feature.

BEATs are more useful for flushing data, like COO/COA, CBD/CBA.
See the corrid draft.

> 
> These differences mean, to me, that M3UA (as specified) has a big hole 
> in it (through which someone could drive several hundred thousand call 
> failures).

I disagree.  I also agree with Michael.  BEAT should not be used in this
fashion.

> 
>  > For SS7, validation and
> > interworking testing is performed to ensure that the protocol stacks are
> > not so poorly designed.
> 
> No amount of testing and validation will uncover every single bug in the 
> system.  (And I do hope the above was a typo and you do you know the 
> difference between a bug and a design problem.)

If you know the difference between one kind of defect and another, perhaps
you know the difference between validation testing and ad hoc testing.

> > 
> > So abort it.  Set a lifetime or a buffer threshold to abort.  But your
> > still going to loose all buffered messages or risk duplicating them if
> > you don't follow the corrid procedures.
> 
> I'll look into that.  But should the RFCs be updated to have some kind 
> of recommendation?  I'd hate for other people/implementations to run 
> into the same problem (isn't that part of what recommendations are for?).

That's what this mailing list is for.  I also doubt that anyone else has
your application sticky bug problem.

> 
> >>As per above, we're not talking about congestion nor queuing delay. 
> >>We're talking about a "stuck" assoc.  Those BEATs/SLTMs will _never_ be 
> >>responded to; in MTP3 2 such failures will cause the link to be failed 
> >>(T1.111.7 section 2.2).
> > 
> > 
> > No.  Not while in service, only during activation.  It is optional to
> > even send SLTM while in service.
> 
> Agreed (that's why I specified that I was talking about periodioc 
> SLTMs).  But MTP3 can afford to make that optional since MTP2 has T6 and T7.

Yes, and SCTP provids M3UA with many ways to monitor the health and
performance of an SCTP association.  You can even retrieve the SRTT,
current RTO, congestion window and others that MTP3 cannot do from MTP2.

At what level of degradation of the association the application wishes
to abort it is an operational consideration that will vary from application
to application, network to network, and operator to operator.

If it is USSD e-mail that is being transferred, I might not care if the
application sticks until Tuesday.  If it is an ISUP call setup, I might
care if it sticks for more than two seconds.

Set your SCTP lifetimes accordingly.

> 
> > I remember in the early days of SS7 the switch line modules used to
> > nicely send an echo SLTMs even though MTP on the front-end was dead.
> > Sending heartbeats does not cure anything.  Proper design of the
> > application does.
> 
> Sounds like that MTP was poorly designed; there's no point in having 
> health checks if they're responded to by the wrong layer.

You do not understand the SLTM.  It is not intended to be used to check
the health of MTP3.  It is intended to be used to check for digital
cross connect and loopback problems.

If you ever worked with SS7 you might know that an MTP Level 2 link will
align nicely against itself if you loop back transmit onto receive.  The
mistaken presence of a loopback on an SS7 circuit could endanger an STP
or SP aligning links, particularly during emergencies where a
significant level of traffic would otherwise be applied to the link.

The SLTM detects loopbacks on links and is performed just after
alignment and will fail the link if a loopback is detected.

Another common cross-connect problem is attaching the transmit of one
SS7 link to the receive of the other.  Another is connecting the wrong
links in a linkset to each other (that will cause problems with a
changeover, changeback or other link management is performed).  SLTM
detects these cross-connect problems too.

STLM could be performed periodically as well to detect cross-connect
problems that occurred that did not cause the link to fail, but, because
it is very unlikely that a cross-connect change would leave the SS7 link
standing, taking the link out of service due to a periodic SLTM is not
done.  It is more likely that the SLTM just got delayed.  If a
cross-connect change is made while the link is in service, it will
fail and the cross-connect problem will be discovered by the SLTM if
the link can still successfully align.  Most switches only generate a
minor alarm condition if a periodic SLTM fails.

Your insistence on BEAT is similar.  If your M3UA data messages are
stuck between M3UA and the application, BEAT will not detect that,
because it will be responded to below the blockage.  If the blockage is
below that, the BEAT will also be delayed.

Use SCTP lifetimes or the other mechanisms afforded by SCTP to monitor
that that grade of service of the association meets your expectations.

> 
> So that there are never, ever, sustained call failures?  Or at least not 
> avoidable ones?

What you consider an avoidable failure by resetting an association, a
correct implementation sees as an unnecessary failure (of the
association).

> 
> If SIGTRAN hopes to be as reliable as SS7, "the other side is broke, 
> it's their problem" is the wrong attitude to have.

But that is SS7's attitude.  Attitude and reliability it seems have
nothing to do with each other.

> > 
> > Which is precisely why congestion should be signalled, so the rest
> > of the network can avoid sending to the overloaded component.
> 
> But it still misses the point that the application itself is _not_ 
> congested.

Yes it is.  Due to a bug maybe, but it is congested.

> > 
> > So set lifetime and abort the assocation.  But that is surely not only
> > implementation dependent but is an operational consideration that has
> > not place in the protocol specification.
> 
> Hmmm, I disagree.  Reading the specs would lead one to believe that one 
> is reasonably safe in letting SCTP manage the viability of the assocs 
> (and M3UA's ability to send data on them).  This could lead implementors 
> to ignore this potential problem until they run into it.

Read the SCTP Applicability RFC.

--brian

--

-- 
Brian F. G. Bidulock
bidulock <at> openss7.org
http://www.openss7.org/
Haresign Lincoln | 3 Oct 2005 19:49
Favicon

RE: recommendations for M3UA/SUA BEATs when running overSCTP

To whom it may concern:

I believe Jeff has a valid point.  To make the argument "Either the SCTP
implementation of appplication are broken.  Fix them." is not a real
valid argument.

When you deploy in the field against other people's switches, you can
not fix their switches.  And if there is a scenario that can lead to a
stuck condition because one side is badly behaved, then we should add
protection in to the implementation.

In our implementation, we are taking down the associations if the other
side is misbehaving.  We are operating outside the specification in this
case.  But I don't think any customers will complain when we explain to
them that we are protecting against lost revenue.

I agree that we should modify the recommendations to protect against all
bad scenarios.  The ITU/ANSI specifications have undergone 25 years of
editting to improve them.  However, if you fail to get agreement, I
would operate outside the recommendation in this case.  You won't find
any interoperabilty problems and you will keep your customers happy.

Regards,
Lincoln

-----Original Message-----
From: sigtran-bounces <at> ietf.org [mailto:sigtran-bounces <at> ietf.org] On
Behalf Of Brian F. G. Bidulock
Sent: Monday, October 03, 2005 1:15 PM
To: Jeff Morriss
Cc: sigtran <at> ietf.org
Subject: Re: [Sigtran] recommendations for M3UA/SUA BEATs when running
overSCTP

Jeff,

Jeff Morriss wrote:
(Mon, 03 Oct 2005 10:29:40)
> > 
> > 
> > Because the application is stuck.
> 
> No, the application is fine.  But (one of) the SCTP assocs it is using

> is stuck (due to a bug or whatever).

Splitting hairs.  Either the SCTP implementation of appplication are
broken.  Fix them.

> 
> > 
> > 
> > A similar bug in SS7 would cause problems too.  
> 
> Hmmm, I'm not so sure...
> 
> MTP2 has T7 whose expiry will kill the link for excessive delay of ACK

> and T6 which will kill the link due to excessive congestion.  SCTP has

> neither, though it does, as you say, have an optional (and barely
> documented) data lifetime.  (I'll have to look at how many SCTPs 
> implement that...)

No.  T7 is equivalent to the SCTP RTO.  In your situation the messages
have passed SCTP and been delivered to the user (receive buffer).  The
user is not collecting them.

If an SS7 link acknowledges messages and places them in the RB and the
MTP3 does not collect them and local processor outage is not signalled,
the same thing will occur.  There would be no T7 to expire.  The
implementation might SIB at some point, but that would be implementation
dependent.

Careful and correct design (and, yes, of course, accurate realization)
ensures that this cannot happen.  SS7 Level 2 does not acknowledge
message until they have actually been delivered to MTP3, not just the
receive buffer.

If you need such an acknowegement, use CorrelationId/CorrelationId Ack
in M3UA.

> 
> MTP3 has (optional) periodic SLTMs in case those two MTP2 safeguards 
> fail.

SLTM is not used in this way.

> M3UA has (optional) BEATs but doesn't recommend them; neither does it 
> recommend using SCTP's lifetime feature.

BEATs are more useful for flushing data, like COO/COA, CBD/CBA.
See the corrid draft.

> 
> These differences mean, to me, that M3UA (as specified) has a big hole

> in it (through which someone could drive several hundred thousand call

> failures).

I disagree.  I also agree with Michael.  BEAT should not be used in this
fashion.

> 
>  > For SS7, validation and
> > interworking testing is performed to ensure that the protocol stacks

> > are not so poorly designed.
> 
> No amount of testing and validation will uncover every single bug in 
> the system.  (And I do hope the above was a typo and you do you know 
> the difference between a bug and a design problem.)

If you know the difference between one kind of defect and another,
perhaps you know the difference between validation testing and ad hoc
testing.

> > 
> > So abort it.  Set a lifetime or a buffer threshold to abort.  But 
> > your still going to loose all buffered messages or risk duplicating 
> > them if you don't follow the corrid procedures.
> 
> I'll look into that.  But should the RFCs be updated to have some kind

> of recommendation?  I'd hate for other people/implementations to run 
> into the same problem (isn't that part of what recommendations are
for?).

That's what this mailing list is for.  I also doubt that anyone else has
your application sticky bug problem.

> 
> >>As per above, we're not talking about congestion nor queuing delay. 
> >>We're talking about a "stuck" assoc.  Those BEATs/SLTMs will _never_

> >>be responded to; in MTP3 2 such failures will cause the link to be 
> >>failed
> >>(T1.111.7 section 2.2).
> > 
> > 
> > No.  Not while in service, only during activation.  It is optional 
> > to even send SLTM while in service.
> 
> Agreed (that's why I specified that I was talking about periodioc 
> SLTMs).  But MTP3 can afford to make that optional since MTP2 has T6
and T7.

Yes, and SCTP provids M3UA with many ways to monitor the health and
performance of an SCTP association.  You can even retrieve the SRTT,
current RTO, congestion window and others that MTP3 cannot do from MTP2.

At what level of degradation of the association the application wishes
to abort it is an operational consideration that will vary from
application to application, network to network, and operator to
operator.

If it is USSD e-mail that is being transferred, I might not care if the
application sticks until Tuesday.  If it is an ISUP call setup, I might
care if it sticks for more than two seconds.

Set your SCTP lifetimes accordingly.

> 
> > I remember in the early days of SS7 the switch line modules used to 
> > nicely send an echo SLTMs even though MTP on the front-end was dead.
> > Sending heartbeats does not cure anything.  Proper design of the 
> > application does.
> 
> Sounds like that MTP was poorly designed; there's no point in having 
> health checks if they're responded to by the wrong layer.

You do not understand the SLTM.  It is not intended to be used to check
the health of MTP3.  It is intended to be used to check for digital
cross connect and loopback problems.

If you ever worked with SS7 you might know that an MTP Level 2 link will
align nicely against itself if you loop back transmit onto receive.  The
mistaken presence of a loopback on an SS7 circuit could endanger an STP
or SP aligning links, particularly during emergencies where a
significant level of traffic would otherwise be applied to the link.

The SLTM detects loopbacks on links and is performed just after
alignment and will fail the link if a loopback is detected.

Another common cross-connect problem is attaching the transmit of one
SS7 link to the receive of the other.  Another is connecting the wrong
links in a linkset to each other (that will cause problems with a
changeover, changeback or other link management is performed).  SLTM
detects these cross-connect problems too.

STLM could be performed periodically as well to detect cross-connect
problems that occurred that did not cause the link to fail, but, because
it is very unlikely that a cross-connect change would leave the SS7 link
standing, taking the link out of service due to a periodic SLTM is not
done.  It is more likely that the SLTM just got delayed.  If a
cross-connect change is made while the link is in service, it will fail
and the cross-connect problem will be discovered by the SLTM if the link
can still successfully align.  Most switches only generate a minor alarm
condition if a periodic SLTM fails.

Your insistence on BEAT is similar.  If your M3UA data messages are
stuck between M3UA and the application, BEAT will not detect that,
because it will be responded to below the blockage.  If the blockage is
below that, the BEAT will also be delayed.

Use SCTP lifetimes or the other mechanisms afforded by SCTP to monitor
that that grade of service of the association meets your expectations.

> 
> So that there are never, ever, sustained call failures?  Or at least 
> not avoidable ones?

What you consider an avoidable failure by resetting an association, a
correct implementation sees as an unnecessary failure (of the
association).

> 
> If SIGTRAN hopes to be as reliable as SS7, "the other side is broke, 
> it's their problem" is the wrong attitude to have.

But that is SS7's attitude.  Attitude and reliability it seems have
nothing to do with each other.

> > 
> > Which is precisely why congestion should be signalled, so the rest 
> > of the network can avoid sending to the overloaded component.
> 
> But it still misses the point that the application itself is _not_ 
> congested.

Yes it is.  Due to a bug maybe, but it is congested.

> > 
> > So set lifetime and abort the assocation.  But that is surely not 
> > only implementation dependent but is an operational consideration 
> > that has not place in the protocol specification.
> 
> Hmmm, I disagree.  Reading the specs would lead one to believe that 
> one is reasonably safe in letting SCTP manage the viability of the 
> assocs (and M3UA's ability to send data on them).  This could lead 
> implementors to ignore this potential problem until they run into it.

Read the SCTP Applicability RFC.

--brian

--
Brian F. G. Bidulock
bidulock <at> openss7.org
http://www.openss7.org/

_______________________________________________
Sigtran mailing list
Sigtran <at> ietf.org
https://www1.ietf.org/mailman/listinfo/sigtran

______________________________________________________________________
  This email message has been scanned by PineApp Mail-Secure and has
been found clean.
Michael Tuexen | 3 Oct 2005 20:11
Picon

Re: recommendations for M3UA/SUA BEATs when running over SCTP

Hi Jeff,

see my comments in-line.

Best regards
Michael

On Oct 3, 2005, at 16:12 Uhr, Jeff Morriss wrote:

>
> Hi Michael,
>
> Michael Tuexen wrote:
>> Hi Jeff,
>> if the receiving SCTP endpoint gives you an a_rwnd = 0 the 
>> application is
>> too slow to handle all traffic. Sending application level HB does not 
>> help here,
>> it hurts because it puts even more messages on the receiver.
>
> Actually the application is fine.  In fact I believe, in this case, it 
> has a second assoc which is in good shape.  It's just got one assoc 
> which is "stuck."
This means that you have a broken SCTP implementation. What does is 
mean in particular:
Does the SCTP SACK everything but does not deliver things? Does it stop 
SACKing?
>
>> I think the sender has to supervise its sending queue and start some 
>> application
>> level flow control mechanisms like sending SIBs to its local user in 
>> the SS7 world.
>> So I think not sending these messages AND supervising the send queue 
>> is the way to go.
>
> The problem then becomes one of detecting when congestion gets to be 
> "too much."  I guess a timer is needed.
As far as I understand things, I would supervise the SCTP send queue 
and depending on the
size perform similar actions as in the SS7 case.
>
> Regards,
> -Jeff
>
>> On Sep 30, 2005, at 20:41 Uhr, Jeff Morriss wrote:
>>>
>>> Hi list,
>>>
>>> Neither the M3UA nor the SUA RFC recommend sending BEATs when used 
>>> over SCTP.  ETSI goes a bit further and precludes the use of BEATs.
>>>
>>> However, SCTP (in particular, the I-G) allows the association to get 
>>> "stuck" such that it will pass no data: a receiver is allowed to 
>>> hold its window closed "for an indefinite time" (new text for 
>>> section 6.1 A).
>>>
>>> I have encountered a number of problems in peer SCTPs which have 
>>> caused those peers to close their windows and keep them closed 
>>> indefinately. This leads to very serious traffic loss since M3UA 
>>> will continue to queue mesages to the "stuck" association until the 
>>> queues fill up and/or overflow.
>>>
>>> Using an end-to-end health check mechanism (such as M3UA or SUA 
>>> BEATs) solves this problem pretty nicely (similar to the way 
>>> periodic SLTMs do in MTP3).  However, I suspect many M3UA/SUA 
>>> implementors may not implement or may not turn on (by default) BEATs 
>>> under the (false, due to the reasons listed above) pretense that 
>>> SCTP heartbeats are sufficient to ensure the viability of the 
>>> association.
>>>
>>> What to do?  Should the RFCs recommend (or even require) using 
>>> BEATs, even when run over SCTP?  (This is just a recommendation 
>>> change for the RFCs but is a complete reversal for ETSI--but I guess 
>>> that's not this list's problem.)
>>>
>>> Regards,
>>> -Jeff
>>>
>>> _______________________________________________
>>> Sigtran mailing list
>>> Sigtran <at> ietf.org
>>> https://www1.ietf.org/mailman/listinfo/sigtran
>>>
>

Gmane