Академический Документы
Профессиональный Документы
Культура Документы
Content
1 Overview .......................................................................................................................................... 5 1.1 Topic and Goal of Lesson ...................................................................................................... 5 1.2 Exercises .................................................................................................................................. 5 1.3 Audience and Preconditions.................................................................................................. 5 2 The best way to proceed ............................................................................................................... 6 2.1 Recommendation of the Lecturer ......................................................................................... 6 2.2 Schedule and Timing .............................................................................................................. 6 3 Event State Publication ................................................................................................................. 7 3.1 The state publication model .................................................................................................. 8 3.2 Protocol overview .................................................................................................................. 10 3.3 Publish framework applied for presence ........................................................................... 11 4 Event Packages ............................................................................................................................ 15 4.1 Presence Event Package .................................................................................................... 15 4.1.1 Package definition ......................................................................................................... 15 4.1.2 Presence information .................................................................................................... 16 4.2 Watcher Information Event Template-Package ............................................................... 22 4.3 An INVITE-Initiated Dialog Event Package for SIP ......................................................... 24 4.4 Further event packages ....................................................................................................... 28 4.4.1 Message Summary and Message Waiting Indication Event .................................. 28 4.4.2 Event Package for Conference State ......................................................................... 29 4.4.3 Event Package for Registrations ................................................................................. 29 4.4.4 Refer event ..................................................................................................................... 30 4.4.5 Debug Event ................................................................................................................... 30 5 The UPDATE method .................................................................................................................. 31 6 Resource Management ............................................................................................................... 33 6.1 Protocol overview .................................................................................................................. 34 6.2 SDP parameters and attributes .......................................................................................... 36 6.3 Option Tag .............................................................................................................................. 38 7 Third Party Session Control ........................................................................................................ 39 8 REFER Method ............................................................................................................................. 40 8.1 Referred-By header field ...................................................................................................... 42 8.2 Replaces header field ........................................................................................................... 43 9 Conferencing ................................................................................................................................. 48 9.1 Tightly Coupled SIP Conference ........................................................................................ 49 9.1.1 Creation of an Ad-hoc conference .............................................................................. 50 9.1.2 Immediate Conference creation with a URI list ........................................................ 51
2
9.1.3 Floor Control ................................................................................................................... 52 9.2 Decentralized Conferencing ................................................................................................ 52 9.3 Joining a conference............................................................................................................. 52 9.4 Join header field .................................................................................................................... 53 10 SIP Based Messaging ............................................................................................................... 54 10.1 Page Mode Instant Messaging ......................................................................................... 54 10.2 Session Mode Instant Messaging with MSRP ............................................................... 55 11 INFO method............................................................................................................................... 59 12 Service Configuration ................................................................................................................ 62 12.1 Overview on XML ................................................................................................................ 62 12.2 The XML Configuration Access Protocol (XCAP) ......................................................... 63 12.2.1 XCAP Overview ........................................................................................................... 63 12.2.2 XCAP Application usage ............................................................................................ 65 12.2.3 XCAP URIs ................................................................................................................... 65 12.2.4 Entity Tags and conditional operations .................................................................... 66 12.2.5 Subscriptions to changes in XML documents......................................................... 68 13 NAT and Firewall Traversal ...................................................................................................... 71 13.1 Network Address Translation ............................................................................................ 71 13.2 Firewalls................................................................................................................................ 72 13.3 Problems caused by NAT and Firewall Traversal ......................................................... 73 13.4 SIP Protocol Enhancements ............................................................................................. 75 13.4.1 Symmetric Response Routing ................................................................................... 75 13.4.2 Symmetric RTP/RTCP ................................................................................................ 76 13.4.3 RTCP attribute in SDP ................................................................................................ 77 13.5 Classical NAT and FW Traversal Solutions ................................................................... 77 13.5.1 NAT and FW categorisation....................................................................................... 78 13.5.2 (Classic) STUN protocol ............................................................................................. 79 13.6 The perfect NAT and FW Traversal Solution ................................................................. 80 13.6.1 NAT and FW Behavior Requirements...................................................................... 81 13.6.2 The new STUN protocol ............................................................................................. 82 13.6.3 Traversal Using Relays around NAT (TURN)......................................................... 86 13.6.4 Interactive Connectivity establishment .................................................................... 89 13.6.5 Client initiated connections ........................................................................................ 92 13.7 External and proprietary Solutions ................................................................................... 94 13.7.1 Application Layer Gateways ...................................................................................... 94 13.7.2 UPnP.............................................................................................................................. 95 13.7.3 Skype ............................................................................................................................. 95 13.7.4 SIP Express Router ..................................................................................................... 95 14 Session Timer ............................................................................................................................. 97 15 Caller Preferences and UA Capabilities ............................................................................... 101 15.1 User Agent Capabilities ................................................................................................... 101 15.1.1 Feature tags ............................................................................................................... 102
3
15.1.2 Expression of capabilities ......................................................................................... 103 15.2 Caller Preferences ............................................................................................................ 104 15.2.1 Feature preferences .................................................................................................. 104 15.2.2 Request handling preferences ................................................................................ 105 16 Global Routable User URI (GRUU)....................................................................................... 107 17 Identity Management ............................................................................................................... 111 18 ENUM ......................................................................................................................................... 115 19 Privacy Mechanism .................................................................................................................. 117 20 Reason ....................................................................................................................................... 119 21 Path............................................................................................................................................. 120 22 Service-Route ........................................................................................................................... 122 23 Request History ........................................................................................................................ 124 24 SIP-Connected-Id ..................................................................................................................... 127 25 Questions ................................................................................................................................... 129
1 Overview
1.1 Topic and Goal of Lesson
The main topic of the Lecture is the basic understanding of the Session Initiation Protocol (SIP). This protocol has its origin in the Internet standardization but was later on also accepted by traditional network operators as the basis for a modern IP based network architecture. This is the second lecture note on SIP. It builds on the content of the first lecture note (basic SIP protocol) and covers some of the most important protocol extensions to SIP with a specific view of the application of SIP in commercial operator networks including IMS1. At the end of the lesson the student will have a good understanding of SIP. He will be able to analyze SIP message flows and perhaps find bugs and is able to identify misbehavior in implementations. In ideal case the lesson will be accompanied by practical exercises in the lab using e.g. open source implementations of SIP servers2 and free SIP clients. A VMware3 image is always available at the University institute which enables the student to verify and enhance the basic knowledge on SIP by running a SIP server on the own notebook computer. The lecture will also encourage the interested student to look into RFCs in certain situations to get first-hand information on more details of the protocol and to get acquainted with reading an RFC.
1.2 Exercises
The last chapter of the lecture note includes a list of questions on each chapter which the student should be able to answer after the lecture. These questions are also good basis for preparing to the final examination.
1 2
IMS = IP Multimedia Subsystem; a SIP based network architecture used by fixed and mobile network operators Examples of open source SIP servers are: Kamailio - the Open Source SIP Server at http://www.kamailio.org/ The OpenSIPS Project at: http://www.opensips.org/ SER - SIP Express Router (the mother of above projects) at: http://www.iptel.org/ 3 VMware: A SW - virtualization product to run e.g. a GNU/Linux server on a notebook computer on top of Windows
5
David
Frank
Alice
Bob
Charly
Esther
Figure 1: Mutual subscription to presence state information To overcome the scalability issue a framework for publishing event states5 has been defined. Without this extension a resource has to send all NOTIFY requests itself and will probably run into performance problems when the group of watchers becomes large. The event publishing framework enables an event publishing agent (EPA) to publish its state change to an event state compositor (ESC) which aggregates the state from various EPAs of a resource. The aggregated state is offered to a state agent, which acts on behalf of the resource and processes SUBSCRIBE
4 5
RFC 3265: SIP-Specific Event Notification RFC 3903: SIP Extension for Event State Publication
7
requests from watchers and sends NOTIFY requests in response. The resources never get any of the SUBSCRIBE requests. Thus the task of sending NOTIFY requests is delegated to a state agent which is implemented on a powerful server. A further advantage of publishing mechanism is that the state agent may correlate and composite state information of a distributed resource to a single NOTIFY request, which means a reduction in network traffic.
SU
Watcher 1
BS NO CRIB TIF E Y
PUBLISH
IS H
PU BL
BL
H IS
PU
Watcher 2
Figure 2: State publication model In this example the event state compositor receives state information about a distributed resource, aggregates this information and offers it to a state agent. The state agent acts on behalf of the resources, receives the SUBSCRIBE requests of watchers and sends NOTIFY requests when the (composite) state changes. As a practical example we may apply this concept to the well-known presence state. This means: A resource (the presence state of a person6) may use three user agents which publish presence state (e.g. notebook, PDA and phone). Each of the user agents publishes its actual state (on-line/off-line) to the event state compositor using a PUBLISH request.
For the presence entity of a person also the term presentity has been defined.
8
The event state compositor aggregates the presence state information and offers the composite state to the state agent. When a watcher requests presence state information for a person he sends a SUBSCRIBE request and the SUBSCRIBE request is forwarded to the state agent. Then it will receive a composite state information in one NOTIFY request sent by the state agent.
The state publication model may be applied to any event, for which an event package has been defined. A further example may be the message waiting event. The state resources in this case are different mailboxes (e-mail, voice-mail, etc) of a user. These resources send PUBLISH requests, whenever e.g. a new message arrives. The event state compositor aggregates the state information and the state agent sends the composites state of all mailboxes in a NOTIFY request. Figure 3 shows an example message flow of combined SUBSCRIBE, PUBLISH and NOTIFY operations. Details to the message flow are explained in the next chapter.
Watcher
Initial publication
State refresh
State modification
SIP-ETag: This header field is generated by the ESC and contains an Entity-Tag. Whenever an ESC receives a PUBLISH request it marks its actual state with a SIP-ETag value and returns this value in an 200 (OK) response. The ETag value is then used by the EPA to distinguish initial state publication from refreshes and modifications. SIP-If-Match: The EPA re-uses the latest SIP-ETag value received from the ESC and repeats that value in a new PUBLISH request. The first (initial) PUBLISH request of an EPA does not contain a SIP-If-Match header field.
The different publication operations are distinguished by the presence of the SIP-If-Match header field, the presence of a message body and the value of the Expires header field according to table below.
In case the entity tag in the SIP-If-Match header field in a PUBLISH request does not contain the expected value the ESC will reject the request with a failure response 412 (Conditional Request Failed). This is a new failure response code defined by RFC 3903.
More details on the presence event package can be found in chapter 4.1 on page 15. Figure 4 shows the content of an initial PUBLISH request and the 200 (OK) response for a presence event. The actual state information for presence within the message body is not shown in this figure. It is an XML formatted Presence Information Data Format (PIDF) document.
11
PUBLISH sip:alice@example.com SIP/2.0 Via: SIP/2.0/UDP pua.example.com;branch=z9hG4bK652hsge To: <sip:alice@example.com> From: <sip:alice@example.com>;tag=1234wxyz Call-ID: 81818181@pua.example.com CSeq: 1 PUBLISH Max-Forwards: 70 Expires: 3600 Event: presence Content-Type: application/pidf+xml Content-Length: ... [Published PIDF document] SIP/2.0 200 OK Via: SIP/2.0/UDP pua.example.com;branch=z9hG4bK652hsge ;received=192.0.2.3 To: <sip:alice@example.com>;tag=1a2b3c4d From: <sip:alice@example.com>;tag=1234wxyz Call-ID: 81818181@pua.example.com CSeq: 1 PUBLISH SIP-ETag: dx200xyz Expires: 1800 Figure 4: PUBLISH initial state publication The initial PUBLISH request does not include an SIP-If-Match header field but the 200 (OK) contains a SIP-ETag header field as expected. The example also shows that the Presence Server has reduced the Expires header field value in 200 (OK) from 3600 to 1800 (seconds). The next figure (Figure 5) shows a state refresh cycle when the presence agent determines that the previously published state of Figure 4 is about to expire. The PUBLISH request now uses the value of the previously received SIP-ETag in the SIP-If-Match header field. It does not contain a message body because the state did not change. In the 200 (OK) response the presence server inserts a new SIP-ETag value. As no state change has occurred the presence server in this case does not send any NOTIFY requests (refer also to Figure 3).
12
PUBLISH sip:alice@example.com SIP/2.0 Via: SIP/2.0/UDP pua.example.com;branch=z9hG4bK771ash02 To: <sip:alice@example.com> From: <sip:alice@example.com>;tag=1234kljk Call-ID: 98798798@pua.example.com CSeq: 1 PUBLISH Max-Forwards: 70 SIP-If-Match: dx200xyz Expires: 1800 Event: presence Content-Length: 0 SIP/2.0 200 OK Via: SIP/2.0/UDP pua.example.com;branch=z9hG4bK771ash02;received=192.0.2.3 To: <sip:alice@example.com>;tag=2affde434 From: <sip:alice@example.com>;tag=1234kljk Call-ID: 98798798@pua.example.com CSeq: 1 PUBLISH SIP-ETag: kwj449x Expires: 1800 Figure 5: PUBLISH state refresh Figure 6 shows the situation of a state change of a presence user agent. When the PUA detects a change of state it sends a PUBLISH request with an updated state information in the message body. The SIP-If-Match header field again refers to the last received entity tag value.
13
PUBLISH sip:alice@example.com SIP/2.0 Via: SIP/2.0/UDP pua.example.com;branch=z9hG4bKcdad2 To: <sip:alice@example.com> From: <sip:alice@example.com>;tag=54321mm Call-ID: 5566778@pua.example.com CSeq: 1 PUBLISH Max-Forwards: 70 SIP-If-Match: kwj449x Expires: 1800 Event: presence Content-Type: application/pidf+xml Content-Length: ... [Published PIDF Document] SIP/2.0 200 OK Via: SIP/2.0/UDP pua.example.com;branch=z9hG4bKcdad2 ;received=192.0.2.3 To: <sip:alice@example.com>;tag=effe22aa From: <sip:alice@example.com>;tag=54321mm Call-ID: 5566778@pua.example.com CSeq: 1 PUBLISH SIP-ETag: qwi982ks Expires: 1800
14
4 Event Packages
The event notification framework of RFC 3265 is only the framework to implement event handling. Every specific event hast to be specified in a separate RFC as a so called event package. Event packages define a specific instantiation of the event notification framework. An event package defines the specific event and its characteristics like name of the event, message bodies of NOTIFY and SUBSCRIBE, state information etc.7 This chapter introduces some frequently used event packages. An actual list of already specified event packages (referring to their specific RFC) can be found at IANA the Internet Assigned Numbers Authority8. Actually there are 20 event packages already specified and some other are in discussion in various IETF working groups.
XML document server (XDMS). Centralised XML documents can be provisioned by the XCAP 9 protocol. The authorisation rules may define different levels of authorisation, so that not every watcher will get the same amount of information. A geographic location information can be part of the presence state but will perhaps not be offered to everybody. In case an authorisation cannot be solved immediately via some policy rules the SUBSCRIBE request is answered with a 202 (Accepted) response and the Subscription-State header field is set to pending. The NOTIFY request in this case (which must be sent in any case when a SUBSCRIBE has been received) will only contain a neutral or dummy state information. The owner of the presentity may be notified of the new authorisation request by subscribing to a watcher information event on its own presence. The watcher information event is a kind of meta event which can be applied to every event package. Further details on the watcher information event can be found in chapter 4.2 on page 18. Message Bodies The SUBSCRIBE request may contain a message body describing some filter information. A filter may reduce the amount of state information to only a specific aspect where the watcher may be interested in, e.g. the possibility to send instant messages. The NOTIFY request contains the presence state information, the format of which may have different levels. It may be the simple PIDF based information or enriched by various extensions (see next chapter). In any case the format of the NOTIFY message body must correlate to the format the watcher is able to understand (Accept header field of SUBSCRIBE request).
RFC 4825: The XML Configuration Access Protocol (XCAP) RFC 3859: Presence Information Data Format 11 RFC 3589: Common Profile for Presence
10
16
<?xml version="1.0" encoding="UTF-8"?> <presence xmlns="urn:ietf:params:xml:ns:pidf" entity="pres:someone@example.com"> <tuple id="sg89ae"> <status> <basic>open</basic> </status> <contact priority="0.8">tel:+09012345678</contact> </tuple> </presence> Figure 7: Simple presence data in PIDF structure
<presence>
The <presence> element contains - an entity element with the name of the presentity - the namespace declaration #n Tuples provide a way of segmenting presence information; Each <tuple> element must contain an id attribute
<tuple>
<status> <basic> #n The optional <basic> element contains either open or closed, expressing the ability to receive instant messages
<extension>
<extension> <contact>
#n
<contact> element is optional, contains a communicatiuons address, may contain a priority attribute #n <note> element is optional, may contain a human readable comment
<note> <timestamp>
<timestamp> element is optional and contains date and time of status change of this tuple
<note>
#n
<extension>
#n
The Presence Data Model (RFC 4479) uses this two extension elements for the <person> and <device> components
17
Data model for presence The PIDF data format for presence has been used by the SIMPLE group as the basis of a presence data model12. This data model for presence offers the possibility to map real-world communications systems built around SIP in particular into a presence document. There are three components assigned to a presentity in the data model: the person, the service and the device. Each attribute in a presence document is affiliated to the service, person or device because they describe a facet of that service, presentity or device. Figure 9 shows that model and possible relationships between the components. The person component models information about the presentity under consideration. A person may represent a group such as a help desk. Examples of presence attributes related to a person are her/his activity, her/his willingness to communicate, her/his picture. The model supports only one person component per presentity. The service data components model the forms of communications for interacting with the presentity. Examples of services through which a presentity may communicate are sessions (audio, video), Instant Messaging, E-mail etc. The device data components model the physical equipment in which services execute: for instance a PC, a PDA, or a mobile phone. A given service may execute in more than one device, therefore the mapping of services to devices is many to many. Devices are uniquely identified with a device ID.
Presentity URI
Person
Service
Service
Service
Device
Device
Device
12
The presence data model of Figure 9 has now been mapped to the PIDF data format of Figure 8. The solution was to use the existing <tuple> element to represent the service and to add the <person> and <device> elements as extension elements. Extension to PIDF: RPID PIDF does not define presence attributes beyond the <basic> status element. RFC 448013 defines therefore Rich Presence Extensions to PIDF. These are additional presence attributes that extend the PIDF <tuple> element and the <device> and <person> elements defined in the data model. The extensions have been chosen to provide features common in existing presence systems, in addition to elements that could readily be derived automatically from existing sources of presence, such as calendaring systems or communication devices, or sources describing the user's current physical environment. Table 1 shows which component of the data model for presence can be enriched by the elements defined in RPID. It also indicates whether from/until attributes are applicable as well as whether a <note> element can be included in the element. Elements that do not have from/until parameters must not appear more than once in each <person>, <tuple>, or <device>. The additional data elements defined by RPID are shortly explained below. This should give an impression what detailed presence information may be offered.
from/until attributes <activities> <class> <deviceID> <mood> <place-is> <place-type> <privacy> <relationship> <service-class> <sphere> <status-icon> <time-offset> <user-input> x x x x x x x x
<note> x
<person> x x
<tuple> service x x
<device>
x x x x x x
x x x x x x x x x x x x x x
13
The <activities> element describes what the person is currently doing. A person can be engaged in multiple activities at the same time, e.g., traveling and having a meal. This information enables a watcher to evaluate how appropriate a communication attempt is and what is the better way for communicating. Here are some examples of activities: away, appointment, meeting, meal, breakfast, lunch, dinner, busy, holiday, in-transit, looking-for-work (for paid work), sleeping, travel... Most of them can be derived from calendar information. The <class> element describes the class of the service, device, or person. Multiple elements can have the same class name within a presence document, but each person, service, or device can only have one class label. The naming of classes is left to the presentity. The <deviceID> element represents a way to map a service component to a device component. One service can be provided by multiple devices, so that each service tuple may contain zero or more <deviceID> elements. The <mood> element describes the mood of the person. For example: confused, amazed. The <place-is> element describes properties of the place the person is currently at. This offers the watcher an indication of what kind of communication is likely to be successful. Each major media type has its own set of attributes: - audio (noisy, ok, quiet, unknown) - video (toobright, ok, dark, unknown) - text (uncomfortable, inappropriate, ok, unknown) The <place-type> element describes the type of place the person is currently at. This offers the watcher an indication of what kind of communication is likely to be appropriate. The initial set of values is defined in RFC458914 The <privacy> element indicates which types of communication third parties in the vicinity of the presentity are unlikely to be able to intercept accidentally or intentionally. The <relationship> element extends <tuple> and designates the type of relationship an alternate contact has with the presentity. This element is provided only if the tuple refers to somebody other than the presentity. Relationship values include "family", "friend", "associate" (e.g., for a colleague), "assistant", "supervisor", "self", and "unknown". The default is "self". The <service-class> element extends <tuple> and designates the type of service offered: electronic, postal, courier, freight, in-person... The <sphere> element designates the current state and role that the person plays. For example, it might describe whether the person is in a work mode, at home, or participating in activities related to some other organization such as the IETF or a church. RFC4480 does not define names for these spheres except for two common ones, "work" and "home", as well as "unknown".
14
Spheres allow the person to easily turn on or off certain rules that depend on what groups of people should be made aware of the person's status. The <status-icon> element includes a URI pointing to an image (icon) representing the current status of the person or service. The watcher may use this information to represent the status in a graphical user interface. The <time-offset> element describes the number of minutes of offset from UTC at the person's current location. A positive number indicates that the local time-of-day is ahead (i.e., east of) Universal Time, while a negative number indicates that the local time-of-day is behind (i.e., west of) Universal Time. The <user-input> element records the user-input or usage state of the service or device, based on human user input, e.g., keyboard, pointing device, or voice.
Further extensions to PIDF The following extensions to PIDF are only mentioned shortly. The interested student may look into the referenced document. Timed Presence The indication of status information for time intervals, either in the past or in the future, can be achieved via the <timed-status> element, defined in RFC 488115 as a child of the <tuple> element. Contact Information RFC 448216 describes elements for providing a "business card", including references to the homepage, map, representative sound, display name, and an icon Geographic Location RFC 411917 describes an object format for carrying geographical information. It extends the 'status' element of PIDF with a complex element called 'geopriv'. SIP User Agent Capabilities The SIP User Agent Capabilities defined in RFC 3840 (see also chapter 15.1 on page 101) can be added to RPID.
15 16
RFC 4481: Timed Presence Extensions to PIDF RFC 4482: CIPID: Contact Information in PIDF 17 RFC 4119: A Presence-based GEOPRIV location object format
21
Presentity
E RIB BSC .winfo SU e enc pres
IFY NOT .winfo e enc pres
SUBSCRIBE NOTIFY
SU
E IB R C Y BS IF SU OT N
Watcher
PUA
Watcher
Watcher
Figure 10: Application of Watcher Info event package to the presence event
18
The application of the watcher info event package to the presence event is illustrated in Figure 10. An example SUBSCRIBE and NOTIFY request for presence.winfo package is shown in Figure 11. It shows the SUBSCRIBE request of the presentity B to its own watcher information event state and the NOTIFY request it receives when A subscribes to B's presence. In this case the presence subscription of A requires authorisation (status pending).. SUBSCRIBE sip:B@example.com SIP/2.0 Via: SIP/2.0/UDP pc34.example.com;branch=z9hG4bKnashds7 From: sip:B@example.com;tag=123s8a To: sip:B@example.com Call-ID: 9987@pc34.example.com Max-Forwards: 70 CSeq: 9887 SUBSCRIBE Contact: sip:B@pc34.example.com Event: presence.winfo
NOTIFY sip:B@pc34.example.com SIP/2.0 Via: SIP/2.0/UDP server.example.com;branch=z9hG4bKna66g From: sip:B@example.com;tag=xyz887 To: sip:B@example.com;tag=123s8a Call-ID: 9987@pc34.example.com Max-Forwards: 70 CSeq: 1288 NOTIFY Contact: sip:B@server.example.com Event: presence.winfo Content-Type: application/watcherinfo+xml Content-Length: ... <?xml version="1.0"?> <watcherinfo xmlns="urn:ietf:params:xml:ns:watcherinfo" version="0" state="full"> <watcher-list resource="sip:B@example.com" package="presence"> <watcher id="7768a77s" event="subscribe" status="pending">sip:A@example.com</watcher> </watcher-list> </watcherinfo> Figure 11: SUBSCRIBE an NOTIFY on presence.winfo event package The message body of the NOTIFY request contains a watcher information document. This document describes some or all of the watchers for a resource within a given package, and the state of their subscriptions. The format of the document is named application/watcherinfo+xml and is defined in RFC 385819.
19
20
Figure 12 shows an example of a minimum dialog state information carried in a dialog state XML document. <?xml version="1.0"?> <dialog-info xmlns="urn:ietf:params:xml:ns:dialog-info" version="0" state="full" entity="sip:alice@example.com"> <dialog id="as7d900as8"> <state>confirmed</state> </dialog> </dialog-info> Figure 12: Minimum dialog state information XML document The state information in Figure 12 shows the actual state of a dialog: confirmed. This means that the user agent sending this information has received 200 (OK) and is engaged in a session. When the session is finished the state information will change to terminated and again a NOTIFY request will be sent. RFC 4235 defines a dialog state machine which specifies when a certain state transition happens. The XML document contains also a version, state and entity attribute. The version contains a number which is incremented with every NOTIFY request, the state attribute describes the information as either full or partial and the entity contains the URI that identifies the user whose dialog information is provided. The next example shows more detailed dialog state information. Figure 14 shows an INVITE request sent by a UAC which is monitored by an INVITE-Initiated Dialog Event. This INVITE request evokes a NOTIFY request with the XML document shown in Figure 14. INVITE sip:bob@example.com SIP/2.0 Via: SIP/2.0/UDP pc33.example.com;branch=z9hG4bKnashds8 Max-Forwards: 70 To: Bob <sip:bob@example.com> From: Alice <sip:alice@example.com>;tag=1928301774 Call-ID: a84b4c76e66710 CSeq: 314159 INVITE Contact: <sip:alice@pc33.example.com> Content-Type: application/sdp Content-Length: 142 [SDP not shown] Figure 13: Example INVITE request
25
<?xml version="1.0"?> <dialog-info xmlns="urn:ietf:params:xml:ns:dialog-info" version="0" state="full" entity="sip:alice@example.com"> <dialog id="as7d900as8" call-id="a84b4c76e66710" local-tag="1928301774" direction="initiator"> <state>trying</state> </dialog> </dialog-info> Figure 14: Corresponding XML document in a NOTIFY request The XML document in Figure 14 also includes details of the dialog-ID . When the dialog setup proceeds additional NOTIFY requests are sent with the state elements early and confirmed and also the remote-tag attribute will be included. The event package definition also allows to send partial state information where only the changed parts are included. The application of the INVITE initiated dialog event to implement an automatic call-back service is shown in Figure 15.
26
Subscription on the dialog event. The immediate delivered XML document in NOTIFY message body may look like:
<?xml version="1.0"?> <dialog-info entity="sip:b@b.com"> <dialog id="as7d900as8> <state event="2xx">confirmed</state> </dialog> </dialog-info>
After some time the session of User Agent B ends and sends a NOTIFY request to User Agent A
NOTIFY 200 OK
When the dialog terminates also the subscription ends automatically. Now an automatic call-back may be started by User Agent A.
INVITE 180 Ringing
...
RFC 4235 also defines two new media feature tags (see chapter 15.1 on page 101), which sometimes are used in combination with an INVITE initiated dialog event: sip.rendering: This feature tag indicates if the user agent is actually rendering any media stream. It may take the values "yes", "no", and "unknown". The feature tag sip.rendering=no indicates that the user agent actually ignores the media stream received. This is typically used when putting a session partner on hold. sip.byeless: This feature tag indicates that the user agent is able to terminate a session on its own. This may be used by an announcement machine continuously playing an announcement.
27
In this case the Content-type of message body is defined as application/simple-messagesummary, which is plain text based and not XML format. The numbers in the last line of the
21
RFC 3842: A Message Summary and Message Waiting Indication Event Package for SIP
28
message body show the summary of new/old messages and in parenthesis the summary of urgent messages. A User Agent may also explicitly fetch the current status by sending a SUBSCRIBE request with Expires header field set to zero.
22 23
RFC 4575: Event Package for Conference State IMS defines the SIP based the network architecture for carrier networks 24 P-CSCF: Proxy Call Session Control Function 25 RFC 3680: A SIP Event Package for Registrations
29
26
27 28
RFC 3311: SIP UPDATE Method This example is probably more hypothetical and should only show that UPDATE requests can be sent from either side. In practical situations usually only one UPDATE transaction is used. 29 This is realistic in case of mobile networks (IMS), but in that case the second offer/answer exchange is carried in the PRACK request.
31
Figure 17: UPDATE Call Flow A user agent should be sure that the peer user agent supports the UPDATE method. Therefore the INVITE request and the 180 (Ringing) response should contain an Allow header field showing support for the UPDATE method. An UPDATE request may also be used during confirmed dialogs (after INVITE transaction is finished), but in that case a re-INVITE is recommended. The re-INVITE allows an approval of the user due to the longer duration an INVITE-ACK may have, while an UPDATE request has to be answered immediately. The main application of the UPDATE requests is Resource Management as explained in the next chapter.
32
6 Resource Management
Some networks (e.g. mobile networks using SIP) require that at session establishment time, once the callee has been alerted, the chances of a session establishment failure are a minimum. One major source of failure in particular in mobile networks is the inability to reserve network resources for a session. This could lead to so called ghost rings, where the callee is alerted but the session cannot be setup successfully due to lack of resources. In order to minimize "ghost rings", it is necessary to reserve network resources for the session before the callee is alerted. However, the reservation of network resources frequently requires knowledge about the session parameters from the callee. This information is obtained as a result of the initial offer/answer exchange carried in SIP. This exchange normally causes the "phone to ring", thus introducing a chicken-and-egg problem: resources cannot be reserved without performing an initial offer/answer exchange, and the initial offer/answer exchange always causes alerting which might not be appropriate as long as necessary resource are not reserved. The solution to this problem is the concept of preconditions. Preconditions are a set of constraints about the session which are introduced in the offer. The recipient of an offer including preconditions generates an answer, but does not alert the user or otherwise proceed with session establishment until the preconditions are met. The session setup is stopped until an event occurs that the preconditions are met. This can be a local event (such as a confirmation of a resource reservation), or through a new offer sent by the caller. The precondition issue is media stream specific. Therefore the solution is based on extending SDP rather than by extending SIP. The solution is specified in RFC 331230. Additional remark on Updates: The original RFC 3312 based solution is QoS specific. In the meantime two additional applications for preconditions were identified: Usage of preconditions to enable mobility solutions31 Usage of preconditions to enable protection of media streams (media security)32
30 31
RFC 3312: Integration of Resource Management and SIP RFC 4032: Update to SIP preconditions Framework 32 RFC 5027: Security preconditions for SDP media streams
33
On the other hand, the values of the example below would make session establishment resume:
current status = resources reserved in both (sendrecv) directions desired status = resources reserved in the send direction
These two state variables are mapped to new attributes for the media stream in SDP and are exchanged with the offer/answer cycle. Thus both session partners have a shared view on the resource situation and they know when they have to stop session setup to wait for a condition to be met. Figure 18 shows a basic session setup using SDP preconditions as it is applied in mobile networks. User Agent A includes quality of service preconditions in the SDP of the initial INVITE. User Agent A does not want User Agent B to be alerted until there are network resources reserved in both directions end-to-end. User Agent B agrees to reserve network resources for this session before alerting the callee. Both user agents will handle resource reservation in their local access segment. This is the segment where in fact resources have to be reserved in a mobile network (radio link).
34
Figure 18: Basic Session Setup using preconditions User Agent B returns a 183 (Session Progress) response to User Agent A asking A to confirm when resources have been reserved in the local segment of A. In mobile networks it is necessary to agree on a specific codec before resource reservation can start due to different bandwidth requirements of different codecs. In SDP answer 1 of the 183 (Session Progress) response there might be more the one codec at disposal. Now User Agent A decides on the codec to be used (because he/she probably has to pay for the session) und tells its decision also User Agent B in a PRACK request containing offer 2. With sending / receiving the PRACK request both sides start the resource reservation mechanism33. User Agent A finishes resource reservation and informs User Agent B with an UPDATE request. User Agent B has already finished resource reservation in above example and now alerts the user and sends a 180 (Ringing) response. Then the session setup proceeds as usual.
Resource Reservation
Resource Reservation
UPDATE (with SDP offer 3) 200 OK (with SDP answer 3) 180 Ringing
Alerting
33
The resource reservation mechanism is independent from SDP signaling and depends on the transport network technology in place. A transport layer mechanism for QoS supported by most routers is RSVP (RFC 2205: Resource Reservation Protocol).
35
The current status attribute curr carries the current status of network resources for a particular media stream: The desired status attribute des carries the preconditions for a particular media stream. When the direction-tag of the current status attribute, with a given precondition-type/status-type for a particular stream is equal to (or better than) the direction-tag of the desired status attribute with the same precondition-type/status- type, for that stream, then the preconditions are considered to be met for that stream. The confirmation status attribute conf carries threshold conditions for a media stream. When the status of network resources reach these conditions, the peer user agent must send an update of the session description containing an updated current status attribute for this particular media stream (a confirmation). The attributes use the following parameters: precondition-type: RFC 3312 defines only one type for Quality of Service qos. RFC 5027 defines additionally a precondition type for security sec. status-type: This parameter indicates if preconditions have to be met end-to-end or only segmented (values are: e2e local, remote). strength-tag: This tag indicates whether the callee can be alerted in case the network fails to meet the preconditions (values are "mandatory","optional","none", "failure", unknown") direction-tag: This parameter indicates the direction in which a particular attribute is applicable to (values are "none","send","recv","sendrecv").
Coming back to the example session setup with preconditions in Figure 18 the precondition attributes in the SDP parts may be look as follows: Offer 1: a=curr:qos local none a=curr:qos remote none a=des:qos mandatory local sendrecv a=des:qos none remote sendrecv This is the initial position of User Agent A. It will care for QoS on the local segment but cannot do that for the remote segement.
36
Answer 1: a=curr:qos local none a=curr:qos remote none a=des:qos mandatory local sendrecv a=des:qos mandatory remote sendrecv a=conf:qos remote sendrecv User Agent B will take care for it own local segment but requires a confirmation when resources are reserved at the remote side. Otherwise it will not alert the user. Offer 2: a=curr:qos local none a=curr:qos remote none a=des:qos mandatory local sendrecv a=des:qos mandatory remote sendrecv Offer 2 reflects the qos mandatory condition from the remote side. In addition the list of codecs has been reduced to exactly one not shown here. Answer 2: a=curr:qos local none a=curr:qos remote none a=des:qos mandatory local sendrecv a=des:qos mandatory remote sendrecv a=conf:qos remote sendrecv Nothing has been changed since Answer 1. Offer 3: a=curr:qos local sendrecv a=curr:qos remote none a=des:qos mandatory local sendrecv a=des:qos mandatory remote sendrecv Now User Agent A confirms QoS readiness in its local segment. Answer 3: a=curr:qos local sendrecv a=curr:qos remote sendrecv a=des:qos mandatory local sendrecv a=des:qos mandatory remote sendrecv User Agent B reflects the availability of OoS readiness on the remote and local side. The user will be alerted now.
37
38
Party A
The controller sets up a call to A with no SDP in the INVITE. A respondes with connection SDP data in 200 OK. Controller sends hold SDP in ACK The controller sets up a call to B with no SDP in the INVITE. A respondes with connection SDP data in 200 OK. The controller re-INVITEs A with SDP data from B. A respondes with connection SDP data in 200 OK (again). The controller sends SDP from A to B in ACK and sends ACK to A. INVITE with no SDP 200 OK with SDP from A ACK
Controller
Party B
INVITE with no SDP 200 OK with SDP from B INVITE with SDP from B 200 OK with SDP from A ACK with SDP from A ACK
Media stream
A terminates the session with BYE and the controller sends BYE to B. Both transactions are confirmed.
BYE 200 OK
BYE 200 OK
Figure 19: Third party call-control The message flow in Figure 19 makes use of the fact that an INVITE may be sent without an SDP. In this case the SDP offer/answer has to be exchanged in 200 OK and ACK. The message flow above is only one possibility for 3rd party session set-up. RFC 372534 gives some more examples. The example above once again shows the flexibility of SIP and its nature as a toolbox of functions which may be combined to create some service. Note that all signaling is originated/terminated at the controller, but media is sent directly between party A and party B. No additional SIP protocol extensions are required for above behavior, just basic SIP.
34
RFC 3725: Best Current Practices for Third Party Call Control (3pcc) in SIP.
39
8 REFER Method
The REFER35 method is a SIP extension that requests that the recipient refers to a resource provided in the request. This can be used to enable many applications, including call transfer. The REFER method also establishes implicitly (without sending a SUBSCRIBE request) a short-lived subscription to the refer event. The refer event allows the party sending the REFER to be notified of the outcome of the referenced request. The NOTIFY body of a REFER has the Content-Type message/sipfrag which is defined in RFC 342036. Compared with the Content-Type message/sip the sipfrag allows to selectively insert only specific parts of a SIP message. In case of the refer event the message body of NOTIFY contains typically the status line only in case of provisional responses and the full response including the dialog data in case of 200 OK. The dialog data allow the recipient of the NOTIFY to take control of the session later and get the session partner back again via the Replaces header field of an INVITE request (see chapter 8.2 on page 43). The REFER request uses a new header field Refer-To which indicates the target to be referred. When an User Agent sends a REFER request the recipient will contact the resource addressed by the Refer-To header field in the request and it will also notify the referrer of the outcome (success or no success) of the operation. In case of call transfer service (the usual case for REFER) the address in the Refer-To header contains the SIP-URI of person to whom the call will be transferred. But the semantic of the Refer-To header is much broader: it also may contain the address of a web-site. In addition various URI-parameters in the Refer-To address may further define some conditions how the addressed resource should be contacted (e.g. the URI-parameter method=INVITE causes the referee to use the INVITE method). The REFER method maybe used within a dialog or outside of a dialog, but the most common case is to transfer existing calls and in this case it is sent within an existing dialog. User Agents often do not accept REFER request outside of a dialog. If REFER is not used within a dialog a dialog is created. REFER and NOTIFY requests are part of the dialog. SUBSCRIBE is not used due to implicit subscription. Figure 20 shows a typical message flow of a simple unattended call transfer service using the REFER request. The call transfer is called unattended because Alice (the referrer) does not setup a session with the refer target (Carol) before the transfer to explain the reason why the call is transferred. This is perhaps impolite but simpler to explain from a call flow perspective. Later we will see the more realistic example of an attended call transfer (see chapter 8.2 on page 43).
35 36
RFC 3515: The SIP Refer Method RFC 3420: The media type message/sipfrag
40
The REFER request in Figure 20 is sent within an existing dialog. The Refer-To header field contains the address of the Refer-Target (Carol). The REFER request is executed as a simple transaction causing the referee (Bob) to respond immediately. At this time the referee does not know the result of the action initiated by REFER request (in above example an INVITE request to the refer target). Therefore the referee responds with 202 Accepted and sends NOTIFY requests to the referrer (Alice) to keep her informed about the result of the initiated action.
Alice (Referrer)
A dialog and session exists between Alice and Bob Session and Dialog
Bob (Referee)
Alice starts a call transfer. A subscription to the refer event is created implicitly. The first NOTIFY informs that the User Agent of Carol is ringing. Alice user agent automatically terminates the session.
REFER Refer-To: Carol 202 Accepted NOTIFY 200 OK BYE 200 OK --- end of session --and dialog
The second NOTIFY informs that the Carol has accepted the session.
NOTIFY 200 OK
Figure 20: Call transfer example based on a REFER request There are some situations where the implicit subscription to the refer event is not necessary. In this case a further extension allows the referrer to suppress the implicit subscription37.
37
Referrer
Referee
Refer Target
Figure 21: Referred-By mechanism The Referrer adds a Referred-By header field to the REFER request containing the identity of the referrer. This header field is copied into the referenced request (INVITE). Someone may detect a security issue in the simple mechanism shown in Figure 21, because it is easy in this case for a man-in-the middle attack to fake a Referred-By header field. Imagine the boss of a company who only might accept calls referred by his secretary. This would be easy to fake. RFC 3892 addresses this situation and offers a solution based on an Authenticated Identity Body39 (AIB). The AIB offers a signature which is included in the message body. Figure 22 shows how the Referred-By header field is secured by an AIB. A content-identifier parameter (cid) is added to the Referred-By header field and the identifier points to a separate part of the message body which contains a signature on the Referred-By header field. The Refer Target has now a possibility to verify the authenticity of the Referred-By header field.
38 39
RFC 3892: The SIP Referred-By Mechanism RFC 3893: SIP Authenticated Identity Body (AIB) Format
42
Referrer
Referee
Refer Target
REFER referee Refer-To: target Referred-By: referrer; cid=X Additional message body part (MIME) Content-ID: X <Referred-By Token>
INVITE target Referred-By: referrer; cid=X Additional message body part (MIME) Content-ID: X <Referred-By Token>
Figure 22: Referred-By header field secured by an AIB There is also the possibility for the refer target to reject a REFER request without a valid referrer identity with the response 429 (Provide Referrer Identity).
40
Alice
INVITE 180 Ringing 200 OK ACK Session and Dialog Bob puts Alice on hold.
Bob
Carol
Bob transfers the existing session with Carol to Alice including the Replaces header field.
no RTP
INVITE Replaces: dialog with Bob 200 OK ACK Session and Dialog
Carol terminates the session with Bob. Alice reports the successful transfer to Bob Bob terminates the session with Alice. NOTIFY 200 OK BYE 200 OK
BYE 200 OK
Figure 23: Attended call transfer using REFER and Replaces The INVITE request (1) of Bob to put Alice on hold is shown in Figure 24. To set a session partner on hold the SDP attribute a=sendonly is used. In addition the media feature tag
44
sip.rendering="no" in the Contact header field is used to make sure that during hold no received media will be rendered. The a=sendonly attribute of an SDP offer is reflected in the SDP answer with the attribute a=recvonly. INVITE sips:alice@client.atlanta.example.com SIP/2.0 Via: SIP/2.0/TLS client.biloxi.example.com:5061 ;branch=z9hG4bKnashds7 Max-Forwards: 70 From: Bob <sips:bob@biloxi.example.com>;tag=23431 To: Alice <sips:alice@atlanta.example.com>;tag=1234567 Call-ID: 12345600@atlanta.example.com CSeq: 1024 INVITE Contact: <sips:bob@client.biloxi.example.com>;+sip.rendering="no" Content-Type: application/sdp Allow: INVITE, ACK, CANCEL, OPTIONS, BYE, REFER, NOTIFY Supported: replaces Content-Length: ... v=0 o=bob 2890844527 2890844528 IN IP4 client.biloxi.example.com s= c=IN IP4 client.biloxi.example.com t=0 0 m=audio 3456 RTP/AVP 0 a=rtpmap:0 PCMU/8000 a=sendonly Figure 24: INVITE request to put Alice on hold (1) The next interesting request is the REFER request (2) sent by Bob to Alice. REFER sips:alice@client.atlanta.example.com SIP/2.0 Via: SIP/2.0/TLS client.biloxi.example.com:5061 ;branch=z9hG4bKnashds2g Max-Forwards: 70 From: Bob <sips:bob@biloxi.example.com>;tag=23431 To: Alice <sips:alice@atlanta.example.com>;tag=1234567 Call-ID: 12345600@atlanta.example.com CSeq: 1025 REFER Refer-To: <sips:39itp34klkd@chicago.example.com?Replaces= sdjfdjfskdf%40biloxi.example.com%3Bto-tag%3D5f35a3 %3Bfrom-tag%3D8675309&Require=replaces> Referred-By: <sips:bob@biloxi.example.com> Contact: <sips:bob@client.biloxi.example.com> Content-Length: 0 Figure 25: REFER request from Bob to Alice (2)
45
In contrast to the REFER request of an unattended call transfer this REFER request contains a Replaces header field referring to the existing dialog between Bob and Carol and a Referred-By header field. Figure 25 shows the REFER request. The Refer-To header field contains the refer target address, which is the Contact address of Carol (not the AoR) to guarantee that the right instance of the user agent is addressed. The Contact URI is amended by the Replaces header field (after the question mark). One will notice that within the Replaces header field control characters are escaped (%HEX notation for @, = and ;). This is a syntax rule in SIP to avoid ambiguity. The Replaces header field contains the three parameters of the dialog-id: Call-ID, To-tag and From-Tag. The INVITE request (3) sent from Alice to Carol includes the (now unescaped) Replaces header field as shown in Figure 26.
INVITE sips:39itp34klkd@chicago.example.com;gr SIP/2.0 Via: SIP/2.0/TLS chicago.example.com:5061 ;branch=z9hG4bKadfe4ko To: Carol <sips:39itp34klkd@chicago.example.com> Max-Forwards: 70 From: Alice <sips:alice@atlanta.example.com>;tag=3461 Call-ID: 9435674543@atlanta.example.com CSeq: 1 INVITE Require: replaces Referred-By: <sips:bob@biloxi.example.com> Replaces: sdjfdjfskdf@biloxi.example.com ;to-tag=5f35a3;from-tag=8675309 Contact: <sips:alice@client.atlanta.example.com> Allow: INVITE, ACK, CANCEL, OPTIONS, BYE, REFER, NOTIFY Supported: replaces Content-Type: application/sdp Content-Length: ... v=0 o=alice 2890844989 2890844989 IN IP4 client.atlanta.example.com s= c=IN IP4 client.atlanta.example.com t=0 0 m=audio 3458 RTP/AVP 0 a=rtpmap:0 PCMU/8000 Figure 26: INVITE with Replaces header field (3)
46
The full message flow with all details of the message contents can be studied in RFC 535941. This RFC contains many examples of possible services and may be interesting to analyse. But an important remark has to be mentioned: The services examples are only to be considered as examples, because there are in some cases different choices to implement a service. This is the result of the toolbox nature of SIP, where all functions (e.g. Replaces header field extension) have to be seen as tools (service primitives) within the SIP toolbox. As long as the syntax and semantic of an extension liker REFER method or Replaces header field is implemented correctly the interoperability should not be an issue.
41
9 Conferencing
The core SIP specification provides a way to set up and manage sessions between two User Agents. It is possible to create and control a multi-party conference using this specification. However in such a scenario, referred to as the loosely coupled conference model, there does not exist a relationship between every participant in the conference. Such conference situations can be accomplished by using multicast. Alternatively, a UA can maintain multiple dialogs with multiple User Agents while also acting as a media mixer. While the User Agent that is acting as the conference controller/mixer has knowledge of the other User Agents involved in the conference, the other User Agents do not know about each other. Additionally this scenario puts extra strain on the resources of the controlling User Agent by forcing it to both: controlling signaling and mixing media streams.. RFC 435342 introduces an architecture by which a central entity, called a focus, provides a variety of conference functions and mixing of media. In this type of conference, referred to as the tightly coupled conference model, each UA involved in the conference connects to the focus and maintains its own SIP dialog with it. This specification also defines a logical function called a conference policy server that stores conference policy, which is simply a set of rules governing a particular conference. The focus must be able to access this conference policy to determine how the conference should operate, such as if a particular UA is allowed to join the conference. The specification also defines a second logical function called a conference notification service. This is a service that a conference participant can subscribe to and receive notifications when changes in conference state occur. In this model, a UA participating in a conference can SUBSCRIBE to the conference URI and be alerted via SIP NOTIFY messages when the state of the conference changes, such as when participants enter and leave the conference. Often the conference focus, policy server, and notification service are located in the same physical entity. RFC 457543 defines an event package for notifying participants of a tightly coupled conferences of the conference state. RFC 457944 uses the concepts from RFC 4353 and RFC 4575 to define a set of recommended practices for creating and controlling
42 43
RFC 4353: A Framework for Conferencing with SIP RFC 4575: A Session Initiation Protocol (SIP) Event Package for Conference State 44 RFC 4579: SIP Call Control - Conferencing for User Agents
48
Conference Focus
H.248
Conference Mixer
RTP
P RT
SIP
P RT
SIP
SIP
Participant A
Participant B
Participant C
49
Tightly coupled conferences are hosted by a central point of control the conference focus to which every participant has a signaling connection. The conference focus uses a conference specific SIP address which is shared among the participants. Closely coupled with the conference focus is a conference mixer. The mixer terminates and re-originates the media streams. The conference focus controls the conference mixer. It knows the SDP parameters for mixing media streams contained in the SIP signaling messages. The focus controls the mixer via the H.24845 protocol also known as MEGACO protocol46. The mixing does simply the following: it receives media stream from A and sends a combined media from B+C to A, it receives media stream from B and sends a combined media from A+C to B, it receives media stream from C and sends a combined media from A+B to C.
The message body of the INVITE request contains all media streams that the user wants to establish for this conference. The conference-focus then checks if the user is allowed to create an ad-hoc conference (via the policy server) and if resources for that conference are available at the mixer. If it the focus accepts the ad-hoc conference it sends a dedicated conference URI back to the user within the Contact header field of the 200 OK response.
SIP/2.0 200 (OK) Contact: sip:litalk-adhoc5853@mrfc25.home1.fr;isfocus
The conference-focus indicates in this address that it will act as a focus for the ad-hoc conference by adding an isfocus feature-parameter (see chapter 15, page 101). The next step for the creator of the conference is to get the participants invited to the conference. This can be accomplished by e.g. sending the conference URI to the participants e.g. via messaging (see chapter 10, page 54). The participants then individually set-up the session with the conference-focus using the dedicated conference URI.
45 46
H.248: Gateway control protocol Version 3 RFC 3525: Gateway Control Protocol Version 1
50
INVITE sip:create-adhoc-litalk@focus.home1.fr SIP/2.0 Content-Type: multipart/mixed;boundary="boundary1" --boundary1 Content-Type: application/sdp //SDP Information not shown here --boundary1 Content-Type: application/resource-lists+xml Content-Disposition: recipient-list <?xml version="1.0" encoding="UTF-8"?> <resource-lists xmlns="urn:ietf:params:xml:ns:resource-lists" xmlns:cp="urn:ietf:params:xml:ns:copycontrol"> <list> <entry uri="sip:peter@home8.de" cp:copyControl="to" /> <entry uri="sip:kevin@home5.co.uk" cp:copyControl="to" /> <entry uri="sip:cathy@foreign.com" cp:copyControl ="to" /> </list> </resource-lists> --boundary1
47 48
RFC 5366: Conference Establishment Using Request-Contained Lists in SIP RFC 5364: Extensible Markup Language (XML) Format Extension for Representing Copy Control Attributes in Resource Lists
51
BFCP uses TCP connections between the participants and binary coded information. For more details see RFC 4582.
The first two methods have been explained already in the previous chapter of centralized and tightly coupled conferences. The REFER based method is an alternative for the conference initiator to directly bring other participants into the conference. The initiator simply sends a REFER request to the participant which includes a Refer-To header field with the conference URI. Another method may be to use a Join header field as explained in next chapter.
49
Alice
Media stream
Bob
Carol
Media stream
Media stream
50
51
Both features Presence and Instant Messaging have been defined by the special IETF working group SIMPLE (SIP for Instant Messaging and Presence Leveraging Extensions) 52 RFC 3428: SIP Extension for Instant Messaging
54
MESSAGE sip:user2@domain.com SIP/2.0 Via: SIP/2.0/TCP user1pc.domain.com;branch=z9hG4bK776sgdkse Max-Forwards: 70 From: sip:user1@domain.com;tag=49583 To: sip:user2@domain.com Call-ID: asd88asd77a@1.2.3.4 CSeq: 1 MESSAGE Content-Type: text/plain Content-Length: 18 Watson, come here. Figure 30: Example of a SIP MESSAGE method The MESSAGE method can be used inside and outside of a dialog. Inside of a dialog (e.g. within an INVITE based dialog) the MESSAGE request may be sent directly between the user agents (end-to-end), otherwise the routing path (inbound and perhaps also an outbound proxy, ...etc) will be used In case of longer messaging sessions (e.g. chat) the session based messaging method should be used.
53
INVITE sip:bob@biloxi.example.com SIP/2.0 To: <sip:bob@biloxi.example.com> From: <sip:alice@atlanta.example.com>;tag=786 Call-ID: 3413an89KU Content-Type: application/sdp c=IN IP4 atlanta.example.com m=message 7654 TCP/MSRP * a=accept-types:text/plain a=path:msrp://atlanta.example.com:7654/jshA7weztas;tcp Figure 31: Example session setup of a messaging session (MSRP)
Similar to audio or video sessions the offer/answer model of SDP is applied. The offer is shown in Figure 31 and it tells where Alice is willing to receive the instant messaging stream. An MSRP-URI with a path attribute describes the endpoint of the session. Bob responds to the INVITE request with a 200 OK response containing the SDP answer as shown in Figure 32.
Based on the MSRP offer/answer exchange an additional TCP-connection is setup between Bob and Alice for the MSRP session.
SIP/2.0 200 OK To: <sip:bob@biloxi.example.com>;tag=087js From: <sip:alice@atlanta.example.com>;tag=786 Call-ID: 3413an89KU Content-Type: application/sdp c=IN IP4 biloxi.example.com m=message 12763 TCP/MSRP * a=accept-types:text/plain a=path:msrp://biloxi.example.com:12763/kjhd37s2s20w2a;tcp Figure 32: Example SDP answer for MSRP session.
MSRP session The Message Session Relay Protocol defines two request types (also called methods): SEND and REPORT. SEND requests are used to deliver a complete message or a chunk (a portion of a complete message), while REPORT requests report on the status of a previously sent message. When Alice receives Bob's answer, she checks to see if she has perhaps an existing connection to Bob. If not, she opens a new TCP connection to Bob using the MSRP URI Bob provided in the SDP. Alice then delivers a SEND request to Bob with her initial message, and Bob replies indicating that Alice's request was received successfully.
56
A typical SEND request is shown in Figure 33 below. SEND and REPORT requests start MSRP transactions which are responded by 200 OK. MSRP a786hjs2 SEND To-Path: msrp://biloxi.example.com:12763/kjhd37s2s20w2a;tcp From-Path: msrp://atlanta.example.com:7654/jshA7weztas;tcp Message-ID: 87652491 Byte-Range: 1-25/25 Content-Type: text/plain Hey Bob, are you there? -------a786hjs2$ MSRP a786hjs2 200 OK To-Path: msrp://atlanta.example.com:7654/jshA7weztas;tcp From-Path: msrp://biloxi.example.com:12763/kjhd37s2s20w2a;tcp -------a786hjs2$ Figure 33: Example MSRP SEND request Alice's request begins with the MSRP start line, which contains a transaction identifier (a786hjs2) that is also used for request framing. Next she includes the path of MSRP URIs to the destination in the To-Path header field, and her own MSRP URI in the From-Path header field. In this typical case, there is just one "hop", so there is only one URI in each path header field (in case of MSRP relays in between additional MSRP URIs are included in the Path header fields) Alice also includes a message ID (87652491), which she can use to correlate status reports with the original message. Next she puts the actual content. Finally, she closes the request with an end-line of seven hyphens, the transaction identifier, and a "$" to indicate that this request contains the end of a complete message, in contrast to a chunk only. The main purpose of the MSRP URIs is to care for some kind of security. Alice and Bob choose their MSRP URIs in such a way that it is difficult to guess the exact URI. Alice and Bob can reject requests to URIs they are not expecting to service and which they cannot correlate with the probable sender. Alice and Bob can also use TLS to provide channel security over this hop. To receive MSRP requests over a TLS protected connection, Alice or Bob could advertise URIs with the "msrps" scheme instead of "msrp". As already mentioned there is the possibility to include intermediary network nodes in between of a messaging session, an MSRP relay as defined in RFC 497654. Intermediate message session relays may be used for enhanced security and authentication. An additional AUTH request is defined for MSRP relays. A typical application for MSRP relays is the application of instant messaging for trading. In this case a trusted protocol of the message exchange is necessary.
54
RFC 4976
An example MSRP address header (path header fields) including security (TLS) and two messaging relays is shown below: To-Path: msrps://B.example.com/bbb;tcp msrps://Bob.example.com/bob;tcp From-Path: msrps://A.example.com/aaa;tcp msrps://Alice.example.com/alice;tcp
58
11 INFO method
The purpose of the INFO message is to carry application level information between endpoints, using the SIP dialog signaling path. But the INFO method does not update the characteristics of a SIP dialog or session. It only allows the applications that use the SIP session to exchange information. The INFO method has been originally defined in RFC 297655 which is now called the legacy INFO method. One of the first applications of this INFO method was to exchange encapsulated ISUP signaling between PSTN-SIP gateways as shown in Figure 34. This was a typical scenario in the early days of Voice over IP when an operator offered cheap long-distance calls via the Internet (so called toll-bypass).
PSTN (ISUP)
Figure 34: Transport of ISUP messages within SIP The above figure shows the principle of the long distance toll bypass method. PSTN-SIP Gateways on the left and right side which take the role of SIP user agents map the ISUP messages for session set-up and release to equivalent SIP messages56,57. But it is not possible to map all information elements of ISUP without loss because not all ISUP information fields do have an equivalent representation within SIP. Therefore an additional mechanism has been defined to transport all ISUP information unaltered through the SIP network: the encapsulation of ISUP messages within SIP messages bodies58. Now all ISUP messages for call set-up and release can be mapped to equivalent SIP messages, but there are additional signaling messages for so called mid-call services, where an appropriate SIP message was missing. For this case the legacy INFO message is used.
55 56
RFC 2976: The SIP INFO Method RFC 3398: ISUP to SIP Mapping 57 RFC 3372: SIP for Telephones (SIP-T): Context and Architectures 58 RFC 3204: MIME media types for ISUP and QSIG Objects
59
Several other applications for the SIP INFO method had been defined which are not further mentioned here. Then a drawback with the legacy INFO method became obvious: there was no indication on the application for which it was used.
After some discussion the INFO method was redefined in a backward compatible manner. The new INFO method59 also includes an Info Package mechanism. The Info Package specification defines the content and semantics of the information carried in an INFO message associated with the Info Package.
The Info Package mechanism also enables the User Agents to indicate which Info Packages they are willing to receive and for which Info Package a specific INFO request is used. For that two new header fields have been defined: The Recv-Info header field indicates in a set of package names for which Info Packages a User Agent is willing to receive INFO requests. The Recv-Info header field may also be empty if it does not want to receive any INFO request. The Recv-Info header field is included in a dialog initiating request (typically INVITE). The receiver also includes a Recv-Info header field the response. Then both sides know which Info Packages the partner is able to process. The Info-Package header field is included in an INFO request to indicate which Info Package is associated with the request.
Figure 35 shows an exchange of Recv-Info header fields in an INVITE request and the
corresponding 200 (OK) response. The UAC sends an initial INVITE request, where the UAC indicates that it is willing to receive INFO requests for Info Packages P and R. The UAS sends a 200 (OK) response back to the UAC, where the UAS indicates that it is willing to receive INFO requests for Info Packages R and T.
Figure 36 shows an INFO request with a single payload. It refers to the Info-Package foo. The
corresponding specification of the Info-Package foo must also describe the syntax and sematic of the content type application/foo. Alternatively to a single payload an INFO request may also contain multiple message body parts.
59
INVITE sip:bob@example.com SIP/2.0 Via: SIP/2.0/TCP pc33.example.com;branch=z9hG4bK776 Max-Forwards: 70 To: Bob <sip:bob@example.com> From: Alice <sip:alice@example.com>;tag=1928301774 Call-ID: a84b4c76e66710@pc33.example.com CSeq: 314159 INVITE Recv-Info: P, R Contact: <sip:alice@pc33.example.com> Content-Type: application/sdp Content-Length: ... ... SIP/2.0 200 OK Via: SIP/2.0/TCP pc33.example.com;branch=z9hG4bK776; received=192.0.2.1 To: Bob <sip:bob@example.com>;tag=a6c85cf From: Alice <sip:alice@example.com>;tag=1928301774 Call-ID: a84b4c76e66710@pc33.example.com CSeq: 314159 INVITE Contact: <sip:bob@pc33.example.com> Recv-Info: R, T Content-Type: application/sdp Content-Length: ... ...
INFO sip:alice@pc33.example.com SIP/2.0 Via: SIP/2.0/UDP 192.0.2.2:5060;branch=z9hG4bKnabcdef To: Bob <sip:bob@example.com>;tag=a6c85cf From: Alice <sip:alice@example.com>;tag=1928301774 Call-Id: a84b4c76e66710@pc33.example.com CSeq: 314333 INFO Info-Package: foo Content-type: application/foo Content-Disposition: Info-Package Content-length: 24 I am a foo message type
61
12 Service Configuration
Many services require a mechanism to allow users to manage configuration parameters. The presences service which is presented in chapter 4.1 is a prominent candidate for parameter configuration, e.g. buddy lists and authorizations. For service configuration the user has to manipulate data on some server. This data are nowadays stored in XML documents because of their platform independence and the possibility to have the data well structured.
Figure 37: Example of a simple XML document After the XML declaration in the first line the data follow in a tree-like structure. Each node in the tree is called XML element. XML elements start an opening tag with the name of the element enclosed in angle brackets, e.g. <status>, and terminates with a closing tag that contains a slash / and the name of the element, e.g. </status>. XML elements can contain other child elements. XML elements usually contain a text node that represents a value. In the example the value of the <note> element is Im in London at the moment. XML elements can also be empty in which case a compact notation can indicate the beginning and end tags of the empty element by including a slash / at the end of the element name. For example <test/> is an empty element. XML elements can also contain attributes that further characterize the element by defining its metadata. In above example the presence element contains two attributes: xmlns and entity. Unlike elements attributes cannot be empty.
60
XML = Extensible Markup Language specified by World Wide Web Consortium (http://www.w3.org/XML/)
62
XML documents usually are structured according to predefined rules. These rules are typically defined in separate additional documents like in a Document Type Definition (DTD) or in an XML schema. An important attribute is the namespace element xmlns. By referring to a globally unique namespace ambiguity is guaranteed. In IETF documents the namespaces are usually URNs61.
61 62
RFC 3986: Uniform Resource Identifier (URI): Generic Syntax RFC 4825: The Extensible Markup Language (XML) Configuration Access Protocol (XCAP)
63
The above XCAP operation (HTTP PUT request) is used by Alice to create a new presence list named family. The list contains the members of her family. Two URIs are initially added to the list: Bob's and Cynthia's. XCAP defines two new functional elements: an XCAP client and an XCAP server. They are depicted in Figure 40.
XCAP client
HTTP request HTTP response
XCAP server
Figure 40: XCAP functional elements An XCAP client is an HTTP 1.1 compliant client that supports the rules and conventions specified by XCAP. It sends HTTP requests and receives HTTP responses. An XCAP server is an HTTP 1.1 compliant server that supports the rules and conventions specified by XCAP. It receives HTTP requests and sends HTTP responses.
64
In case of a centralized conferencing service the creator on a conference can use XCAP to configure the list of participants. Due to this versatility XCAP uses the concept of application usage. An application usage defines how a particular application uses XCAP to interact with an XCAP server. Each application usage is identified by an AUID (Application Unique ID) that uniquely identifies the application usage. The AUID is a string which is included in the HTTP URIs that identify XCAP resources (see next chapter). There are standardized and vendor proprietary AUIDs. In case of the above mentioned XCAP usage for the presence service the following application usages have been standardized: XCAP application usage for resource lists63 XCAP application usage for presence authorization64 XCAP application usage for manipulating presence documents65
63 64
RFC 4826: Extensible Markup Language (XML) Formats for Representing Resource Lists RFC 5025: Presence Authorization Rules 65 RFC 4827: An XCAP Usage for Manipulating Presence Document Contents
65
AUID
subtree users
XML document
http://xcap-root@net1.test/root/resource-lists/users/sip:alice@net1.test/resource-list.xml
XCAP root locator hostname and directory AUID document selector subtree users XML document
http://xcap-root@net1.test/rules/org.openmobilealliance.pres-rules/users/sip:alice@net1.test/pres-rules
XCAP root locator document selector
Figure 41: XCAP URI consisting of XCAP root and document selector The above XCAP URIs address whole XML documents. When a specific element within the XML document is selected (not shown above) a node separator /~~/ follows with the node hierarchy and optionally the addressed attribute. A valid XCAP URI which addresses a specific entry in a resource-list might be (all in one line):
http://xcap-root@net1.test/root/resource-lists/users/sip:alice@net1.test/ resource-list.xml/~~/list[3]/entry[@uri=sip:dave@net1.test]
This XCAP URI addresses within the 3rd element of <list> the attribute uri in the element <entry>. It might be used in a HTTP PUT request to create or replace the specific element or in an HTTP DELETE request to delete the element.
66
Client
GET resource 200 OK Etag:1 MIME body
Server
Figure 42: Entity tags in HTTP Entity tags are used in conditional HTTP requests and also in XCAP operations. To avoid unnecessary download of data (MIME body) conditional HTTP requests can be used. Figure 43 shows conditional HTTP requests which use If-Match and If-None-Match header fields. The first GET request in the figure will not include a MIME body because the assumption of the ETag-value 2 is correct (304 Not modified response). When the client now updates the resource it will get a new ETag value in the response. Then some change happens and a new ETag is assigned. When now the client again updates the resource referring to an outdated ETag it will get an error response (412 Precondition Failed) and the client ha to fetch the resource again before updating. The mechanism of conditional HTTP requests can be re-used by XCAP, which uses HTTP as the underlying protocol. Conditional XCAP requests are very useful. Before the client adds a new friend to the presence list, the client should make sure that it already has the latest version of the presence list. If it does not the operation might lead to an undesired result.
67
Client
GET resource If-None-Match: 2 304 Not modified
Server
PUT resource If-Match: 2 MIME body 200 OK Etag:3 Content changes New Etag: 4
68
A more accurate solution is offered by the combination of two specifications: The XCAP-Diff Format66 The XCAP-Diff event package67
These two specifications provide a subscription/notification mechanism to keep one or more XML documents synchronized with those stored on an XCAP server. The XCAP- Diff Format specifies an XML format to express changes in an XML document and the XCAP-Diff event package enables automatic notification on case of change of the content. The XCAP-Diff mechanism allows the terminal to subscribe not only to changes in the whole document, but also to changes in a particular element or attribute of an XML document. Furthermore the subscriber can issue a subscription to a collection of XML documents, elements and attributes even contained in different XML documents. The list of resources to be watched may be maintained in another XML document called a resource list. This list is then referred in the message body of the SUBSCRIBE request. There are further several different ways how the server can express the differences. The client may select a specific handling by using a diff-processing parameter specified for the Event header field. The diff-processing parameter may take one of three values: no-patching, xcappatching and aggregate. The value no-patching means that in case the subscription is done towards a whole XML document the document is not included in the notification, only the new entity tag. The value xcap-patching means that the client is interested in the actual changes also in case of subscription to whole documents. The value aggregate means that the server may aggregate several updates into a single notification. The policy for determining whether or not to apply aggregation or to determine how many updates to aggregate is determined locally. An example subscription to XML document changes is shown in Figure 44. Please note that in this operation XCAP and SIP operations are combined.
66 67
RFC 5874: An XML Document Format for Indicating a Change in XCAP Resources RFC 5875: An XCAP Diff Event Package
69
XCAP/SIP Client
SUBSCRIBE Event: xcap-diff [resource-list] 200 OK
XCAP/SIP Server
70
The usage of the private address space offered by RFC 1918 requires NAT and it was initially considered to be a short time solution only, but this technique is meanwhile used extensively due to its inherent security advantages. With private address space the main issue of exhausted address space has been significantly reduced so that the introduction of IPv6 (the long term solution) still takes only up at a very low pace. The consequence now is the destruction of the transparency of the Internet and the issues with multimedia protocols and SIP in real life.
68 69
RFC 1958: Architectural Principles of the Internet RFC 4632: Classless Inter-domain Routing (CIDR) 70 RFC 1918: Address Allocation for Private Internets 71 RFC 3022: Traditional IP Network Address Translator (Traditional NAT)
71
This limitation is avoided by using also different port numbers as an additional addressing layer. This method is called NAPT (Network Address and Port Translation) and is the predominant NAT mechanism used today. The principle of NAPT is well known and shown in Figure 45. Two clients in the private network with different IP addresses (192.168.0.2 and 192.168.0.3) use the same port (5060) to communicate with the SIP server on the public Internet. Requests sent by both clients are mapped to the public IP address of the NAPT-router (85.127.217.158) and the NAPT mechanism assigns a different source port to each (5060, 23544). The NAPT router keeps a mapping table to forward responses from the SIP server to the clients accordingly. The task of a NAPT box is to create a mapping if required and to exchange IP address and port numbers in IP packet headers accordingly when packets traverse the NAPT box. The NAPT mapping usually has a restricted life time (e.g. 2 minutes) and needs a permanent refreshing by sending packets between client and server.
Source address mapping: Client 2 - Request Src: 192.168.0.3: 5060 Dst: 195.245.225.190: 5060 External IP: 85.127.217.158 Internal IP: 192.168.0.0/255 Src1: 192.168.0.2: 5060 Src1: 85.127.217.158: 5060 Src2: 192.168.0.3: 5060 Src2: 85.127.217.158: 23544
Server - Responses Src: 195.245.225.190: 5060 Dst1: 85.127.217.158: 5060 Src: 195.245.225.190: 5060 Dst2: 85.127.217.158: 23544
13.2 Firewalls
Firewalls are typically implemented in the same equipment as NAT. That means a NAT box usually also has firewalling capabilities and both functions cannot be controlled independently. This fact has led to a specific categorization model in the past (see chapter 13.5.1 on page 78) which unfortunately did not hold. The behavior of NAT boxes turned out to be unpredictable in
72
many cases and therefore the method of determining a NAT characteristic as used in STUN (see chapter 13.5.2 on page 79) should not be used any longer.
The consequence is that signaling and media stream is impacted and this sometimes results in: - Signaling only in one direction - Session setup without media connection (ghost ring) - Unidirectional media, etc Without any additional efforts it is impossible to use SIP in NATed network environments. During the several years of SIP standardization various solutions have been proposed and some add-ons to SIP have been defined. The next chapters explain the most important enhancements and protocol extensions in this area. We can find enhancements on the user agent side, enhancements within the network or solutions impacting both sides including use of additional servers.
72
RFC 2663: IP Network Address Translator (NAT) Terminology and Considerations RFC 3027: Protocol Complications with the IP Network Address Translator 73 From an expert point of view one can argue that the layering rules of protocols have been violated by SIP. Each layer should use its own addressing mechanism but not re-use addresses of a lower layer in an upper layer.
73
INVITE sip:franz.edler@technikum-wien.at SIP/2.0 Via: SIP/2.0/TCP 192.168.0.2:28176;branch=z9hG4bK-f275c654cd6b756d Max-Forwards: 70 To: "Franz Edler"<sip:franz.edler@technikum-wien.at> From: "Klaus Berner"<sip:klaus.berner@iptel.org>;tag=3f542204 Call-ID: NWU4MDQwNzVjODBhMjhkMjdhMDkxMjhlODkxMGE3NDI. Contact: <sip: klaus.berner@192.168.0.2:28176;transport=TCP> CSeq: 1 INVITE Allow: INVITE, ACK, CANCEL, OPTIONS, BYE, REFER, NOTIFY, MESSAGE, SUBSCRIBE Content-Type: application/sdp User-Agent: eyeBeam release 1100z stamp 47739 Content-Length: 417 v=0 o=- 7 2 IN IP4 192.168.0.2 s=CounterPath eyeBeam 1.5 c=IN IP4 192.168.0.2 t=0 0 m=audio 63912 RTP/AVP 107 100 106 6 0 105 8 18 3 5 101 a=fmtp:18 annexb=yes a=fmtp:101 0-15 a=rtpmap:107 BV32/16000 a=rtpmap:100 SPEEX/16000 a=rtpmap:106 SPEEX-FEC/16000 a=rtpmap:105 SPEEX-FEC/8000 a=rtpmap:18 G729/8000 a=rtpmap:101 telephone-event/8000 a=sendrecv a=x-rtp-session-id:DB819CF67F894567AC7C1F0F9763A62C
Figure 46: SIP INVITE request originated from behind NAT The major drawback of most of these solutions is the brittleness, that means that some of the solutions do not work in all environments; some need special configurations and experience on the client side and/or special resources (servers) in the network, and it usually requires a lot of expertise to predict if a solution will work in which network constellation. This is the bad news. Therefore it is not a surprise that the efforts within the last years concentrated on finding and defining an easy to configure one-fits-all solution for NAT and FW traversal. The good news now is: Such a solution has been defined now. It is a combination of two RFCs, which will be explained later: Client initiated connections: see chapter 13.6.5 on page 92 Interactive Connectivity Establishment: see chapter 13.6.3 on page 86
But before explaining this solution some of the other methods will be mentioned, because these are still in use today and are also applied in IMS74.
74
The rule for a server (the next hop which has to re-use the Via header field for routing responses) which receives a request is as follows: When the server receives a request it must examine the value of the "sent-by" parameter in the top Via header field value. If the host portion of the "sent-by" parameter contains a domain name, or if it contains an IP address that differs from the packet source address, the server must add a "received" parameter to that Via header field. This parameter must contain the source address from which the packet was received. This is to assist the server in sending the response, since it must be sent to the source IP address from which the request came. That means in case of NAT an outbound proxy server will always add the external IP-address from where it received the request to the Via header. But one important parameter is missing to enable successful routing back of responses: the port number. RFC 358175 fills this gap. This protocol extension defines a new parameter for the Via header field, called rport. When a client adds an empty rport parameter to the Via header field of a request the server must also enter the port from which the request originated. The combination of received and rport parameters now contains the both IP address and port number of the external address in case of NAT.
Applying this rule now to an example means the following: When the client is behind NAT and it adds an empty rport parameter.
Via: SIP/2.0/UDP 192.168.0.2:5060;rport;branch=z9hG4bK87asdks7
The server then will complement the Via header field of the IP address and port number from where it received the request as follows:
75
Via: SIP/2.0/UDP 92.168.0.2:5060; received=85.172.63.100;rport=9876;branch=z9hG4bK87asdks7 A response will then be sent back to 85.172.63.100: 9876. The rport-procedure is initiated by the client when it adds an empty rport parameter. Therefore the addition of an rport parameter is usually controlled via a provisioning parameter of a user agent. Note that this extension is only necessary in case of UDP. If TCP is used as transport protocol responses are always sent on the connection setup by the request.
NAT
External IP-address/port Public IP-address/port
Private IP-address/port
76
Be aware that this trick only works if one session partner is on the public Internet (or uses a correctly mapped public address). If it is also behind NAT (green line) the trick wont work because then it will not receive any media packet. Symmetric RTP/RTCP behavior is also a parameter which is usually provisioned at a SIP client.
77
The three cone NAT categories are shown in Figure 48. The common characteristic of all three flavours is that the NAT mapping only depends on the IP address and port on the private side (IP: 10.0.0.1, Port: 8000). Each host on the public side may reach the IP address and port using the same mapping (IP: 202.123.211.25, Port: 12345). The difference between the three flavors is the firewall behavior is shown in Figure 48. In case of Full Cone there is no firewall at all and therefore no restriction. The inside host is always reachable from any outside host. Restricted Cone means that the inside host is only reachable from an outside host if a packet has been received by the outside host from the inside host within a certain time interval. Only the IP-address of the outside host is sufficient for traversing the firewall. If also the port of a previously sent packet must match, then we have the Port Restricted Cone behavior.
Full Cone: Host on the public side may always reach the host on the private side using the mapped address/port. Host A Restricted Cone: Hosts on the public side may only reach the host on the private side if it had sent a packet to the outside IP adress before. Restricted Port Cone: Hosts on the public side may only reach the host on the private side if it had sent a packet to the outside IP and port adress before.
Figure 48: Cone NAT behavior All three Cone behavior variants are SIP friendly, because the mapping of an internal IP-address/port to an outside IP-address/port is always the same if it is once known.
78
In contrast to that we have the Symmetric NAT behavior as shown in Figure 49.
NAT enabled firewall IP: 202.123.211.25 Port: 12345 Symmetric NAT: The IP-address/port mapping of the same inside IP-address/port is destination dependent. The mapping cannot be predicted. If e.g. Host B would use the mapping of Host A the packets would be dropped by the firewall and vice versa. IP: 202.123.211.25 Port: 45678 Host B private address space public address space
Host A
Figure 49: Symmetric NAT behavior In case of Symmetric NAT behavior the mapping is not predictable. This is very SIP unfriendly. As long as we do not know the IP-address/port of the peer every IP-address/port put into Via, Contact header field or SDP will be incorrect. Unfortunately an increasing number of NAT boxes follow a more symmetric behavior due to its stronger firewall characteristic.
78
RFC 3489: STUN - Simple Traversal of UDP Through Network Address Translators (NATs)
79
situation79 so that from an outside server (registrar or proxy server) the client looks like not being behind NAT.
NAT enabled firewall STUN server: A STUN server is located on the outside (public) network and uses two IP addresses. During NAT behaviour discovery the client asks e.g. the STUN server to reflect back a packet from the other IP address. If this does not arrive the client concludes to be behind a symmetric NAT. If the packet arrives the client concludes to be behind a cone NAT, and it mirrors the mapped address in the reflected packet.
STUN server
Figure 50: STUN server and algorithm The (classic) STUN protocol defines different packet formats to control a STUN server. It uses the dedicated port number 3478 and can be used with UDP or TCP transport protocol.
Both solutions need additional protocol support by a redefined STUN protocol (chapter 13.6.2 on page 82).
79
With all this ingredients every NAT situation can now be automatically solved in the most economical way. If SIP unfriendly NAT implementations are involved a TURN server will be automatically (chapter 13.6.3 on page 86) inserted into the media path, but this is only the last resort if other solutions fail.
Hairpinning The NAT box should support hairpinning (see Figure 51 below).
Deterministic properties The NAT algorithm should be deterministic and not be changed on the fly.
ICMP Destination Unreachable Behavior The receipt of an ICMP message should not terminate the NAT mapping.
80
Packet fragmentation The NAT box should honor the DF (Dont Fragment) bit set from the internal traffic. The NAT box should be able to receive in-order and out-of-order packets.
NAT enabled firewall Hairpinning behaviour: In case the internal client with IP-address/port A:a sends a packet to a second internal client B:b using the external mapped address Y:y of ist peer as destination, the packet should not be sent to the outside domain but rather be hairpinned internally in the NAT box. It is questionable if a packet sent to outside domain will be returned to the same IP address. A router usually discards packets instead of sending the packets back.
A:a
X:x
?
B:b Y:y
81
STUN is not a NAT traversal solution by itself. Rather, it is a tool to be used in the context of a NAT traversal solution. The new STUN protocol provides a tool for dealing with NATs. It actually enables four different applications called STUN usages: a) It enables an endpoint to determine the IP address/port mapping used by NAT. b) It provides a way for an endpoint to keep a NAT binding alive. c) The protocol can be used to execute connectivity checks between two endpoints. d) The protocol can be used to relay packets between two endpoints. In keeping with its tool nature, the new STUN protocol defines an extensible packet format, defines operation over several transport protocols, and provides for two forms of authentication.
STUN usages STUN is intended to be used in context of one or more NAT traversal solutions. These solutions are known as STUN usages. Each usage describes how STUN is utilized to achieve a NAT traversal solution. A usage typically indicates when STUN messages get sent, which optional attributes to include, what server is used, and what authentication mechanism is to be used. Three usages of STUN are further defined below: Interactive Connectivity Establishment (ICE) Client initiated connections (SIP outbound) Traversal Using Relays around NAT (TURN)
83
STUN protocol structure STUN is a client/server protocol supporting two types of transactions. One is a request/response transaction in which a client sends a request to a server, and the server returns a response. The second is an indication transaction in which either agent - client or server - sends an indication which generates no response. STUN is a binary protocol. All STUN messages start with a fixed header (see Figure 52 below) that includes a STUN message type (comprising class and method), the message length, a magic cookie and a transaction ID. The class indicates whether this is a request, a success response, an error response, or an indication. The method indicates which of the various requests or indications this is. The basis STUN specification defines just one method, Binding but the TURN usage also defines the methods Allocate, Refresh, Channel Bind, Send and Data. The transaction ID is used to map request and responses.
16
31
00
Message Length
Figure 52: Format of the STUN message header After the STUN header zero or more attributes may follow. Each attribute is TLV encoded (TypeLength-Value) as shown in Figure 53.
16 31
Length
Figure 53: Format of a STUN attribute (TLV) The STUN protocol defines a basic set of attributes and some usages define additional extension attributes.
84
In the specific case of a Binding request/transaction, a Binding Request is sent from a STUN client to a STUN server (see Figure 54). The STUN client is embedded in an application and multiplexed with the application protocol. When the Binding Request arrives at the STUN server, it may have passed through one or more NATs between the STUN client and the STUN server (in Figure 54 there were two such NATs). As the Binding Request message passes through a NAT, the NAT will modify the source transport address (that is, the source IP address and the source port) of the packet. As a result, the source transport address of the request received by the server will be the public IP address and port created by the NAT closest to the server. This is called a reflexive transport address. The STUN server copies that source transport address into an XOR-MAPPED- ADDRESS attribute in the STUN Binding Response and sends the Binding Response back to the STUN client. As this packet passes back through a NAT, the NAT will modify the destination transport address in the IP header, but the transport address in the XOR-MAPPED-ADDRESS attribute within the body of the STUN response will remain untouched. In this way, the client can learn the reflexive transport address allocated by the outermost NAT for a specific protocol.
NAT
NAT
Clients Server Reflexive Transport Address
Application STUN client Binding Response Binding Request STUN server Private network 1 Private network 2 Public network
Figure 54: STUN Binding Request/Response By using the Binding method of STUN a client can acquire its reflexive transport addresses for all its communication protocols. It then may use this addresses inside of the payload of protocols and can so communicate with its peers even if it is located behind NAT. The reflexive transport addresses are only usable if the NAT has a good behavior. If the NAT mechanism is destination IP-address/port dependent than it is bad and as a consequence the acquired reflexive transport addresses will not be usable. A very interesting feature of the new STUN protocol is that it has been designed to be included (multiplexed) in other application protocols using the same port as the application protocol. This is necessary because different destination ports of a packet get a different address mapping at some
85
NATs. To be able to detect the specific NAT mapping of an application protocol (e.g. SIP with port 5060) the STUN Binding request/response must be sent within the same protocol. Therefore the STUN protocol elements have to be multiplexed within the particular application protocol. The challenge now has been to design some protocol characteristics into STUN that guarantee no code collision and allow to an application to discriminate a STUN packet from other packets of the application protocol. STUN provides the following characteristics in the STUN header for this purpose. The protocol header starts with two bits zero The message type must contain reasonable values The message length must be correct A specific magic cookie with a fixed value at the correct position must be available.
If these characteristics are not sufficient to distinguish the packets, then STUN packets can also contain a fingerprint value carried in a protocol extension field. STUN further defines a set of optional procedures (mechanisms) that may be applied in a specific usage of the protocol. These mechanisms include DNS discovery to locate a STUN server, a redirection technique to an alternate server and two authentication and message integrity exchanges. The authentication mechanisms are based on a username, password, and messageintegrity value. Two authentication mechanisms, the long-term credential mechanism and the short-term credential mechanism are also defined. The long-term credential mechanism is based on a pre-provisioned username and password and a digest challenge/ response exchange similar to HTTP. The short-term credential mechanism uses some out-of-band method (e.g. SIP signaling) to exchange a username and password between client and the server prior to the STUN exchange.
RFC 5766: Traversal Using Relays around NAT (TURN) - Relay Extensions to STUNs
86
addresses and ports on the relay. When the relay server receives a packet on one of these allocated addresses, the relay server forwards it toward the client. The TURN specification makes use of an extension to the STUN protocol. It allows a client to request a relayed transport address on a TURN server. Figure 55 shows the relayed transport address of a client (yellow) together with its server reflexive address obtained via STUN binding method (blue) and the physical transport address (light green). The client behind NAT may in principle offer three different addresses to a potential peer. This is also the basis of the ICE methodology. A relayed transport address of a TURN server will work in any case, irrespective how bad a NAT may be. Why does the relay method always work? Thats because the relay server uses a fixed address and therefore a potential issue with a IPaddress/port dependent mapping is avoided. The variability of destination addresses is shielded by the TURN server which then relays packets to variable destinations. But there is an obvious drawback of TURN server. It costs network resources and causes additional delay for the media stream. Therefore a TURN server is usually used as a last resort solution.
NAT
Private network Public network
NAT Peer A
Peer B
Clients Server Reflexive Transport Address: 192.0.2.1:7000 Clients Host Transport Address: 10.1.1.2:17240 TURN Server Address: 192.0.2.15:3478
Peer As Server Reflexive Transport Address: 192.0.2.17:1234 Clients Relayed Transport Address: 192.0.2.15:9000 Peer B Host Transport Address: 192.0.2.210: 18200
Figure 55: Clients Host, Server Reflexive and Relayed Transport Address
87
Exchanging Data with Peers Figure 56 shows the setup of relay ports and the exchange of data between clients, peers and a TURN server. The client requests the allocation of relay ports from the TURN server. After that it may use the allocated ports to send data to its peers. The address of the peer is an attribute in the Send indication. Data received from the peer are submitted in a Data indication. Allocate request/response, Send and Data indication are TURN commands (STUN extensions). The overhead of the Send and Data indication (36 bytes) is relatively high for applications like voice transport. Therefore an optimised procedure may be used in this case: the setup of data channels. Figure 57 shows the setup of a data channel on a TURN server. Data Channels have to be setup via a ChannelBind request/response (in Figure 57 for peer A). The ChannelBind request maps the destination address to a slim 4 bytes header enabling thus a more efficient data transport.
TURN client
TURN server
Peer A
Peer B
Send (Peer A) Send (Peer A) Data (Peer A) Data (Peer A) Send (Peer B) Send (Peer B) Data (Peer B) Data (Peer B)
Figure 56: Setup of Relay Ports and Data Exchange with Send and Data Indication
88
TURN client
TURN server
Peer A
Peer B
Allocate Request Allocate Response ChannelBind Request (Peer A to 0x4001)) ChannelBind Response [0x4001] Data [0x4001] Data Send (Peer B) Send (Peer B) Data (Peer B) Data (Peer B)
Data Data
Figure 57: Setup of Relay Ports and Data Channels The TURN protocol can be used in isolation, but is more properly used as part of the ICE (Interactive Connectivity Establishment) approach to NAT traversal. Some final remarks on TURN protocol: In addition to the principle mechanism of relaying packets the protocol includes also authentication mechanisms to Allocate transactions to avoid any security issues (DoS attacks). Also refresh mechanisms are defined for allocations so that resources are not occupied endless in case of loss of control data.
83
RFC 5245: Interactive Connectivity Establishment (ICE) - A Protocol for NAT Traversal for Offer/Answer Protocols
89
Figure 58 shows the deployment scenario of an ICE based solution. There are two clients which are behind different NATs and which want to setup a media session. A precondition for ICE to be applied is an established signaling connection for each of clients (blue arrows). This precondition can be fulfilled following the method of client initiated connections (chapter 13.6.5 on page 92). Each client additionally has access to a STUN and a TURN server84 which typically may be collocated. The clients use STUN Binding and TURN Allocate requests to get additional IPaddress/port combinations where they may be reachable. Together with the physical address on the interface each client will have three candidate addresses. These candidate addresses are required for each media stream the client wants to use (for RTP and RTCP).
SIP signalling
SIP server
SIP server
NAT
NAT
Alice
Bob
Figure 58: ICE deployment scenario Figure 59 shows again the three address types and their relationship. When Alice now wants to setup a session she gathers all the usable (candidate) addresses for media to receive and sends an INVITE request. This request contains a modified SDP which includes additional attributes for all the candidate addresses. Such an additional attributes are shown below (line folded for readability):
84
If only a STUN server is available is also a valid scenario, but perhaps some NAT situations may not be covered then.
90
a=candidate:2 1 UDP 1694498815 192.0.2.3 45664 typ srflx raddr 10.0.1.1 rport 8998 The a=candidate line contains various parameters like IP-address and port (192.0.2.3:45664) a type (srflx = server reflexive) and a priority (1694498815) among others85. When Bob receives the INVITE request it does the same. It gathers also all candidate addresses and includes the a=candidate lines in its SDP answer.
STUN/TURN server
Relayed address
NAT
Local address
Figure 59: Candidate addresses and their relationship When both peers have exchanged their candidate addresses they both setup possible address pairs and start connectivity tests. The list of possible address pairs is prioritised on both sides so that the most preferable pairs are tested for connectivity before the others (local addresses are preferred over server reflexive addresses and the lowest priority is for relayed addresses). The connectivity test uses a STUN Binding request/response. That means that each peer must listen with an internal STUN server on all advertised candidate address/ports and respond accordingly. When the connectivity test succeeds, a usable address pair was found and further tests are stopped. The detailed procedure is much more complex and should be read in RFC 5245 if required.
85
Explaining all details is out of scope of this lesson as the ICE-RFC is a complex one.
91
86 87
RFC 5626: Managing Client Initiated Connections in SIP In the simple scenario we assume that the registrar takes also the role of the inbound proxy. 88 For media feature tags see chapter 15.1 on page 17.
92
their flow to the proxy or registrar alive. The ping message has to be answered by a pong message from the server, which allows to detect when a flow has failed. For connection oriented transports such as TCP the ping is based on double carriagereturn and line-feed sequences (CRLF - CRLF) and the pong is a single CRLF. For transports that are not connection oriented the ping-pong messages are accomplished by using a STUN Binding request/response transaction.
Figure 60 shows an example of a REGISTER request creating a signaling flow according to the SIP-outbound draft:
REGISTER sip:example.com SIP/2.0 Via: SIP/2.0/TCP 192.0.2.2;branch=z9hG4bK-bad0ce-11-1036 Max-Forwards: 70 From: Bob <sip:bob@example.com>;tag=d879h76 To: Bob <sip:bob@example.com> Call-ID: 8921348ju72je840.204 CSeq: 1 REGISTER Supported: outbound Contact: <sip:line1@192.0.2.2;transport=tcp>; reg-id=1; ;+sip.instance="<urn:uuid:00000000-0000-1000-8000-000A95A0E128>" Content-Length: 0
Figure 60: REGISTER request creating a signaling flow SIP outbound further defines an option tag outbound which must be inserted by the UA in the Supported header field at the registration. Finally it should be mentioned that the above mechanism describes only the simple scenario of a co-located registrar and inbound-proxy. But the SIP-outbound draft may also be applied to more complex scenarios e.g. allowing a pair of edge proxies be located between UA and registrar/proxy.
93
Figure 61: Protocol structure of an Application Layer Gateway An ALG in principle works as a B2BUA, because it has to modify many parameters in the user agent domain. But on a first view it makes sense to implement such functionality into a NAT/FW box. The main reason is that the NAT/FW box (e.g. a DSL router for a home network) knows the address mapping created by NAT and also the FW rules set-up. It therefore seems quite natural that the NAT/FW box should do the necessary modifications in SIP requests and responses by replacing the address/port values accordingly and opening ports in the firewall. But this means also that the NAT/FW box now becomes application aware. The drawback of ALGs is twofold: The NAT/FW box must implement and fully understand the SIP protocol. In view of the many extensions the probability is high that some extensions are not properly supported and the end-to-end communication is broken. The business role if the NAT/FW vendor is a different one. The vendor usually does not have a big interest in permanently evolving the product according to the innovations produced by the SIP protocol groups.
94
13.7.2 UPnP
Another solution outside of standards is the Universal Plug and Play (UPnP) industry initiative (http://www.upnp.org/). The UPnP standards were mainly pushed by Microsoft and enable a client to control a residential gateway typically a NAT/FW-router. An UPnP enabled client can query the address mapping from the gateway by SOAP protocol. Compared with an Application Layer Gateway the advantage of UPnP is that the gateway does not need to have all protocol logic of SIP implemented. The drawback of UPnP may be the security risk to have a home gateway that is controlled by different applications. In addition UPnP uses broadcast messages to advertise network infrastructure information, a fact that is not well accepted by some security managers.
13.7.3 Skype
After all the complexity of NAT/FW traversal for SIP shown in the previous chapters a question might arise: How does Skype the well-known VoIP application handle the issues? Skype does an excellent job in this area. No NAT/FW configuration is known where Skype media streams are blocked. But the details how this is handled are not fully transparent because Skype is a proprietary peer-to-peer technology and uses an encrypted protocol. There has been some research and reverse engineering in the past on the protocol89. The main outcome in the area of NAT/FW was: Skype tunnels signaling and media streams through port 8080 (http) as a kind of last resort if no other method succeeds. Also Skype uses some kind of TURN server in case of bad NAT, but these servers where media streams traverse are not servers provisioned by Skype but servers of super-nodes. These are the hosts of some high performance users with high bandwidth access and fixed IP address (with and without their knowledge).
Also with Skype we can conclude that there is no easier way to traverse bad NATs.
89
http://www1.cs.columbia.edu/~library/TR-repository/reports/reports-2004/cucs-039-04.pdf
95
A software module within the server which corrects the unusable private IP addresses and wrong SDP data (nathelper- or mediaproxy module).
This solution is interesting insofar as the TURN like servers of SER are automatically included (SDP data are modified) without any knowledge the user agents.
96
14 Session Timer
The SIP does not define a keep-alive mechanism for the sessions it establishes. Although the user agents may be able to determine whether the session has timed out by using session specific mechanisms, SIP proxy servers which are included in the signaling path will not be able to do so. The result is that call stateful proxy server will not always be able to determine whether a session is still active. For instance, when a user agent fails to send a BYE message at the end of a session, or when the BYE message gets lost due to network problems, a call stateful proxy will not know when the session has ended. In this situation, the call stateful proxy will retain state for the call and has no chance to determine if the call state information is still valid. To resolve this problem the SIP session timer extension defines a keep-alive mechanism for SIP sessions. UAs send periodic re-INVITE or UPDATE requests (referred to as session refresh requests) to keep the session alive. The interval for the session refresh requests is determined through a negotiation mechanism at session setup. If a session refresh request is not received before the interval passes, the session is considered to be terminated. Both UAs are supposed to send a BYE, and call stateful proxies can remove any state for the call. The SIP session timer extension is defined in RFC 402890. Periodic refreshes, through re-INVITE or UPDATE requests, are used to keep the session active. The solution works as long as either one of the two participants in a dialog understands the extension. Two new header fields (Session-Expires and Min-SE) and a new response code (422) are defined. The Session-Expires header field defines the time-interval for refreshing the session and it also carries a parameter about who is the refresher (uac or uas) . The Min-SE header field defines a minimum value for this time-interval to avoid overload of network elements. The response-code 422 Session Interval too Small avoids session-setup with too short refresh-timer values. Via the option-tag timer a User Agent can signal if it supports (Supported-Header) or even requires (Require-Header) this protocol extension. The default time-interval for refreshing audio-sessions is proposed to be 30 min. The session timer extension allows the user agents or proxy server to negotiate the timer-value and to define which of both User Agents is responsible for session refreshing. Both user agents and any SIP proxy server in between might activate the session-timer procedure. The only precondition is that one of both user agents supports the extension. SIP proxies in between are able to deny the session-timer procedure if the refresh-intervals are too short. Figure 62 and Figure 63 show an example of a session setup with SIP session timer extension. In this example, both the UAC and UAS support the session timer extension and the assumption is that both proxy server use Record-Route to stay within the signaling path.
90
The session starts when Alice sends an INVITE request to Bob. Alice starts the session with a proposed session timer (Session-Expires header field, SE) of 120 seconds. This session timer is rejected by proxy P1 because the timer values is too short for the proxy. The Proxy P1 requires a minimum session timer of 1800 seconds (Min-SE header field, MSE). Alice sends a new INVITE request this time with a Session-Expires header field (SE) value 1800. Proxy P1 forwards the INVITE to proxy 2 and now the same rejection happens. Proxy P2 requires a minimum session timer of 3600 seconds. The third INVITE request of Alice with SE and MSE value of 3600 now succeeds and arrives at Bob. Bobs user agent also decides Alice to be the refresher of the session (adds a refresher=uac parameter to the Session-Expires header field). After some time (before the session timer at Alice expires) Alices user agent refreshes the session with an UPDATE request.
Alice
INVITE SE: 120 422 Session Int. too Small MSE: 1800 INVITE SE: 1800 MSE: 1800
Proxy P1
Proxy P2
Bob
INVITE SE: 1800 MSE: 1800 422 Session Int. too Small MSE: 3600
422 Session Int. too Small MSE: 3600 INVITE SE: 3600 MSE: 3600
98
Alice
Proxy P1
Proxy P2
Bob
UA crashes
Timeout state is removed State is removed in SIP proxies after session timeout
Figure 63: SIP Session timer extension (part 2) After some time the user agent of Alice crashes. No SIP signaling is sent anymore. After session timeout at the user agent of Bob the user agent send a BYE request. This BYE request timeouts at proxy P1 (and perhaps also at proxy P2 and at UA of Bob) because the UA of Alice does not answer anymore. Independent of the BYE request of Bob the activated SIP session timer extension within the SIP proxy servers causes the session state to be removed within both SIP proxy servers. The following messages reflect the SIP session timer extension for above example (bold font). The first INVITE request sent by Alice may look like:
INVITE sips:bob@biloxi.example.com SIP/2.0 Via: SIP/2.0/TLS pc33.atlanta.example.com;branch=z9hG4bKnashds8 Supported: timer Session-Expires: 120 Max-Forwards: 70 To: Bob <sips:bob@biloxi.example.com> From: Alice <sips:alice@atlanta.example.com>;tag=1928301774 Call-ID: a84b4c76e66710 CSeq: 314159 INVITE Contact: <sips:alice@pc33.atlanta.example.com> Content-Type: application/sdp Content-Length: 142 (Alice's SDP not shown)
99
100
91 92
RFC 3840: Indicating User Agent Capabilities in SIP RFC 2703: Protocol-independent Content Negotiation Framework
101
sip.data
sip.control
102
X and higher. This does not imply that a phone has to reject calls of lower priority. As always, the decision on handling of such calls is a matter of local policy. sip.methods Each value of the sip.methods (note the plurality) feature tag indicates a SIP method supported by this UA. In this case, "supported" means that the UA can receive requests with this method. In that sense, it has the same connotation as the Allow header field. sip.extensions Each value of the sip.extensions feature tag is a SIP extension (each of which is defined by an option-tag registered with IANA) that is understood by the UA. Understood, in this context, means that the option tag would be included in a Supported header field in a request. sip.schemes Each value of the sip.schemes (note the plurality) media feature tag indicates a URI scheme that is supported by a UA. Supported implies, for example, that the UA would know how to handle a URI of that scheme in the Contact header field of a redirect response. This feature tag indicates the type of entity that is available at this URI. This feature tag indicates that the UA is a conference server, also known as a focus, and will mix together the media for all calls to the same URI. The feature tag is a boolean flag. When set it indicates that the device is incapable of terminating a session autonomously. This feature tag contains one of three tokens indicating if the device is rendering any media from the current session ("yes"), none of the media from the current session ("no"), or if this status is not known to the device ("unknown"). sip.message This feature tag indicates that the device supports message as a streaming media type.
sip.actor sip.isfocus
sip.byeless
sip.rendering
The above example means that the UA supports audio and video sessions, can also act as a mailbox (actor=msg-taker and automata=true), is a fixed client and supports the methods INVITE,
103
BYE, OPTIONS, ACK and CANCEL. In a REGISTER request the above example is expressed in the following predicate included in the Contact header field:
Contact: <sip:user@host.example.com>;audio;video; actor="msg-taker";automata;mobility="fixed"; methods="INVITE,BYE,OPTIONS,ACK,CANCEL"
In some cases there is (unfortunately) some overlap with existing headers (e.g. Allow and Supported header field). In case of overlap the specific header field has higher priority. In case of a REGISTER request the feature tags are stored in the location database where they are offered for a filter mechanism at terminating requests.
93
a specific selection of a UA can be reached. This offers several possibilities to create services like94 Routing of INVITE and MESSAGE to Different UA Audio/Video vs. Audio Only Forcing Audio/Video Third-Party Call Control: Forcing Media Maximizing Media Overlaps Multilingual Lines I Hate Voicemail! I Hate People! Prefer Voicemail Routing to an Executive Speak to the Executive Mobile Phone Only Simultaneous Languages The Number You Have Called The Number You Have Called, Take Two
Forwarding to a Colleague Besides the above examples the Caller Preferences and User Agent Capabilities extension is used within IMS to express the requirements and capabilities to support certain service sets.
94
These examples are use cases defined in RFC 4596: Guidelines for Usage of the SIP Caller Preferences Extension
105
the server SHOULD proxy the request to the "best" address (generally the one with the highest q-value). If there are multiple addresses with the highest qvalue, the server chooses one based on its local policy. The directive is ignored if "redirect" has been requested. recurse-directive: This type of directive indicates whether a proxy server receiving a 3xx response should send requests to the addresses listed in the response ("recurse") or forward the list of addresses upstream towards the caller ("no-recurse"). The directive is ignored if "redirect" has been requested. parallel-directive: For a forking proxy server this type of directive indicates whether the caller would like the proxy server to proxy the request to all known addresses at once ("parallel") or go through them sequentially contacting the next address only after it has received a non-2xx or non-6xx final response for the previous one ("sequential"). The directive is ignored if "redirect" has been requested. queue-directive: If the called party is temporarily unreachable e.g. because it is in another call, the caller can indicate that it wants to have its call queued ("queue") or rejected immediately ("no-queue"). If the call is queued the server returns "182 Queued".
An example for request handling directives might be: Request-Disposition: proxy, recurse, parallel
106
R S eac IP ha ou ble tb o u vi a nd
Clara
Bob Alice
Figure 64: Reachability situation solved by GRUU
107
The RFC 562795 defines such an address (URI) with the GRUU mechanism. This mechanism enables a User Agent during registration to request a globally routable User Agent URI (GRUU) from the registrar. The registrar returns in the 200 OK response of the REGISTER request such a GRUU as a parameter in the Contact header. If the user agent uses this GRUU in the Contact header it can be sure, that it is globally reachable via that address. The basic idea behind a GRUU is simple. GRUUs are issued by SIP domains and always route back to a SIP proxy in that domain. The domain maintains the binding between the GRUU and the particular User Agent instance. When a GRUU is used in a request URI, that request arrives at the SIP proxy. It maps the GRUU to the contact for the particular User Agent instance, and sends the request there. Therefore it is the registrar who has to provide a globally reachable GRUU, such a URI cannot be generated by the User Agent. There are two different types of GRUUs defined: A temporary GRUU which does not reveal the identity of the user agent. A public GRUU which may reflect the identity of the user agent.
A temporary GRUU must be used whenever privacy requires to hide the underlying AoR. A precondition for requesting a GRUU is an instance-identifier, which the User Agent has to provide as a feature-tag96 at the registration. A User Agent that wants to obtain a GRUU at registration must provide an instance ID in the "+sip.instance" Contact header field parameter like: Contact: <sip:callee@192.0.2.2> ;+sip.instance="<urn:uuid:f81d4fae-7dec-11d0-a765-0>" When the registrar detects this header field parameter (in addition to a Supported: gruu header field) it provides two GRUUs in the REGISTER response. One of these is a temporary GRUU, and the other is the public GRUU. The two GRUUs are returned in the "temp-gruu" and "pub-gruu" Contact header field parameters in the response. For example: Contact: <sip:callee@192.0.2.2> ;pub-gruu="sip:callee@example.com; gr=urn:uuid:f81d4fae-7dec-11d0-a765-0" ;temp-gruu="sip:tgruu.7hs==jd7vnzga5w7fajsc7-jd6fabz0f8g5@example.com;gr" ;+sip.instance="<urn:uuid:f81d4fae-7dec-11d0-a765-0>" ;expires=3600 Note that all parameters of the Contact header field are sent in one line. The separation with linebreaks is done only for better readability.
95 96
RFC 5627: Obtaining and Using Globally Routable User Agent URIs (GRUUs) in SIP See chapter 15.1.1 Feature tags
108
REGISTER sip:example.com SIP/2.0 Via: SIP/2.0/UDP 192.0.2.1;branch=z9hG4bKnashds7 Max-Forwards: 70 From: Callee <sip:callee@example.com>;tag=a73kszlfl Supported: gruu To: Callee <sip:callee@example.com> Call-ID: 1j9FpLxk3uxtm8tn@192.0.2.1 CSeq: 1 REGISTER Contact: <sip:callee@192.0.2.1> ;+sip.instance="<urn:uuid:f81d4fae-7dec-11d0-a765-0>" Content-Length: 0 SIP/2.0 200 OK Via: SIP/2.0/UDP 192.0.2.1;branch=z9hG4bKnashds7 From: Callee <sip:callee@example.com>;tag=a73kszlfl To: Callee <sip:callee@example.com> ;tag=b88sn Call-ID: 1j9FpLxk3uxtm8tn@192.0.2.1 CSeq: 1 REGISTER Contact: <sip:callee@192.0.2.1> ;pub-gruu="sip:callee@example.com;gr=urn:uuid:f81d4fae-7dec-11d0-a765-0" ;temp-gruu="sip:tgruu.7hs==jd7vnzga5w7fajsc7-ajd6fabz0f8@example.com;gr" ;+sip.instance="<urn:uuid:f81d4fae-7dec-11d0-a765-0>" ;expires=3600
Content-Length: 0
An example of GRUU allocation during registration is shown in Figure 65 above. A temporary and a public GRUU is assigned by the registrar in the 200 (OK) response. The Contact header field returned in the response from registrar contains two additional header field parameters pub-gruu and temp-gruu. Both GRUUs are valid SIP-URIs but the difference is that the public GRUU contains the AoR in full readability and the attached gr parameter reflects the instance-id, while the temporary GRUU only reveals the domain, where the GRUU has to be resolved to the physical address of the User Agent instance. The user part of a temp-gruu parameter contains a cryptographic string. The temporary GRUU is valid for the duration of a registration including refreshes, but the public GRUU persists across registrations assuming that the instance identifier does not change Finally the question should be answered when a GRUU used. Remember that a GRUU is representing the Contact address but without any limitation regarding global reach. Therefor a User Agent should use a GRUU whenever it is populating the Contact header field of dialog initiating (and target refreshing97) requests and responses. These are
97
Target refresh requests are requests that may change the remote target address within a dialog. These are re-INVITE and UPDATE.
109
INVITE and its 2xx response SUBSCRIBE and its 2xx response REFER and its 2xx response UPDATE and its 2xx response
RFC 5627 shows in chapter 9 an example call flow where GRUUs are used extensively. The interested reader may look at this call flow. A further SIP protocol extension related to GRUUs is specified in RFC 562898. In IMS a specific event package is defined which allows some nodes to learn about information stored by a SIP registrar including the registered Contact addresses. When now the Contact addresses have been enhanced by GRUUs it is reasonable to enhance also the registration event package with GRUUs. This simply means that the message body of the NOTIFY requests now also contains two new elements <pub-gruu> and <temp-gruu>.
98
RFC 5628: Registration Event Package Extension for SIP Globally Routable User Agent URIs (GRUUs)
110
17 Identity Management
According to the basic SIP standard (RFC 3261) the From and To header fields of a SIP request contain the address of the initiator and the recipient of the request. The format these addresses is an address-of-record (public address). Both header fields are significant only for the user agents (end-to-end) and therefore the header fields are not checked (screened) anywhere in the network. The recipient of a SIP request has no way to verify that the From header field has been populated correctly if not some cryptographic authentication mechanisms have been applied. SIP offers some security mechanisms including digest authentication, TLS and S/MIME. All three mechanisms do not provide the comfort and easy handling of authenticated identities which we are used from PSTN because: Digest authentication requires a shared secret between session partners. This is not easy to arrange due to the amount of different session partners and in particular it cannot be arranged in advance. TLS is only a hop-by-hop mechanism and further on requires all intermediate nodes to be trusted. S/MIME suffers from the lack of end-user-certificates.
In PSTN one can be sure that the calling line identity offered in a call is correct and can be used to call back. But there are different constraints available in PSTN, which are not available on the Internet: The PSTN is based on a network of operators with a trust relationship. The home network of a user is responsible for verifying the calling line identity and all other networks rely on the correct verification of the identity in the home network. There is also a similar model used in the IMS (IP Multimedia Subsystem) which is based on an additional header field P-Asserted-Identity that is inserted at the edge of the network and forwarded towards the destination. Between network operators of IMS there is a strict mutual trust relationship. But this model is not applicable on the Internet where different networks are interconnected without any trust relationship. RFC 447499 defines a viable solution to offer a similar asserted identity in SIP as we are used from PSTN without big effort and trust relationship between service providers. The solution is based on two new functional roles of servers: authentication server and a verification server two new header fields: Identity header field and Identity-Info header field some additional failure response codes
Figure 66 shows the identification architecture which is quite simple. It shows a request which is sent from Alice in domain atlanta.com to Bob in domain biloxi.com. The goal of the SIP identity
99
management extension is that Bob should be sure that the From header field in the request is correct (that it corresponds to the identity of the originator).
atlanta.com biloxi.com
Internet
authenticator 4 verifier
2,3
Alice
Bob
Figure 66: Identification architecture These following steps show how this is done: 1. Alice sends the request to an authentication server which could be configured as an outbound proxy for Alice. The authenticator is located in the home domain of Alice and has the possibility to authenticate the originator (e.g. via digest authentication). 2. After successfully authenticating the request the authenticator checks the content of the From header field and verifies that the address in the From header field corresponds to the Address-of-Record of Alice. 3. The authenticator then calculates a hash code on some parts of different header fields (including the AoR of the From header field) and uses an asymmetric cryptographic algorithm to sign the hash code with its private key. It then includes the signature in an Identity header field and additionally includes an Identity-Info header field with an address where the public key may be obtained by the verification server at the target domain. 4. The request may now traverse the untrusted area of the Internet and arrives at the target domain biloxi.com where it is handled by the verification server (verifier). 5. The verifier takes the public key of atlanta.com (according to the domain-part of the AoR of the From header). If it does not have the public key of atlanta.com in its cache it will get the key via the reference address contained in the Identity-Info header field.
112
The verifier calculates the hash code on the same header fields and verifies the signature based in the public key of atlanta.com. 6. If the signature is correct the verifier concludes that the From header contains the authenticated identity of the originator of the request and forwards the request to Bob. Otherwise it rejects the request with a failure code 438 (Invalid Identity Header).
Figure 67 shows an INVITE request forwarded by the authenticator after inserting the Identity and Identity-Info header fields (bold font).
INVITE sip:bob@biloxi.example.org SIP/2.0 Via: SIP/2.0/TLS pc33.atlanta.example.com;branch=z9hG4bKnashds8 To: Bob <sip:bob@biloxi.example.org> From: Alice <sip:alice@atlanta.example.com>;tag=1928301774 Call-ID: a84b4c76e66710 CSeq: 314159 INVITE Max-Forwards: 70 Date: Thu, 21 Feb 2002 13:02:03 GMT Contact: <sip:alice@pc33.atlanta.example.com> Identity: "ZYNBbHC00VMZr2kZt6VmCvPonWJMGvQTBDqghoWeLxJfzB2a1pxAr3VgrB0SsSAa ifsRdiOPoQZYOy2wrVghuhcsMbHWUSFxI6p6q5TOQXHMmz6uEo3svJsSH49thyGn FVcnyaZ++yRlBYYQTLqWzJ+KVhPKbfU/pryhVn9Yc6U=" Identity-Info: <https://atlanta.example.com/atlanta.cer>;alg=rsa-sha1 Content-Type: application/sdp Content-Length: 147 v=0 o=UserA 2890844526 2890844526 IN IP4 pc33.atlanta.example.com s=Session SDP c=IN IP4 pc33.atlanta.example.com t=0 0 m=audio 49172 RTP/AVP 0 a=rtpmap:0 PCMU/8000
Figure 67: INVITE request including Identity and Identity-Info header field RFC 4474 defines exactly how the hash code has to be calculated. It is calculated on a concatenated string comprising parts of the following header fields - From: AoR part - To: AoR part - Call-ID: value - Cseq: numeric value and method name - Date: value - Contact: AoR part - Message body: whole content
113
The Address-of-Record of the From header field is the primary goal of protection, but the other parts included increase the level of protection. The Date header field (it comprises a date and a time string) protects against replay attacks. In practical scenarios both new functional components (authenticator and verifier) will be implemented as additional tasks of existing SIP proxy servers. The authenticator will be part of an outbound proxy server and the verifier will be part of an inbound proxy server. The authenticator has to guarantee the authenticity of the originator. This can best be done via authentication of each request (via proxy authentication using digest algorithm) or by using a TLS connection towards the user agent in which case a one-time authentication of the user is sufficient. The verifier function of the inbound proxy server may differentiate requests including a verified identity header field from those not containing such a header field. It may use an Alert-Info header field to mark all identity verified INVITE requests and thus let the user decide how to handle unauthenticated requests.
114
18 ENUM
SIP based sessions and telephone calls on the traditional PSTN use different addressing mechanisms: SIP based sessions use SIP URIs usually including an alphanumeric user part Telephone calls use telephone numbers according to an ITU-T standard: E.164100.
We already have seen that a telephone number can be mapped into the user part of a SIP URI using the user=phone parameter or it can be expressed as a TEL-URI. This mechanism allows using a telephone number in a SIP session. In the legacy telecom world a user addressed with a telephone number is expected to reside in the PSTN and it may be reached from a SIP user via a SIP/PSTN gateway as shown in Figure 68 (step 1).
DNS
SIP Operator B 3 2
ENUM Query
Gateway
Network Migration
SIP Operator A
G a t e w a y
PSTN Operator B
Figure 68: E.164 based subscriber addressed with SIP using ENUM But network operators now gradually migrate their subscribers to SIP based networks. After migration the subscriber has to be addressed with a SIP URI. That means that operator B has to forward all calls via a gateway to the SIP network (step 2). It is obvious that it is not economic to use two gateways and the PSTN as a transit network when it could be directly reached via the Internet. But in this case a mapping from a telephone number (E.164) to a SIP URI is required.
100
http://www.itu.int/rec/T-REC-E.164/en
Such a mapping is also necessary when the SIP subscriber has to be reached from a PSTN user, because a legacy telephone sets only allows to dial numbers. This is where the ENUM system comes in. ENUM represents a DNS based database where E.164 numbers can be mapped to other services (to a SIP service in our example). ENUM is an abbreviation for E.164 NUmber Mapping and is defined in RFC 6116101 and RFC 6117102. It is based on an abstract Dynamic Delegation Discovery System which is implemented on DNS. The first step on this mapping is to convert a telephone number into a FQDN (Fully Qualified Domain Name). The rule for this transformation is simple. A specific root within DNS has been reserved (e164.arpa) and telephone digits are inverted due to their implicit hierarchy and dots are put between each digit. A simple example is: The international telephone number is transformed to the following FQDN
DNS NAPTR resource records (Naming Authority Pointer) are used for this mapping. Various ENUM services may be mapped to a telephone number. In our example we focus only on a protocol based mapping for SIP. In this case the NAPTR RR is effectively a rewrite rule exchanging the FQDN with a SIP URI. This may look like: $ORIGIN 8.4.1.0.6.4.9.7.0.2.4.4.e164.arpa NAPTR 100 10 "u" "E2U+sip" "!^.*!sip:bob@example.com!" . This NAPTR record describes that the domain 8.4.1.0.6.4.9.7.0.2.4.4.e164.arpa can be contacted by SIP using the SIP-URI sip:bob@example.com. There is a list of other ENUM services where an E.164 number can be mapped to defined in RFC 6117 (see the IANA registration103), but the most important mapping is towards SIP service.
101
RFC 6116: The E.164 to Uniform Resource Identifiers (URI) Dynamic Delegation Discovery System (DDDS) Application (ENUM) 102 RFC 6117: IANA Registration of Enumservices: Guide, Template, and IANA Considerations 103 http://www.iana.org/assignments/enum-services/enum-services.xml
116
19 Privacy Mechanism
Sometimes a SIP user does not want his identity to be revealed to other session partners. Thereby not only the From and Contact header field disclose his identity but also other header fields like Via or the Route header field may give hints to the origin or destination of a user. While the From header field may be easily modified by the user to obfuscate his identity, all the other headers mentioned above are system header fields which are required for correct routing within the network. To enable privacy of a user RFC 3323104 defines rules and a header field which expresses privacy preferences of a user. The RFC proposes for a user to populate the From header field with an anonymous SIP URI From: "Anonymous" <sip:anonymous@anonymous.invalid> and in addition to add a Privacy header field to express his privacy preferences. As the user cannot modify the other system header fields he relies on a network service, which honours its preferences and obfuscates these header fields before requests and responses to the session partners. The Privacy header field may carry different values which are defined as follows: header: The user requests that a privacy service obscure those headers which cannot be completely expunged of identifying information without the assistance of intermediaries (such as Via and Contact). session: The user requests that a privacy service provide anonymization also for session data (SDP). user: In this case the user delegates the privacy to a network service because the user agent is unable to provide privacy. none: The user requests that a privacy service applies no privacy functions to this message. critical: The user declares that the privacy services requested for this message are critical, and that therefore, if these privacy services cannot be provided by the network, this request should be rejected. RFC 3325105 adds another privacy value which can be used in certain network architectures (IMS), where the identity of a user is asserted by the network (a new P-Asserted-Identity header field is defined for that). id: With this value the user requests the network asserted to be not disclosed.
104 105
RFC 3323: A Privacy Mechanism for SIP RFC 3325: Private Extensions to SIP for Asserted Identity within Trusted Networks
117
118
20 Reason
For creating services it is often useful to know why a SIP request was issued. Take as an example a CANCEL request and consider two different situations when a CANCEL request is sent: The User Agent client uses CANCEL when the caller gives up after listening to the ringing tone for some time A forking Proxy Server uses CANCEL to terminate pending transaction when the session set-up was successful on another branch.
In both cases the system behavior is the same but for the User Agent server it might be different. In the first case a missed call may show up on the display but in the second case this would be misleading. Also in case of responses the existing mechanism based on status code and reason phrase is sometimes not sufficient to transport all information required for proper handling of a session failure. For both applications (requests and responses) a Reason header field was defined106. A reason header filed contains one or more reason values consisting of protocol and a reason description consisting of cause and text as shown in the examples below: Reason: Reason: Reason: Reason: SIP ;cause=200 ;text="Call completed elsewhere" Q.850 ;cause=16 ;text="Terminated" SIP ;cause=600 ;text="Busy Everywhere" SIP ;cause=580 ;text="Precondition Failure"
The protocol Q.850 refers to an ITU-T standard for PSTN and it defines different cause values for unsuccessful calls. In case of interworking with PSTN via a gateway the Reason header field enables to transport and process the cause value within the SIP domain. The Reason header field may appear in any request within a dialog, in any CANCEL request and in any response whose status code explicitly allows the presence of this header field.
106
RFC 3326: The Reason Header Field for the Session Initiation Protocol
119
21 Path
When a SIP requests whose R-URI contains an AoR (Address-of-Record) reaches the inbound proxy server, this server replaces the R-URI with the Contact address of the terminal received during registration. This model assumes that the destination user is directly reachable by the inbound proxy server, but there are SIP network architectures where this is not the case. An example is the IMS network architecture, where the serving network node (S-CSCF) cannot reach the user terminal directly and always has to use a proxy network node (P-CSCF). RFC 3327107 defines a SIP protocol extension which allows different network nodes to be included in the signaling path at terminating requests. RFC 3327 defines a Path header field, which may be used by SIP proxy servers during registration. Network nodes which require to be included in routing of terminating requests (after re-targeting by the inbound proxy server) only need to add a Path header field during registration. This is illustrated in Figure 69. The mechanism can also be regarded as a sort of Record-Route mechanism for REGISTER requests. During registration the SIP Proxy Server inserts a Path header field with its own address. The address is stored in the location database and inserted in a Route header field automatically whenever a terminating request arrives.
User Agent
Supported: Path
Path: sip:proxy7.domain.com
Terminating Request
Terminating Request
RFC 3327: SIP Extension Header Field for Registering Non-Adjacent Contacts
120
The user agent includes a Supported header field with the option tag path and also receives the Path header field in 200 OK of REGISTER request which it usually ignores.
121
22 Service-Route
RFC 3608108 defines a SIP protocol extension which enables a registrar server to inform a user agent about a service route which the user agent may use when requesting originating services. The Service-Route header field is included by the registrar in the 200 OK response of a REGISTER request. The user agent stores the content of the Service-Route header field and uses the addresses contained within it as a preloaded route. Figure 70 shows the usage of the Service-Route header field in an IMS environment. The user agent (IMS-Terminal) registers at the S-CSCF via two additional SIP proxy server (P-CSCF and ICSCF). The S-CSCF is the registrar. It inserts a Service-Route header field into the 200 OK which is stored at the IMS terminal and used as a preloaded whenever the user agent sends an initial request into the network.
IMS-Terminal
REGISTER
P-CSCF
I-CSCF
S-CSCF
REGISTER REGISTER
200 OK
200 OK
200 OK Service-Route: sip:orig@scscf.domain.com Service-Route: sip:orig@scscf.domain.com
Originating Request Use Service-Route as preloaded Route header field Route: sip:pcscf.domain.com sip:orig@scscfdomain.com
Originating Request
108
RFC 3608: SIP Extension Header Field for Service Route Discovery During Registration
122
By this mechanism the user agent receives dynamically the address of a server which it should use as a preloaded route whenever it requests a service from the network.
123
23 Request History
SIP offers simple mechanisms to redirect or retarget109 a request. This can be done at a proxy server or an application server by changing the request URI. For some applications it is important to determine why and how a session arrived at a specific application and to recognize that the request has been diverted. The request history extension allows the receiving application to get this information. It is specified in RFC 4244110. The extension is based on a new SIP header field History-Info and an option-tag histinfo. The History-Info header field may be added to a request when it is created by the User Agent client or by a SIP Proxy Server. The History-Info header field carries the following information: Targeted-to-URI: This parameter captures the Request-URI before it is overwritten and forwarded Index: This parameter reflects the chronological order of the Targeted-to-URIs if more than one retarget operation has been performed. It is based on a string of digits separated by dots to indicate the number of forwarding hops and retargets. It also reflects forking and nesting of requests. Reason: This is an optional parameter and only added, when retargeting occurs. Privacy: This is an optional parameter with the privacy value history which may be added to the Targeted-to-URI or to the Privacy header field. It indicates whether a specific or all History-Info header fields should be forwarded Extensions: These optional parameters allow for future extensions.
An example of a History-Info header field is shown below. Note that usually more than one HistoryInfo header field is included in a request (to reflect the routing history step-by-step) and as with the e.g. Route header field the History-Info header fields may be separate header fields or the values of several header fields may be accumulated in one History-Info header field with different values separated by colons as shown below. History-Info: <sip:Bob@P1.example.com>;index=1, <sip:Bob@P2.example.com>; index=1.1, <sip:User3@UA3.example.com?Reason=SIP;cause=486; text="Busy Here">;index=1.2, <sip:User5@UA5.example.com>;index=1.3 There are several levels of indices separated by dots. In above example there are only 2 levels shown.
109 110
Retarget means that the request URI is changed during processing of the request. RFC 4244: An Extension to SIP for Request History Information
124
The indexing rules are roughly111 as follows: The index starts at 1. Each forwarding hop adds a new index level. Each dot in the index reflects a hop or level of nesting. The number of hops is reflected in the number of dots within the index. In case of forking a SIP Proxy Server creates a new index for each branch. In above example the indices 1.1, 1.2 and 1.3 reflect three branches created by a forking SIP Proxy Server. A simple example is shown in Figure 71. It shows the principle of indexing History-Info header fields.
Alice INVITE INVITE
Supported: histinfo History-Info: <sip:Bob@P1.example.com>;index=1, <sip:Bob@P2.example.com>; index=1.1
Proxy 1
Proxy 2
Bob UA1
Bob UA2
Bob UA3
INVITE
Supported: histinfo History-Info: <sip:Bob@P1.example.com>;index=1, <sip:Bob@P2.example.com>; index=1.1 <sip:Bob@UA1.example.com>;index=1.1.1
INVITE
Supported: histinfo History-Info: <sip:Bob@P1.example.com>;index=1, <sip:Bob@P2.example.com>; index=1.1 <sip:Bob@UA2.example.com>;index=1.1.2
INVITE
Supported: histinfo History-Info: <sip:Bob@P1.example.com>;index=1, <sip:Bob@P2.example.com>; index=1.1 <sip:Bob@UA3.example.com>;index=1.1.3
Figure 71: History-Info header field - indexing example Alice sends an INVITE request to Bob. Proxy 1 starts the History-Info chain by including the Supported header field and by creating two History-Info header fields as it is re-targeting to Proxy 2. Proxy 2 is a forking SIP-Proxy which forks the INVITE request to three User Agents of Bob. The History-Info header field is also included in responses. It enables upstream SIP Proxy Servers and the User Agent client to make more intelligent decisions in case of failure responses, because the response reflects all routes where the request was sent including reasons.
111
The request history extension is used in IMS to control all aspects of communication diversion service. It should be mentioned, that historically a Diversion header field was used to carry information about a call diversion. This header field was never standardized but due to lack of standards it was widely used112. There exists even a mapping rule on how to map between both header fields (Diversion and History-Info) in case of interworking113. Another aspect of diversion services towards voicemail system is, that when a diversion is done, not only the address (SIP URI) of the voicemail system is relevant but also the address of the user who is responsible for the diversion and the cause. Both parameters can be added to a SIP URI as defined in RFC 4458114. Such a SIP URI will look like sip:voicemail@example.com;target=bob%40example.com;cause=486 This URI shows that the mailbox of bob@example.com is the target and the reason for diversion has been user busy.
112 113
RFC 5806: Diversion Indication in SIP (historic) RFC 6044: Mapping and Interworking of Diversion Information between Diversion and History-Info Headers in SIP 114 RFC 4485: SIP URIs for Applications such as Voicemail and Interactive Voice Response (IVR)
126
24 SIP-Connected-Id
Chapter 17 (Identity Management) described a method to authenticate the identity of the originator of a session. But how can the identity of the participant at the termination side be authenticated, or the identity of a new session partner when a participant hands-over the session to another person? RFC 4916115 offers a solution for this problem. It starts with the fact that the SIP-URI of the To header field reflects only the initial target of the originator but not the final destination. Because of re-targeting (changing the value of the request-URI) during dialog initiating requests the User Agent that receives the session can have a different identity from that identity in the To header field. This may happen due to features like call forwarding, call distribution (call centre), call transfer, etc. The solution is based on an UPDATE request and an option tag from-change. It is applicable only to dialogs (usually INVITE based dialogs) and requires that the User Agents include the fromchange option tag in the Supported header field of an INVITE request and the dialog-creating116 response. This is depicted in Figure 72.
Alice
INVITE From: Alice ... To: Bob ... Supported: from-change 180 Ringing From: Alice ... To: Bob ... Supported: from-change UPDATE From: Bob ... To: Alice ... 200 OK 200 OK ACK
Media stream
Bob
RFC 4916: Connected Identity in SIP A dialog-creating response is the first response from the User Agent Server to a dialog initiating request. In case of INVITE this is e.g. a 180 (Ringing) response or 200 (OK) response, whichever comes first.
127
When the UAS also supports the from-change option an UPDATE request has to be sent during session set-up irrespective if the identity addressed within the To header field corresponds to the targeted user or not. In above example the identity has not been changed. The UPDATE transaction (red arrows) may also be authenticated using the Identity and Identity-Info header fields presented in chapter 17. Figure 73 shows the same example but now the identity has been changed due to some retargeting action within the network.
Alice
180 Ringing From: Alice ... To: Bob ... Supported: from-change
UPDATE From: Carol ... To: Alice ... 200 OK 200 OK ACK
Media stream
SIP network
Figure 73: Application of Connected Identity during session set-up with identity change Please note that even if the SIP-URI in the From header field has been changed the associated tag (From-tag) must be kept, otherwise the UPDATE cannot be associated to the dialog. The connected identity mechanism can also be applied during a session and also when the identity of initiator is changed. It simply offers the feature to inform the session partner whenever the identity of a participant has been changed. It can also be used in re-INVITE requests.
128
25 Questions
After studying the relevant chapter of the lesson you should be able to answer the following questions117:
Chapter 3: Event State Publication Explain the principle of event state publication based on the principal message flow! What network components are involved in event state publication! What is the advantage of event state publication compared to event notification alone? What is the purpose of the PUBLISH request and does it use a message body? What is the purpose of the ETag and SIP-IF-Match header fields?
Chapter 4: Event Packages What is an event package in relation to the event notification framework? Why has been PIDF as technology neutral data format been defined? Draw and explain the main components of the presence architecture! Why is usually authorisation of presence subscription required? What mechanisms for authorisation of presence subscription are available? For SIP an enhanced data model for presence was defined. Explain its components! Explain some enhancement to the PIDF data structure defined for SIP! What is a watcher information event? Explain its usage in case of the presence event! What information does the INVITE initiated dialog event offer? For which applications might it be used? Explain (based on the message flow) the call-back service implemented with INVITE initiated dialog event! List some event packages and explain their purpose!
Chapter 5: The UPDATE method Which problem does the UPDATE method solve? What is the difference between INVITE request and an UPDATE request? Draw and comment a typical message flow showing session setup including an UPDATE request!
117
Chapter 6: Resource Management Describe the principle of resource management signaling in SIP! Which problem does resource management solve? How does resource management impact the setup of a session? Explain the additional SDP attributes used for resource management! Draw and comment an example message flow showing session setup with preconditions! How can a user agent tell its peer that resource management should be used?
Chapter 7: Third Party Session Control What is a typical application for third party session control? Explain the message flow of a third party session control flow and show how SDP data may be exchanged between both User Agents!
Chapter 8: REFER Method Explain the purpose of the REFER method! What is a refer event and how is it related to the REFER method? What is the information carried in the NOTIFY body of a refer event? Draw and comment a message flow example of an unattended call transfer based on REFER! What is the purpose of the Referred-By header field? Why is there a security issue with the Referred-By header field and how can it be solved? What is the purpose of the Replaces head field? In which request is it typically used? How can the Replaces head field be used in an attended call transfer?
Chapter 9: Conferencing What is the advantage of using a central entity (conference focus) for conferencing? What does the Event Package for Conference State offer? What is the role of the mixer? What is the role of the policy server? Explain the steps of creating an ad-hoc conference! What is the conference-factory URI used for? Explain the principle of using URI list!
130
What is the floor control protocol used for? By which methods may participants join a conference?
Chapter 10: SIP Based Messaging Explain the two different modes of instant messaging in SIP! What is the drawback of page mode messaging? How does the SDP for setup of session mode instant messaging look like? What is MSRP? What is an MSRP relay server?
Chapter 11: INFO method What is the INFO method used for and what is its characteristic? For which application is the INFO method used very often? What are Info-Packages and how are info packages referred to in an info request?
Chapter 12: Service Configuration Explain the principles of the XCAP protocol! What is an XCAP application usage and how is it related to an XCAP URI? Explain the principle structure of an XCAP URI including document and node selector! Why are entity tags necessary in XCAP protocol handling? Explain the XCAP-Diff event?
Chapter 13: NAT and Firewall Traversal Why is NAT so bad for SIP compared with other protocols? What are the critical points in an INVITE request where in case of NAT wrong addresses might be included? The classical STUN protocol classifies NAT/FW mechanisms in four categories. Explain these categories and which one cannot be solved by classical STUN? Why is symmetric NAT so bad? What is the principle of the classical STUN server? Why has the classical STUN approach been re-worked? What is the difference between classical STUN and new STUN? What is a TURN server and why does it always help in case of sophisticated NAT/FW situations?
131
Explain the principal concept of ICE! Which are the three different address categories that are used in ICE? What is the purpose of the SIP outbound mechanism? What are the drawbacks of application layer gateways?
Chapter 14: Session Timer Which problem does the session timer extension solve? Explain the principle of the session timer extension!
Chapter 15: Caller Preferences and UA Capabilities Explain the principle of User Agent Capabilities! How are UA capabilities (media feature tags) included in SIP signaling? Give some example of media feature tags! By which header fields can a caller make use of User agent capabilities? What is the purpose of the Request-Disposition header field?
Chapter 16: Global Routable User URI (GRUU) What is the purpose of GRUUs? Explain the two types of GRUUs! Explain the mechanism how a GRUU is assigned!
Chapter 17: Identity Management Why is there a problem with the identity of the caller and the callee in basic SIP? How can the problem be solved in basic SIP? What is the drawback of the S/MIME based solution for offering secure identities? Explain the principle mechanism of the authenticated identity management solution! Draw and comment the identity management solution! What is the purpose of the Identity and the Identity-Info header field?
Chapter 18: ENUM Which problems does ENUM solve? How can a PSTN user be addressed by a SIP user?
132
How can a SIP user be addressed by a PSTN user? Explain the ENUM mechanism!
Chapter 19: Privacy Mechanism What problem does the privacy mechanism solve? Which header field is the basis of the privacy mechanism?
Chapter 20: Reason What is the purpose of the Reason header field? Where can it be used? Give an example!
Chapter 21: Path Which problem does the Path mechanism solve? Explain the principle of the Path mechanism including message flow!
Chapter 22: Service-Route Which problem does the Service-Route mechanism solve? Explain the principle of the Service-Route mechanism including message flow!
Chapter 23: Request History What is the purpose of the History-Info header field? What information does the history-index provide?
Chapter 24: SIP-Connected-Id Which problem does the SIP-Connected-ID mechanism solve? Explain the principal mechanism of the extension! What happens with the From- and To-tag when the Identity is changed? How can the identity of the User Agent serve be authenticated?
133