Вы находитесь на странице: 1из 11

SIP Servlet specification defines application programming interface for writing server-side

server
components (in Java) that can converse in SIP. In other words
words, a component that accepts, sends,
inspect, create, and modify SIP messages. With those capabilities, the component can act either as
a UAC, UAS, or a proxy. Its design is quite similar to HTTP servlet that’s been widely used for
implementing web applications.. In fact, the specification was derived from HTTP servlet, as it is
apparent from the definition off the class defined in SIP Servlet spec, SipServlet
SipServlet; it extends the class
Servlet.

Just like HTTP servlet, a SIP Servlet executes inside a container; the SIP Servlet container. This
container takes care of various lower
lower-level tasks, for instance keeping a socket opened in the
container host, for receiving SIP messages. Another example – and this one is specific to SIP
Servlet – is managing the lifecycle of SIP transactions.

You know, according to the protocol specification (of SIP), an INVITE / BYE message must end with
a corresponding OK / ACK. A pair of INVITE and OK, for example, and all the messages exchanged
in between make up a single SIP transaction.

In the transport layer, things can go wrong (of course), such that message
messages don’t get delivered /
not delivered in time…, and it may lead to some ugly consequences
consequences. To compensate for those
circumstances, they came up with SIP message retransmission mechanism. If you read the SIP
protocol specification, you’ll understand it’s quite a difficult task doing it right (on our own).
Thankfully, SIP container takes care of it so it wouldn’t distract us.

Please allow me to take you through a little detour


detour. You just picked
ked up a new term, SIP transaction.
Now I’d like to introduce you to another important term, SIP Dialog. Simply put, a SIP dialog
represents a call-leg.
leg. An invite and the corresponding success response (e.g.: OK) marks the
beginning of a dialog. On the other
her hand, a BYE message marks the end of it.

So that was SIP dialog or call-leg,


leg, what about the “call” itself? Well in case the user-agents
agents are
directly interconnected we wouldn’t really be able to distinguish a call from a call call-leg;
leg; (because)
the call-leg is the call itself. The distinction is more visible in the cases where the call is being
brokered by a B2B user agent. Let’s have a look at the (following) diagram:

This line stretching from party 1 to party 2 represents what people normally wo would refer to as a
“call”. While the line connecting party 1 and the b2b-ua is a call-leg,
leg, and so is the line connecting
party 2 and the b2bua. Throughout the lifetime of a call
call-leg, there can be several transactions,
transactions as
shown in the following diagram::
Allright, that was the little detour, but please keep the definition of those terms in mind. I can’t
stress enough the importance of using the correct terminology to express your domain.

Let’s go back to SIP Servlet. Now that most lower-level details are handled by the SIP servlet
container, what is left to us as servlet programmers is writing codes that reacts on incoming SIP
messages, be it a request or a response. It’s relatively a simple task; yet another event-driven
component programming.

It’s time for us to know where to handle those events in our code. So, let’s implement our first SIP
Servlet. This servlet is going to answer every call requests it receives. It’s not very useful, but it’s as
simple as we can get.

[Open netbeans]

Open up your netbeans and create a project of type web-application. [Click-click]. Now let’s import
additional libraries required to compile SIP Servlet code. The jar comes with the installation of BEA
WLSS [show it, click, click].

Now, create a class that extends SIPServlet. [click-click]. Let’s examine what we have in the
SipServlet class by reading the API doc.

[Scroll to doService(…)]. The method doService(…) is the entry point to the SIP Servlet; (that is to
say) the container passes the incoming messages to the SIP Servlet through this method. We
normally wouldn’t want to modify this method, whose default behavior is to determine whether
the message is a request or a response, and then pass the message further to the method
doRequest(…) or doResponse(…) accordingly.

Then either doRequest(..) or doResponse(…) inspects the message further, (by) reading the header
of the message to figure out the type of the request or the response. For example, if the request is
an INVITE, then the method doInvite(…) will be invoked, passing along the request to it. Similarly,
if the response is an OK, then the method doSuccessResponse(…) will be invoked.

You can see now, these are the slots where we might have to fill-in with our message-handling
logics. [Show all the doXXX methods in the API].

So, we want to deal with INVITE requests, here’s what we have to do: [write down the doInvite(…)
method, start with comments / pseudo-code]:

doInvite(…) {
//log the message [to demonstrate the structure of a message obj]
//send ringing then sleep for 5 seconds
//send OK
}

We also want to handle BYE message, because we have to respond a BYE with an OK, so that the
sender can do a propert clean up. Here’s what to do [write down do BYE]:

doBye(…) {
//log the message [to demonstrate the structure of a message obj]
//send OK
}

That’s all the Java codes needed for now, and but we still have one more thing to do: writing the
descriptor. Think of it as SIP Servlet’s equivalent of web.xml in an HTTP Servlet application. The
filename of the descriptor is sip.xml, and must be placed in the WEB-INF folder. Let me just copy-
and-paste it and I’ll explain the content.

[Create file, copy-paste]

In the sip.xml you define the SIP servlets that should be made available. We do it the same way as
we would do it for HTTP servlet in web.xml:

[Highlight the <servlet> section]

We also have to define a mapping for the servlet.

[Highlight the <servlet-mapping> section]

The difference from the mapping in web.xml is: instead of specifying the pattern of the URL for
invoking the servlet, we specify the pattern of the SIP message that would invoke the (SIP) servlet.

The pattern is a Boolean expression that will be matched against the value of various fields in the
initial IP message. Please take notice: the pattern only filters __initial__ SIP messages, which are
message that don’t belong to an existing dialog.

In this example we’re pretty lose with our filtering; we accept any INVITEs.
Now let’s build the WAR. Yes, I mean it, let’s make war. [Click-click]. Finally, we can deploy our
application to BEA wlss, watch me doing it: [Click-click].

It’s time for a call [start x-lite]. Remember this is a demo of a P2P, direct call from the phone to the
SIP application server. Therefore, we don’t need to have the requests coming out of our SIP phone
goes through any proxy. Let’s make sure of it by checking its configuration [open X-lite account
config].

Hang on, almost there. We just need to add a SIP URI (of the application server) in the SIP phone’s
address book. [Click-click]…, and fire.

[Look at the log and explain]

Now let’s make another call, this time with wireshark turned on in the machine running the
application server.

[Do it again, now with wireshark] The trace here is similar to the one we’ve seen before (in the
part where SIP protocol was briefly explained, on the direct-call case…, so I wouldn’t explain it
again (stroll down for 1 minutes – isi keterangan 1.5 minutes of clicking through the messages).

Let’s move on our second SIP Servlet. This time, our SIP Servlet will act as a B2B User agent. So,
any call coming in to this SIP Servlet will be bridged to, uhm… let’s say a person named Alice, who
has her SIP user-agent running on 192.168.22.34.

A common practice here is to write a signaling diagrams that depict the possible scenarios. For
now we’ll just consider one scenario, that is the normal scenario, everything go smoothly. So here
it is.

[Draw the signaling diagram with artrage]

Now, we’re facing the design challenge. First, we should notice, that we have two sip dialogs
running side by side. Let’s label the to the left here CallerDialog (because it’s facing the caller).
And… for the one to the right, we’ll name it CalleeDialog.

In SIP Servlet programming the object that roughly represents a SIP dialog is SIPSession

[Show the API doc of SIPSession].

What’s the use of it? Well, it can be used to store session-wide data…, akin to the use of
HttpSession in HTTP servlet programming.

The other use – that we will employ here – is: creation of requests. You see, at one point (here)
[hover your mouse over the BYE], the SIP Servlet will have to create and send a BYE request. That
can be achieved by calling createRequest(…) method on the SIPSession object [show the method
doc in the API doc] that represents the dialog you want to tear down.
Now back to the diagram, by now it’s clear that our SIP Servlet will have to handle two SIPSessions
throughout the scenario. Further, we also (should) realize that we need to logically link the two
sessions. By this I mean: we must be able to navigate from one session to the other. Just keep that
in mind for now, as we’ll jump on to the next step… that is doing a mapping between the diagram
and various doXXX methods of SIPServlet.

We start from the top, the INVITE request. We already know that we deal with it in doInvite(…)
method. What exactly do we do there? Well, first we’ll have to send back a response to the caller,
to indicate him that the INVITE has been received, and letting him know that the SIP Servlet is
trying to fulfill it. The name of the response signal is…, surprise, TRYING (code 100).

Ok, another little detour.... Regarding sending this TRYING signal. It’s a good thing to do, in order
to avoid network congestion. You know, the caller, after sending an INVITE, will start a timer.
When it fires off, and no response has been received, it will re-send the INVITE. So, let’s keep that
from happening, and sending the TRYING signal is the way to go.

That’s all for doInvite(…)…, let me mark this part of the diagram, denoting that it falls within the
scope of the doInvite(…) method.

Next…. When the ringing signal is received from the callee, we have to relay it to the caller. We do
that inside the doProvisionalResponse() method.

[Again, highlight]

Later, as soon as the callee picks up, our SIP Servlet will receive an OK signal. This one, also, has to
be relayed to the caller. We do that inside the doSuccessResponse() method.

[Again, highlight]

Finally, after a while, either the caller or the callee will hangup first. Yep, time for total
annihilation. So, what we have here is an incoming BYE request. Again what we have to do on that
event is: relay it to the other side…, and we do it inside the doBye(…) method.

[Again, highlight]

Allrighty, we’re done the mapping, now it’s time to do the actual coding.

[Start from doInvite:]

doInvite(inviteFromCaller):

inviteToCallee = SIPFactory.createRequest(invitationCaller.getSession().getApplicationSession(),
to = create using sipfactory, from = copy from the inviteFromCaller).

…………….
Notice a new type just came up, applicationSession (show SIPApplicationSession API DOC).
Basically it’s an object that binds inter-related SIPSessions together. The reason why we pass an
instance of SIPApplicationSession when creating the request is because we want the request to be
in a completely new SIPSession [different from that of the inviteFromCaller], but… (emphasis), that
new session must belong to the same application session (as that of the inviteFromCaller). Think of
it as a common storage for both sessions.

SIPFactory is a helper class. Other helper method it has is the one for creating sip URI (like this one
… highlight the code the creation for the To header).

As for the FROM header for the inviteToCallee, we simply copy the value of From header of the
inviteFromCaller. The idea is to let the callee know the identity of the originator of the call…, not
of the application server.

In these lines [show the line where I copy the message] we copy the value of Content-Type header
from the inviteFromCaller to the inviteToCallee. The same goes for the message body here. By
doing so, the callee will send the voice data __directly__ -- without passing through the
application server -- to the caller’s user-agent when the conversation finally takes place. This is an
important aspect of SIP: there’s a separation between signaling space and media space. The voice
data can take a different routing, through a completely different network, from the signaling
messages.

Then here, we simply send the invite.

At this point we have 2 sip sessions, our task now is to link them. The reason we’re doing this is
because at some point in the scenario we will need to do a cross-over. For example: at the point
we receive a BYE from the caller, what we’ll have is the SIPSession that corresponds to the dialog
with the caller. However, we will also need to have the SIPSession of the dialog with the callee in
order to be able to create a BYE request to be sent to the callee.

One way to link them is by using the SipSession. So basically in the SipSession of the caller we store
a reference to the SipSession of the callee, and vice versa. Like this [type the two lines of code].

Another approach for linking is storing both sessions in the SipApplicationSession, this way [show
code].

Well, what approach is better? Well, in this case, there’s no technical merit of one over the other.
But of course, in other cases, it depends ☺.

That’s all, we’re done with the doInvite(…), so we can move now to the doProvisionalResponse(…).
Provisional response is any response whose status starts with 1. The TRYING response we just
made, for example, has status code 100. Another example: 183, for early-media, and 180 for
ringing.
Here we want to adhere strictly to the diagram, we wouldn’t consider anything that is not in the
diagram. So we will be selective here on the type of provisional we will handle…, we’ll only deal
with Ringing.

[Type the if block]

Now, to create a SIPResponse object to be sent to the caller, we need to get hold of the SIPSession
object corresponding to the dialog with the caller. The reason: because that’s the object that has
the createResponse method we need.

[show the API doc of createResponse].

This is where the linking we did comes handy; we simply look for it in the SipApplicationSession.

[Type response.getApplicationSession().getAttribute(“callerSession”)]

[Type createresponse]

And finally we sent it [type send response].

Next…. The doSuccessResponse method. Success response is any response whose status starts
with 2. However here we’re only interested in the OK response, whose code is 200.

[type if-block code]

First thing to do, send an acknowledgement to the callee, otherwise the callee will keep sending
OK to us.

[type send acknowledge]

Then we create the OK response to be sent to the caller…, the same way as we did it for the
TRYING response.

[Type create response].

Also, we copy the message body of the OK from the callee to the OK for the caller. This way ,
during the conversation, the caller will send the voice data directly to the callee’s user-agent.

Finally, and this time I mean it, the doBye(…) method.

Let’ me just type it first. It’s very simple, just a repetition of what we’ve done in the other
methods, we relay the BYE.

That’s all for the second example. Now we’re ready to touch the interesting subject in VoiP
domain: convergence. Looks like many people in this domain are trying to profit from it.

To me convergence is quite an overloaded term. I thought I knew what it mean, just by using the
fact that SIP Servlet application can be viewed as “just another TCP/IP application”, that can be
made to work together with other TCP/IP applications, and a little bit of imagination, tada, we
have “convergence”. That was my initial impression of convergence. However, preparing the
material of this video forced me to evaluate my (old) understanding.

I found an interesting note in a whitepaper published by a company named AudioCodes, where


the author mentions three types of convergence:
- Device convergence
- Network convergence
- and Service convergence.

Here, let me show you the whitepaper. [show the whitepaper on the screen].

Network convergence is about enabling access to a common core IP network through various
types of access networks (such as GSM, PSTN, etc).

Device convergence… uhm… I guess it can be understood as the ability for a single device to use
multiple access networks.

Lastly, service convergence, is more about enabling access to a common service in the core IP
network through various access network, and various devices. Well, uhm, I believe so.

It should be obvious that we, in this video, are more interested in the service convergence. So,
let’s think of a scenario of a “converged service”….. [pause 15 seconds].

Allright, I have to admit I had difficulties finding a real-world example of such service. Well, my
only resource is google, and I couldn’t find anything (weird, heh?). I already asked some of my
friends who live in Japan about this. I thought this kind of thing is already common there. He
commented that for the realizations of “converged services” requires a more advanced
infrastructure, that is 4G, which will be available in Japan starting on 2010. I was like: what?
another waiting? Why? I thought with the availability of 3G networks half of the world problems,
including converged service, have been solved?!

Aaannyway, if we see this in positive light: we’re just in time for that. We start investing time
learning the concepts and the development skills now, and we’ll reap the benefits within one or
two years ☺.

Ok, let’s trackback to the “definition” of service convergence: “access to a common service
through multiple access network”.

Well, if that’s what it’s all about, maybe it’s not that new after all. I mean, you remember WAP? To
my understanding it was introduced to complement its big brother (WWW), by extending the
audiences of web applications to mobile phone users, right? That’s an access to a common service
through multiple access-networks, I believe.

Additionaly, VoiceXML. Many applications with user-interface developed in VoiceXML have been
in operation since 2002, at least. For example back in 2003, I and my friends wrote a VoiceXML
application that lets people listen to the emails in their inbox, on the phone (including from regular
PSTN phone). Can we call it “converged service”? I don’t know for sure, but I know I’m not
content with that.
Search-search-search, and I found a whitepaper from Cisco that suggests the emphasis of service
convergence (nowadays) is on the continuity across access, for customer loyalty and stickiness.

So I guess, the really interesting aspect of converged service – that we need to explore more in our
search of killer services – is the continuity.

Let’s imagine the following scenario: you’re a model employee. Just like any other model
employee, you play massively multiplayer online role playing game in the office. It’s 5 PM, and
now you’re in the middle of a long campaign. But it’s 5 PM, you have to go home. But you can’t
just scrap your adventure, it’s a six-hours worth of works! Oh, dilemma.

But, thanks to service-convergence, you can switch on your old-clunky cellular, and you dial to the
game. Once you’re in, you turn off your workstation and you can leave your desk. Nothing is lost,
everything you had in the session while you’re playing on PC are intact. You continue playing, in
that session, on your way driving home. Isn’t that sweet?

Yeah, I know, the example sounds rather stupid, with no economic value whatsoever (except for
the game publisher). But, I hope got the gist of it. It’s the continuity.

Now let’s try to draw a line between what I just described and SIP Servlet technology. You see,
that continuity of service – especially real-time service – requires a mechanism for sharing or
transferring the data stored in one session to another session (for example, from the http session
to the call session, sip session in this case). This mechanism exists in SIP Servlet technology. That’s
what we’re going to learn in this third, the last, example.

Remember the SIPApplicationSession that we learned in the second example? It binds multiple –
inter-related – SIPSession together. Now, additional fact for you: SIPApplicationSession can
actually have a mix of SIPSessions and HttpSessions, and some others. That is the characteristic
we’re going to learn how to exploit.
So here’s the general overview of the application: The application is composed of two parts, the
web part, and the sip servlet part. The user starts by accesing the web part, that displays a web-
form containing two text fields, and a button, like in this picture.
Pressing the make call button causes the data in the fields to be sent to a plain HTTP Servlet, that
initialized the HTTP session for that user. Additionally, the HTTP Servlet triggers the creation of a
SIPSession by initiating a call to the “destination sip-uri” specified in one of those text fields. Now
we have two sessions, of different protocols, and we bind them together in the same
SIPApplicationSession.

What will happen next is, on the web browser, the page shows the progress of the call. The page
refreshes every one second. It should be something like this:

[show page]

It’s easy to imagine that from the JSP page / HTTP servlet that generates the page, you need to
access the SIPSessions whose progress you’d like to display. It can be achieved easily by navigating
from the HTTPSession to the SIPApplicationSession, and finally you arrive at the SIPSesions you’re
interested in.

Now, let’s turn our attention a little bit to the SIPServlet. This SIP servlet will be doing the B2B UA
role, bridging two call-legs, one is to the “user sip-uri”, and the other one is to “destination sip-
uri”. Once those two legs are bridged the user, on the web page will be presented a page that
contains a button “Drop”, like this.

[show page]

Clicking the “drop” button will cause the call to be dropped.

That is all. Conceptually it’s fairly simple, but this very useful in IP call-center applications, for
example. Now let’s just code it:

[Todo: coding]

As you could see, converged application is still a bit messy in SIP Servlet 1.0. Fortunately it’s been
improved in SIP Servlet 1.1. For example, in converged container you can cast a plain HttpSession
to an instance of ConvergedHttpSession (only available in SIP Servlet 1.1) that gives you a cleaner,
direct path to the SipapplicationSession. Newer concepts in J2EE such as injection by annotation is
also supported, so you can have an easy access to SIPFactory from your J2EE components such as
EJB.

That’s all for the introduction of SIP Servlet programming. Please feel free to post your doubts /
question / correction / suggestions here. The next section will be the introduction to VoiceXML
programming. Zaijian.

Вам также может понравиться