Академический Документы
Профессиональный Документы
Культура Документы
These days, the World Wide Web has become so popular that many people think it is the
Internet.
If you aren't on the Web, you aren't anybody. Unfortunately, although the Web is based
primarily on a single protocol (HTTP), web sites often use a wide variety of protocols,
downloadable code, and plug-ins, which have a wide variety of security implications.
It has become impossible to reliably configure a browser so that you can always read
everything on every web site; it has always been insecure to do so.
Many people confuse the functions and origins of the Web, Netscape, Microsoft Internet
Explorer, HTTP, and HTML, and the terminology used to refer to these distinct entities has
become muddy.
Some of the muddiness was introduced intentionally; web browsers attempt to provide a
seamless interface to a wide variety of information through a wide variety of mechanisms, and
blurring the distinctions makes it easier to use, if more difficult to comprehend. Here is a quick
summary of what the individual entities are about:
The Web
The collection of HTTP servers (see the description of HTTP that follows) on the
Internet.
The Web is responsible, in large part, for the recent explosion in Internet activity. It is
based on concepts developed at the European Particle Physics Laboratory (CERN) in
Geneva, Switzerland, by Tim Berners-Lee and others.
Much of the ground-breaking work on web clients was done at the National Center for
Supercomputing Applications (NCSA) at the University of Illinois in UrbanaChampaign.
Many organizations and individuals are developing web client and server software these
days, and many more are using these technologies for a huge range of purposes.
The Internet Engineering Task Force (IETF) is currently responsible for maintaining the
HTTP standard, and the World Wide Web Consortium (W3C) is developing successors
to HTML
Nobody "controls" the Web, however, much as nobody "controls" the Internet.
HTTP
The primary application protocol that underlies the Web: it provides users access to the
files that make up the Web.
These files might be in many different formats (text, graphics, audio, video, etc.), but
the format used to provide the links between files on the Web is the Hyper Text Markup
Language (HTML).
HTML
A standardized page description language for creating web pages. It provides basic
document-formatting capabilities (including the ability to include graphics) and allows
you to specify hypertext links to other servers and files.
Netscape Navigator and Microsoft Internet Explorer
Commonly known as "Netscape" and "Explorer", these commercial products are web
browsers (they let you read documents via HTTP and other protocols).
There are hundreds of other web browsers, including Lynx, Opera, Slurp, Go!Zilla, and
perl, WWW, but most estimates show that the vast majority of web users are using
Netscape or Explorer.
HTTP is only one protocol used by web browsers; web browsers typically also can use
at least the FTP, NNTP, SMTP, and POP protocols. Some of them also can use other
protocols like WAIS, Gopher, and IMAP.
Thus, when users say "we want Explorer" or "we want Netscape", what they really
mean, from a protocol level, is that they want access to the HTTP servers that make up
the Web, and probably to associated servers running other protocols that the web
browsers can use (for instance, FTP, SMTP, and/or NNTP).
1
For instance, a Java program running inside the sandbox cannot write or read files
without notification.
Unfortunately, there have been implementation problems with Java, and various ways
have been found to do operations that are supposed to be impossible. In any case, a
program that can't do anything dangerous has difficulty doing anything interesting.
Children get tired of playing in a sandbox relatively young, and so do programmers.
ActiveX, instead of trying to limit a program's abilities, tries to make sure that you
know where the program comes from and can simply avoid running programs you don't
trust.
This is done via digital signatures; before an ActiveX program runs, a browser will
display signature information that identifies the provider of the program, and you can
decide whether or not you trust that provider.
Unfortunately, it is difficult to make good decisions about whether or not to trust a
program with nothing more than the name of the program's source.
Is "Jeff's Software Hut" trustworthy? Can you be sure that the program you got from
them doesn't send them all the data on your hard disk?
As time goes by, people are providing newer, more flexible models of security that
allow you to indicate different levels of trust for different sources.
New versions of Java are introducing digital signatures and allowing you to decide that
programs with specific signatures can do specific unsafe operations.
Similarly, new versions of ActiveX are allowing you to limit which ActiveX operations
are available to programs.
There is a long way to go before the two models come together, and there will be real
problems even then.
Even if you don't have to decide to trust Jeff's Software Hut completely or not at all,
you still have to make a decision about what level of trust to give them, and you still
won't have much data to make it with.
What if Jeff's Software Hut is a vendor youve worked with for years, and suddenly
something comes around from Jeff's Software House? Is that the same people, upgrading
their image, or is that somebody using their reputation?
Because programs in extension systems are generally embedded inside HTML
documents, it is difficult for firewalls to filter them out without introducing other
problems.
Because an HTML document can easily link to documents on other servers, it's easy for
people to become confused about exactly who is responsible for a given document.
"Frames" (where the external web page takes up only part of the display) are particularly
bad in this respect.
New users may not notice when they go from internal documents at your site to external
ones.
This has two unfortunate consequences. First, they may trust external documents
inappropriately (because they think they're internal documents).
Second, they may blame the internal web maintainers for the sins of the world.
People who understand the Web tend to find this hard to believe, but it's a common
misconception: it's the dark side of having a very smooth transition between sites.
Take care to educate users, and attempt to make clear what data is internal and what
data is external.
Web Server Security Issues
When you run a web server, you are allowing anybody who can reach your machine to
send commands to it.
If the web server is configured to provide only HTML files,the commands it will obey
are quite limited.
However, they may still be more than you'd expect; for instance, many people assume
that people can't see files unless there are explicit links to them, which is generally false.
3
You should assume that ifthe web server program is capable of reading a file; it is
capable of providing that file to a remote user.
Files that should not be public should at least be protected by filepermissions, and
should, if possible, be placed outside of the web server's accessible area (preferably by
moving themoff the machine altogether).
Most web servers, however, provide services beyond merely handing out HTMLfiles.
For instance, many of them come with administrative servers, allowing you to
reconfigure the server itself from a web browser.
If you can configure the server from a web browser, so can anybody else who can reach
it; be sure to do the initial configuration in a trusted environment.
If you are building or installing a web server, be sure to read the installation
instructions.
Web servers can also call external programs in a variety of ways.
You can get external programs from vendors, either as programs that will run separately
or as plug-ins that will run as part of the web server, and you can write your own
programs in a variety of different languages and using a variety of different tools.
These programs are relatively easy to write but very difficult to secure, because they
can receive arbitrary commands from external people.
You should treat all programs run from the web server, no matter who wrote them or
what they're called, with the same caution you would treat a new server of any kind. The
web server does not provide any significant protection to these programs.
A large number of third-party server extensions originally ship with security flaws,
generally caused by the assumption that input to them is always going to come from wellbehaved forms.
This is not a safe assumption; there is no guarantee that people are going to use your
forms and your web pages to access your web server.
They can send any data they like to it.
A number of software (and hardware) products are now appearing with embedded web
servers that provide a convenient graphical configuration interface.
These products should be carefully configured if they are running on systems that can
be accessed by outsiders. In general, their default configurations are insecure.
Forms Authentication
An alternative approach for authentication is to use HTML forms. Forms are a generic
mechanism for users to enter data into a website; they are supported by almost every web
browser.
Forms include support for a password box, which obscures the password as it is typed.
The real advantage of this technique is flexibility; web developers can make the form and
surrounding HTML appear however they like.
The disadvantage is that the application must take care of the whole authentication system.
Alternative Approach: HTTP Authentication
HTTP authentication takes an alternative approach to solving this problem.
6
Traceability
In software development, the term traceability (or Requirements Traceability) refers to the
ability to link product documentation requirements back to stakeholders' rationales and forward
to corresponding design artifacts, code, and test cases.
Traceability supports numerous software engineering activities such as change impact analysis,
compliance verification or trace back of code, regression test selection, and requirements
validation.
It is usually accomplished in the form of a matrix created for the verification and validation of
the project.
In transaction processing software, traceability implies use of a unique piece of data (e.g., order
date/time or a serialized sequence number) which can be traced through the entire software flow
of all relevant application programs.
Messages and files at any point in the system can then be audited for correctness and
completeness, using the traceability key to find the particular transaction.
This is also sometimes referred to as the transaction footprint.
Importance of traceability
It Ensures that requirements are met
Understand relationship between requirements and the delivered system
It lowers the risk
It provides consistency
It helps to provide control over entire system
It gives the freedom to change anything in system smoothly
It helps in easy development
It minimize the risk
Problems with Traceability
It is manual process
viewed by developers as a low priority
Misunderstood
No single modeling method
Poor documentation
how is Tracing Performed?
Client gives developers rough requirements
Developers create system, hardware, and software requirements
Each element is given a unique identifier
o Element requirement, design attribute, test, etc
Linkages done manually and managed by a CASE tool
Traceability tables are made using Matrix
7
Encryption
Encryption is the process of transforming information (referred to as plaintext) using an
algorithm (called a cipher) to make it unreadable to anyone except those possessing special
knowledge, usually referred to as a key.
The result of the process is information (in cryptography, referred to as cipher text). The
reverse process, i.e., to make the encrypted information readable again, is referred to as
decryption (i.e., to make it unencrypted)
In many contexts, the word encryption may also implicitly refer to the reverse process,
decryption e.g. software for encryption can typically also perform decryption.
The messages to be encrypted, known as the plaintext, are transformed by a function that is
parameterized by a key.
The output of the encryption process, known as the cipher text, is then transmitted, often by
messenger or radio.
We assume that the enemy, or intruder, hears and accurately copies down the complete cipher
text.
However, unlike the intended recipient, he does not know what the decryption key is and so
cannot decrypt the cipher text easily.
Sometimes the intruder can not only listen to the communication channel (passive intruder) but
can also record messages and play them back later, inject his own messages, or modify
legitimate messages before they get to the receiver (active intruder).
The art of breaking ciphers, called cryptanalysis, and the art devising them (cryptography) is
collectively known as cryptology.
Encryption methods have historically been divided into two categories:
Substitution ciphers
Transposition ciphers.
We will now deal with each of these briefly as background information for modern
cryptography.
Substitution Ciphers
In a substitution cipher each letter or group of letters is replaced by another letter or group of
letters to disguise it. One of the oldest known ciphers is the Caesar cipher, attributed to Julius
Caesar.
In this method, a becomes D, b becomes E, c becomes F, ... , and z becomes C. For example,
attack becomes DWWDFN. In examples, plaintext will be given in lower case letters, and cipher
text in upper case letters.
A slight generalization of the Caesar cipher allows the cipher text alphabet to be shifted by k
letters, instead of always 3.
8
In this case k becomes a key to the general method of circularly shifted alphabets. The Caesar
cipher may have fooled Pompey, but it has not fooled anyone since.
The next improvement is to have each of the symbols in the plaintext, say, the 26 letters for
simplicity, map onto some other letter.
For example,
Plaintext: a b c d e f g h i j k l m n o p q r s t u v w x y z
ciphertext: Q W E R T Y U I O P A S D F G H J K L Z X C V B N M
The general system of symbol-for-symbol substitution is called a monoalphabetic substitution,
with the key being the 26-letter string corresponding to the full alphabet.
For the key above, the plaintext attack would be transformed into the ciphertext QZZQEA.
Nevertheless, given a surprisingly small amount of ciphertext, the cipher can be broken easily.
The basic attack takes advantage of the statistical properties of natural languages
. In English, for example, e is the most common letter, followed by t, o, a, n, i, etc.
The most common two-letter combinations, or digrams, are th, in, er, re, and an.
The most common three-letter combinations, or trigrams, are the, ing, and, and ion.
A cryptanalyst trying to break a monoalphabetic cipher would start out by counting the relative
frequencies of all letters in the ciphertext.
Then he might tentatively assign the most common one to e and the next most common one to
t.
He would then look at trigrams to find a common one of the form tXe, which strongly suggests
that X is h.
Similarly, if the pattern thYt occurs frequently, the Y probably stands for a.
With this information, he can look for a frequently occurring trigram of the form aZW, which is
most likely and. By making guesses at common letters, digrams, and trigrams and knowing about
likely patterns of vowels and consonants, the cryptanalyst builds up a tentative plaintext, letter by
letter.
Another approach is to guess a probable word or phrase. For example, consider the following
ciphertext from an accounting firm (blocked into groups of five characters):
CTBMN BYCTC BTJDS QXBNS GSTJC BTSWX CTQTZ CQVUJ
QJSGS TJQZZ MNQJS VLNSX VSZJU JDSTS JQUUS JUBXJ
DSKSU JSNTK BGAQJ ZBGYQ TLCTZ BNYBN QJSW
Using our knowledge that financial has a repeated letter (i), with four other letters between their
occurrences, we look for repeated letters in the ciphertext at this spacing.
We find 12 hits, at positions 6, 15, 27, 31, 42, 48, 56, 66, 70, 71, 76, and 82.
However, only two of these, 31 and 42, have the next letter (corresponding to n in the plaintext)
repeated in the proper place.
Of these two, only 31 also have the correctly positioned, so we know that financial begins at
position 30.
From this point on, deducing the key is easy by using the frequency statistics for English text.
Transposition Ciphers
A transposition cipher is a rearrangement of the letters in the plaintext according to some
specific system & key (i.e. a permutation of the plaintext).
They are generally insecure. Simple example we arrange the plaintext in a geometrical figure,
then copy it out following a different route
e.g. P/T: Now is the time for all good men
We arrange this in a rectangle of K columns and extract the ciphertext by the columns:
NOWIS
THETI
MEFOR
ALLGO
ODMEN
Cipher text: NTMAO OHELD WEFLM ITOGE SIRON
9
How do we detect this? Well, the character frequency should be the same as English.More
generally, we deal with:
Rectangular Columnar Transposition:
1) Arrange horizontally in a rectangle.
2) Use a key to generate a permutation of the columns
3) Read vertically
Key: SCHMID
613542
Plaintext: sell all stock on Monday
6 13 542
s e l l al
l st o ck
o nMo nd
a y
Ciphertext: ESNYL KDLTM ACNLO OSLOA
Privacy
Internet privacy involves the right or mandate of personal privacy concerning the storing,
repurposing, providing to third-parties, and displaying of information pertaining to oneself via
the Internet.
Privacy can entail both Personally Identifying Information (PII) and non-PII information such
as a site visitor's behavior on a website.
PII refers to any information that can be used to identify an individual.
For example, age and physical address alone could identify who an individual is without
explicitly disclosing their name, as these two factors are unique enough to typically identify a
specific person.
People with only a casual concern for Internet privacy need not achieve total anonymity.
Internet users may protect their privacy through controlled disclosure of personal information.
The revelation of IP addresses, non-personally-identifiable profiling, and similar information
might become acceptable trade-offs for the convenience that users could otherwise lose using the
workarounds needed to suppress such details rigorously.
On the other hand, some people desire much stronger privacy.
In that case, they may try to achieve Internet anonymity to ensure privacy
use of the Internet without giving any third parties the ability to link the Internet activities to
personallyidentifiable information (P.I.I.) of the Internet user.
In order to keep their information private, people need to be careful on what they submit and
look at online.
When filling out forms and buying merchandise, that becomes tracked and because the
information was not private, companies are now sending Internet users spam and advertising on
similar products.
There are many ways to protect individuals and their finances over the Internet, especially in
dealing with investments.
To ensure safety, meet with a broker in person or over the phone, to know a real person is
dealing with the individuals money.
Second, ask questions. If the person on the phone seems uninterested, this is a red flag and
should tell the individual that this individual is to not be trusted.
Thirdly, protect all personal information. Refrain from giving out full name, address, or any
other personal information that could be used to easily access your finances.
Only give the information if shown that the company and the individual are true.
Do not ask for an e-mail with a finance statement. A written copy shows that individuals are
not dealing with hackers.
Lastly, investigate about the company individuals are investing with.
10
Digital Certificates
In cryptography, a public key certificate (also known as a digital certificate or identity
certificate) is an electronic document that uses a digital signature to bind a public key with an
identity information such as the name of a person or an organization, their address, and so
forth. The certificate can be used to verify that a public key belongs to an individual.
In a typical public key infrastructure (PKI) scheme, the signature will be of a certificate
authority (CA).
In a web of trust scheme, the signature is of either the user (a self-signed certificate) or other
users ("endorsements").
In either case, the signatures on a certificate are attestations by the certificate signer that the
identity information and the public key belong together.
For provable security this reliance on something external to the system has the consequence
that any public key certification scheme has to rely on some special setup assumption, such as
the existence of a certificate authority
What are Digital Certificates?
Digital Certificates are part of a technology called Public Key Infrastructure or PKI.
Digital certificates have been described as virtual ID cards. This is a useful analogy.
There are many ways that digital certificates and ID cards really are the same.
Both ID cards and client digital certificates contain information about you, such as your name,
and information about the organization that issued the certificate or card to you.
Universities generally issue institutional ID cards only after ensuring or validating that you are
a bona fide student, faculty, or staff member.
In PKI terms, this is called the registration processverifying that you are eligible to receive a
certificate and verifying the information in it.
Similar to an important ID card, once a digital certificate is issued, it should be managed with
care.
Just as you would not lend someone else your ID card allowing entry into a secure facility, you
should never lend someone your digital certificate.
If your certificate or ID card is lost or stolen, it should be reported to the issuing office so that it
can be invalidated and a new one issued.
How is a digital certificate created? In creating digital certificates a unique cryptographic key
pair is generated.
One of these keys is referred to as a public key and the other as a private key. Then the
certification authoritygenerally on your campuscreates a digital certificate by combining
information about you and the issuing organization with the public key and digitally signing the
whole thing.
This is very much like an organizations ID office filling out an ID card for you and then
signing it to make it official.
In PKI terms, the public key for an individual is put into a digital document, along with
information about that individual, and then the digital document is signed by the organizations
certification authority.
This signed document can be transmitted to anyone and used to identify the subject of the
certificate.
However, the private key of the original key pair must be securely managed and never given to
anyone else.
As the private key is a very large prime number, it is not something an individual memorizes;
rather, the private key must be stored on some device, such as a laptop computer, PDA, or USB
key ring.
If you send a copy of your certificate to another computer to authenticate yourself, what keeps
someone with access to that computer from reusing it later to pretend to be you?
Unlike an ID card which is valuable by itself, the digital certificate is useless without the
associated private key.
11
That is why protecting the private key is so important. The private key must never be given to
anyone else nor left somewhere outside of control by the owner.
An added value of digital certificates is that they provide a higher level of security than what
we currently have with PIN and password combinations.
Users still use passwords, but only on their local computer to protect their digital certificates.
If one loses the device on which a digital certificate is stored, a person holding the certificate
would still need the password to unlock the certificate.
What is a Digital Signature?
Above we stated that the digital certificate was digitally signed.
The holder of a digital certificate can also use it to digitally sign other digital documents, for
example, purchase orders, grant applications, financial reports or student transcripts.
A digital signature is not an image of your pen and ink signatureit is an attachment to a
document that contains an encrypted version of the document created using the signers private
key.
Once a document is signed, no part of that document can be changed without invalidating the
signature.
Thus if someone obtained a copy of your digital certificate and changed the name in it to be
their own name, any application receiving that modified certificate would see immediately that
the signature on it was not valid.
In this sense, a digital credential is much better than a traditional ID card to prove that the
holder is really the person to whom it was issued.
In fact, digital signatures in general are much more useful than pen and ink signatures since
anyone checking the signature also can find out something about the signer in order to know
whether the signature is meaningful.
Public Key Infrastructures and Certificate Authorities
Digital certificates are one part of a set of components that make up a public key infrastructure
(PKI).
A PKI includes organizations called certification authorities (CAs) that issue, manage, and
revoke digital certificates; organizations called relying parties who use the certificates as
indicators of authentication, and clients who request, manage, and use certificates.
A CA might create a separate registration authority (RA) to handle the task of identifying
individuals who apply for certificates.
Examples of certification authorities include VeriSign, a wellknown commercial provider, and
the CREN Certificate Authority that is available for higher education institutions.
In addition to the organizational roles, there must be an associated database or directory,
generally using a directory access protocol called LDAP that will store information about
certificate holders and their certificates. There also must be a way to make available information
about revoked certificates.
An application that makes use of PKI digital credentials may consult the revocation database
before relying on the validity of a certificate.
It may wish to consult the Subject directory as well in order to retrieve further information
about the certificate Subject.
Types of Certificates
There are different types of certificates, each with different functions and this can be confusing.
It helps to differentiate between at least four types of certificates. You can see samples of some
of these different types of certificates in your browser.
Root or authority certificates. These are certificates that create the base (or root) of a
certification authority hierarchy, such as Thawte or CREN.
These certificates are not signed by another CAthey are self signed by the CA that
created them. When a certificate is self-signed, it means that the name in the Issuer field
is the same as the name in the Subject Field.
12
Session
In computer science, in particular networking, a session is a semi-permanent interactive
information interchange, also known as a dialogue, a conversation or a meeting, between two or
more communicating devices, or between a computer and user (see Login session).
A session is set up or established at a certain point in time, and torn down at a later point in
time.
An established communication session may involve more than one message in each direction.
A session is typically, but not always, stateful, meaning that at least one of the communicating
parts needs to save information about the session history in order to be able to communicate, as
opposed to stateless communication, where the communication consists of independent requests
with responses.
An established session is the basic requirement to perform a connection-oriented
communication.
A session also is the basic step to transmit in connectionless communication modes. However
any unidirectional transmission does not define a session.
Communication sessions may be implemented as part of protocols and services at the
application layer, at the session layer or at the transport layer in the OSI model.
Application layer examples:
o HTTP sessions, which allow associating information with individual visitors
o A telnet remote login session
Session layer example:
o A Session Initiation Protocol (SIP) based Internet phone call
Transport layer example:
o A TCP session, which is synonymous to a TCP virtual circuit, a TCP connection,
or an established TCP socket.
In the case of transport protocols that do not implement a formal session layer (e.g., UDP) or
where sessions at the session layer are generally very short-lived (e.g., HTTP), sessions are
maintained by a higher level program using a method defined in the data being exchanged.
For example, an HTTP exchange between a browser and a remote host may include an HTTP
cookie which identifies state, such as a unique session ID, information about the user's
preferences or authorization level.
HTTP/1.0 was thought to only allow a single request and response during one Web/HTTP
Session.
However a workaround was created by David Hostettler Wain in 1996 such that it was possible
to use session IDs to allow multiple phase Web Transaction Processing (TP) Systems (in ICL
nomenclature), with the first implementation being called Deity. Protocol version HTTP/1.1
further improved by completing the Common Gateway Interface (CGI) making it easier to
maintain the Web Session and supporting cookies and file uploads.
Most client-server sessions are maintained by the transport layer - a single connection for a
single session.
However each transaction phase of a Web/HTTP session creates a separate connection.
Maintaining session continuity between phases required a session ID.
The session ID is embedded within the <A HREF> or <FORM> links of dynamic web pages so
that it is passed back to the CGI.
CGI then uses the session ID to ensure session continuity between transaction phases.
15
One advantage of one connection-per-phase is that it works well over low bandwidth (modem)
connections.
Deity used a sessionID, screenID and actionID to simplify the design of multiple phase
sessions.
Software implementation
TCP sessions are typically implemented in software using child processes and/or
multithreading, where a new process or thread is created when the computer establishes or joins
a session.
HTTP sessions are typically not implemented using one thread per session, but by means of a
database with information about the state of each session.
The advantage with multiple processes or threads is relaxed complexity of the software, since
each thread is an instance with its own history and encapsulated variables.
The disadvantage is large overhead in terms of system resources, and that the session may be
interrupted if the system is restarted.
When a client may connect to any server in a cluster of servers, a special problem is
encountered in maintaining consistency when the servers must maintain session state.
The client must either be directed to the same server for the duration of the session, or the
servers must transmit server-side session information via a shared file system or database.
Otherwise, the client may reconnect to a different server than the one it started the session with,
which will cause problems when the new server does not have access to the stored state of the
old one.
Server side web sessions
Server-side sessions are handy and efficient, but can become difficult to handle in conjunction
with load-balancing/high-availability systems and are not usable at all in some embedded
systems with no storage.
The load-balancing problem can be solved by using shared storage or by applying forced
peering between each client and a single server in the cluster, although this can compromise
system efficiency and load distribution.
A method of using server-side sessions in systems without mass-storage is to reserve a portion
of RAM for storage of session data. This method is applicable for servers with a limited number
of clients (e.g. router or access point with infrequent or disallowed access to more than one client
at a time)
Client side web sessions
Client-side sessions use cookies and cryptographic techniques to maintain state without storing
as much data on the server.
When presenting a dynamic web page, the server sends the current state data to the client (web
browser) in the form of a cookie.
The client saves the cookie in memory or on disk. With each successive request, the client
sends the cookie back to the server, and the server uses the data to "remember" the state of the
application for that specific client and generate an appropriate response.
This mechanism may work well in some contexts; however, data stored on the client is
vulnerable to tampering by the user or by software that has access to the client computer.
To use client-side sessions where confidentiality and integrity are required, the following must
be guaranteed:
1. Confidentiality: Nothing apart from the server should be able to interpret session data.
2. Data integrity: Nothing apart from the server should manipulate session data
(accidentally or maliciously).
3. Authenticity: Nothing apart from the server should be able to initiate valid sessions.
To accomplish this, the server needs to encrypt the session data before sending it to the client,
and modification of such information by any other party should be prevented via cryptographic
means.
Transmitting state back and forth with every request is only practical when the size of the
cookie is small.
16
In essence, client-side sessions trade server disk space for the extra bandwidth that each web
request will require.
Moreover, web browsers limit the number and size of cookies that may be stored by a web site.
To improve efficiency and allow for more session data, the server may compress the data
before creating the cookie, decompressing it later when the cookie is returned by the client.
HTTP session token
A session token is a unique identifier that is generated and sent from a server to a client to
identify the current interaction session.
The client usually stores and sends the token as an HTTP cookie and/or sends it as a parameter
in GET or POST queries.
The reason to use session tokens is that the client only has to handle the identifierall session
data is stored on the server (usually in a database, to which the client does not have direct access)
linked to that identifier.
Examples of the names that some programming languages use when naming their HTTP
cookie include JSESSIONID (JSP), PHPSESSID (PHP), CGISESSID (CGI), and
ASPSESSIONID (ASP).
Session management
In humancomputer interaction, session management is the process of keeping track of a
user's activity across sessions of interaction with the computer system.
Typical session management tasks in a desktop environment include keeping track of which
applications are open and which documents each application has opened, so that the same state
can be restored when the user logs out and logs in later.
For a website, session management might involve requiring the user to re-login if the session
has expired (i.e., a certain time limit has passed without user activity).
It is also used to store information on the server-side between HTTP requests.
Desktop session management
A desktop session manager is a program that can save and restore desktop sessions.
A desktop session is all the windows currently running and their current content. Session
management on Linux-based systems is provided by X session manager.
On Microsoft Windows systems, no session manager is included in the system, but session
management can be provided by third-party applications like twinsplay.
Browser session management
Session management is particularly useful in a web browser where a user can save all open
pages and settings and restore them at a later date.
To help recover from a system or application crash, pages and settings can also be restored on
next run.
Google Chrome, Mozilla Firefox, Internet Explorer, OmniWeb and Opera are examples of web
browsers that support session management. Session management is often managed through the
application of cookies.
Web server session management
Hypertext Transfer Protocol (HTTP) is stateless: a client computer running a web browser must
establish a new Transmission Control Protocol (TCP) network connection to the web server with
each new HTTP GET or POST request.
The web server, therefore, cannot rely on an established TCP network connection for longer
than a single HTTP GET or POST operation.
Session management is the technique used by the web developer to make the stateless HTTP
protocol support session state.
For example, once a user has been authenticated to the web server, the user's next HTTP
request (GET or POST) should not cause the web server to ask for the user's account and
password again.
For a discussion of the methods used to accomplish this see HTTP cookie and Session ID.
17
The session information is stored on the web server using the session identifier Session ID
generated as a result of the first (sometimes the first authenticated) request from the end user
running a web browser.
The "storage" of Session IDs and the associated session data (user name, account number, etc.)
on the web server is accomplished using a variety of techniques including, but not limited to,
local memory, flat files, and databases.
In situations where multiple web servers must share knowledge of session state (as is typical in
a cluster environment) session information must be shared between the cluster nodes that are
running web server software.
Methods for sharing session state between nodes in a cluster include: multicasting session
information to member nodes (see JGroups for one example of this technique), sharing session
information with a partner node using distributed shared memory or memory virtualization,
sharing session information between nodes using network sockets, storing session information on
a shared file system such as the network file system or the global file system, or storing the
session information outside the cluster in a database.
If session information is considered transient, volatile data that is not required for nonrepudiation of transactions and does not contain data that is subject to compliance auditing (in
the U.S. for example, see the Health Insurance Portability and Accountability Act and the
Sarbanes-Oxley Act for examples of two laws that necessitate compliance auditing) then any
method of storing session information can be used.
However, if session information is subject to audit compliance, consideration should be given
to the method used for session storage, replication, and clustering.
In a service-oriented architecture, Simple Object Access Protocol or SOAP messages
constructed with Extensible Markup Language (XML) messages can be used by consumer
applications to cause web servers to create sessions.
Session Management over SMS
Just as HTTP is a stateless protocol, so is SMS. As SMS became interoperable across rival
networks in 1999, and text messaging started its ascent towards becoming a ubiquitous global
form of communication, various enterprises became interested in using the SMS channel for
commercial purposes.
Initial services did not require session management since they were only one-way
communications (for example, in 2000, the first mobile news service was delivered via SMS in
Finland). Today, these applications are referred to as application-to-peer (A2P) messaging as
distinct from peer-to-peer (P2P) messaging.
The development of interactive enterprise applications required session management, but
because SMS is a stateless protocol as defined by the GSM standards, early implementations
were controlled client-side by having the end-users enter commands and service identifiers
manually.
In 2001, a Finnish inventor, Jukka Salonen, introduced a means of maintaining the state of
asynchronous sessions from an operator-independent server using what was termed the Dynamic
Dialogue Matrix (DDM).
Managing sessions from a remote server removes complexity for the end user and enables
solutions to scale more easily in a manner that is backwards compatible to existing mobile
phones.
Managing sessions from the server-side also enabled improved user-authentication and
eliminates the need to transmit sensitive data over insecure wireless networks. Finnair became
the first airline to use the DDM system and method for authenticated mobile check-in to flights.
18