Unit Three

1
UNIT THREE
CLASSIFICATION CLASSIFICATION SCHEME: In metadata a classification scheme is the descriptive information for an arrangement or division of objects into groups based on characteristics which the objects have in common. The ISO/IEC 11179 metadata registry standard uses classification schemes as a way to classify administered items, such as data elements, in a metadata registry. Benefits of using classification schemes: Setting up one or more classification schemes for a collection of objects has many benefits. Some of these include: allows a user to quickly find an object in a large object collection makes it easier to detect duplicate objects conveys semantics (meaning) of an object that is not conveyed by the object name or definition Examples of classification schemes: The following are examples of different classification schemes. This list is in approximate order from informal to more formal:
keyword - a list of uncategorized words or phrases associated with an object thesaurus - a list of categorized words or phrases associated with an object taxonomy - a formal list of controlled words arranged from abstract to
specific. data model - an arrangement of words or phrases that have complex manyto-many relationships network (mathematics) - an arrangement of objects in a random graph ontology - an arrangement of objects in a directed acyclic graph with multiple inheritance One example of a classification scheme for data elements is a representation term. Classified information is sensitive information to which access is restricted by law or regulation to particular classes of persons. A formal security clearance is
required to handle classified documents or access classified data. The clearance process requires a satisfactory background investigation. There are typically several levels of sensitivity, with differing clearance requirements. This sort of hierarchical system of secrecy is used by virtually every national government. The act of assigning the level of sensitivity to data is called data classification. Some corporations and non-government organizations also assign sensitive information to multiple levels of protection, either from a desire to protect trade secrets, or because of laws and regulations governing various matters such as personal privacy, sealed legal proceedings and the timing of financial information releases. GOVERNMENT CLASSIFICATION: The purpose of classification is ostensibly to protect information from being used to damage or endanger national security. Classification formalises what constitutes a "state secret" and accords different levels of protection based on the expected damage the information might cause in the wrong hands. Classification levels Although the classification systems vary from country to country, most have levels corresponding to the following British definitions (from the highest level to lowest): Top Secret (TS): The highest level of classification of material on a national level. Such material would cause "exceptionally grave damage" to national security if publicly available. Secret Such material would cause "grave damage" to national security if publicly available. Confidential Such material would cause "damage" or be "prejudicial" to national security if publicly available.
Restricted Such material would cause "undesirable effects" if publicly available. Some countries do not have such a classification. Unclassified Technically not a classification level, but is used for government documents that do not have a classification listed above. Such documents can sometimes be viewed by those without security clearance. Clearance Depending on the level of classification there are different rules controlling the level of clearance needed to view such information, and how it must be stored, transmitted, and destroyed. Additionally, access is restricted on a "need to know" basis. Simply possessing a clearance does not automatically authorize the individual to view all material classified at that level or below that level. The individual must present a legitimate "need to know" in addition to the proper level of clearance. COMPRESSION In computer science and information theory, data compression or source coding is the process of encoding information using fewer bits (or other informationbearing units) than an unencoded representation would use through use of specific encoding schemes. As with any communication, compressed data communication only works when both the sender and receiver of the information understand the encoding scheme. For example, this text makes sense only if the receiver understands that it is intended to be interpreted as characters representing the English language. Similarly, compressed data can only be understood if the decoding method is known by the receiver. Compression is useful because it helps reduce the consumption of expensive resources, such as hard disk space or transmission bandwidth. On the downside, compressed data must be decompressed to be used, and this extra processing may be detrimental to some applications. For instance, a compression scheme for video may require expensive hardware for the video to be decompressed fast enough to be viewed as it's being decompressed (the option of decompressing the video in full before watching it may be inconvenient, and requires storage space for the decompressed video). The design of data compression schemes therefore involves trade-offs among various factors, including the degree of compression, the amount of
distortion introduced (if using a lossy compression scheme), and the computational resources required to compress and uncompress the data. LOSSLESS VERSUS LOSSY COMPRESSION Lossless compression algorithms usually exploit statistical redundancy in such a way as to represent the sender's data more concisely without error. Lossless compression is possible because most real-world data has statistical redundancy. For example, in English text, the letter 'e' is much more common than the letter 'z', and the probability that the letter 'q' will be followed by the letter 'z' is very small. Another kind of compression, called lossy data compression or perceptual coding, is possible if some loss of fidelity is acceptable. Generally, a lossy data compression will be guided by research on how people perceive the data in question. For example, the human eye is more sensitive to subtle variations in luminance than it is to variations in color. JPEG image compression works in part by "rounding off" some of this less-important information. Lossy data compression provides a way to obtain the best fidelity for a given amount of compression. In some cases, transparent (unnoticeable) compression is desired; in other cases, fidelity is sacrificed to reduce the amount of data as much as possible. Lossless compression schemes are reversible so that the original data can be reconstructed, while lossy schemes accept some loss of data in order to achieve higher compression. However, lossless data compression algorithms will always fail to compress some files; indeed, any compression algorithm will necessarily fail to compress any data containing no discernible patterns. Attempts to compress data that has been compressed already will therefore usually result in an expansion, as will attempts to compress all but the most trivially encrypted data. In practice, lossy data compression will also come to a point where compressing again does not work, although an extremely lossy algorithm, like for example always removing the last byte of a file, will always compress a file up to the point where it is empty. An example of lossless vs. lossy compression is the following string: 25.888888888 This string can be compressed as: 25.[9]8 . Interpreted as, "twenty five point 9 eights", the original string is perfectly recreated, just written in a smaller form. In a lossy system, using 26 instead, the original data is lost, at the benefit of a smaller file size. APPLICATIONS:
The above is a very simple example of run-length encoding, wherein large runs of consecutive identical data values are replaced by a simple code with the data value and length of the run. This is an example of lossless data compression. It is often used to optimize disk space on office computers, or better use the connection bandwidth in a computer network. For symbolic data such as spreadsheets, text, executable programs, etc., losslessness is essential because changing even a single bit cannot be tolerated (except in some limited cases). For visual and audio data, some loss of quality can be tolerated without losing the essential nature of the data. By taking advantage of the limitations of the human sensory system, a great deal of space can be saved while producing an output which is nearly indistinguishable from the original. These lossy data compression methods typically offer a three-way tradeoff between compression speed, compressed data size and quality loss. Lossy image compression is used in digital cameras, to increase storage capacities with minimal degradation of picture quality. Similarly, DVDs use the lossy MPEG-2 codec for video compression. In lossy audio compression, methods of psychoacoustics are used to remove nonaudible (or less audible) components of the signal. Compression of human speech is often performed with even more specialized techniques, so that "speech compression" or "voice coding" is sometimes distinguished as a separate discipline from "audio compression". Different audio and speech compression standards are listed under audio codecs. Voice compression is used in Internet telephony for example, while audio compression is used for CD ripping and is decoded by audio players. THEORY: The theoretical background of compression is provided by information theory (which is closely related to algorithmic information theory) and by rate-distortion theory. These fields of study were essentially created by Claude Shannon, who published fundamental papers on the topic in the late 1940s and early 1950s. Cryptography and coding theory are also closely related. The idea of data compression is deeply connected with statistical inference. Many lossless data compression systems can be viewed in terms of a four-stage model. Lossy data compression systems typically include even more stages, including, for example, prediction, frequency transformation, and quantization. The Lempel-Ziv (LZ) compression methods are among the most popular algorithms for lossless storage. DEFLATE is a variation on LZ which is optimized for decompression speed and compression ratio, therefore compression can be slow. DEFLATE is used in PKZIP, gzip and PNG. LZW (Lempel-Ziv-Welch) is used in GIF images. Also noteworthy are the LZR (LZ-Renau) methods, which serve as the basis of the Zip
method. LZ methods utilize a table-based compression model where table entries are substituted for repeated strings of data. For most LZ methods, this table is generated dynamically from earlier data in the input. The table itself is often Huffman encoded (e.g. SHRI, LZX). A current LZ-based coding scheme that performs well is LZX, used in Microsoft's CAB format. The very best compressors use probabilistic models which predictions are coupled to an algorithm called arithmetic coding. Arithmetic coding, invented by Jorma Rissanen, and turned into a practical method by Witten, Neal, and Cleary, achieves superior compression to the better-known Huffman algorithm, and lends itself especially well to adaptive data compression tasks where the predictions are strongly context-dependent. Arithmetic coding is used in the bilevel imagecompression standard JBIG, and the document-compression standard DjVu. The text entry system, Dasher, is an inverse-arithmetic-coder. There is a close connection between machine learning and compression: a system that predicts the posterior probabilities of a sequence given its entire history can be used for optimal data compression (by using arithmetic coding on the output distribution), while an optimal compressor can be used for prediction (by finding the symbol that compresses best, given the previous history). This equivalence has been used as justification for data compression as a benchmark for "general intelligence" [1]. LOSSLESS DATA COMPRESSION Lossless data compression is a class of data compression algorithms that allows the exact original data to be reconstructed from the compressed data. The term lossless is in contrast to lossy data compression, which only allows an approximation of the original data to be reconstructed, in exchange for better compression rates. Lossless data compression is used in many applications. For example, it is used in the popular ZIP file format and in the Unix tool gzip. It is also often used as a component within lossy data compression technologies. Lossless compression is used when it is important that the original and the decompressed data be identical, or when no assumption can be made on whether certain deviation is uncritical. Typical examples are executable programs and source code. Some image file formats, like PNG or GIF, use only lossless compression, while others like TIFF and MNG may use either lossless or lossy methods. LOSSLESS COMPRESSION TECHNIQUES Most lossless compression programs do two things in sequence: the first step generates a statistical model for the input data, and the second step uses this model to map input data to bit sequences in such a way that "probable" (e.g.
frequently encountered) data will produce shorter output than "improbable" data. The primary encoding algorithms used to produce bit sequences are Huffman coding (also used by DEFLATE) and arithmetic coding. Arithmetic coding achieves compression rates close to the best possible for a particular statistical model, which is given by the information entropy, whereas Huffman compression is simpler and faster but produces poor results for models that deal with symbol probabilities close to 1. There are two primary ways of constructing statistical models: in a static model, the data is analyzed and a model is constructed, then this model is stored with the compressed data. This approach is simple and modular, but has the disadvantage that the model itself can be expensive to store, and also that it forces a single model to be used for all data being compressed, and so performs poorly on files containing heterogeneous data. Adaptive models dynamically update the model as the data is compressed. Both the encoder and decoder begin with a trivial model, yielding poor compression of initial data, but as they learn more about the data performance improve. Most popular types of compression used in practice now use adaptive coders. Lossless compression methods may be categorized according to the type of data they are designed to compress. While, in principle, any general-purpose lossless compression algorithm (general-purpose meaning that they can compress any bitstring) can be used on any type of data, many are unable to achieve significant compression on data that is not of the form for which they were designed to compress. Many of the lossless compression techniques used for text also work reasonably well for indexed images. Text: Statistical modeling algorithms for text (or text-like binary data such as executables) include: Context Tree Weighting method (CTW) Burrows-Wheeler transform (block sorting preprocessing that makes compression more efficient) LZ77 (used by DEFLATE) LZW

Multimedia: Techniques that take advantage of the specific characteristics of images such as the common phenomenon of contiguous 2-D areas of similar tones. Every pixel but the first is replaced by the difference to its left neighbor. This leads to small values having a much higher probability than large values. This is often also
applied to sound files and can compress files which contain mostly low frequencies and low volumes. For images this step can be repeated by taking the difference to the top pixel, and then in videos the difference to the pixel in the next frame can be taken. A hierarchical version of this technique takes neighboring pairs of data points, stores their difference and sum, and on a higher level with lower resolution continues with the sums. This is called discrete wavelet transform. JPEG2000 additionally uses data points from other pairs and multiplication factors to mix then into the difference. These factors have to be integers so that the result is an integer under all circumstances. So the values are increased, increasing file size, but hopefully the distribution of values is more peaked. The adaptive encoding uses the probabilities from the previous sample in sound encoding, from the left and upper pixel in image encoding, and additionally from the previous frame in video encoding. In the wavelet transformation the probabilities are also passed through the hierarchy. Historical legal issues: Many of these methods are implemented in open-source and proprietary tools, particularly LZW and its variants. Some algorithms are patented in the USA and other countries and their legal usage requires licensing by the patent holder. Because of patents on certain kinds of LZW compression, and in particular licensing practices by patent holder Unisys that many developers considered abusive, some open source proponents encouraged people to avoid using the Graphics Interchange Format (GIF) for compressing image files in favor of Portable Network Graphics PNG, which combines the LZ77-based deflate algorithm with a selection of domain-specific prediction filters. However, the patents on LZW have now expired. Many of the lossless compression techniques used for text also work reasonably well for indexed images, but there are other techniques that do not work for typical text that are useful for some images (particularly simple bitmaps), and other techniques that take advantage of the specific characteristics of images (such as the common phenomenon of contiguous 2-D areas of similar tones, and the fact that color images usually have a preponderance to a limited range of colors out of those representable in the color space). As mentioned previously, lossless sound compression is a somewhat specialised area. Lossless sound compression algorithms can take advantage of the repeating patterns shown by the wave-like nature of the data essentially using models to predict the "next" value and encoding the (hopefully small) difference between the expected value and the actual data. If the difference between the predicted and the actual data (called the "error") tends to be small, then certain difference values (like 0, +1, -1 etc. on sample values) become very
frequent, which can be exploited by encoding them in few output bits. It is sometimes beneficial to compress only the differences between two versions of a file (or, in video compression, of an image). This is called delta compression (from the Greek letter which is commonly used in mathematics to denote a difference), but the term is typically only used if both versions are meaningful outside compression and decompression. For example, while the process of compressing the error in the above-mentioned lossless audio compression scheme could be described as delta compression from the approximated sound wave to the original sound wave, the approximated version of the sound wave is not meaningful in any other context. LOSSLESS COMPRESSION METHODS By operation of the pigeonhole principle, no lossless compression algorithm can efficiently compress all possible data, and completely random data streams cannot be compressed. For this reason, many different algorithms exist that are designed either with a specific type of input data in mind or with specific assumptions about what kinds of redundancy the uncompressed data are likely to contain.Some of the most common lossless compression algorithms are listed below. General purpose: Run-length encoding a simple scheme that provides good compression of data containing lots of runs of the same value. LZW used by gif and compress among others Deflate used by gzip, modern versions of zip and as part of the compression process of PNG, PPP, HTTP, SSH
Audio

Apple Lossless ALAC (Apple Lossless Audio Codec) ATRAC Advanced Lossless Audio Lossless Coding also known as MPEG-4 ALS MPEG-4 SLS also known as HD-AAC Direct Stream Transfer DST Dolby TrueHD DTS-HD Master Audio Free Lossless Audio Codec FLAC Meridian Lossless Packing MLP Monkey's Audio Monkey's Audio APE
10
OptimFROG RealPlayer RealAudio Lossless Shorten SHN TTA True Audio Lossless WavPack WavPack lossless WMA Lossless Windows Media Lossless
Graphics

ABO Adaptive Binary Optimization GIF (lossless, but contains a very limited number color range) JBIG2 (lossless or lossy compression of B&W images) JPEG-LS (lossless/near-lossless compression standard) JPEG 2000 (includes lossless compression method, as proven by Sunil Kumar, Prof San Diego State University) JPEG XR - formerly WMPhoto and HD Photo, includes a lossless compression method PGF Progressive Graphics File (lossless or lossy compression) PNG Portable Network Graphics TIFF - Tagged Image File Format
Video

Animation codec CorePNG FFV1 JPEG 2000 Huffyuv Lagarith MSU Lossless Video Codec SheerVideo
CODING THEORY
A code is just a set of elements. Coding theory is an approach to various science disciplines -- such as information theory, electrical engineering, digital communication, mathematics, and computer science -- which helps design efficient and reliable data transmission methods so that redundancy can be removed and errors corrected. It
11
also deals with the properties of codes and with their fitness for a specific application. There are three classes of codes
1. Source coding (Data compression) 2. Channel coding (Forward error correction) 3. Joint source and channel coding
The first, source encoding, attempts to compress the data from a source in order to transmit it more efficiently. This practice is found every day on the Internet where the common "Zip" data compression is used to reduce the network load and make files smaller. The second, channel encoding, adds extra data bits to make the transmission of data more robust to disturbances present on the transmission channel. The ordinary user may not be aware of many applications using channel coding. A typical music CD uses the Reed-Solomon code to correct for scratches and dust. In this application the transmission channel is the CD itself. Cell phones also use coding techniques to correct for the fading and noise of high frequency radio transmission. Data modems, telephone transmissions, and NASA all employ channel coding techniques to get the bits through, for example the turbo code and LDPC codes. Contents

Source coding Principle Example Channel coding Linear block codes Convolution codes Other applications of coding theory
Source coding: The aim of source coding is to take the source data and make it smaller. An in-depth article can be found at Data compression. Principle Entropy of a source is the measure of information. Basically source codes try to reduce the redundancy present in the source, and represent the source with fewer bits that carry more information. Data compression which explicitly tries to
12
minimise the average length of messages according to a particular assumed probability model is called entropy encoding. Various techniques used by source coding schemes try to achieve the limit of Entropy of the source. C(x) H(x), where H(x) is entropy of source (bitrate), and C(x) is the bitrate after compression. In particular, no source coding scheme can be better than the entropy of the source. Example Facsimile transmission uses a simple run length code. Channel coding: The aim of channel coding theory is to find codes which transmit quickly, contain many valid code words and can correct or at least detect many errors. While not mutually exclusive, performance in these areas is a trade off. So, different codes are optimal for different applications. The needed properties of this code mainly depend on the probability of errors happening during transmission. In a typical CD, the impairment is mainly dust or scratches. Thus codes are used in an interleaved manner.[citation needed] The data is spread out over the disk. Although not a very good code, a simple repeat code can serve as an understandable example. Suppose we take a block of data bits (representing sound) and send it three times. At the receiver we will examine the three repetitions bit by bit and take a majority vote. The twist on this is that we don't merely send the bits in order. We interleave them. The block of data bits is first divided into 4 smaller blocks. Then we cycle through the block and send one bit from the first, then the second, etc. This is done three times to spread the data out over the surface of the disk. In the context of the simple repeat code, this may not appear effective. However, there are more powerful codes known which are very effective at correcting the "burst" error of a scratch or a dust spot when this interleaving technique is used. Other codes are more appropriate for different applications. Deep space communications are limited by the thermal noise of the receiver which is more of a continuous nature than a bursty nature. Likewise, narrowband modems are limited by the noise present in the telephone network and is also modeled better as a continuous disturbance.[citation needed] Cell phones are subject to rapid fading. The high frequencies used can cause rapid fading of the signal even if the receiver is moved a few inches. Again there are a class of channel codes that are designed to combat fading.[citation needed]
ALGEBRAIC CODING
13
The term algebraic coding theory denotes the sub-field of coding theory where the properties of codes are expressed in algebraic terms and then further researched.[citation needed] Algebraic coding theory, is basically divided into two major types of codes[citation needed] 1. Linear block codes 2. Convolutional codes It analyzes the following three properties of a code mainly:[citation needed] code word length total number of valid code words the minimum distance between two valid code words, using mainly the Hamming distance, sometimes also other distances like the Lee distance Linear block codes Linear block codes have the property of linearity, i.e the sum of any two codewords is also a code word, and they are applied to the source bits in blocks, hence the name linear block codes. There are block codes that are not linear, but it is difficult to prove that a code is a good one without this property.[1] Linear block codes are summarized by their symbol alphabets (e.g. binary or ternary) and parameters (n,m,dmin)[2] where 1. n is the length of the codeword, in symbols, 2. m is the number of source symbols that will be used for encoding at once, 3. dmin is the minimum hamming distance for the code There are many types of linear block codes, such as
1. Cyclic codes (Hamming code is a subset of cyclic codes) 2. Repetition codes 3. Parity codes 4. Polynomial codes (BCH codes are a subset of the polynomial codes) 5. Reed Solomon codes 6. Algebraic geometric codes 7. ReedMuller codes 8. Perfect codes
14
Block codes are tied to the sphere packing problem, which has received some attention over the years. In two dimensions, it is easy to visualize. Take a bunch of pennies flat on the table and push them together. The result is a hexagon pattern like a bee's nest. But block codes rely on more dimensions which cannot easily be visualized. The powerful Golay code used in deep space communications uses 24 dimensions. If used as a binary code (which it usually is) the dimensions refer to the length of the codeword as defined above. The theory of coding uses the N-dimensional sphere model. For example, how many pennies can be packed into a circle on a tabletop, or in 3 dimensions, how many marbles can be packed into a globe. Other considerations enter the choice of a code. For example, hexagon packing into the constraint of a rectangular box will leave empty space at the corners. As the dimensions get larger, the percentage of empty space grows smaller. But at certain dimensions, the packing uses all the space and these codes are the so-called "perfect" codes. The only nontrivial and useful perfect codes are the distance-3 Hamming codes with parameters satisfying (2r 1, 2r 1 r, 3), and the [23,12,7] binary and [11,6,5] ternary Golay codes.[2][1] Another code property is the number of neighbors a single codeword may have. [citation needed] Again, let's use pennies as an example. First we pack the pennies in a rectangular grid. Each penny will have 4 near neighbors (and 4 at the corners which are farther away). In a hexagon, each penny will have 6 near neighbors. When we increase the dimensions, the number of near neighbors increases very rapidly. The result is the number of ways for noise to make the receiver choose a neighbor (hence an error) grows as well. This is a fundamental limitation of block codes, and indeed all codes. It may be harder to cause an error to a single neighbor, but the number of neighbors can be large enough so the total error probability actually suffers.[citation needed] Convolution codes Convolutional codes are used in voiceband modems (V.32, V.17, V.34) and in GSM mobile phones, as well as satellite and military communication devices. Here the idea is to make every codeword symbol be the weighted sum of the various input message symbols. This is like convolution used in LTI systems to find the output of a system, when you know the input and impulse response. So we generally find the output of the system convolutional encoder, which is the convolution of the input bit, against the states of the convolution encoder, registers. Fundamentally, convolutional codes do not offer more protection against noise than an equivalent block code. In many cases, they generally offer greater simplicity of implementation over a block code of equal power. The encoder is usually a simple circuit which has state memory and some feedback
15
logic, normally XOR gates. The decoder can be implemented in software or firmware. The Viterbi algorithm is the optimum algorithm used to decode convolutional codes. There are simplifications to reduce the computational load. They rely on searching only the most likely paths. Although not optimum, they have generally found to give good results in the lower noise environments. Other applications of coding theory Another concern of coding theory is designing codes that help synchronization. A code may be designed so that a phase shift can be easily detected and corrected and that multiple signals can be sent on the same channel.[citation needed] Another application of codes, used in some mobile phone systems, is code-division multiple access (CDMA). Each phone is assigned a code sequence that is approximately uncorrelated with the codes of other phones.[citation needed] When transmitting, the code word is used to modulate the data bits representing the voice message. At the receiver, a demodulation process is performed to recover the data. The properties of this class of codes allow many users (with different codes) to use the same radio channel at the same time. To the receiver, the signals of other users will appear to the demodulator only as a low-level noise.[citation needed] Another general class of codes are the automatic repeat-request (ARQ) codes. In these codes the sender adds redundancy to each message for error checking, usually by adding check bits. If the check bits are not consistent with the rest of the message when it arrives, the receiver will ask the sender to retransmit the message. All but the simplest wide area network protocols use ARQ. Common protocols include SDLC (IBM), TCP (Internet), X.25 (International) and many others. There is an extensive field of research on this topic because of the problem of matching a rejected packet against a new packet. Is it a new one or is it a retransmission? Typically numbering schemes are used, as in TCP."RFC793". RFCs. Internet Engineering Task Force (IETF). 1981-09. http://tools.ietf.org/html/rfc793. Properties of linear block codes are used in many applications. for example Syndrome-Coset uniqueness property of linear block codes is used in Trellis shaping [3] , one of the best known shaping codes. This same property is used in Sensor networks for distributed source coding Decision Making Some Definitions A good place to start is with some standard definitions of decision making. 1. Decision making is the study of identifying and choosing alternatives based on the values and preferences of the decision maker. Making a decision
16
implies that there are alternative choices to be considered, and in such a case we want not only to identify as many of these alternatives as possible but to choose the one that (1) has the highest probability of success or effectiveness and (2) best fits with our goals, desires, lifestyle, values, and so on. 2. Decision making is the process of sufficiently reducing uncertainty and doubt about alternatives to allow a reasonable choice to be made from among them. This definition stresses the information-gathering function of decision making. It should be noted here that uncertainty is reduced rather than eliminated. Very few decisions are made with absolute certainty because complete knowledge about all the alternatives is seldom possible. Thus, every decision involves a certain amount of risk. Kinds of Decisions There are several basic kinds of decisions. 1. Decisions whether. This is the yes/no, either/or decision that must be made before we proceed with the selection of an alternative. Should I buy a new TV? Should I travel this summer? Decisions whether are made by weighing reasons pro and con. The PMI technique discussed in the next chapter is ideal for this kind of decision. 2. Decisions which. These decisions involve a choice of one or more alternatives from among a set of possibilities, the choice being based on how well each alternative measures up to a set of predefined criteria. 3. Contingent decisions. These are decisions that have been made but put on hold until some condition is met. Most people carry around a set of already made, contingent decisions, just waiting for the right conditions or opportunity to arise. Time, energy, price, availability, opportunity, encouragement--all these factors can figure into the necessary conditions that need to be met before we can act on our decision. Decision Making is a Recursive Process A critical factor that decision theorists sometimes neglect to emphasize is that in spite of the way the process is presented on paper, decision making is a nonlinear, recursive process. That is, most decisions are made by moving back and forth between the choice of criteria (the characteristics we want our choice to meet) and the identification of alternatives (the possibilities we can choose from among). The alternatives available influence the criteria we apply to them, and similarly the criteria we establish influence the alternatives we will consider. Let's
17
look at an example to clarify this. Suppose someone wants to decide, Should I get married? Notice that this is a decision whether. A linear approach to decision making would be to decide this question by weighing the reasons pro and con (what are the benefits and drawbacks of getting married) and then to move to the next part of the process, the identification of criteria (supportive, easy going, competent, affectionate, etc.). Next, we would identify alternatives likely to have these criteria (Kathy, Jennifer, Michelle, Julie, etc.). Finally we would evaluate each alternative according to the criteria and choose the one that best meets the criteria. We would thus have a scheme like this: decision whether ... select criteria ... identify alternatives ... make choice However, the fact is that our decision whether to get married may really be a contingent decision. "I'll get married if I can find the right person." It will thus be influenced by the identification of alternatives, which we usually think of as a later step in the process. Similarly, suppose we have arrived at the "identify alternatives" stage of the process when we discover that Jennifer (one of the girls identified as an alternative) has a wonderful personality characteristic that we had not even thought of before, but that we now really want to have in a wife. We immediately add that characteristic to our criteria. Thus, the decision making process continues to move back and forth, around and around as it progresses in what will eventually be a linear direction but which in its actual workings is highly recursive. Key point, then, is that the characteristics of the alternatives we discover will often revise the criteria we have previously identified. The Components of Decision Making The Decision Environment Every decision is made within a decision environment, which is defined as the collection of information, alternatives, values, and preferences available at the time of the decision. An ideal decision environment would include all possible information, all of it accurate, and every possible alternative. However, both information and alternatives are constrained because the time and effort to gain information or identify alternatives are limited. The time constraint simply means that a decision must be made by a certain time. The effort constraint reflects the limits of manpower, money, and priorities. (You wouldn't want to spend three hours and half a tank of gas trying to find the very best parking place at the mall.) Since decisions must be made within this constrained environment, we can say that the major challenge of decision making is uncertainty, and a major goal of
18
decision analysis is to reduce uncertainty. We can almost never have all information needed to make a decision with certainty, so most decisions involve an undeniable amount of risk. The fact that decisions must be made within a limiting decision environment suggests two things. First, it explains why hindsight is so much more accurate and better at making decisions that foresight. As time passes, the decision environment continues to grow and expand. New information and new alternatives appear--even after the decision must be made. Armed with new information after the fact, the hindsighters can many times look back and make a much better decision than the original maker, because the decision environment has continued to expand. The second thing suggested by the decision-within-an-environment idea follows from the above point. Since the decision environment continues to expand as time passes, it is often advisable to put off making a decision until close to the deadline. Information and alternatives continue to grow as time passes, so to have access to the most information and to the best alternatives, do not make the decision too soon. Now, since we are dealing with real life, it is obvious that some alternatives might no longer be available if too much time passes; that is a tension we have to work with, a tension that helps to shape the cutoff date for the decision. Delaying a decision as long as reasonably possible, then, provides three benefits: 1. The decision environment will be larger, providing more information. There is also time for more thoughtful and extended analysis. 2. New alternatives might be recognized or created. Version 2.0 might be released. 3. The decision maker's preferences might change. With further thought, wisdom, and maturity, you may decide not to buy car X and instead to buy car Y. The Effects of Quantity on Decision Making Many decision makers have a tendency to seek more information than required to make a good decision. When too much information is sought and obtained, one or more of several problems can arise. (1) A delay in the decision occurs because of the time required to obtain and process the extra information. This delay could impair the effectiveness of the decision or solution. (2) Information overload will occur. In this state, so much information is available that decision-making ability actually declines because the information in its entirety can no longer be managed or assessed appropriately. A major problem caused by information overload is forgetfulness. When too much information is taken into memory, especially in a short period of time, some of the information (often that received early on) will be pushed out.
19
The example is sometimes given of the man who spent the day at an informationheavy seminar. At the end of the day, he was not only unable to remember the first half of the seminar but he had also forgotten where he parked his car that morning. (3) Selective use of the information will occur. That is, the decision maker will choose from among all the information available only those facts which support a preconceived solution or position. (4) Mental fatigue occurs, which results in slower work or poor quality work. (5) Decision fatigue occurs, where the decision maker tires of making decisions. Often the result is fast, careless decisions or even decision paralysis--no decisions are made at all. The quantity of information that can be processed by the human mind is limited. Unless information is consciously selected, processing will be biased toward the first part of the information received. After that, the mind tires and begins to ignore subsequent information or forget earlier information. Decision Streams A common misconception about decision making is that decisions are made in isolation from each other: you gather information, explore alternatives, and make a choice, without regard to anything that has gone before. The fact is, decisions are made in a context of other decisions. The typical metaphor used to explain this is that of a stream. There is a stream of decisions surrounding a given decision, many decisions made earlier have led up to this decision and made it both possible and limited. Many other decisions will follow from it. Another way to describe this situation is to say that most decisions involve a choice from a group of preselected alternatives, made available to us from the universe of alternatives by the previous decisions we have made. Previous decisions have "activated" or "made operable" certain alternatives and "deactivated" or "made inoperable" others. For example, when you decide to go to the park, your decision has been enabled by many previous decisions. You had to decide to live near the park; you had to decide to buy a car or learn about bus routes, and so on. And your previous decisions have constrained your subsequent ones: you can't decide to go to a park this afternoon if it is three states away. By deciding to live where you do, you have both enabled and disabled a whole series of other decisions. As another example, when you enter a store to buy a VCR or TV, you are faced with the preselected alternatives stocked by the store. There may be 200 models available in the universe of models, but you will be choosing from, say, only a
20
dozen. In this case, your decision has been constrained by the decisions made by others about which models to carry. We might say, then, that every decision (1) follows from previous decisions, (2) enables many future decisions, and (3) prevents other future decisions. People who have trouble making decisions are sometimes trapped by the constraining nature of decision making. Every decision you make precludes other decisions, and therefore might be said to cause a loss of freedom. If you decide to marry Terry, you no longer can decide to marry Shawn. However, just as making a decision causes a loss of freedom, it also creates new freedom, new choices and new possibilities. So making a decision is liberating as well as constraining. And a decision left unmade will often result in a decision by default or a decision being made for you. It is important to realize that every decision you make affects the decision stream and the collections of alternatives available to you both immediately and in the future. In other words, decisions have far reaching consequences. Concepts and Definitions 1. Information. This is knowledge about the decision, the effects of its alternatives, the probability of each alternative, and so forth. A major point to make here is that while substantial information is desirable, the statement that "the more information, the better" is not true. Too much information can actually reduce the quality of a decision. See the discussion on The Effects of Quantity on Decision Making above. 2. Alternatives. These are the possibilities one has to choose from. Alternatives can be identified (that is, searched for and located) or even developed (created where they did not previously exist). Merely searching for preexisting alternatives will result in less effective decision making. 3. Criteria. These are the characteristics or requirements that each alternative must possess to a greater or lesser extent. Usually the alternatives are rated on how well they possess each criterion. For example, alternative Toyota ranks an 8 on the criterion of economy, while alternative Buick ranks a 6 on the same criterion. 4. Goals. What is it you want to accomplish? Strangely enough, many decision makers collect a bunch of alternatives (say cars to buy or people to marry) and then ask, "Which should I choose?" without thinking first of what their goals are, what overall objective they want to achieve. Next time you find yourself asking,
21
"What should I do? What should I choose?" ask yourself first, "What are my goals?" A component of goal identification should be included in every instance of decision analysis. 5. Value. Value refers to how desirable a particular outcome is, the value of the alternative, whether in dollars, satisfaction, or other benefit. 6. Preferences. These reflect the philosophy and moral hierarchy of the decision maker. We could say that they are the decision maker's "values," but that might be confusing with the other use of the word, above. If we could use that word here, we would say that personal values dictate preferences. Some people prefer excitement to calmness, certainty to risk, efficiency to esthetics, quality to quantity, and so on. Thus, when one person chooses to ride the wildest roller coaster in the park and another chooses a mild ride, both may be making good decisions, if based on their individual preferences. 7. Decision Quality. This is a rating of whether a decision is good or bad. A good decision is a logical one based on the available information and reflecting the preferences of the decision maker. The important concept to grasp here is that the quality of a decision is not related to its outcome: a good decision can have either a good or a bad outcome. Similarly, a bad decision (one not based on adequate information or not reflecting the decision maker's preferences) can still have a good outcome. For example, if you do extensive analysis and carefully decide on a certain investment based on what you know about its risks and your preferences, then your decision is a good one, even though you may lose money on the investment. Similarly, if you throw a dart at a listing of stocks and buy the one the dart hits, your decision is a bad one, even though the stock may go up in value. Good decisions that result in bad outcomes should thus not be cause for guilt or recrimination. If you decide to take the scenic route based on what you know of the road (reasonably safe, not heavily traveled) and your preferences (minimal risk, prefer scenery over early arrival), then your decision is a good one, even though you might happen to get in an accident, or have a flat tire in the middle of nowhere. It is not justified to say, "Well, this was a bad decision." In judging the quality of a decision, in addition to the concerns of logic, use of information and alternatives, three other considerations come into play:
22
A. The decision must meet the stated objectives most thoroughly and completely. How well does the alternative chosen meet the goals identified? B. The decision must meet the stated objectives most efficiently, with concern over cost, energy, side effects. Are there negative consequences to the alternative that make that choice less desirable? We sometimes overlook this consideration in our search for thrills. C. The decision must take into account valuable byproducts or indirect advantages. A new employee candidate may also have extra abilities not directly related to the job but valuable to the company nonetheless. These should be taken into account. 8. Acceptance. Those who must implement the decision or who will be affected by it must accept it both intellectually and emotionally. Acceptance is a critical factor because it occasionally conflicts with one of the quality criteria. In such cases, the best thing to do may be to choose a lesser quality solution that has greater acceptance. Thus, the inferior method may produce greater results if the inferior one has greater support. One of the most important considerations in decision making, then, is the people factor. Always consider a decision in light of the people implementation. A decision that may be technologically brilliant but that is sociologically stupid will not work. Only decisions that are implemented, and implemented with thoroughness (and preferably enthusiasm) will work the way they are intended to. Approaches to Decision Making There are two major approaches to decision making in an organization, the authoritarian method in which an executive figure makes a decision for the group and the group method in which the group decides what to do. 1. Authoritarian. The manager makes the decision based on the knowledge he can gather. He then must explain the decision to the group and gain their acceptance of it. In some studies, the time breakdown for a typical operating decision is something like this: make decision, 5 min.; explain decision, 30 min.; gain acceptance, 30 min. 2. Group. The group shares ideas and analyses, and agrees upon a decision to implement. Studies show that the group often has values, feelings, and reactions quite different from those the manager supposes they have. No one knows the group and its tastes and preferences as well as the group itself. And, interestingly, the time breakdown is something like this:
23
group makes decision, 30 min.; explain decision, 0 min.; gain acceptance, 0 min. Clearly, just from an efficiency standpoint, group decision making is better. More than this, it has been shown many times that people prefer to implement the ideas they themselves think of. They will work harder and more energetically to implement their own idea than they would to implement an idea imposed on them by others. We all have a love for our own ideas and solutions, and we will always work harder on a solution supported by our own vision and our own ego than we will on a solution we have little creative involvement with. There are two types of group decision making sessions. First is free discussion in which the problem is simply put on the table for the group to talk about. For example, Joe has been offered a job change from shift supervisor to maintenance foreman. Should he take the job? The other kind of group decision making is developmental discussion or structured discussion. Here the problem is broken down into steps, smaller parts with specific goals. For example, instead of asking generally whether Joe should take the job, the group works on sub questions: What are Joe's skills? What skills does the new job require? How does Joe rate on each of the skills required? Notice that these questions seek specific information rather than more general impressionistic opinions. Developmental discussion (1) insures systematic coverage of a topic and (2) insures that all members of the group are talking about the same aspect of the problem at the same time. Some Decision Making Strategies As you know, there are often many solutions to a given problem, and the decision maker's task is to choose one of them. The task of choosing can be as simple or as complex as the importance of the decision warrants, and the number and quality of alternatives can also be adjusted according to importance, time, resources and so on. There are several strategies used for choosing. Among them are the following: 1. Optimizing. This is the strategy of choosing the best possible solution to the problem, discovering as many alternatives as possible and choosing the very best. How thoroughly optimizing can be done is dependent on A. importance of the problem B. time available for solving it C. cost involved with alternative solutions D. availability of resources, knowledge E. personal psychology, values
24
Note that the collection of complete information and the consideration of all alternatives is seldom possible for most major decisions, so that limitations must be placed on alternatives. 2. Satisficing. In this strategy, the first satisfactory alternative is chosen rather than the best alternative. If you are very hungry, you might choose to stop at the first decent looking restaurant in the next town rather than attempting to choose the best restaurant from among all (the optimizing strategy). The word satisficing was coined by combining satisfactory and sufficient. For many small decisions, such as where to park, what to drink, which pen to use, which tie to wear, and so on, the satisficing strategy is perfect. 3. Maximax. This stands for "maximize the maximums." This strategy focuses on evaluating and then choosing the alternatives based on their maximum possible payoff. This is sometimes described as the strategy of the optimist, because favorable outcomes and high potentials are the areas of concern. It is a good strategy for use when risk taking is most acceptable, when the go-for-broke philosophy is reigning freely. 4. Maximin. This stands for "maximize the minimums." In this strategy, that of the pessimist, the worst possible outcome of each decision is considered and the decision with the highest minimum is chosen. The Maximin orientation is good when the consequences of a failed decision are particularly harmful or undesirable. Maximin concentrates on the salvage value of a decision, or of the guaranteed return of the decision. It's the philosophy behind the saying, "A bird in the hand is worth two in the bush." Decision Making Procedure As you read this procedure, remember our discussion earlier about the recursive nature of decision making. In a typical decision making situation, as you move from step to step here, you will probably find yourself moving back and forth also. 1. Identify the decision to be made together with the goals it should achieve. Determine the scope and limitations of the decision. Is the new job to be permanent or temporary or is that not yet known (thus requiring another decision later)? Is the new package for the product to be put into all markets or just into a test market? How might the scope of the decision be changed--that is, what are its possible parameters? When thinking about the decision, be sure to include a clarification of goals: We must decide whom to hire for our new secretary, one who will be able to create an efficient and organized office. Or, we must decide
25
where to go on vacation, where we can relax and get some rest from the fast pace of society. 2. Get the facts. But remember that you cannot get all the facts. Get as many facts as possible about a decision within the limits of time imposed on you and your ability to process them, but remember that virtually every decision must be made in partial ignorance. Lack of complete information must not be allowed to paralyze your decision. A decision based on partial knowledge is usually better than not making the decision when a decision is really needed. The proverb that "any decision is better than no decision," while perhaps extreme, shows the importance of choosing. When you are racing toward a bridge support, you must decide to turn away to the right or to the left. Which way you turn is less important than the fact that you do indeed turn. As part of your collection of facts, list your feelings, hunches, and intuitive urges. Many decisions must ultimately rely on or be influenced by intuition because of the remaining degree of uncertainty involved in the situation. Also as part of your collection of facts, consult those who will be affected by and who will have to implement your decision. Input from these people not only helps supply you with information and help in making the decision but it begins to produce the acceptance necessary in the implementers because they feel that they are part of the decision making process. As Russell Ackoff noted in The Art of Problem Solving, not consulting people involved in a decision is often perceived as an act of aggression. 3. Develop alternatives. Make a list of all the possible choices you have, including the choice of doing nothing. Not choosing one of the candidates or one of the building sites is in itself a decision. Often a non decision is harmful as we mentioned above--not choosing to turn either right or left is to choose to drive into the bridge. But sometimes the decision to do nothing is useful or at least better than the alternatives, so it should always be consciously included in the decision making process. Also be sure to think about not just identifying available alternatives but creating alternatives that don't yet exist. For example, if you want to choose which major to pursue in college, think not only of the available ones in the catalog, but of designing your own course of study. 4. Rate each alternative. This is the evaluation of the value of each alternative. Consider the negative of each alternative (cost, consequences, problems created, time needed, etc.) and the positive of each (money saved, time saved, added creativity or happiness to company or employees, etc.). Remember here that the alternative that you might like best or that would in the best of all possible worlds
26
be an obvious choice will, however, not be functional in the real world because of too much cost, time, or lack of acceptance by others. Also don't forget to include indirect factors in the rating. If you are deciding between machines X, Y, and Z and you already have an employee who knows how to operate machine Z, that fact should be considered. If you are choosing an investigative team to send to Japan to look at plant sites and you have very qualified candidates A, B, and C, the fact that B is a very fast typist, a superior photographer or has some other side benefit in addition to being a qualified team member, should be considered. In fact, what you put on your hobbies and interests line on your resume can be quite important when you apply for a job just because employers are interested in getting people with a good collection of additional abilities. 5. Rate the risk of each alternative. In problem solving, you hunt around for a solution that best solves a particular problem, and by such a hunt you are pretty sure that the solution will work. In decision making, however, there is always some degree of uncertainty in any choice. Will Bill really work out as the new supervisor? If we decide to expand into Canada, will our sales and profits really increase? If we let Jane date Fred at age fifteen, will the experience be good? If you decide to marry person X or buy car Y or go to school Z, will that be the best or at least a successful choice? Risks can be rated as percentages, ratios, rankings, grades or in any other form that allows them to be compared. See the section on risk evaluation for more details on risking. 6. Make the decision. If you are making an individual decision, apply your preferences (which may take into account the preferences of others). Choose the path to follow, whether it includes one of the alternatives, more than one of them (a multiple decision) or the decision to choose none. And of course, don't forget to implement the decision and then evaluate the implementation, just as you would in a problem solving experience. One important item often overlooked in implementation is that when explaining the decision to those involved in carrying it out or those who will be affected by it, don't just list the projected benefits: frankly explain the risks and the drawbacks involved and tell why you believe the proposed benefits outweigh the negatives. Implementers are much more willing to support decisions when they (1) understand the risks and (2) believe that they are being treated with honesty and like adults. Remember also that very few decisions are irrevocable. Don't cancel a decision prematurely because many new plans require time to work--it may take years for your new branch office in Paris to get profitable--but don't hesitate to change directions if a particular decision clearly is not working out or is being somehow harmful. You can always make another decision to do something else.
27
Non-functional requirement In systems engineering and requirements engineering, a non-functional requirement is a requirement that specifies criteria that can be used to judge the operation of a system, rather than specific behaviors. This should be contrasted with functional requirements that define specific behavior or functions. In general, functional requirements define what a system is supposed to do whereas non-functional requirements define how a system is supposed to be. Nonfunctional requirements are often called qualities of a system. Other terms for non-functional requirements are "constraints", "quality attributes", "quality goals" and "quality of service requirements".Qualities, that is, non-functional requirements, can be divided into two main categories: Execution qualities, such as security and usability, which are observable at run time. Evolution qualities, such as testability, maintainability, extensibility and scalability, which are embodied in the static structure of the software system. Examples A system may be required to present the user with a display of the number of records in a database. This is a functional requirement. How up-to-date this number needs to be is a non-functional requirement. If the number needs to be updated in real time, the system architects must ensure that the system is capable of updating the displayed record count within an acceptably short interval of the number of records changing. Other examples: Accessibility Audit and control Availability (see service level agreement) Certification Dependency on other parties Documentation Efficiency (resource consumption for given load) Effectiveness (resulting performance in relation to effort) Extensibility (adding features, and carry-forward of customizations at next major version upgrade) Legal and licensing issues Interoperability Maintainability Open Source Performance / Response time (see Performance Engineering)
28
Platform compatibility Price Portability Quality (e.g. Faults Discovered, Faults Delivered, Fault Removal Efficacy) Reliability (e.g. Mean Time Between Failures - MTBF) Resource constraints (processor speed, memory, disk space, network bandwidth etc. ) Response time Robustness Scalability (horizontal, vertical) Security Software, tools, standards etc. Compatibility Stability Safety Supportability Testability Usability by target user community Functional and non-functional requirements: In general, requirements are partitioned into functional requirements and non-functional requirements. Functional requirements are associated with specific functions, tasks or behaviours the system must support, while non-functional requirements are constraints on various attributes of these functions or tasks. In terms of the ISO qualit y characteristics for evaluation, the functional requirements address the quality characteristic of functionality while the other quality characteristics are concerned with various kinds of nonfunctional requirements. Because non-functional requirements tend to be stated in terms of constraints on the results of tasks which are given as functional requirements (e.g., constraints on the speed or efficiency of a given task), a taskbased functional requirements statement is a useful skeleton upon which to construct a complete requirements statement. That is the approach taken in this work. It can be helpful to think of non-functional requirements as adverbially related to tasks or functional requirements: how fast, how efficiently, how safely, etc., is a particular task carried out by a particular system? Functional Requirement (Function): A Functional Requirement is a requirement that, when satisfied, will allow the user to perform some kind of function. Non-Functional Requirement:
29
A Non-Functional Requirement is usually some form of constraint or restriction that must be considered when designing the solution. Functional and non-functional Requirements have the following characteristics: Uses simple language Not ambiguous Contains only one point Specific to one type of user Is qualified Describes what and not how Non-Functional requirements tend to identify user constraints and system constraints. A system constraint is a constraint imposed by the system and not dictated by a Business Need. Since system constraints are part of a solution, they should be documented in the System Specifications document Functional Requirements Functional requirements capture the intended behavior of the system. This behavior may be expressed as services, tasks or functions the system is required to perform. In product development, it is useful to distinguish between the baseline functionality necessary for any system to compete in that product domain, and features that differentiate the system from competitors products, and from variants in your companys own product line/family. Features may be additional functionality, or differ from the basic functionality along some quality attribute (such as performance or memory utilization).One strategy for quickly penetrating a market, is to produce the core, or stripped down, basic product, and adding features to variants of the product to be released shortly thereafter. This release strategy is obviously also beneficial in information systems development, staging core functionality for early releases and adding features over the course of several subsequent releases. In many industries, companies produce product lines with different cost/feature variations per product in the line, and product families that include a number of product lines targeted at somewhat different markets or usage situations. What makes these product lines part of a family, are some common elements of functionality and identity. A platform-based development approach leverages this commonality, utilizing a set of reusable assets across the family. These strategies have important implications for software architecture. In particular, it is not just the functional requirements of the first product or release that must be supported by the architecture. The functional requirements of early (nearly concurrent) releases need to be explicitly taken into account. Later
30
releases are accommodated through architectural qualities such as extensibility, flexibility, etc. The latter are expressed as non-functional requirements. Functional Requirements Template Functional Requirement defines a function of a software system and how the system must behave when presented with specific inputs or conditions. These may include calculations, data manipulation and processing and other specific functionality. A typical functional requirement has a unique name, number, summary, and a rationale. Use this template to: Specify particular behaviors of a system. Use the requirements to generate use cases. Each use case describes one or more functional requirement. Help the reader understand why a requirement is needed. Track requirements through the development of the system. Capture the scope, business objectives, and functional and non-functional requirements of the current/proposed system. HUMAN FACTORS Human factors is a term that covers: The science of understanding the properties of human capability (Human Factors Science). The application of this understanding to the design, development and deployment of systems and services (Human Factors Engineering). The art of ensuring successful application of Human Factors Engineering to a programme (sometimes referred to as Human Factors Integration).It can also be called ergonomics. The term "human factors science/research/technologies" is to a large extent synonymous with the term "ergonomics", having separate origins on either side of the Atlantic Ocean but covering the same technical areas. In general, a human factor is a physical or cognitive property of an individual or social behavior which is specific to humans and influences functioning of technological systems as well as human-environment equilibriums. In social interactions, the use of the term human factor stresses the social properties unique to or characteristic of humans. Human factors involves the study of all aspects of the way humans relate to the world around them, with the aim of improving operational performance, safety, through life costs and/or adoption through improvement in the experience
31
of the end user. The terms human factors and ergonomics have only been widely used in recent times; the field's origin is in the design and use of aircraft during World War II to improve aviation safety. It was in reference to the psychologists and physiologists working at that time and the work that they were doing that the terms "applied psychology" and ergonomics were first coined. Work by Elias Porter, Ph.D. and others within the RAND Corporation after WWII extended these concepts. "As the thinking progressed, a new concept developed - that it was possible to view an organization such as an air-defense, man-machine system as a single organism and that it was possible to study the behavior of such an organism. It was the climate for a breakthrough."Specializations within this field include cognitive ergonomics, usability, human computer/human machine interaction, and user experience engineering. New terms are being generated all the time. For instance, user trial engineer may refer to a human factors professional who specializes in user trials. Although the names change, human factors professionals share an underlying vision that through application of an understanding of human factors the design of equipment, systems and working methods will be improved, directly affecting peoples lives for the better. THE FORMAL HISTORY OF AMERICAN HUMAN FACTORS ENGINEERING The formal history describes activities in known chronological order. This can be divided into 5 markers: - Developments prior to World War I - Developments during World War I - Developments between World War I and World War II - Developments during World War II - Developments after World War II Developments prior to World War I: Prior to WWI the only test of human to machine compatibility was that of trial and error. If the human functioned with the machine he was accepted, if not he was rejected. There was a significant change in the concern for humans during the American civil war. The US patent office was concerned whether the mass produced uniforms and new weapons could be used by the infantry men. The next development was when the American inventor Simon Lake tested submarine operators for psychological factors. The next development was the scientific study of the worker. This was an effort dedicated to improve the efficiency of humans in the work place. These studies were designed by F W Taylor. The next step was the derivation of formal time and motion study from the studies of Frank and Lillian Gilbreth.
32
Developments during World War I:With the onset of WWI, more sophisticated equipment was developed. The inability of the personnel to use such systems led to an increase in interest in human capability. Earlier the focus of aviation psychology was on the aviator himself. But as the time progressed the focus shifted onto the aircraft in particular, the design of controls and displays, the effects of altitude and environmental factors on the pilot. The war saw the emergence of aeromedical research and the need for testing and measurement methods. Still, the war did not create an Human Factors Engineering (HFE) discipline as such, the reasons attributed to this are the facts that technology was not very advanced at the time and Americas involvement in the war lasted only for 18 months. Developments between World War I and World War II: This period saw relatively slow development in HFE. Although, studies on driver behavior started gaining momentum during this period, as Henry Ford started providing millions of Americans with automobiles. Another major development during this period was the performance of aero-medical research. By the end of WWI, two aeronautical labs were established, one at Brooks Airforce Base, Texas and the other at Wright field outside of Dayton, Ohio. Many tests were conducted to determine which characteristic differentiated the successful pilots from the unsuccessful ones. During the early 1930s, Edvin Link developed the first flight simulator. The trend continued and more sophisticated simulators and test equipment were developed. Another significant development was in the civilian sector, where the effects of illumination on worker productivity were examined. This led to the coinage of the 'Hawthorne Effect', which suggested that motivational factors could significantly influence human performance. Developments during World War II: With the onset of the WW II, it was no longer possible to adopt the Tayloristic principle of matching individuals to preexisting jobs. Now the design of equipment had to take into account human limitations and take advantage of human capabilities. This change took time to come into place, there was a lot research conducted to determine the human capabilities and limitations that had to be accomplished. A lot of this research took off where the aeromedical research between the wars had left off. An example of this is the study done by Fitts and Jones (1947), who studied the most effective configuration of control knobs to be used in aircraft cockpits. A lot of this research transcended into other equipment with the aim of making the controls and displays easier for the operators to use. After the war, the Army Air Force published 19 volumes summarizing what had been established from research during the war.
33
Developments after World War II: In the initial 20 years after the WW II, most activities were done by the founding fathers Chapanis, Fitts, Small. The beginning of cold war led to a major expansion of Defense supported research laboratories. Also, a lot of labs established during the war started expanding. Most of the research following the war was military sponsored. Large sums of money were granted to universities to conduct research. The scope of the research also broadened from small equipments to entire workstations and systems. Concurrently, a lot of opportunities started opening up in the civilian industry. The focus shifted from research but to participate through advice to engineers in the design of equipments. After 1965, the period saw a maturation of the discipline. The field has expanded with the development of the computer and computer applications. THE CYCLE OF HUMAN FACTORS Human Factors involves the study of factors and development of tools that facilitate the achievement of these goals. In the most general sense, the three goals of human factors are accomplished through several procedures in human factors cycle, which depicts the human operator (brain and body) and the system with which he or she is interacting. At first point it is necessary to diagnose or identify the problems and deficiencies in the human-system interaction of an existing system. After defining the problems there are five different approaches that can be approached to in order to implement the solution. There are as follows: Equipment Design: changes the nature of the physical equipment with which humans work. Task Design: focuses more on changing what operators do than on changing the devices they use. This may involve assigning part or all of tasks to other workers or to automated components. Environmental Design: implements changes, such as improved lighting, temperature control and reduced noise in the physical environment where the task is carried out. Training the individuals: better preparing the worker for the conditions that he or she will encounter in the job environment by teaching and practicing the necessary physical or mental skills. Selection of individuals: is a technique that recognizes the individual differences across humans in every physical and mental dimension that is relevant for good system performance. Such a performance can be optimized by selecting operators who possess the best profile of characteristics for the job. HUMAN FACTORS SCIENCE Human factors are sets of human-specific physical, cognitive, or social properties which either may interact in a critical or dangerous manner with
34
technological systems, human natural environment, or human organizations, or they can be taken under consideration in the design of ergonomic human-user oriented equipments. The choice/identification of human factors usually depends on their possible negative or positive impact on the functioning of humanorganization and human-machine system. The human-machine model: The simple human-machine model is of a person interacting with a machine in some kind of environment. The person and machine are both modeled as information-processing devices, each with inputs, central processing, and outputs. The inputs of a person are the senses (e.g., eyes, ears) and the outputs are effectors (e.g., hands, voice). The inputs of a machine are input control devices (e.g., keyboard, mouse) and the outputs are output display devices (e.g., screen, auditory alerts). The environment can be characterized physically (e.g., vibration, noise, zero-gravity), cognitively (e.g., time pressure, uncertainty, risk), and/or organizationally (e.g., organizational structure, job design). This provides a convenient way for organizing some of the major concerns of human engineering: the selection and design of machine displays and controls; the layout and design of workplaces; design for maintainability; and the design of the work environment. Example: Driving an automobile is a familiar example of a simple man-machine system. In driving, the operator receives inputs from outside the vehicle (sounds and visual cues from traffic, obstructions, and signals) and from displays inside the vehicle (such as the speedometer, fuel indicator, and temperature gauge). The driver continually evaluates this information, decides on courses of action, and translates those decisions into actions upon the vehicle's controlsprincipally the accelerator, steering wheel, and brake. Finally, the driver is influenced by such environmental factors as noise, fumes, and temperature. No matter how important it may be to match an individual operator to a machine, some of the most challenging and complex human problems arise in the design of large manmachine systems and in the integration of human operators into these systems. Examples of such large systems are a modern jet airliner, an automated post office, an industrial plant, a nuclear submarine, and a space vehicle launch and recovery system. In the design of such systems, human-factors engineers study, in addition to all the considerations previously mentioned, three factors: personnel, training, and operating procedures. Personnel are trained; that is, they are given appropriate information and skills required to operate and maintain the system. System design includes the
35
development of training techniques and programs and often extends to the design of training devices and training aids. Instructions, operating procedures, and rules set forth the duties of each operator in a system and specify how the system is to function. Tailoring operating rules to the requirements of the system and the people in it contributes greatly to safe, orderly, and efficient operations HUMAN FACTORS ENGINEERING Human Factors Engineering (HFE) is the discipline of applying what is known about human capabilities and limitations to the design of products, processes, systems, and work environments. It can be applied to the design of all systems having a human interface, including hardware and software. Its application to system design improves ease of use, system performance and reliability, and user satisfaction, while reducing operational errors, operator stress, training requirements, user fatigue, and product liability. HFE is distinctive in being the only discipline that relates humans to technology. Human factors engineering focuses on how people interact with tasks, machines (or computers), and the environment with the consideration that humans have limitations and capabilities. Human factors engineers evaluate "Human to Human," "Human to Group," "Human to Organizational," and "Human to Machine (Computers)" interactions to better understand these interactions and to develop a framework for evaluation. Human Factors engineering activities include: 1. Usability assurance 2. Determination of desired user profiles 3. Development of user documentation 4. Development of training programs. Usability assurance Usability assurance is an interdisciplinary concept, integrating system engineering with Human Factors engineering methodologies. Usability assurance is achieved through the system or service design, development, evaluation and deployment. user interface design comprises physical (ergonomic) design, interaction design and layout design. Usability development comprises integration of human factors in project planning and management, including system specification documents: requirements, design and testing. Usability evaluation is a continuous process, starting with the operational requirements specification, through prototypes of the user interfaces,
36
through usability alpha and beta testing, and through manual and automated feedback after the system has been deployed. User interface design Human-computer interaction is a discipline concerned with the design, evaluation and implementation of interactive computing systems for human use and with the study of major phenomena surrounding them. This is a well known subject of Human Factors within the Engineering field. There are many different ways to determine human computer interaction by usability testing. Problems with Human Factors Methods Problems in how usability measures are employed include: (1) measures of learning and retention of how to use an interface are rarely employed during methods and (2) some studies treat measures of how users interact with interfaces as synonymous with quality-in-use, despite an unclear relation. Weakness of Usability Lab Testing Although usability lab testing is believed to be the most influential evaluation method, it does have some limitations. These limitations include: (1) Additional resources and time than other methods (2) Usually only examines a fraction of the entire market segment (3) Test scope is limited to the sample tasks chosen (4) Long term ears-of-use problems are difficult to identify (5) May reveal only a fraction of total problems (6) Laboratory setting excludes factors that the operational environment places on the products usability Weakness of Inspection Methods Inspection methods (expert reviews and walkthroughs) can be accomplished quickly, without resources from outside the development team, and does not require the research expertise that usability tests need. However, inspection methods do have limitations, which include: (1) Do not usually directly involve users (2) Often do not involve developers (3) Set up to determine problems and not solutions (4) Do not foster innovation or creative solutions (5) Not good at persuading developers to make product improvements
37
Weakness of Surveys, Interviews, and Focus Groups These traditional human factors methods have been adapted, in many cases, to assess product usability. Even though there are several surveys that are tailored for usability and that have established validity in the field, these methods do have some limitations, which include: (1) Reliability of all surveys is low with small sample sizes (10 or less) (2) Interview lengths restricts use to a small sample size (3) Use of focus groups for usability assessment has highly debated value (4) All of these methods are highly dependent on the respondents Weakness of Field Methods Although field methods can be extremely useful because they are conducted in the users natural environment, they have some major limitations to consider. The limitations include: (1) Usually take more time and resources than other methods (2) Very high effort in planning, recruiting, and executing than other methods (3) Much longer study periods and therefore requires much goodwill among the participants (4) Studies are longitudinal in nature, therefore, attrition can become a problem. APPLICATION OF HUMAN FACTORS ENGINEERING An Example: Human Factors Engineering Applied to the Military Before World War II, HFE had no significance in the design of machines. Consequently, many fatal human errors during the war were directly or indirectly related to the absence of comprehensive HFE analyses in the design and manufacturing process. One of the reasons for so many costly errors was the fact that the capabilities of the human were not clearly differentiated from those of the machine. Furthermore, human performance capabilities, skill limitation, and response tendencies were not adequately considered in the designs of the new systems that were being produced so rapidly during the war. For example, pilots were often trained on one generation of aircraft, but by the time they got to the war zone, they were required to fly a newer model. The newer model was usually more complex than the older one and, even more detrimental, the controls may have had opposing functions assigned to them. Some aircraft required that the control stick be pulled back toward the pilot in order to pull the nose up. In other aircraft the exact opposite was required; namely, in order to ascend you would push the stick away from you. Needless to say, in an emergency situation many
38
pilots became confused and performed the incorrect maneuver, with disastrous results. Along the same line, pilots were subject to substitution errors due mostly to lack of uniformity of control design, inadequate separation of controls, or the lack of a coding system to help the pilot identify controls by the sense of touch alone. For example, in the early days of retractable landing gear, pilots often grabbed the wrong lever and mistakenly raised the landing gear instead of the flaps. Sensory overload also became a problem, especially in cockpit design. The 1950s brought a strong program of standardizing control shapes, locations and overload management. The growth of human factors engineering during the midto late-forties was evidenced by the establishment of several organizations to conduct psychological research on equipment design. Toward the end of 1945, Paul Fitts established what came to be known as the Behavioral Sciences Laboratory at the Army Corps Aeromedical Laboratory in Dayton, Ohio. Around the same time, the U.S. Navy established the Naval Research Laboratory at Anacostia, Maryland (headed by Frank V. Taylor), and the Navy Special Devices Center at Port Washington, New York (headed by Leonard C. Mead). The Navy Electronics Laboratory in San Diego, California, was established about a year later with Arnold M. Small as head. In addition to the establishment of these military organizations, the human factors discipline expanded wihtin several civilian activities. Contract support was provided by the U.S. Navy and the U.S. Air Force for research at several noted universities, specifically Johns Hopkins, Tufts, Harvard, Maryland, Holyoke, and California (Berkeley). Paralleling this growth was the establishment of several private corporate ventures. Thus, as a direct result of the efforts of World War II, a new industry known as engineering psychology or human factors engineering was born. Why is HFE important to the military? Until this day, many project managers and designers are still slow to consider Human Factors Engineering (HFE) as an essential and integral part of the design process. This is mostly due to their lack of education on the purpose of HFE. Nevertheless, progress is being made as HFE is becoming more and more accepted and is now implemented in a wide variety of applications and processes. The U.S. military is particularly concerned with the implementation of HFE in every phase of the acquisition process of its systems and equipment. Just about every piece of gear, from a multi-billion dollar aircraft carrier to the boots that Servicemembers wear, goes at least in part through some HFE analyses before procurement and throughout its lifecycle.
39
Lessons learned in the aftermath of World War II prompted the U.S. War Department (now U.S. Department of Defense) to take some steps in improving safety in military operations. U.S. Department of Defense regulations require a comprehensive management and technical strategy for human systems integration (HSI)[15] be initiated early in the acquisition process to ensure that human performance is considered throughout the system design and development process.[16] HFE applications in the U.S. Army In the U.S. Army, the term MANPRINT is used as the program designed to implement HSI.[17][18] The program was established in 1984 with a primary objective to place the human element (functioning as individual, crew/team, unit and organization) on equal footing with other design criteria such as hardware and software. The entry point of MANPRINT in the acquisition process is through requirements documents and studies. What is MANPRINT? MANPRINT (Manpower and Personnel Integration) is a comprehensive management and technical program that focuses attention on human capabilities and limitations throughout the systems life cycle: concept development, test and evaluation, documentation, design, development, fielding, post-fielding, operation and modernization of systems. It was initiated in recognition of the fact that the human is an integral part of the total system. If the human part of the system can't perform efficiently, the entire system will function sub-optimally. MANPRINT's goal is to optimize total system performance at acceptable cost and within human constraints. This is achieved by the continuous integration of seven human-related considerations (known as MANPRINT domains) with the hardware and software components of the total system and with each other, as appropriate. The seven MANPRINT domains are: Manpower (M), Personnel (P), Training (T), Human Factors Engineering (HFE), System Safety (SS), Health Hazards (HH), Soldier Survivability (SSv). They are each expounded on below: Manpower (M) Manpower addresses the number of military and civilian personnel required and potentially available to operate, maintain, sustain, and provide training for systems.[19] It is the number of personnel spaces (required or authorized positions) and available people (operating strength). It considers these requirements for peacetime, conflict, and low intensity operations. Current and projected
40
constraints on the total size of the Army/organization/unit are also considered. The MANPRINT practitioner evaluates the manpower required and/or available to support a new system and subsequently considers these constraints to ensure that the human resource demands of the system do not exceed the projected supply. Personnel (P) Manpower and personnel are closely related. While manpower looks at numbers of spaces and people, the domain of personnel addresses the cognitive and physical characteristics and capabilities required to be able to train for, operate, maintain, and sustain materiel and information systems. Personnel capabilities are normally reflected as knowledge, skills, abilities, and other characteristics (KSAOs). The availability of personnel and their KSAOs should be identified early in the acquisition process and may result in specific thresholds. On most systems, emphasis is placed on enlisted personnel as the primary operators, maintainers, and supporters of the system. Personnel characteristics of enlisted personnel are easier to quantify since the Armed Services Vocational Aptitude Battery (ASVAB) is administered to potential enlistees. While normally enlisted personnel are operators and maintainers; that is not always the case, especially in aviation systems. Early in the requirements determination process, identification of the target audience should be accomplished and used as a baseline for assessment. Cognitive and physical demands of the system should be assessed and compared to the projected supply. MANPRINT also takes into consideration personnel factors such as availability, recruitment, skill identifiers, promotion, and assignment. Training (T) Training is defined as the instruction or education, on-the-job, or self development training required to provide all personnel and units with their essential job skills, and knowledge. Training is required to bridge the gap between the target audiences' existing level of knowledge and that required to effectively operate, deploy/employ, maintain and support the system. The MANPRINT goal is to acquire systems that meet the Army's training thresholds for operation and maintenance. Key considerations include developing an affordable, effective and efficient training strategy (which addresses new equipment, training devices, institutional, sustainment, and unit collective tactical training); determining the resources required to implement it in support of fielding and the most efficient method for dissemination (contractor, distance learning, exportable packages, etc.); and evaluating the effectiveness of the
41
training. Training is particularly crucial in the acquisition and employment of a new system. New tasks may be introduced into a duty position; current processes may be significantly changed; existing job responsibilities may be redefined, shifted, or eliminated; and/or entirely new positions may be required. It is vital to consider the total training impact of the system on both the individuals and the organization as a whole. Human Factors Engineering (HFE) The goal of HFE is to maximize the ability of an individual or crew to operate and maintain a system at required levels by eliminating design-induced difficulty and error. Human factors engineers work with systems engineers to design and evaluate human-system interfaces to ensure they are compatible with the capabilities and limitations of the potential user population. HFE is conducted during all phases of system development, to include requirements specification, design and testing and evaluation. HFE activities during requirements specification include: evaluating predecessor systems and operator tasks; analyzing user needs; analyzing and allocating functions; and analyzing tasks and associated workload. During the design phase, HFE activities include: evaluating alternative designs through the use of equipment mockups and software prototypes; evaluating software by performing usability testing; refining analysis of tasks and workload; and using modeling tools such as human figure models to evaluate crew station and workplace design and operator procedures. During the testing and evaluation phase, HFE activities include: confirming the design meets HFE specification requirements; measuring operator task performance; and identifying any undesirable design or procedural features. System Safety (SS) System Safety is the design features and operating characteristics of a system that serve to minimize the potential for human or machine errors or failures that cause injurious accidents. Safety considerations should be applied in system acquisition to minimize the potential for accidental injury of personnel and mission failure. Health Hazards (HH) Health Hazards addresses the design features and operating characteristics of a system that create significant risks of bodily injury or death. Along with safety hazards, an assessment of health hazards is necessary to determine risk reduction or mitigation. The goal of the Health Hazard Assessment (HHA) is to incorporate biomedical knowledge and principles early in the design of a system to eliminate or control health hazards. Early application will eliminate costly system retrofits
42
and training restrictions resulting in enhanced soldier-system performance, readiness and cost savings. HHA is closely related to occupational health and preventive medicine but gets its distinctive character from its emphasis on soldier-system interactions of military unique systems and operations. Health Hazard categories include acoustic energy, biological substances, chemical substances, oxygen deficiency, radiation energy, shock, temperature extremes and humidity, trauma, vibration, and other hazards. Health hazards include those areas that could cause death, injury, illness, disability, or a reduction in job performance. Organizational and Social The seventh domain addresses the human factors issues associated with the sociotechnical systems necessary for modern warfare. This domain has been recently added to investigate issues specific to Network Enabled Capability (NEC) also known as Network Centric Warfare (NCW). Elements such as dynamic command and control structures, data assimilation across mulitple platforms and its fusion into information easily understood by distributed operators are some of the issues investigated. A soldier survivability domain was also proposed but this was never fully integrated into the MANPRINT model. Domain Integration Although each of the MANPRINT domains has been introduced separately, in practice they are often interrelated and tend to impact on one another. Changes in system design to correct a deficiency in one MANPRINT domain nearly always impact another domain. HUMAN FACTORS INTEGRATION Areas of interest for human factors practitioners may include: training, staffing evaluation, communication, task analyses, functional requirements analyses and allocation, job descriptions and functions, procedures and procedure use, knowledge, skills, and abilities; organizational culture, human-machine interaction, workload on the human, fatigue, situational awareness, usability, user interface, learnability, attention, vigilance, human performance, human reliability, human-computer interaction, control and display design, stress, visualization of data, individual differences, aging, accessibility, safety, shift work, work in extreme environments including virtual environments, human error, and decision making.
43
REAL WORLD APPLICATIONS OF HUMAN FACTORS - MULTIMODAL INTERFACES Multi-Modal Interfaces In many real world domains, ineffective communication occurs partially because of inappropriate and ineffective presentation of information. Many real world interfaces both allow user input and provide user output in a single modality (most often being either visual or auditory). This single modality presentation can often lead to data overload in that modality causing the user to become overwhelmed by information and cause him/her to overlook something. One way to address this issue is to use multi-modal interfaces. Reasons to Use Multimodal Interfaces Time Sharing helps avoid overloading one single modality -Redundancy providing the same information in two different modalities helps assure that the user will see the information Allows for more diversity in users (blind can use tactile input; hearing impaired can use visual input and output) Error Prevention having multiple modalities allows the user to choose the most appropriate modality for each task (for example, spatial tasks are best done in a visual modality and would be much harder in a olfactory modality) Examples of Well Known Multi-Modality Interfaces Cell Phone The average cell phone uses auditory, visual, and tactile output through use of a phone ringing, vibrating, and a visual display of caller ID. ATM Both auditory and visual outputs
Early Multi-Modal Interfaces by the Experts Bolts Put That There 1980 used speech and manual pointing Cohen and Oviatts Quickset multi user speech and gesture input Information Theory The information theory is a theory that deals primarily with the structural aspect of communication. This is directly due to the fact that Claude Shannon, a Bell Telephone Company research scientist, created it. He was unconcerned with the
44
human side of communication. The semantic meaning of a message or the pragmatic effect on the listener mattered little to him. Warren Weaver, an executive with the Rockefeller Foundation and the Sloan-Kettering Institute on Cancer Research, expressed in an essay that reducing the loss of information was the solution to any communication problem. The basis of the information theory is to reusce any interference in the communication of a message. A linear model of communication is used in A First Look at Communication Theory to illustrate the thought of communication in a technical sense.
Communication theory deals with systems for transmitting information from one point to another. Information theory was born with the discovery of the fundamental laws of data compression and transmission
Introduction: Information theory answers two fundamental questions: What is the ultimate data compression? Answer: The Entropy H. What is the ultimate transmission rate? Answer: Channel Capacity C. But its reach is beyond Communication Theory. In early days it was thought that increasing transmission rate over a channel increases the error rate. Shannon showed that this is not true as long as rate is below Channel Capacity. Shannon has further shown that random processes have an irreducible complexity below which they can not be compressed.
45
Information Theory (IT) relates to other fields: Computer Science: shortest binary program for computing a string. Probability Theory: fundamental quantities of IT are used to estimate probabilities. Inference: approach to predict digits of pi. Infering behavior of stock market. Computation vs. communication: computation is communication limited and vice-versa. It has its beginning at the start of the century but it really took of after WW II. Weiner: extracting signals of a known ensemble from noise of a predictable nature. Shannon: encoding messages chosen from a known ensemble so that they can be transmitted accurately and rapidly even in the presence of noise. IT: The study of efficient encoding and its consequences in the form of speed of transmission and probability of error. Historical Perspective: Follows S. Verdu, Fifty years of Shannon Theory, IT-44, Oct. 1998, pp. 2057-2058. Shannon published A mathematical theory of communication in 1948. It lays down fundamental laws of data compression and transmission. Nyquist (1924): transmission rate is proportional to the log of the number of levels in a unit duration. - Can transmission rate be improved by replacing Morse by an optimum code? Whitlaker (1929): loseless interpolation of bandlimited function. Gabor (1946): time-frequency uncertainty principle Hartley (1928): muses on the physical possibilities of transmission rates. Introduces a quantitative measure for the amount of information associated with n selection of states. H=n log s where s = symbols available in each selection. n = # of selections.
46
Information = outcome of a selection among a finite number of possibilities. Data Compression Shannon uses the definition of entropy as a measure of information. Rationals: (1) continuous in prob. (2) increasing with n for equiprobable r.v. (3) additive entropy of a sum of r.v. is equal to the sum of entropies of the individual r.v. Entropy satisfies for memoryless sources: Shannon Theorem 3: Given any and, we can find No such that sequences of any length fall into two classes: (1) A set whose probability is less than (2) The reminder set, all of whose members have probabilities {p} satisfying Reliable Communication Shannon: ..redundancy must be introduced to combat the particular noise structure involved a delay is generally required to approach the ideal encoding. Defines channel capacity It is possible to send information at the rate C through the channel with as small a frequency of errors or equivocation as desired by proper encoding. This statement is not true for any rate greater than C. Defines differential entropy of a continuous random variable as a formal analog to the entropy of a discrete random variable. Shannon obtains the formula for the capacity of: power-constrained white Gaussian channel flat transfer function The minimum energy necessary to transmit one bit of information is 1.6 dB below the noise psd. Some interesting points about the capacity relation: Since any strictly band limited signal has infinite duration, the rate of information of any finite codebook of band limited waveforms is equal to zero. Transmitted signals must approximate statistical properties of white noise.
47
Generalization to dispersive/nonwhite Gaussian channels by Shannons water-filling formula. Constraints other then power constraints are of interest: Amplitude constraints Quantized constraints Specific modulations. Zero-Error Channel Capacity: Example of typing a text: a non-zero probability of making an error with the prob. = 1 as the length increases. By designing a code that takes into account the statistics of the typists mistakes, the prob. of error can be made 0. Example: consider mistakes made by mistyping neighboring letters. The alphabet { b, I, t, s} has no neighboring letters, hence will have zero probability of error. Zero-error capacity: the rate at which information can be encoded with zero prob. of error. Error Exponent: Rather than focus on the channel capacity, study the error probability (EP) as a function of block length. Exponential decrease of EP as a function of block length in Gaussian, discrete memory less channel. The exponent of the minimum achievable EP is a function of the rate referred to as reliability function. An important rate that serves as lower bound to the reliability function is the cutoff rate. Was long thought to be the practical limit to transmission rate. Turbo codes refuted that notion. ERROR CONTROL MECHANISMS Error Control Strategies The goal of error-control is to reduce the effect of noise in order to reduce or eliminate transmission errors. Error-Control Coding refers to adding redundancy to the data. The redundant symbols are subsequently used to detect or correct erroneous data. Error control strategy depends on the channel and on the specific application. Error control for one-way channels are referred to as forward error control (FEC). It can be accomplished by:
48
Error detection and correction hard detection. Reducing the probability of an error soft detection For two-way channels: error detection is a simpler task that error correction. Retransmit the data only when an error is detected: automatic request (ARQ). In the course, we focus on wireless data communications, hence we will not delve in error concealment techniques such as interpolation, used in audio and video recording. Error schemes may be priority based, i.e., providing more protection to certain types of data that others. For example, in wireless cellular standards, the transmitted bits are divided in three classes: bits that get double code protection, bits that get single code protection, and bits that are not protected. Block and Convolutional Codes Error control codes can be divided into two large classes: block codes and convolutional codes. Information bits encoded with an alphabet Q of q distinct symbols.
Designers of early digital communications system tried to improve reliability by increasing power or bandwidth. Shannon taught us how to buy performance with a less expensive resource: complexity. Formal definition of a code C: a set of 2k n-tuples. Encoder: the set of 2k pairs (m,c), where m is the data word and c is the code word. Linear code: the set of codewords is closed under modulo-2 addition. Error detection and correction correspond to terms in the Fano inequality: o Error detection reduces
49
o Error correction reduces BASIC DEFINITIONS Define Entropy, Relative Entropy, Mutual Information Entropy, Mutual Information A measure of uncertainty of a random variable. Let x be a discrete random variable (r.v.) with alphabet A and probability mass p(x) = Pr {X=x}. (D1) The entropy H(x) of a discrete r.v. x is defined bits where log is to the base 2. Comments: (1) simplest example: entropy of a fair coin = 1bit. (2) Adding terms of zero probability does not change entropy (0log 0 = 0). (3) Entropy depends on probabilities of x, not on actual values. (4) Entropy is H(x) = E [ log 1/p(x) ] Properties of Entropy (P1) H(x) 0 0 p(x) 1 log [ 1/p(x) ] 0 [E] x = 1 p 0 1-p H(x) = - p log p (1-p) log(1-p) = H(p) [E] x = a 1/2 b 1/4 c 1/8 d 1/8 H(x) = log - log - 1/8log 1/8 - 1/8log 1/8 = 1.75 bits INFORMATION CONTENT Information as a concept has a diversity of meanings, from everyday usage to technical settings. Generally speaking, the concept of information is closely related to notions of constraint, communication, control, data, form, instruction, knowledge, meaning, mental stimulus, pattern, perception, and representation. Many people speak about the Information Age as the advent of the Knowledge
50
Age or knowledge society, the information society, the Information revolution, and information technologies, and even though informatics, information science and computer science are often in the spotlight, the word "information" is often used without careful consideration of the various meanings it has acquired. REDUNDANCY (INFORMATION THEORY) In engineering, redundancy is the duplication of critical components of a system with the intention of increasing reliability of the system, usually in the case of a backup or fail-safe. In many safety-critical systems, such as fly-by-wire and hydraulic systems in aircraft, some parts of the control system may be triplicated. An error in one component may then be out-voted by the other two. In a triply redundant system, the system has three sub components, all three of which must fail before the system fails. Since each one rarely fails, and the sub components are expected to fail independently, the probability of all three failing is calculated to be extremely small. Redundancy may also be known by the terms "majority voting systems" or "voting logic". Redundancy in information theory is the number of bits used to transmit a message minus the number of bits of actual information in the message. Informally, it is the amount of wasted "space" used to transmit certain data. Data compression is a way to reduce or eliminate unwanted redundancy, while checksums are a way of adding desired redundancy for purposes of error detection when communicating over a noisy channel of limited capacity. FORMS OF REDUNDANCY There are four major forms of redundancy, these are:

Hardware redundancy, such as DMR and TMR Information redundancy, such as Error detection and correction methods Time redundancy, including transient fault detection methods such as Alternate Logic Software redundancy such as N-version programming DATA REDUNDANCY
Data redundancy sometime refers to in computer data storage, is a property of some disk arrays (most commonly in RAID systems) which provides fault tolerance, so that all or part of the data stored in the array can be recovered in the case of disk failure. The cost typically associated with providing this feature is a reduction of disk capacity available to the user, since the implementations require
51
either a duplication of the entire data set, or an error-correcting code to be stored on the array. Redundancy is attained when the same data values are stored more than once in a table, or when the same values are stored in more than one table. To prevent redundancy in Database Tables, database normalization should be done to prevent redundancy and any other problems that might affect the performance of the database. One of the biggest disadvantages of data redundancy is that it increases the size of the database unnecessarily. Also data redundancy might cause the same result to be returned as multiple search results when searching the database causing confusion and clutter in results INFORMATION THEORY Information theory is a branch of applied mathematics and electrical engineering involving the quantification of information. Historically, information theory was developed by Claude E. Shannon to find fundamental limits on compressing and reliably storing and communicating data. Since its inception it has broadened to find applications in many other areas, including statistical inference, natural language processing, cryptography generally, networks other than communication networks as in neurobiology, the evolution and function of molecular codes, model selection in ecology, thermal physics, quantum computing, plagiarism detection and other forms of data analysis. A key measure of information in the theory is known as entropy, which is usually expressed by the average number of bits needed for storage or communication. Intuitively, entropy quantifies the uncertainty involved when encountering a random variable. For example, a fair coin flip (2 equally likely outcomes) will have less entropy than a roll of a dice (6 equally likely outcomes). Applications of fundamental topics of information theory include lossless data compression (e.g. ZIP files), lossy data compression (e.g. MP3s), and channel coding (e.g. for DSL lines). The field is at the intersection of mathematics, statistics, computer science, physics, neurobiology, and electrical engineering. Its impact has been crucial to the success of the Voyager missions to deep space, the invention of the compact disc, the feasibility of mobile phones, the development of the Internet, the study of linguistics and of human perception, the understanding of black holes, and numerous other fields. Important sub-fields of information theory are source coding, channel coding, algorithmic complexity theory, algorithmic information theory, and measures of information.
52
OVERVIEW The main concepts of information theory can be grasped by considering the most widespread means of human communication: language. Two important aspects of a good language are as follows: First, the most common words (e.g., "a", "the", "I") should be shorter than less common words (e.g., "benefit", "generation", "mediocre"), so that sentences will not be too long. Such a tradeoff in word length is analogous to data compression and is the essential aspect of source coding. Second, if part of a sentence is unheard or misheard due to noise e.g., a passing car the listener should still be able to glean the meaning of the underlying message. Such robustness is as essential for an electronic communication system as it is for a language; properly building such robustness into communications is done by channel coding. Source coding and channel coding are the fundamental concerns of information theory. Note that these concerns have nothing to do with the importance of messages. For example, a platitude such as "Thank you; come again" takes about as long to say or write as the urgent plea, "Call an ambulance!" while clearly the latter is more important and more meaningful. Information theory, however, does not consider message importance or meaning, as these are matters of the quality of data rather than the quantity and readability of data, the latter of which is determined solely by probabilities. Information theory is generally considered to have been founded in 1948 by Claude Shannon in his seminal work, "A Mathematical Theory of Communication." The central paradigm of classical information theory is the engineering problem of the transmission of information over a noisy channel. The most fundamental results of this theory are Shannon's source coding theorem, which establishes that, on average, the number of bits needed to represent the result of an uncertain event is given by its entropy; and Shannon's noisy-channel coding theorem, which states that reliable communication is possible over noisy channels provided that the rate of communication is below a certain threshold called the channel capacity. The channel capacity can be approached in practice by using appropriate encoding and decoding systems. Information theory is closely associated with a collection of pure and applied disciplines that have been investigated and reduced to engineering practice under a variety of rubrics throughout the world over the past half century or more: adaptive systems, anticipatory systems, artificial intelligence, complex systems, complexity science, cybernetics, informatics, machine learning, along with systems sciences of many descriptions. Information theory is a broad and deep mathematical theory, with equally broad and deep applications, amongst which is the vital field of coding theory. Coding theory is concerned with finding explicit methods, called codes, of increasing the efficiency and reducing the net error rate of data communication
53
over a noisy channel to near the limit that Shannon proved is the maximum possible for that channel. These codes can be roughly subdivided into data compression (source coding) and error-correction (channel coding) techniques. In the latter case, it took many years to find the methods Shannon's work proved were possible. A third class of information theory codes are cryptographic algorithms (both codes and ciphers). Concepts, methods and results from coding theory and information theory are widely used in cryptography and cryptanalysis. See the article ban (information) for a historical application. Information theory is also used in information retrieval, intelligence gathering, gambling, statistics, and even in musical composition. HISTORICAL BACKGROUND The landmark event that established the discipline of information theory, and brought it to immediate worldwide attention, was the publication of Claude E. Shannon's classic paper "A Mathematical Theory of Communication" in the Bell System Technical Journal in July and October of 1948. Prior to this paper, limited information theoretic ideas had been developed at Bell Labs, all implicitly assuming events of equal probability. Harry Nyquist's 1924 paper, Certain Factors Affecting Telegraph Speed, contains a theoretical section quantifying "intelligence" and the "line speed" at which it can be transmitted by a communication system, giving the relation W = Klogm, where W is the speed of transmission of intelligence, m is the number of different voltage levels to choose from at each time step, and K is a constant. Ralph Hartley's 1928 paper, Transmission of Information, uses the word information as a measurable quantity, reflecting the receiver's ability to distinguish that one sequence of symbols from any other, thus quantifying information as H = logSn = nlogS, where S was the number of possible symbols, and n the number of symbols in a transmission. The natural unit of information was therefore the decimal digit, much later renamed the hartley in his honour as a unit or scale or measure of information. Alan Turing in 1940 used similar ideas as part of the statistical analysis of the breaking of the German second world war Enigma ciphers. Much of the mathematics behind information theory with events of different probabilities was developed for the field of thermodynamics by Ludwig Boltzmann and J. Willard Gibbs. Connections between information-theoretic entropy and thermodynamic entropy, including the important contributions by Rolf Landauer in the 1960s, are explored in Entropy in thermodynamics and information theory. In Shannon's revolutionary and groundbreaking paper, the work for which had been substantially completed at Bell Labs by the end of 1944, Shannon for the first
54
time introduced the qualitative and quantitative model of communication as a statistical process underlying information theory, opening with the assertion that "The fundamental problem of communication is that of reproducing at one point, either exactly or approximately, a message selected at another point." With it came the ideas of

the information entropy and redundancy of a source, and its relevance through the source coding theorem; the mutual information, and the channel capacity of a noisy channel, including the promise of perfect loss-free communication given by the noisy-channel coding theorem; the practical result of the ShannonHartley law for the channel capacity of a Gaussian channel; and of course the bita new way of seeing the most fundamental unit of information. INTERFERENCE
In physics, interference is the addition (superposition) of two or more waves that result in a new wave pattern. Interference usually refers to the interaction of waves which are correlated or coherent with each other, either because they come from the same source or because they have the same or nearly the same frequency. Two non-monochromatic waves are only fully coherent with each other if they both have exactly the same range of wavelengths and the same phase differences at each of the constituent wavelengths. The total phase difference is derived from the sum of both the path difference and the initial phase difference (if the waves are generated from 2 or more different sources). It can then be concluded whether the waves reaching a point are in phase (constructive interference) or out of phase (destructive interference). UNCERTAINTY Uncertainty is a term used in subtly different ways in a number of fields, including philosophy, physics, statistics, economics, finance, insurance, psychology, sociology, engineering, and information science. It applies to predictions of future events, to physical measurements already made, or to the unknown
55
CONCEPTS: In his seminal work Risk, Uncertainty, and Profit University of Chicago economist Frank Knight (1921) established the important distinction between risk and uncertainty: "Uncertainty must be taken in a sense radically distinct from the familiar notion of risk, from which it has never been properly separated.... The essential fact is that 'risk' means in some cases a quantity susceptible of measurement, while at other times it is something distinctly not of this character; and there are far-reaching and crucial differences in the bearings of the phenomena depending on which of the two is really present and operating.... It will appear that a measurable uncertainty, or 'risk' proper, as we shall use the term, is so far different from an unmeasurable one that it is not in effect an uncertainty at all." Although the terms are used in various ways among the general public, many specialists in decision theory, statistics and other quantitative fields have defined uncertainty and risk more specifically. Doug Hubbard defines uncertainty and risk as:
1. Uncertainty: The lack of certainty, A state of having limited knowledge
where it is impossible to exactly describe existing state or future outcome, more than one possible outcome. 2. Measurement of Uncertainty: A set of possible states or outcomes where probabilities are assigned to each possible state or outcome - this also includes the application of a probability density function to continuous variables 3. Risk: A state of uncertainty where some possible outcomes have an undesired effect or significant loss. 4. Measurement of Risk: A set of measured uncertainties where some possible outcomes are losses, and the magnitudes of those losses - this also includes loss functions over continuous variables. There are other different taxonomy of uncertainties and decisions that include a more broad sense of uncertainty and how it should be approached from an ethics perspective . For example, if you do not know whether it will rain tomorrow, then you have a state of uncertainty. If you apply probabilities to the possible outcomes using weather forecasts or even just a calibrated probability assessment, you have quantified the uncertainty. Suppose you quantify your uncertainty as a 90% chance of sunshine. If you are planning a major, costly, outdoor event for tomorrow then you have risk since there is a 10% chance of rain and rain would be undesirable. Furthermore, if this is a business event and you would lose $100,000 if it rains, then you have quantified the risk (a 10% chance of losing
56
$100,000). These situations can be made even more realistic by quantifying light rain vs. heavy rain, the cost of delays vs. outright cancellation, etc. Some may represent the risk in this example as the "expected opportunity loss" (EOL) or the chance of the loss multiplied by the amount of the loss (10% x $100,000 = $10,000). That is useful if the organizer of the event is "risk neutral" which most people are not. Most would be willing to pay a premium to avoid the loss. An insurance company, for example, would compute an EOL as a minimum for any insurance coverage, then add on to that other operating costs and profit. Since many people are willing to buy insurance for many reasons, then clearly the EOL alone is not the perceived value of avoiding the risk. Quantitative uses of the terms uncertainty and risk are fairly consistent from fields such as probability theory, actuarial science, and information theory. Some also create new terms without substantially changing the definitions of uncertainty or risk. For example, surprisal is a variation on uncertainty sometimes used in information theory. But outside of the more mathematical uses of the term, usage may vary widely. In cognitive psychology, uncertainty can be real, or just a matter of perception, such as expectations, threats, etc. Vagueness or ambiguity are sometimes described as "second order uncertainty", where there is uncertainty even about the definitions of uncertain states or outcomes. The difference here is that this uncertainty is about the human definitions and concepts not an objective fact of nature. It has been argued that ambiguity, however, is always avoidable while uncertainty (of the "first order" kind) is not necessarily avoidable.[4]: Uncertainty may be purely a consequence of a lack of knowledge of obtainable facts. That is, you may be uncertain about whether a new rocket design will work, but this uncertainty can be removed with further analysis and experimentation. At the subatomic level, however, uncertainty may be a fundamental and unavoidable property of the universe. In quantum mechanics, the Heisenberg Uncertainty Principle puts limits on how much an observer can ever know about the position and velocity of a particle. This may not just be ignorance of potentially obtainable facts but that there is no fact to be found. There is some controversy in physics as to whether such uncertainty is an irreducible property of nature or if there are "hidden variables" that would describe the state of a particle even more exactly than Heisenberg's uncertainty principle allows. MEASUREMENTS In metrology, physics, and engineering, the uncertainty or margin of error of a measurement is stated by giving a range of values which are likely to enclose the true value. This may be denoted by error bars on a graph, or by the following notations:
57

measured value uncertainty measured value(uncertainty)
The latter "concise notation" is used for example by IUPAC in stating the atomic mass of elements. There, the uncertainty applies only to the least significant figure of x. For instance, 1.00794(7) stands for 1.00794 0.00007. Often, the uncertainty of a measurement is found by repeating the measurement enough times to get a good estimate of the standard deviation of the values. Then, any single value has an uncertainty equal to the standard deviation. However, if the values are averaged, then the mean measurement value has a much smaller uncertainty, equal to the standard error of the mean, which is the standard deviation divided by the square root of the number of measurements. When the uncertainty represents the standard error of the measurement, then about 68.2% of the time, the true value of the measured quantity falls within the stated uncertainty range. For example, it is likely that for 31.8% of the atomic mass values given on the list of elements by atomic mass, the true value lies outside of the stated range. If the width of the interval is doubled, then probably only 4.6% of the true values lie outside the doubled interval, and if the width is tripled, probably only 0.3% lie outside. These values follow from the properties of the normal distribution, and they apply only if the measurement process produces normally distributed errors. In that case, the quoted standard errors are easily converted to 68.3% ("one sigma"), 95.4% ("two sigma"), or 99.7% ("three sigma") confidence intervals. In this context, uncertainty depends on both the accuracy and precision of the measurement instrument. The least the accuracy and precision of an instrument are, the larger the measurement uncertainty is. Notice that precision is often determined as the standard deviation of the repeated measures of a given value, namely using the same method described above to assess measurement uncertainty. However, this method is correct only when the instrument is accurate. When it is inaccurate, the uncertainty is larger than the standard deviation of the repeated measures, and it appears evident that the uncertainty does not depend only on instrumental precision. APPLICATIONS

Investing in financial markets such as the stock market. Uncertainty is used in engineering notation when talking about significant figures. Or the possible error involved in measuring things such as distance. Uncertainty is designed into games, most notably in gambling, where chance is central to play.
58

In scientific modelling, in which the prediction of future events should be understood to have a range of expected values. In physics, the Heisenberg uncertainty principle forms the basis of modern quantum mechanics. In weather forecasting it is now commonplace to include data on the degree of uncertainty in a weather forecast. Uncertainty is often an important factor in economics. According to economist Frank Knight, it is different from risk, where there is a specific probability assigned to each outcome (as when flipping a fair coin). Uncertainty involves a situation that has unknown probabilities, while the estimated probabilities of possible outcomes need not add to unity. In risk assessment and risk management In metrology, measurement uncertainty is a central concept quantifying the dispersion one may reasonably attribute to a measurement result. Such an uncertainty can also be referred to as a measurement error. In daily life, measurement uncertainty is often implicit ("He is 6 feet tall" give or take a few inches), while for any serious use an explicit statement of the measurement uncertainty is necessary. The expected measurement uncertainty of many measuring instruments (scales, oscilloscopes, force gages, rulers, thermometers, etc) is often stated in the manufacturers specification.
The most commonly used procedure for calculating measurement uncertainty is described in the Guide to the Expression of Uncertainty in Measurement (often referred to as "the GUM") published by ISO. A derived work is for example the National Institute for Standards and Technology (NIST) publication NIST Technical Note 1297 "Guidelines for Evaluating and Expressing the Uncertainty of NIST Measurement Results" and the Eurachem/Citac publication "Uncertatinty in measurements" (available at the Eurachem homepage). The uncertainty of the result of a measurement generally consists of several components. The components are regarded as random variables, and may be grouped into two categories according to the method used to estimate their numerical values: Type A, those which are evaluated by statistical methods, Type B, those which are evaluated by other means, e.g. by assigning a probability distribution.
By propagating the variances of the components through a function relating the components to the measurement result, the combined measurement uncertainty is
59
given as the square root of the resulting variance. The simplest form is the standard deviation of a repeated observation.
Uncertainty has been a common theme in art, both as a thematic device (see, for example, the indecision of Hamlet), and as a quandary for the artist (such as Martin Creed's difficulty with deciding what artworks to make).

Unit Three

Загружено:

Сведения о документе

Исходное описание:

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

Unit Three

Загружено:

Авторское право:

Доступные форматы

1

measured value uncertainty measured value(uncertainty)

Вам также может понравиться