Вы находитесь на странице: 1из 7

A Cautionary Note On Image Downgrading

Charles Kurak Department of Computer Science The University of North Carolina Chapel Hill, NC 27599-3715 John McHugh Department of Computer Science The University of North Carolina Chapel Hill, NC 27599-3715
that are not immediately obvious when the information is printed or viewed on a display. In the presence of untrusted software that may contain Trojan horse code, care must be taken to ensure that the downgrading is done in such a way that the resulting product is free of contamination even though its sources may be compromised. The issues surrounding high assurance downgrading of text les are discussed in detail in 5]. To the authors' knowledge, no high assurance downgraders for text les exist. For the past several years, one of the authors (McHugh) has been part of a team developing a high assurance windowing system based on the MIT X Windows system. Targeted at B3 evaluation, this system 3] has attracted substantial interest in the security community. The developers have been told by a number of potential users that an image downgrading capability is a requirement for many MLS windowing systems. The purpose of this note is to relate the results of some simple experiments that the authors of this paper have performed at the University of North Carolina that cast doubt on the notion of trustworthy image downgrading. This paper continues with a brief description of the steps that are believed to be necessary for high assurance downgrading of text information in an environment in which contamination by Trojan horse programs is considered a credible threat. We then describe our experiments in which we contaminated images with other images. We note that our approach, though simplistic, is easily extended to more sophisticated forms that would be, we believe, extremely di cult to detect. We conclude with an outline of areas in which we think that research should be performed before operational downgrading of images is undertaken in an environment in which contamination by Trojan horse programs is a threat. Because the techniques and terminology used in digital image processing are unfamiliar to many in the computer security community, the appendix contains a brief introduction to the subject and de nes some of the terms used in the body of the paper. Readers unfamiliar with digital image processing may want to read the appendix rst. 1

Abstract
The results of an experiment that shows that it is very simple to contaminate digital images with information that can later be extracted are presented. This contamination cannot be detected when the image is displayed on a good quality graphics workstation. Based on these results, it is recommended that image downgrading based on visual display of the image to be downgraded not be performed if there is any threat of image contamination by Trojan horse programs. Potential Trojan horse programs may include untrusted image processing software.

1 Introduction
For as long as there has been classi ed information, there has been a need to de-classify or downgrade information that has been classi ed above its currently appropriate level. On some occasions, wholesale reclassi cation is permitted, as in the case of automatic downgrading of an entire document after some predetermined time period has elapsed. In other cases, piecemeal downgrading is done, using a combination of cutting and pasting from the source document combined with the obliteration or marking out of individual words or phrases. The preparation of an unclassied summary of a classi ed document is a good example of the latter process. In the pen and paper world, this process may be followed by the copying or transcription of the resulting document to ensure that the obliterated material is truly unreadable by the recipient of the downgraded document. In this world, it is reasonable to assume that the individual performing the downgrading is trustworthy and that no additional secrets have been hidden in the source document encoded by means of word order, spacing, etc. Under these assumptions, the resulting document can be assumed to be properly classi ed at a level lower than that of the source from which it was derived. For information stored in electronic form, the situation is not so simple. It is possible to encode substantial amounts of information in a text le in ways
To appear in the proceedingsof the 1992 Computer Security Applications Conference, San Antonio, TX in December 1992.

2 Downgrading Text
Like many actions involving classi ed information, downgrading rst and foremost involves accountability. Some human is required to accept responsibility for the downgrading process. In the event that classi ed information is compromised, this person will be held accountable. When software is used to aid an individual in the downgrading of, say, a text le, the individual needs to know that the software will not, on its own, pass through information that is not supposed to be downgraded and that it will operate in such a way that helps to prevent the user from making careless mistakes. The downgrader proposed in 5] does this in several ways. 1. The downgrader is trusted software, formally speci ed and veri ed to have exactly and only the functionality needed to do its job. A major portion of the e ort in building such a downgrader would be devoted to achieving assured functionality in such areas as information display and user interactions. 2. The downgrader converts its input into a canonical form that excludes invisible and unprintable characters as well as information that might be encoded via spacing or formatting. The user sees only this form and the downgrader output is provably derived from its input using a series of subtractive operations. 3. The downgrader interacts with its user in a way that forces the review of downgraded material as small segments (typically sentences) viewed in context, preventing the deliberate or accidental downgrading of large quantities of information without appropriate review. The output of such a downgrader resembles the results of the cut, paste, and obliterate operation described earlier. It is claimed that this style of downgrading o ers acceptable performance coupled with a high degree of assurance and immunityto Trojan horse attacks.

that we have chosen is extremely simple and would be easy to detect if it was suspected, we can postulate much more di cult to detect contamination mechanisms.

3.1 The Experiment


A simple program was written to encode images within images by replacing the low order bits of one image with the high order bits of another. The program displays the two original images, the contaminated image, and the extracted contaminant. Figures 2,3, and 5 are PostScript1 screen dumps of three samples of the program's operation. All images are 256 by 256 by 8 bit per pixel grey scale images. Three images have been used: 1. A section of a page from a statistics text is shown in Figure 2. Although this is an 8 bit per pixel image, it can be easily represented at one bit ber pixel. 2. An old LandSAT photograph of the Raleigh{ Durham, North Carolina airport showing the original runways. This is shown in Figure 3. The surrounding forest makes for a highly textured image. 3. A photograph of a portion of an F{16 aircraft. This is shown in Figure 5. This might be considered typical of the sort of visual information that one would want to conceal during downgrading. The tool that we developed allowed on screen comparison of the base, contaminating, contaminated, and extracted images by presenting four 256 by 256 image areas along with appropriate labels and controls. On the tool's display, the upper left image was the base gure, prior to contamination, the upper right image was the contaminating image, the lower left image was the contaminated (or combined) image, and the lower right image was the version of the contaminating image that has been extracted from the contaminated image. The organization of Figures 3 through 6 is similar to the tool's display. In the paper, Figures 2, 3, and 5 are variously used as base and contaminating images. For the cases studies that follow, we only show the contaminated and extracted images. Figure 4 illustrates the contamination of the airport photograph with the F{16 photograph. In this case, four bits of the airport photo were used. The extracted version of the F{16 is shown in Figure 6. Figure 7 illustrates the contamination of the airport photograph with the scanned text. Only one bit was \stolen" from the airport as the text really only needs a single bit. The extracted text is shown in Figure 9. Figure 8 may come as a surprise. In this case, the text is contaminated with the airport. Three bits are
1 PostScript is a registered trademark of the Adobe Systems Corporation.

3 Downgrading Images
Visual information presents more di cult problems. In the computer, an image is an array of numbers that represent light intensities at various points in the image. In displayable form, the image typically has 8 to 24 bits per pixel. Display screens are typically 1024 768 pixels (Super VGA on a PC class platform) or 1280 1024 pixels (many workstations). An 8 bit per pixel image occupying a quarter of the screen might be 500 by 600 pixels and contain 300 kilobytes of data. Examining each byte of data in a small local context, as is done with textual data, does not seem to be fruitful. Displaying the entire image on the computer screen does not eliminate the possibility of contamination as will be seen. While the form of contamination 2

used. The extracted image of the airport is shown in Figure 10. Why is this approach so successful? Part of the reason is the fact that even good computer displays simply don't support the degree of grey scale resolution needed to distinguish 28 or 256 distinct shades. In reality, about 100 levels is all that we can distinguish under ideal circumstances. In a noisy picture, such as that of the airport, it is all but impossible to detect tampering with the low order 4 bits. Even in the case of a picture having large at areas (such as the text), tampering with the low order 3 bits is undetectable.

do

end do

contaminated image: shift left by 8 - n extracted image: set to the shifted value of the contaminated image

3.3 The Results


The program was executed on a DECStation 5000/120 with a high resolution, 8 bits per pixel color display. The base images are shown in Figures 2 and 3. The images of Figures 2 and 5 are used as contaminants. The remaining gures are paired, left and right, with a contaminated image on the left and the extracted contaminant on the right. In each set note the di erence, or lack thereof, between the original base image and the contaminated image. The extracted image is taken from the contaminated image only. Its values have been adjusted so it is properly viewable. With this particular algorithm, there may be some lossy compression (see the appendix) with both images. Since the contaminated image may subject to downgrading at this point, the reduced quality may not be a subject of concern. However, one may want to keep the quality of the contaminating image as high as possible. This can be done with lossless compression. Selecting the number of bits to utilize for including the contaminating image is somewhat objective at this point. Typically one would like to choose the highest possible number without a ecting the visual quality of the contaminated image. The actual selection of this number may depend on the texture of the base image. Visual testing to develop some heuristics may be in order. Lossless compression may be desirable for inclusion of the contaminating image. If the image needs to be extracted intact, and the number of bits available to hide the contaminating image are less than the dynamic range of the contaminating image requires, then some form of compression will be required. This will mandate a more complicated set of combination and extraction algorithms. There are a number of lossless data compression algorithms that could be utilized for this purpose. The contaminatingimage may need to be encrypted to disallow accidental extraction by the unintended. Standard data encryption methods could be utilized before the combination process is employed.

3.2 The approach


The programs whose results are shown here were written using the IGLOO 1] library developed by James Coggins. Below are the two algorithms at the heart of this particular image data hiding scheme, Combine and Extract. Although this is a quite simple algorithm, it is very e ective in that the human eye is unaware of its use. Knowledge of this algorithm's possible utilization would allow one to easily determine if a second image was hidden in the rst. With a simple image processing program, the second image can be revealed within seconds. Other, more complicated algorithms, are discussed later. The algorithms are expressed in terms of the base, contaminating, contaminated, and extracted images as de ned in section 3.1 above. bits per pixel2.
n: integer n := number of low order bits to be utilized to hide the contaminating image underneath the base image. for each pixel in base, contaminating, and contaminated images do base image: set the n low order bits to 0 contaminating image: shift right by 8 - n contaminated image: add values from base and contaminating image end do

The Combine Algorithm: Scale both images to 8

The Extract Algorithm:


for each pixel in contaminated and extracted images
2 This value can be modi ed dependingon the dynamic range of the original images and the desired output image. Other issues may also a ect its selection. They are discussed later. For this sample program the value 8 bits per pixel was selected.

4 Concerns, Conclusions, and Future Directions


Detection of hidden images is expected to be a hard problem. Statistical methods may need to be employed, but there is no assurance that they will succeed. Of course, electronic comparison with the original image would reveal any tampering. In many cases, it is not possible to apply cryptographic or other 3

seals to the image data because it must be processed by complex and necessarily untrusted software in order to be made useful. These programs transform the data and are possible sources of contamination. Hiding images within images is an obvious form of contamination. It is clear that any form of information can be hidden. For example, if only one bit per pixel can be appropriated, a 200 by 200 image allows 40 kilobits or 5 kilobytes for information hiding. This is the equivalent of about a page and a half of text. Suitably encrypted, such information would appear as random noise on the low order bit of the image. We suspect that it would be possible to hide small numbers of bits of information in almost any image and to do it in ways that would be both di cult to detect and largely immune to attempts to remove them. One area that is of particular interest is the transmission of the contaminated image in hardcopy format. The question arises whether or not a contaminating image can be successfully extracted from a photo-quality paper image. In other words, can an image be hidden, printed on photo paper, re-digitized, and extracted? We hope to address these questions in the near future. For the present, we wish to issue a cautionary note. Downgrading is a risky process at best and not something to be undertaken lightly. Most computer displays simply do not display enough of the information contained in even low quality digital images to allow visual inspection of the displayed image as a basis for assuming that the image can be safely downgraded. We know of no work that has been done that would lead to the necessary assurance and recommend that images not be downgraded based on visual display if there is any possibility that the image has been contaminated by a Trojan horse program at any time in its history.

Scene
Image energy Scanner (sensor)

Raw digital image

Process

Processed digital image

Display

Visual image

Observer

Figure 1: Digital image processing The number of di erent grey-levels in the digital representation of the image is a function of the energy distribution of the original image, as well as the capabilities of the digitizing device. Currently, digitizing devices, sometimes referred to as scanners, are readily available for 4 or 8 bit digitizing. That is, with a 4bit scanner, 24 or 16 grey-levels are discernable. With an 8-bit scanner, 28 or 256 grey-levels are discernable. Color scanners capable of 24 bit resolution are also available. These use 8 bits for each of the red, blue, and green energies. Images utilizing 12 bits (212 or 4096 grey-levels) are commonly utilized by the medical industry for radiological images. The images used in this work utilize only 8 bits. After reading this paper, one can later extrapolate the consequences associated with higher order acquisition devices. The spatial resolution of scanners used for digitizing photographs is typically 200 to 900 pixels per inch in both the X and Y dimensions. Capacities up to 8.5 by 11 inches are common with larger sizes being available. Direct image sensing is usually done with some sort of digital camera. Charge coupled devices with spatial resolutions from about 100 by 100 pixels up to 4096 by 4096 pixels are available for use in these cameras. These are combined with analog to digital converters and appropriate scanning circuitry to provide a digital data stream representing the image. Devices to digitize directly from analog video signals are also available. Once a digital representation has been acquired, it can be processed or transformed in a variety of ways. It is possible to duplicate most of the photographer's darkroom tricks on the computer. The contrast, brightness, color balance, etc. of the image can 4

Appendix: Image Processing


Computerized Digital Image Processing 2] is the manipulation of images by a computer. The basic process of digital image processing is shown in Figure 1. We assume that the starting point is a physical scene or perhaps a photographic representation of the scene. Energy, optical or otherwise, from the image in question is converted into a digital representation. The image is divided into small regions in the shape of squares called pixel elements, or pixels, in e ect superimposing a grid of cells over the original image. The scanner or sensor produces an output that is proportional to the energy emitted by each pixel. Without loss of generality, we can refer to the numeric representation of the energy as a \grey{level3" A numeric representation of the energy in the cell, i.e. the average grey-level value is stored in a data structure. A two-dimensional array is typically utilized, with the two indices corresponding to the row and column positions.

3 Color images are typically represented as three grey{levels, one each for the energies of the red, green, and blue portions of the spectrum. Energies ranging from audio frequencies through the X{ray portion of the spectrum can be sensed and digitized.

all be altered. The image can be magni ed4, reduced, cropped, etc. False colors can be used to emphasize energy distributions. In addition, the image can be processed to compensate for defects in the sensor, i.e. out of focus images can be sharpened, etc. Finally, the digital image can be interpreted, i.e. features such as edges and structures can be identi ed. The result of the digital image processing can be either another digital image, or it can be a data structure that contains information about the image. We are primarily interested in the former case. Image processing programs are often large and complex. Once an image has been processed, it is typically viewed, either directly on the screen of a workstation or similar display device or via some form of photographic reproduction. Many workstation display devices can display 8-bit images (256 grey-scales or 256 di erent colors at a time), although higher resolution (24-bit) display devices are readily available. On a color display, the 256 grey-levels are produced by setting the Red, Green, and Blue values to equal intensities, thereby mixing the three colors equally to achieve grey. The problem is that the human visual system (HVS) is not as good at distinguishing between levels as the display is at producing them. The appropriate metric for the HVS is the just noticeable di erence or JND. Under ideal circumstances the eye can only distinguish on the order of 100 JNDs 4, 6] in grey scale. It is thus apparent that the capabilities of currently available display devices are far greater than those of the HVS and the HVS is not capable of discerning all of the information presented to it. The inability of the HVS to discriminate precisely among small di erences in image intensity makes possible a number of strategies for reducing the space required to store digital images. When we consider that a typical weather satellite image is 1200 pixels by 600 pixels by 8 bits and requires nearly 3/4 of a megabyte of disk space, it is clear that compression is useful if not essential. There are two kinds of compression, lossless and lossy.

compression algorithms have a greater potential for reducing the size of the data than do lossless ones. They are typically based on a combination of the allowable tolerances for image reconstruction and on the statistical characteristics of the data to be compressed.

References
1] James M. Coggins. Designing C++ libraries. The C++ Journal, 1(1):25{32, June 1990. 2] Benjamin M. Dawson. Introduction to image processing algorithms. BYTE, pages 169{186, March 1987. 3] Jeremy Epstein, et al. A prototype B3 trusted X Window system. In 1991 Computer Security Applications Conference, December 1991. 4] James Foley, et al. Computer Graphics. Addison{ Wesley, 2nd edition, 1990. 5] John McHugh. An EMACS based downgrader for the SAT. In Marshall D. Abrams and Harold J. Podel, editors, Computer and Network Security, pages 228{237. IEEE Computer Society Press, 1986. 6] Stephen M. Pizer, et al. Evaluation of the number of discernable levels produced by a display. In Information Processing in Medical Imaging. INSERM, Paris, 1980.

Lossless compression algorithms are invoked to save

storage space. It is always possible to reconstruct the image exactly with this method of compression. This method is preferred when there is a requirement that the original information not be modi ed as might be the case when subsequent processing is needed. Lossless compression algorithms are widely used to compress text and data les and are usually based on the statistical characteristics of the data to be compressed. Lossy compression algorithms also save space, and the reconstructed image is very close (or possibly identical) to the original, but exact reconstruction is not always possible. This may be adequate for many applications, especially when the user only wants to use or display the image in a fashion that does not require exact data representation. Lossy
4

Figure 2: A scanned page of text 5

Pixel size has the same limiting role as lm grain.

Figure 3: Landsat image of RDU

Figure 5: An F{16 aircraft

Figure 4: RDU contaminated with F{16

Figure 6: F{16 extracted from RDU

Figure 7: RDU contaminated with text

Figure 9: Text extracted from RDU

Figure 8: Text contaminated with RDU

Figure 10: RDU extracted from text

Вам также может понравиться