Вы находитесь на странице: 1из 11

CMPE264: Image Analysis and Computer Vision

Final Project Report UAV Video Stabilization


Mariano I. Lizrraga (mlizarra@ucsc.edu) a Sonia Arteaga (sarteaga@soe.ucsc.edu)

August 15, 2007

CMPE264: Image Analysis and Computer Vision

Introduction

Unmanned Aerial Vehicles (UAVs) have slowly started to permeate to civilian and law enforcement applications. Once thought to be exclusively employed by the military, the UAVs are being used today in border surveillance, whale and other sea mammal tracking, and search and rescue missions in disaster areas. The usefulness of the imagery that these ying robots relay to the controlling ground stations in military applications is directly related to how much information the operator can extract from the frames being watched in real-time. Image quality, RF noise rejection, and image stabilization have come to play an important role in the overall performance measurement of the mission. While many of the mainstream UAVs have sophisticated equipment onboard to take care of the above mentioned problems, the price tag is usually in the order of millions of dollars, making them unaordable about any other situation except military applications. Recent advances in solid-state sensors and overall reduction of the electronics have made possible to noticeably improve the smaller UAVs capabilities. Nevertheless, this smaller UAVs are now more sensible to natural oscillations of the airplane and wind turbulence, thus degrading the stability of the imagery served to the end user. The Naval Postgraduate School (NPS) has been performing experimental ights on tactical UAVs since 2001 in order to develop technology that supports U.S. troops in dierent peace-keeping scenarios around the world. These small UAVs carry visual and IR cameras used to relay video down to a ground station providing vital information for the deployed team. Even though these UAVs have autopilots and robust control schemes implemented on board, it is practically impossible to completely eliminate vibration and oscillations due to external disturbances and natural behavior of the plane. These oscillations get mechanically transmitted to the camera and as a consequence the relayed video is dicult to watch and makes it exhausting for the operator to evaluate. To address the issue of oscillations and low frequency vibrations in the recorded imagery, implementation of an image stabilization algorithm is required to improve visual quality. Furthermore, the stabilization algorithm needs to be robust and computationally inexpensive to perform in real time, and to run on the PC104 computer that is available at the ground station.

Final Project Report

August 15, 2007

CMPE264: Image Analysis and Computer Vision

Implementation

Image stabilization for moving platforms is usually related to high-frequency unwanted motion compensation. Many of the widely known algorithms, like the ones presented in [1], are very sensitive to panning and rotation, thus rendering them useless for applications where intentional panning and rotation are part of the application. The image stabilization algorithm, and the Simulink implementation presented herein, follows directly the work presented in [2], showing very stable behavior for intentional panning and rotation. This algorithm oered promising results in stabilizing the UAV footage provided by the Unmanned Systems Lab at the Naval Postgraduate School. The implemented frame motion compensation follows the one proposed in [3]. Simulink, a model based engineering tool developed by The Mathworks (makers of Matlab), was picked as development platform for this project due to its block-oriented design paradigm, oering great ease of use and better understanding of each functional block of the algorithm. The presented algorithm consist of ve main functional blocks shown in Figure 1:

Figure 1: Top Level Simulink Diagram Video reading and grayscale conversion, Gray-Code calculation, Sub-frame correlation measure calculation, Global motion calculation, and Motion compensation. Final Project Report 2 August 15, 2007

CMPE264: Image Analysis and Computer Vision

2.1

Video Reading and Grayscale Conversion

The rst step is to read frame-by-frame the video, and convert each frame into a grayscale 8-bit image, which will be the actual input to the algorithm. This is performed in Simulink with the block layout shown in Figure 2. Note that before the output the video stream is down-sampled by two, thus completely ignoring every other frame. This was done to improve the throughput and increase the frame-rate of the output.

Figure 2: Grayscale Frame Conversion and Downsampling

2.2

Gray-Code Calculation

This functional block, decomposes the frame into 8 binary images ak , called bit plane images, such that the frame f at time t is given by [2], f t (x, y) = aK1 2K1 + aK2 2k2 + . . . + a1 21 + a0 20 (1)

Figure 3 shows the 8 bit plane decomposition of a given frame. The next part of this functional block calculates the Gray-Code of two successive bit plane images. The Gray-Code, named after Bell Researcher Frank Gray, is a binary numeral system where two successive numbers only dier in one digit [4]. The Gray-Coded Bit Plane image is given by: gk = ak ak+1 0 k 6. (2)

It is this gray-coded image gk that is passed onto the next functional block.

Final Project Report

August 15, 2007

CMPE264: Image Analysis and Computer Vision

Figure 3: 8 Bit Plane Frame Decomposition

2.3

Sub-frame Correlation Measure

This functional block divides the gray-coded image gk into four regions of size M N and denes a search window of size (M + 2p) (N + 2p), which is explored in turn to calculate the following correlation measure [2]: 1 Cj (m, n) = MN
M 1 N 1 t1 t gk (x, y) gk (x + m, y + n). x=0 y=0

(3)

Therefore Cj will act as an accumulator to count the number of non correst1 t pondences between gk and gk , thus the smaller, the better. Figure 4 shows the Simulink implementation of this functional block. Final Project Report 4 August 15, 2007

CMPE264: Image Analysis and Computer Vision

Figure 4: Correlation Measure Calculation

2.4

Global Motion Calculation

This functional block chooses the minimum correlation measure Cj of each region (thus the best match) and the actual coordinates inside the matrix correspond to the local motion vector Vj obtained as: Vj = min{Cj (m, n)}. (4)

These motion vectors Vj are stacked together with the global motion vector Vgt1 from the previous frame and passed trough a median lter to obtain the current global motion vector Vgt : Vgt = median{V1t , V2t , V3t , V4t , Vgt1 }. Figure 5 shows the Simulink implementation of this functional block. (5)

2.5

Motion Compensation

Since motion could be originated by intentional panning, the global motion vector Vg needs to be damped to allow for smooth panning, given by: Vat = D1 Vat1 + Vgt (6)

With this motion vector, the original frame image is thus relocated to remove the unwanted motion and still keep intentional panning. Figure 6 Final Project Report 5 August 15, 2007

CMPE264: Image Analysis and Computer Vision

Figure 5: Global Motion Calculation shows the Simulink implementation of this functional block. Note that before the output a frame rate transition block is included to keep the frame rate constant, taking into account the down-sample mentioned in Subsection 2.1

Figure 6: Motion Compensation

Results

Several tests were run using dierent region sizes in order to quantify the variability in the values of the motion vectors, since the value of the motion vectors aects the visual quality of the motion compensated video footage. The block sizes ranged a minimum of 6 6 pixels in increments of 25 up to a maximum of 106 106 pixels. A xed p in equation 3 value of 8 was used. Then the motion vectors of the rst 16 frames were plotted in a bar graph to show the variability from frame to frame for the specied block size. Figure 7 shows that the results for the block sizes 56106 contain practically Final Project Report 6 August 15, 2007

CMPE264: Image Analysis and Computer Vision identical values. Thus allowing us to reduce the size of the region scanned and noticeably improving the frame rate output to 7 frames per second for the analyzed footage.

Figure 7: Motion Vector Components for Dierent Values of N Using 56 56 regions, there existed the need to prove that the the stabilization was working correctly. Therefore a small Simulink model, shown in Figure 8 was set up in order to generate dierence frames such that: Fd = Ft Ft1 , (7)

for both, the original footage and the compensated footage. Figure 9 shows three dierence frames for a given sequence, showing that the compensation indeed does much better than the original video. Figure 10 shows the mean of each dierence frame Fd for a segment of video footage. It is clear from that gure that the compensated video does better in most of the cases than that of the original video.

Final Project Report

August 15, 2007

CMPE264: Image Analysis and Computer Vision

Figure 8: Simulink Model of the Frame Dierence Comparison

Conclusion

From the implementation of the previously described image stabilization algorithm one can conclude the following: The GCBP method described in [2] and [3] show good performance in stabilizing video footage that contains signicant intentional rotation and panning as was the case with the UAV footage. The signicance of the results for the motion vectors mentioned in Section 3 is that decreasing the block size also increases speed of the motion compensation implementation. Therefore, the results above show that we can run the model with a block size of roughly 56 56 pixels and still attain the same level of quality as running the model with a larger block size, but at a faster speed with reduced computational costs. The use of Simulink to implement the algorithm oered a great insight into each step of the algorithm and allowed to test and debug each functional block independently.

Acknowledgments

The authors of this Final Project would like to thank Dr. Vladimir Dobrokhodov from the Naval Postgraduate School Unmanned Systems Lab for providFinal Project Report 8 August 15, 2007

CMPE264: Image Analysis and Computer Vision

Figure 9: Dierence Frames at dierent Time Intervals ing us with several hours of UAV footage and invaluable support in the Simulink implementation of this algorithm.

References
[1] J. Bergen, P. Anandan, K. Hanna, and R. Hingorani, Hierarchical ModelBased Motion Estimation, David Sarno Research Center, Princeton, NJ, 1992 [2] S. Ko, S. Lee, S. Jeon, E. Kang, Fast Digital Image Stabilizer Based on Gray-Coded Bit Plane Matching, IEEE Transactions on Consumer Electronics, Vol 45, No.3 August 1999. [3] A. Brooks, Real-Time Digital Image Stabilization, Image Processing Report, Department of Electrical Engineering, Northwestern University, Evanston, IL, 2003. [4] Gray Code, Wikipedia, the free encyclopedia, http://en.wikipedia.org/wiki/Gray code, November, 2006.

Final Project Report

August 15, 2007

CMPE264: Image Analysis and Computer Vision

Figure 10: Mean of Dierence Frames for a Segment of Video Footage

Final Project Report

10

August 15, 2007

Вам также может понравиться