Вы находитесь на странице: 1из 9

This article has been accepted for inclusion in a future issue of this journal.

Content is final as presented, with the exception of pagination.


3-D Surround View for Advanced Driver

Assistance Systems
Yi Gao, Chunyu Lin, Yao Zhao, Senior Member, IEEE, Xin Wang, Shikui Wei, and Qi Huang

Abstract— As the primary means of transportations in mod-

ern society, the automobile is developing toward the trend of
intelligence, automation, and comfort. In this paper, we propose
a more immersive 3-D surround view covering the automobiles
around for advanced driver assistance systems. The 3-D surround
view helps drivers to become aware of the driving environment
and eliminates visual blind spots. The system first uses four fish-
eye lenses mounted around a vehicle to capture images. Then,
according to the pattern of image acquisition, camera calibration,
image stitching, and scene generation, the 3-D surround driving
environment is created. To achieve the real-time and easy-
to-handle performance, we only use one image to finish the
camera calibration through a special designed checkerboard. Fig. 1. Output result of the existing system. (a) 2D surround view. (b) satcked
Furthermore, in the process of image stitching, a 3-D ship image.
model is built to be the supporter, where texture mapping and
image fusion algorithms are utilized to preserve the real texture
information. The algorithms used in this system can reduce the
computational complexity and improve the stitching efficiency.
The fidelity of the surround view is also improved, thereby
optimizing the immersion experience of the system under the
premise of preserving the information of the surroundings.
Index Terms— Fish-eye lens, camera calibration, 3D surround
view, image stitching, driver assistance systems.
Fig. 2. Screenshot of the system.
all types of abnormal events. A surround view camera system
P RESENTLY, autonomous vehicles are a very hot topic
in academia and industry. However, how to make
autonomous vehicles practical is not only a technical problem
consists of a top/bird’s-eye view that allows the driver to watch
360-degree surroundings of the vehicle [3], [4]. On the one
but also includes safety, legal and social acceptance aspects, hand, the existing surround view assistance systems cannot
among others [1]. Conversely, advanced driver assistance generate integrated and natural surround images because of
systems (ADAS) that involve human interaction are more prac- the calibration algorithm. On the other hand, such algorithms,
tical for applications. In [2], anomaly detection in traffic scenes as claimed in [3], are not designed to achieve real-time
is implemented through spatial-aware motion reconstruction performance or have not been tested in the embedded platform.
to reduce traffic accidents resulting from drivers’ unawareness Most importantly, a bird’s-eye view system can only provide a
or blind spots. However, the authors in [2] also note that it is single perspective from above the vehicle, such as that shown
almost impossible to design a system that can faultlessly detect in Fig. 1 (a), or the images are simply stacked, as shown
in Fig. 1 (b). Both of these images will probably mislead the
Manuscript received February 21, 2017; revised June 2, 2017 and driver.
July 8, 2017; accepted July 30, 2017. This work was supported in part 3D surround views for ADAS could solve this problem by
by the National Training Program of Innovation and Entrepreneurship for
Undergraduates, in part by the National Natural Science Foundation of China providing a considerably better sense of immersion and aware-
under Grant 61402034, Grant 61210006, and Grant 61202240, and in part ness of the surround view. A screenshot of our 3D surround
by the National Key Research and Development Program of China under view is presented in Fig. 2, from which more information
Grant 2016YFB0800404. The Associate Editor for this paper was Q. Wang.
(Corresponding author: Chunyu Lin.) of the vehicle surroundings can be observed. Although the
Y. Gao, C. Lin, Y. Zhao, X. Wang, and S. Wei are with the Beijing large company Fujitsu declared that its chips will support
Key Laboratory of Advanced Information Science and Network, Institute 3D surround views [5], the details of this technology must
of Information Science, Beijing Jiaotong University, Beijing 100044, China
(e-mail: cylin@bjtu.edu.cn). still be completed. In this paper, we will introduce a low-
Q. Huang is with Beijing Xinyangquan Electronic Technology Co., Ltd., cost 3D surround view system that includes our special fish-
Beijing 100038, China. eye calibration algorithm, perspective transformation, 3D ship
Color versions of one or more of the figures in this paper are available
online at http://ieeexplore.ieee.org. model building, texture mapping and linear fusion. This type
Digital Object Identifier 10.1109/TITS.2017.2750087 of system can help drivers be aware of the driving environment
1524-9050 © 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.


Fig. 5. Ship model.

Fig. 3. The specific position of the lenses and the output of the screen.

Fig. 6. Sectional view of 3D ship model.

A. 3D Ship Model
Before introducing the details of the algorithm, the proposed
3D model, which is the carrier of the surround image, will be
presented because it is directly related to the effects of the
texture mapping, image fusion and the quality of the surround
image. In this algorithm, we decide to construct a ship model.
Fig. 4. Flow chart of the key steps.
Compared with the commonly used cylindrical models, such
as those mentioned in [6], the horizontal bottom and the
around the vehicle, eliminate visual blind spots, and prevent arc-shaped wall will be more consistent with the driver’s visual
all types of hidden dangers. habits to help drivers obtain a broader view. The model is
The remainder of this paper is organized as follows. shown in Fig. 5.
In Sec.II, we describe the architecture of our 3D surround The construction of the 3D model consists of connecting the
view for driver assistance systems. In Sec.III, we introduce points in the 3D space into a line, then a plane, and finally
the algorithm details. Finally, the performance and the results a body. After we store the points in a certain order, we can
are presented in Sec.IV. draw the 3D model as needed.
Considering the effect of texture fusion, our ship model
II. S YSTEM OVERVIEW selects the function Z = R 4 as the ramp function (R is the
distance between the projection point and the edge of the
The system consists of four fish-eye lenses mounted around bottom, and Z is the arc point’s height). The sectional view
the vehicle and a display screen inside the control panel. of our ship model is presented in Fig. 6. As indicated by
A miniature vehicle model is shown in Fig. 3. The fish-eye the top view, the model is actually composed of a circle of
lenses are distributed at the front bumper, the rear bumper and ellipses. The points of the model are actually the intersection
under two rear-view mirrors. To reduce costs, HK8067A fish- of lines extending from the bottom frame and those ellipses.
eye lenses that cost no more than 3 dollars are utilized. These According to experiments, the slope surface is smoother when
types of lenses provide an 180-degree wide-angle view to the density of points on the slope is 15. In other words,
ensure sufficient overlap between each view. Through a series 15 intersection points on each slope curve are selected to
of image stitching processes using an embedded system with build the model. However, in the uppermost ellipse, where
the Freescale processor, a 3D surround view can be formed on the distance between each point is large, we add a number
the vehicle’s central control panel. In the next section, we will of straight lines passing through the corners of the bottom
introduce each step in detail. frame and form more intersections to build the model. Then,
we project these points into the bottom of the model and
III. 3D S URROUND V IEW S YSTEM calculate the distance between the projection point and edge
A flow chart of our 3D surround view system is presented point of the bottom surface. With the ramp function, the
in Fig. 4, which includes four main steps: camera calibration, coordinates of those points on the slope can be calculated.
coordinate transformation, texture mapping and image fusion. The resulting slope is smooth, and the images are more natural.
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.


Fig. 8. Special calibration board.

Fig. 7. Model construction. reference height. Using this list, we can rectify the distorted
image captured by the fish-eye lens. However, due to the entire
It not only helps drivers obtain broader vision but also speeds manufacturing process of the lens, the following CCD center
up the process of image stitching. The time required from and misalignment between the sensor plane and lens plane,
generating model points to completing the model construction the provided list may not be accurate. Moreover, the fish-eye
is 17 ms. In addition, construction of the model can be lenses installed in our system are very inexpensive and of low
completed in 3 ms when the model points are already known. quality, which may also lead to inaccuracy. Through many
Two adjacent curves on the slope can constitute a parallel experiments, we found that the optical center is the most
region. We choose one of these regions to illustrate the process important factor that affects the rectified image. If a good
and we show this region in Fig. 7. The letter K in this figure estimation of the optical center can be obtained, then the list of
means that it is a point, and L means that it is a line. The field curvature data can still be used. Most importantly, we can
label in the bracket is the number of the point or the line. cross rectify the image and estimate the optical center using
First, mark the points on the left curve with odd numbers and a certain rule, thereby obtaining the final rectified image.
those on the right curve with even numbers. Then, suppose Considering that the system is going to be setup in vehicles
that there is a point with label K(n). If this point is on the left during manufacturing or in a 4S shop, less human intervention,
curve, connect the points K(n-1), K(n-2), and K(n) in order. simple and easy operation and less time cost are preferred.
Otherwise, connect the points K(n-2), K(n-1) and K(n) in Therefore, we propose a fish-eye lens calibration algorithm
order. By addressing the points on all lines using this method, that uses a collinear constraint and edge corner points. The
all triangles will be drawn in the same direction. After all proposed algorithm takes two important constraints. The first
the points in a certain area have been addressed, the model constraint is that collinear points should be rectified to be
construction is complete. Having finished the introduction of collinear. The second constraint is that the light through the
our 3D model, we will now introduce the other steps in the optical center has less distortion, whereas that through the edge
following subsections. of the lenses has the largest distortion.
First, we use a special checkerboard that can be printed and
B. Calibration of Fish-Eye Lenses Using Collinear easily set up, as shown in Fig. 8. Using this checkerboard,
Constraint and Edge Corner Points we calibrate the camera without moving the checkerboard or
Since the employed fish-eye lens can capture a scene with vehicle.
a wide angle of 180 degrees, the four fish-eye lenses around In the distorted image, the corner points close to the lens can
the vehicle can ensure that there are no blind spots if a good still be accurately searched since these points have relatively
surround view can be stitched. However, the wide angle of less distortion. In the global coordinates, these points are in the
the fish-eye lens sacrifices the quality of the captured image. same row or the same column. As shown in Fig. 8, the corner
There is barrel distortion using a fish-eye lens, particularly points in the rectified image should still be collinear. Suppose
the light through positions located far away from the optical that d(i ) is the distance between point i and the fitting line
center [7], [8]. The distortion makes it difficult to convey the and that μ(i ) is the weight of each point. Then, the weighted
captured image information [9]. Therefore, it is necessary to summation is
calibrate the camera, correct the distortion and then rectify 
the image to meet human visual requirements. The tradi- L= μ(i )d(i ). (1)
tional camera calibration methods typically use approximately i=1
20 captured images with different positions [10]–[12], which where M is the number of total corner points.
requires substantial human intervention and time cost [13]. Considering the special structure of the fish-eye lens, the
In addition, the traditional calibration methods assume a pin- farther away the points are from the optical center, the larger
hole model, which differs from the fish-eye lenses. the distortion will be. Therefore, the weight of each point
In fact, the lens supplier provides a list of field curva- should be different, and these weights are set considering the
ture data that indicate the relation between real height and physical distance of the points to the lens.
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.


The perspective transformation matrix contains 9 parameters.

⎛ ⎞
a11 a12 a13
⎝ a21 a22 a23 ⎠.
a31 a32 a33
Fig. 9. The process of the perspective transformation. However, the map plane is parallel to the original plane
and the formula is homogeneous. Therefore, [a33 ] = 1.
Given an optical center value, we can always obtain a The perspective transformation matrix contains 9 parameters,
shown as follows.
rectified image using the list of field curvature data. Then, ⎛ ⎞
lines could be fit through the searched corner points. The a11 a12 a13
optical center coordinate is traversed in a given range, and the (x, y, 1) = (u, v, 1) × ⎝ a21 a22 a23 ⎠. (2)
smallest L of each line is found. The corresponding optical a31 a32 1
center is what we require. Among them, (u, v) are the pixel coordinates in the distorted
Due to the special structure of the fish-eye lens, distortion image, and (x, y) are the pixel coordinates in the transformed
near the edge of the image is greater. As shown in Fig. 9, the image. The transformation matrix obtains the information
distortion of the rectangle on the sides of the checkerboard is about scaling, shearing, rotation and translation. With 4 reli-
larger than that of the middle of the checkerboard. In addition, able pairs of detected corner points, which are the 4 points
the fish-eye lenses are mounted above the ground, and there closest to the lens, we can calculate the transformation matrix
will be an inclination angle between the horizontal ground parameters. Thus, we can obtain each point in the transformed
and the lens, which makes the rectangle appear as a trapezoid image as follows.
or other irregular space. The aforementioned facts lead to the a11 u + a21 v + a31
result that the side corner points are not easy to detect. If we x = , (3)
a13 u + a23v + 1
perform the calibration using only the middle corner points a12 u + a22 v + a32
on the checkerboard(smaller checkerboard), then the estimated y= . (4)
optical center will not be accurate, thus affecting the final a13 u + a23 v + 1
rectified image. Subsequently, we perform a second perspective transforma-
To detect the corner points far away from the fish-eye tion on the bird’s-eye view image. Thus, we can calculate
lens, we design a large square at the edge and perform a the pixel coordinates of the large rectangular corner points in
perspective transformation two times. The process of corner the original image. With all the detected corner points, the
detection can be divided into several steps. First, a binarized line fitting algorithm is used to search for the optical center
image is obtained through threshold segmentation. In addition, and generate the rectified image as (1). Compared with the
we employ the adaptive thresholding method considering the traditional calibration algorithm, the proposed algorithm only
non-uniform brightness of images. The adaptive threshold of captures one image to complete the calibration; thus, it is
a pixel is determined by the pixel value distribution of its more suitable for the driver assistance system. Note that in
adjacent points. Second, image expansion is used to separate the traditional algorithm, only the close pixels can be rectified,
the connection of each black cube on the calibration board. whereas our algorithm can rectify all the points.
This method requires a structuring element, which can be
a square or a circle with a center point. The pixel value C. Coordinate Transformation Using Virtual
of this center point is compared with the value of each Imaging Surface
pixel of the image. Then, the larger one will be set as After obtaining the optical center and the rectified images,
the value of the pixel of the image. After this process, the we need to transform the 2D image into our 3D ship model.
white pixel dots will expand. Therefore, it can reduce the However, the direct relationship between the 2D image and
black quadrilateral and cut off the connection between each 3D model is difficult to obtain. Hence, a virtual image plane
square. Moreover, the vertex number and the graph outline is established between the 2D image and 3D imaging plane
are used to distinguish the square. Ultimately, some restrictive through perspective transformation and affine transformation.
conditions, such as aspect ratio, perimeter and area, are used The lens is mounted on the top of the cone. According to
to eliminate interfering figures. After the above steps, the the fish-eye lens’ position, visual angle, and orientations, the
corner points can be detected. By combining these detected cone matrix can be confirmed.
⎧⎛ 2N ⎞
large square corner points and the small square corner points, ⎪

⎪ 0 0 0
the camera calibration can be accomplished. Fig. 9 shows the ⎪⎜
⎪ ⎜ r −l

⎪ ⎜ 2N ⎟
process. ⎪
⎪ ⎜ 0 0 0 ⎟,

⎪ ⎜ − ⎟
Some corner points are difficult to detect because they are ⎪
⎪ t b
⎨⎜⎝ 0 0 a b⎠

far away from the optical center and have large distortion.
Hence, we perform perspective projection on the original ⎪
⎪ 0 0 −1 0

⎪ ⎧
image. Since the checkerboard is placed on the ground, if we ⎪
⎪ ⎪ F+N

⎪ ⎨a = − ,
look from above the checkerboard, the rectangle will be ⎪
⎪ F−N

its normal shape. Consequently, we can obtain a bird’s-eye ⎪
⎩⎪⎩b = − 2N F .
view image in which we can easily detect the corner points. F−N
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.


Fig. 11. Two-step mapping.

Fig. 10. The process of coordinate transformation.

N is the distance from the eye to the front clipping plane,

and F is the distance from the eye to the back clipping plane.
r , l, t, and b are the right boundary value, the left boundary
value, the upper boundary value and the lower boundary value
Fig. 12. O mapping.
of the projection plane, respectively.
The pixel coordinates of the points on the virtual imaging
plane can be obtained by multiplying the above matrix with First, the 2D texture is mapped to a simple 3D object sur-
the pixel coordinates of the corner points on the 3D model. face, such as a sphere, cube, cylinder and so on.
The following

⎛ 2N x ⎞ ⎛ ⎞ mapping is then established: T (u, v) → T  x  , y  , z  . This
2N ⎛ ⎞
0 0 0 mapping is called the S mapping, which maps the 2D texture
⎜ r −l ⎟ ⎜ r −l ⎟ x
⎜ ⎟ ⎜ ⎟ ⎜y⎟ to a sphere with a radius of R. The mapping process is shown
⎜ 2N y ⎟ ⎜ 0 2N ⎟
0⎟× ⎜ ⎟. (6)
⎜ ⎟=⎜ 0
⎜ t −b ⎟ ⎜ t −b ⎟ ⎝z⎠ in Fig. 11.
⎝ az + b ⎠ ⎝ 0 0 a b⎠ ⎧
1 ⎪
−1 ⎨x = R × cos α × sin β,
−z 0 0 0
y = R × sin α × sin β, (8)
Then, the perspective transformation matrix can be solved ⎪

z = R × cos β.
by combining the pixel coordinates of the corresponding
corner point on the 2D plane. After obtaining the matrix of the P is a point on the sphere, as shown in Fig. 11. By mapping
two projection transformations, every point on the 2D image the line OP to the XOY plane, α is then the angle between
can be converted to the 3D model using the virtual imaging the projection line and the X axis. The angle between the line
plane. The schematic diagram of the entire process is presented OP and the Z axis is β, where 0 ≤ α ≤ 2π, 0 ≤ β ≤ 2π.
in Fig. 10. Subsequently, the texture on the surface of the intermediate
object is mapped to the surface of the final object, as shown
D. 3D Texture Mapping in Fig. 12. The intersection of the intermediate object and the
radial that links point O and (x, y, z) is (x  , y  , z  ). This point
After the above steps, the coordinates of each pixel in the is regarded as the mapping point. The process described above
3D ship model are obtained. If only these points are mapped, 
is O mapping. T  x  , y  , z  → O (x, y, z).
then the resulting surround image is not natural and will Through the above two steps, the texture on the 2D plane
cause a loss of image information. Therefore, texture mapping can be mapped to the surface of the 3D ship model through
is employed here. Compared with the traditional mapping a spherical surface. The image distortion can also be reduced,
method, texture mapping is a complex graphics technology, and the image information can be saved to the greatest
and it is most commonly used to express geometric details extent [14], [15].
via texture and light details. The generated surround image
will be more vivid and more natural.
Texture mapping is the process that maps the pixels of a E. Image Fusion
2D texture plane to a 3D surface. This process is similar to Through the calculation of the camera parameters and the
placing an image onto the surface of a 3D object to enhance calculation of the texture mapping rules, the relative relation-
the sense of reality. The core of this method is the introduction ship between the adjacent images has been determined.
of a middle 3D surface mapping as an intermediate media. The By projecting the 3D ship model to the first quadrant,
basic process can be accomplished through the following two we can observe that the area of the adjacent lens will have
steps [14], [15]. overlapping areas. To reduce the stitching of the surround
 image and make the image more natural, it is necessary to
(u, v) → x  , y  , z  → (x, y, z). (7)
fuse the overlapping regions of the image [16]–[18].
(u, v) are the coordinates of the 2D plane. (x  , y  , z  ) are For simplicity, we use alpha fusion here. As shown
the coordinates of the simple 3D object surface. (x, y, z) are in Fig. 13, there are two boundary lines: the front region
the coordinates of the 3D model. segmentation line l and the right visual region segmentation
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.


Fig. 14. Comparison of the two algorithms. (a) Traditional algorithm.

(b) Proposed algorithm.


Fig. 13. The projection of the model in the first quadrant.

line m. For the front region, the weight of the pixel value
varied from 1 to 0, corresponding to l to m. For the right
visual region, the weight of the pixel value varied from 1 to 0,
corresponding to m to l. For point A in the model, the pixel
value can be calculated using the following formula.
⎧   In contrast, the calibration algorithm proposed in this paper

⎪ |y| − w/2 adopted a special calibration board. It can use only one image
⎪α = ar c tan
⎪ ,

⎪ |x| − l/2 to rectify the distortion of images and obtain a more accurate

⎨ α − θr value of the optical center. This strength suits our system and
P f ront = , (9)

⎪ θo other high real-time systems well. The following part will

⎪ Pright = 1 − P f ront ,

⎪ compare the traditional algorithm mentioned in [10] with the

⎩C = P
A ×C
f ront +Pf ront×C .
right right proposed algorithm, and the results will be presented.
We implemented the camera calibration in an indoor envi-
In the above formula, w is the vehicle’s width, and l is the ronment where the light is mild. This environment helps to
vehicle’s length. θr and θo are shown in the figure. Depending more easily detect corners. The rectified image processed using
on the fish-eye lens that we used, the θo of our system is 15°, our algorithm is shown in Fig. 14(b), and it is compared
θ f is 41.5°, and θr is 33.5°. C f ront and Cright represent with the traditional algorithm in Fig. 14(a). According to the
the pixel values of point A in the front region and the right images, we can clearly observe that the image processed using
visual region, respectively. P f ront and Pright are the weight the proposed algorithm rectified the distortion well. It not only
values of C f ront and Cright , respectively. 0 ≤ Pright ≤ 1, detects the corners of the middle checkerboard, but also detects
0 ≤ P f ront ≤ 1. This algorithm has a good fusion effect. the corners of the rectangular calibration board at the edge
Moreover, the image output by the system exhibits a naturally of the image. However, the traditional algorithm cannot detect
connected and smooth transition. It also eliminates the bright- the corners of the rectangular calibration board. Therefore, the
ness differences in images, and the visual effects have been rectified image in Fig. 14(a) also contains distortion at the edge
greatly improved. [19]. of the image. Only the central region is well calibrated.
After the above steps, the 3D surround view is completed. The second advantage of our calibration algorithm is that
In the next section, experimental results will be presented to it obtains more accurate values of the optical center, which is
demonstrate the effectiveness of the proposed algorithm. the most important parameter in camera calibration. A more
accurate optical center value will provide a better camera
calibration result. The optical center value and some other
In this section, we present some results from our exper- parameters of the fish-eye lens were already presented in
iments with the aforementioned algorithm and the entire the table of camera parameters. The value in the table can
system. be used as a criterion for evaluating the camera calibration
performance. The results of the proposed algorithm and the
A. Testing the Proposed Calibration Algorithm With traditional algorithm are presented in table I and table II.
the Traditional Calibration Algorithm There is a reference value of optical center in the parameter
Before the cars leave the manufacturing process, some table given by the manufacturer. We use four images to test
required checks and tests must be completed very quickly. the performances of the proposed algorithm and the traditional
In this process, less human intervention is preferred. However, algorithm. By carefully comparing and analyzing the data
the traditional calibration algorithm [10] needs more than in table. I and table. II, we observe that the results of the two
one image to complete calibration, thus requiring the shifting calibration algorithms are quite close to the reference values.
of cars or the calibration board a few times. Hence, tradi- However, the traditional algorithm [20] always has a relatively
tional algorithms are not well suited for this application case. larger deviation compared with the proposed scheme. For the
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.


Fig. 15. Images captured by the fish-eye lenses. (a) Front side of the car. (b) Back side of the car. (c) Left side of the car. (d) Right side of the car.

Fig. 16. Surround view in an underground garage. (a) Front view. (b) Rear view. (c) Right view. (d) Left view.

Fig. 17. Surround view from different angles on a highway. (a) Rear view. (b) Left-rear view. (c) Right view. (d) Front view.

TABLE II under the mirrors. To improve the imaging result, the output
E RROR A NALYSIS resolution of the fish-eye lenses is 720×576. If larger sizes are
required, other types of cameras or image resizing algorithms
should be adopted [21]. The checkerboard of the experiment
is 5×7, and the size of each compartment is 20cm×20cm. The
size of the large rectangle is 100cm×100cm, and the distance
between the rectangle and the checkerboard is 40cm. We use
a Honda CRV as the test vehicle in the experiment. Both static
and moving cases are adopted to test the performance of our
We parked the car in an underground garage where the light
is dim. Fig. 15 shows the images that were acquired by the
fish-eye lenses. Using the proposed algorithm to process the
average deviation, both the x coordinate and the y coordinate images, the final surround view is presented as follows.
of the presented algorithm are smaller than those of the tradi- These images captured by the fish-eye lenses all have
tional algorithm. Thus, the values of the optical center obtained distortions. The farther away the optical center is, the larger
from the proposed algorithm are more accurate. Employing is the distortion. Meanwhile, there are overlapping regions
this algorithm in the system can make the camera calibration in adjacent visual regions. If we only piece those images
results more precise. Most importantly, the proposed algorithm together, then the information in the image will mislead
requires less human intervention. drivers. If the algorithm in this paper is employed, the results
are considerably different. As shown in Fig. 16, the distortion
B. Testing the System While the Car is has already been corrected, and the texture has been mapped
Static or in Motion to our 3D model appropriately. The overlapped area has been
First, a brief introduction to the implementations and con- greatly improved in terms of color, brightness and so on after
tents of the experiments will be given. The embedded platform the fusion process. Overall, the 3D surround view system can
of our system is Freescale i.MX6Q, which is a quad-core help drivers obtain knowledge of their surroundings naturally.
processor with 1 MB L2 cache. In addition, the processor The output of the 3D surround view is on the display screen.
has four shader core 3D graphics acceleration engines and We tested the system on a spacious highway and on a crowded
two 2D graphics acceleration engines. Four fish-eye lenses are city road. The pictures of different visual angles of different
mounted on the front bumper, rear bumper, and on each side environments are shown in Fig. 17 and Fig. 18. We also
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.


Fig. 18. Surround view from different angles on a city road. (a) Rear view. (b) Left-rear view. (c) Right view. (d) Front view.

by themselves. Note that the setup of the lenses will meet the
requirement of vehicle level in production; hence, it is much
more robust to vibration in practice.
Real-time efficiency is one of the most important evalua-
tion indicators of the system; thus, the time costs are also
Fig. 19. Comparison with other systems. (a) Fujitsu. (b) Delphi automotive. provided here. The multi-threaded camera calibration takes
(c) Our system. approximately 10 seconds to calibrate all the cameras, which
could be finished off-line. Another 8 seconds is required to
provide a test video on the website,1 which better illustrates read the data and load the model. Including other time costs,
the test results. The results are still satisfactory. it altogether costs approximately 20 seconds to finish the
stitching. This meets the low time-cost requirements of the
system. Considering all the results, we find that both in
C. Testing the System With Other Systems
the static environment and the moving vehicle, this algorithm
of the Same Type
can restore the scene around the vehicle well; preserve the
There have been some reports of 360 surround view men- light, shade and concave-convex information well; and obtain
tioned in the literature or introduced on websites. A 360-degree the natural surround view with almost no traces of splicing.
wrap-around video imaging technology was proposed by
Fujitsu in [5]. The image is shown in Fig. 19(a). Another V. C ONCLUSION
360-degree surround view system mentioned in [4] is shown
in Fig. 19(b). These two figures are utilized here to perform This paper presents an advanced driver assistance system
a comparison with ours. based on 3D surround view technology. With four fish-eye
The technology shown in Fig. 19(a) can generate a lenses mounted on a vehicle, we implement the entire frame-
panoramic image around the vehicle, but the image is not work, which includes the special calibration, 3D ship model
natural, and there is a large difference in brightness and color construction, texture mapping and image fusion processes. The
between the generated image and reality. The system shown proposed algorithm is very efficient, and it can be applied
in Fig. 19(b) generates a more natural image, but the scene that in embedded systems. The entire system can adapt well to
the driver can see above the ground is very limited compared changes in the environment and switch to any desired view
with our system, as shown in Fig. 19(c). Our algorithm can angle. The experimental results show that the calibration
generate a more natural image, and drivers can obtain a algorithm presented in this paper obtains more accurate results
boarder view. It helps drivers obtain more accurate information than the traditional algorithm. Moreover, calibration based on
about their driving environment and allows them to response a single image enables this system to not only be used in
in advance. advanced driver assistance systems but also in video surveil-
We implemented the camera calibration in an indoor envi- lance and applications where real-time demand is required.
ronment where the light is well balanced. This environ-
ment condition helps to detect corners easily such that we R EFERENCES
can perform a highly accurate calibration. However, vehicles [1] P. Koopman and M. Wagner, “Autonomous vehicle safety: An inter-
will inevitably encounter shaking while they are in motion. disciplinary challenge,” IEEE Intell. Transp. Syst. Mag., vol. 9, no. 1,
pp. 90–96, Jan. 2017.
An important and necessary task is to eliminate the effects of [2] Y. Yuan, D. Wang, and Q. Wang, “Anomaly detection in traffic scenes via
car turbulence on the fish-eye lenses. For convenience, we pro- spatial-aware motion reconstruction,” IEEE Trans. Intell. Transp. Syst.,
vide a mobile application for users. If there are just some vol. 18, no. 5, pp. 1198–1209, Mar. 2017.
[3] B. Zhang et al., “A surround view camera solution for embedded sys-
trivial position changes with the fish-eye lenses, users can use tems,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. Workshops,
the mobile application connected with the system to adjust Jun. 2014, pp. 676–681.
the captured images slightly without a second calibration. [4] M. Yu and G. Ma, “360° surround view system with parking guidance,”
Driver Assist. Syst., vol. 7, no. 1, pp. 19–24, 2014.
However, when the positions of the lenses encounter consid- [5] “360° wrap-around video imaging technology ready for integration
erable changes, the users cannot complete a good 3D display- with fujitsu graphics SoCs,” Fujitsu Microelectron. America, Inc.,
ing process autonomously. Under these circumstances, they Sunnyvale, CA, USA, Tech. Rep., Feb. 2011. [Online]. Available:
have to recalibrate the lenses through the manufacturer or [6] M. Lin, G. Xu, X. Ren, and K. Xu, “Cylindrical panoramic image
stitching method based on multi-cameras,” in Proc. IEEE Int. Conf.
1 https://pan.baidu.com/s/1dELPgrv
Cyber Technol. Autom., Control, Intell. Syst., Jun. 2015, pp. 1091–1096.
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.


[7] Z. Hu, Y. Li, and Y. Wu, “Radial distortion invariants and lens evaluation Yao Zhao (M’06–SM’12) received the B.S. degree
under a single-optical-axis omnidirectional camera,” Comput. Vis. Image from the Radio Engineering Department, Fuzhou
Understand., vol. 126, no. 2, pp. 11–27, 2014. University, Fuzhou, China, in 1989, and the M.E.
[8] M. Schönbein, T. Strauß, and A. Geiger, “Calibrating and center- degree from the Radio Engineering Department,
ing quasi-central catadioptric cameras,” in Proc. Int. Conf. Robot. Southeast University, Nanjing, China, in 1992, and
Autom. (ICRA), May 2014, pp. 4443–4450. the Ph.D. degree from the Institute of Informa-
[9] C. S. Fraser, “Automatic camera calibration in close range photogram- tion Science, Beijing Jiaotong University (BJTU),
metry,” Photogramm. Eng. Remote Sens., vol. 79, no. 4, pp. 381–388, Beijing, China, in 1996. He became an Associate
2013. Professor with BJTU in 1998, where he became a
[10] Z. Zhang, “A flexible new technique for camera calibration,” IEEE Trans. Professor in 2001. From 2001 to 2002, he was a
Pattern Anal. Mach. Intell., vol. 22, no. 11, pp. 1330–1334, Senior Research Fellow with the Information and
Nov. 2000. Communication Theory Group, Faculty of Information Technology and
[11] Z. Zhang, “Camera calibration with one-dimensional objects,” IEEE Systems, Delft University of Technology, Delft, The Netherlands. He is
Trans. Pattern Anal. Mach. Intell., vol. 26, no. 7, pp. 892–899, Jul. 2004. currently the Director of the Institute of Information Science, BJTU. His
[12] Q. Wang, C. Zou, Y. Yuan, H. Lu, and P. Yan, “Image registration current research interests include image/video coding, digital watermarking
by normalized mapping,” Neurocomputing, vol. 101, pp. 181–189, and forensics, and video analysis and understanding. He is also leading
Feb. 2013. several national research projects from the 973 Program, the 863 Program,
[13] H.-T. Chen, “Geometry-based camera calibration using five point cor- and the National Science Foundation of China. He serves on the edi-
respondences from a single image,” IEEE Trans. Circuits Syst. Video torial boards of several international journals, including as an Associate
Technol., to be published. Editor of the IEEE T RANSACTIONS ON C YBERNETICS , an Associate Editor
[14] E. A. Bier and K. R. Sloan, “Two-part texture mappings,” IEEE Comput. of the IEEE S IGNAL P ROCESSING L ETTERS , an Area Editor of the
Graph. Appl., vol. 6, no. 9, pp. 40–53, Sep. 1986. Signal Processing: Image Communication (Elsevier), and an Associate Editor
[15] P. J. Besl, “Geometric modeling and computer vision,” Proc. IEEE, of Circuits, System, and Signal Processing (Springer). He was named a
vol. 76, no. 8, pp. 936–958, Aug. 1988. Distinguished Young Scholar by the National Science Foundation of China
[16] M. A. Ruzon and C. Tomasi, “Alpha estimation in natural images,” in in 2010, and was elected as a Chang Jiang Scholar of Ministry of Education
Proc. CVPR, vol. 1. 2000, pp. 18–25. of China in 2013.
[17] A. Levin, D. Lischinski, and Y. Weiss, “A closed-form solution to natural
image matting,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 30, no. 2,
pp. 228–242, Feb. 2008.
[18] M. Salvi and K. Vaidyanathan, “Multi-layer alpha blending,” in Proc.
Meet. ACM SIGGRAPH Symp. Interact. 3D Graph. Games, 2014, Xin Wang was born in Xianghe, China, in 1995.
pp. 151–158. He is currently pursuing the bachelor’s degree in
[19] T. Stathaki, Image Fusion: Algorithms and Applications. San Francisco, computer science with Beijing Jiaotong University,
CA, USA: Academic, 2008. China. He is currently involved in multimedia
[20] Z. Zhang, “Flexible camera calibration by viewing a plane from information processing with the Institute of Infor-
unknown orientations,” in Proc. 7th IEEE Int. Conf. Comput. Vis., vol. 1. mation Science, Beijing Jiaotong University. His
Sep. 1999, pp. 666–673. interests include image processing, deep learning,
[21] Q. Wang and Y. Yuan, “High quality image resizing,” Neuro- and computer vision.
computing, vol. 131, pp. 348–356, Jan. 2014. [Online]. Available:

Yi Gao was born in Yichang, China, in 1996. She is Shikui Wei received the Ph.D. degree in signal
currently pursuing the bachelor’s degree in computer and information processing from Beijing Jiaotong
science with Beijing Jiaotong University, China. University (BJTU), Beijing, China, in 2010. From
She is currently involved in multimedia information 2010 to 2011, he was a Research Fellow with
processing with the Institute of Information Science, the School of Computer Engineering, Nanyang
Beijing Jiaotong University. Her interests include Technological University, Singapore. He is cur-
image processing and data analysis. rently a Professor with the Institute of Information
Science, BJUT. His research interests include com-
puter vision, image/video analysis and retrieval, and
copy detection.

Chunyu Lin was born in Liaoning, China.

He received the Ph.D. degree from Beijing Qi Huang received the master’s degree of com-
Jiaotong University, Beijing, China, in 2011. munication and information systems from Beijing
From 2009 to 2010, he was a Visiting Researcher Jiaotong University in 2008. He was with China
with the ICT Group, Delft University of Technology, Mobile (Beijing) Ltd., from 2008 to 2016.
Delft, The Netherlands. From 2011 to 2012, he was Since 2016, he has been serving as the CEO of
a Post-Doctoral Researcher with the Multimedia Beijing Xinyangquan Electronic Technology Co.,
Laboratory, Gent University, Gent, Belgium. His
current research interests include image/video com-
pression and robust transmission, 3-D video coding,
panorama, and VR video processing.