Вы находитесь на странице: 1из 5

The 2017 ISEE


Dang Thanh Tin(1), Le Tuan Vu(1), Van Dinh Trieu(1)
Bach Khoa University , Viet Nam
Email: dttin@hcmut.edu.vn

methods is in section 2, result analysis in section 3. The

conclusion will be in the section 4.
Nowadays, the worldwide expansion of the Internet
brings about the higher demand for subtitle video. 2. METHOD
Subtitle is a textual version of a videos dialogue or 2.1. Tool
explanatory that appears onscreen. Due to that fact,
OpenCV is a library of programming functions
developing a software that facilitates video makers in
mainly aimed at real-time computer vision. Its vast and
adding and editing their subtitles is imperative. A study
progressively expanded library aids us effectively in
has been conducted to build up the SmartSubtitling
image processing. It includes various functions such as
software. It aims at adding and processing the subtitle
drawing, video input/output, shape detection ect. As an
smartly by the ability to evaluate the color average value
open source tool, OpenCV is suitable for studying. In
of the subtitle area. Then it defines the subtitle color so
this study, OpenCV version 3.2.0 is utilized.
that they are the most readable.
This software is programmed in C++language which
KEYWORDS: subtitling, smart subtitling, opencv,
is the Object Oriented programming. The test run and
subtitle process.
algorithm testing are carried out on Visual Studio 2010.
To build Graphic User Interface (GUI), Qt 5.6.2 is
For the reason that viewers sometimes get annoyed chosen. It is a C++ based framework of libraries and
at the classic subtitles which are virtually unreadable as tools that enables the development of powerful,
its color coincides with the backgrounds. This degrades interactive and cross-platform applications and devices.
video quality and causes the interruption in viewer
comprehension. 2.2. Theoretical basis and idea
There are availably some subtitling softwares such As regard to subtitling, every single video frames
as MKVToolnix or Captions and Subtitles function by must be observed. Each of them is encoded into a matrix
Youtube website. They all have their own methods for which the coordinates stand for an image element- pixel
the better readability. Commonly, they use the default (Fig.1).
font as white word with black edge. Youtube, personally,
sets their white text in black cells. Sharing the same aim,
the SmartSubtitling is introduced as subtitling software
with the advanced function.
Smart subtitling is a complex function that involves
creating, editing or adjusting subtitles so that they are
always readable. On the other words, its task is to keep
the subtitle and background colors distinguishable even
when the background changes consecutively in term of
color and brightness. The initial requirement is to
accurately define subtitle area in the background then
segment it to darkness and brightness so as to properly
Figure 1. Illustrating the pixel of an image [1]
process its colors. Discussing briefly on the applied

The 2017 ISEE

Thence, it is able to insert text into the frame at a each character. To do so, changing the line into character
specified pixel. array is compulsory, with each cell in the array is a
character, including space.
A character is composed by the line segments and
the curves. In deed, drawing the word is conducted as After detaching the word, the following step is to
drawing then connecting those lines and curves [3][6]. calculate the length of each character and divide the
rectangle into cells.

Figure 4. Detaching a word into seperated characters [1].

Figure 2. Each character is composed by the line

segments and curves
There are many available algorithms to draw the line
and curve such as DDA Algorithm, Bresenhams Line Figure 5. Seperating the subtitle area into cells [1].
Generation, Mid-Point Algorithm, Hence, a reference Accordingly, A calculation on the average color
was made to select the right one instead of going further value of the cells containing the characters is carried on
into designing this algorithm. Likely, the available which helps to define the essential color in each cell.
library offers drawing fonts abolishing line and curve Normally the color is chosen in contrast with the
connecting step. average color.

n 1
Color average value formula: C(i ) n
To C(i) is the color value of pixel place i, n is the
total pixels of one cell. Beside the content, time that the
subtitle is rendered is another important factor needed to
be considered. This is executed by defining frame
position of start point and end point, the subtitle must be
displayed in this interval. Frame position is defined by
the formula:
Position = fps * [Time point]
To fps is the frame per second or frame rate, [time
point] is the point of time observed.
The following diagram depicts the process overview.
Figure 3. Available word fonts in OpenCV library
Input Grab video Input text
As to preparation process; premarily, background video frame
area has to be defined on frame containing the subtitle,
then subdivided into cells. Each cell has its own certain Next Divide texts
brightness level that helps define its subtitle color. The frame background into
details on how the cells are divided are going to be cells
discussed. By entering input including text and the End of file
subtitle position on the frame, the background area
Output Draw each Calculate the color
containing subtitle can be defined easily. It is a rectangle video character average value of
surrounding the text. Next, the rectangle is divided into subtitled of the text each cell
cells in such a way that its length is equal to the length of
Figure 6. The process overview

The 2017 ISEE

2.3. Algorithm
2.3.1. Displaying video in GUI 2.3.3. Subtitling
The player is totally designed basing on Frame Thank to the sufficient OpenCV and C++ libraries,
reading function in OpenCV instead of the available the process becomes much more convinient.
media player function in Qt library. Signal and Slot
To open the given video, using VideoCapture
mechanism is a central feature of Qt. Signal is emitted
constructor the following syntax [1]:
when a particular event occurs. A slot is a function of a
class that is called in response to a particular signal. VideoCapture::Videocapture(const string& filename)
Signal v Slot are used for communication between
objects. When there is a connection, an object is an To read each frame in the video, using:
emitter and another plays a role as a receiver. By default, VideoCapture::read(Mat& image)
for every emission, the corresponding signal receiver
implements its slot. To connect signal and slot the Subtitle is entered as the string.
Connect command is used: Background area on frame containing the subtitle
connect(object1, signal, object2, slot) has to be defined. In this case, the alignment position is
chosen-center and under the frame about nine tenths of
To signal emitter, the loop will be executed to read frame length. To any subtitle section with its certain font,
every video frame. For each time, it splits the frame into It is possible to define the subtitle strings format. It is a
an image and save it to a variable called img; then emits rectangle surrounding the text which is the same length
a signal attached the img: with the string and the same size with text font.
void sendProcessedImaged(QImage img) Supposing x,y are the coordinates of the rectangles
right-below angle, frame_width and frame_height is the
To signal receiver, a corresponding slot is set up to
width and the length of a video frame, x and y are
receive image img:
computable. From then, the frame points in the rectangle
void updateUIframe(QImage img) are defined:
Then the img is displayed on a label in GUI . For
every receipt, label displays the receiving image. Entering the Subtitle

Time interval between the two signal emission is

calculated by the formula: T fps (sec) . To reach
high appropriation, the following formula is used instead:
T fps1 106 ( sec) .
subSize = Subtitle length
2.3.2. Subtitle storage
A class is used to show subtitle characteristics.
Currently, the three characteristics under consideration
are: Start point, end point and content. Vectors are used
as the container. The following table is an example: frame _ width subSize
Table 1. Subtitle characteristics 2
y frame _ height
Start End
Point Point Content
(sec) (sec)

0 10 13 Hello!
Set of pixels in subtitle area

1 15 18 Hi !
Figure 7. The algorithm used to define subtitle area
2 20 23 How are you today?
As to character splitting, in principle, the string is
3 25 27 Good. And you? the contiguous sequence of characters. Every elements of
the array must be referred then the character is assigned

The 2017 ISEE

to a variable in type of char [5]. However, OpenCV library the interval is set from 0 to 176 which returns the white
doesnt support drawing char function, it is impossible to color, and the rest interval which is between 176 and 255
draw characters to the frame. Therefore, an intermediate returns the black color.
step must be executed to exchange the characters from
char type to string. Here is the algorithm:
Defining the object area

Entering any S string

Trimming the area to a new matrix:


i, j, S, avg = 0
i < S.length

Char a = S[i] j Mat.rows

j = j+1
i=0 True

Covert a to string type

i Mat.columns


Return a True

S = S +

i = i+1

i = i +1


Figure 8. The algorithm used to split characters

Each split word is considered a string. A algorithm avg
Mat.rows Mat.columns
to define background color of this string is performed
then its color average value is calculated. Smoothly, the
defined area must be trimmed into a particular image to
calculate its color average value. To avg is the average Figure 9. Calculating the color average value of a
value under determination, Mat is the matrix of trimmed defined area.
image, the rule is to calculate each pixels color value A more friendly approach is to put the text and its
then the average color. Because the calculation is background colors in contrast (255 [the average value
executed on greyscale [4][7], the average color drops in the of color]). However, this leads to the unreadability,
interval from 0 to 255. The algorithm is shown in Figure coloring coincidence in some cases. For example: when
9. the color average value is 125, the calculated color is
After calculating the average value, the condition for 130, they are nearly the same.
character color in each cell is restricted. Subsequently,

The 2017 ISEE

3. RESULT ANALYSIS Higher accuracy algorithm.

A test run is made including opening a video then Enable user edit their subtitles after storing.
inserting the subtitle Hello World !!!! Previewing.
Adding three more subtitle features: Font, size
and customized coordinates availably in the
The subtitle will be automatically split into rows
if it width excesses the frame width.
Time accuracy will reach millisecond.
Video output has wider kinds of format
Sound adding for rendered video and output
The GUI will be designed nicer and more

Figure 10. A subtitled video.

According to certain practical tests and user
feedbacks, in most case, the algorithm performs properly. The author is grateful to ASS Prof., Dr. Dang Thanh
GUI is regarded to be nice, friendly and to fulfill the Tin for his valuable criticism of the draft of this paper.
initial idea. It allows input video in various formats such
as .mpg, .mp4 and .avi. With this version, video output is REFERENCES
only saved in .avi format. [1] G. Bradski and A. Kaehler, Learning OpenCV, 1st
Edition, OReilly Media, Inc, 2008.
[2] J. Blanchette and M. Summerfield, C++ GUI
Programming with QT 4, 2nd Edition, Prentice Hall,
[3] P. Shirley, Fundamentals of Computer Graphics,
2nd Edition, A.K.Peters, 2005.
[4] S. Johnson, Stephen Johnson on Digital
Photography, 1st Edition, OReilly Media, Inc,
[5] Splitting a string by a character. Retrieved from:
[6] Line Generation Algorithm. Retrieved from:
Figure 11. An unreadable moment ine_generation_algorithm.htm
However, there are particular cases that the [7] John, Three algorithms for converting color to
algorithm has some drawbacks such as the unreadability grayscale. Retrieved from:
in Figure 11. For such background, the color average hms-convert-color-grayscale/
value is not appropriate which leads to the unreadability.
[8] Efficient way to display OpenCV image into Qt.
The two subtitle features start point and end point are Retrieved from:
calculated by second. This causes the inaccuracy in http://www.qtcentre.org/threads/56482-efficient-
displaying time which affects the subtitle continuity. way-to-display-opencv-image-into-Qt

The assignment on Smart Subtitle is completed in
all edges. The result is as good as expectation. However
there are some drawbacks in the process such as subtitle
quality, time-consuming process, output video format is
limited- AVI only.
In the coming version, the software will be designed
and developed to get more advanced function: