Вы находитесь на странице: 1из 10

Timestamp Extracting From CCTV Footage

Kazi Fozle Azim Rabi


Sakib Shahriyar Pathan
Syed Intiser Ahsan
Monirul Hasan

Dept. Of Computer Science and Engineering


Southeast University
7th September 2016

1
1 Abstract
Timestamp extraction from video footage is a computer vision technology
that enables computer systems to automatically learn the hard coded time
and date from a video or from a single frame. This paper aims to propose a
method to for this technology. The idea consists of the steps pre-processing
the image, timestamp Region Detection from the whole image using edge
detection technique, post-processing to lessen the amount of noises and dis-
tributing the intensity across the newly cut images, dividing each image
region into characters regions using the flood-fill technique and normalising
them to a known image size of the pre-stored images and finally supplying
the normalised and binarized image to OCR, which runs the input image
against all the pre-stored images to find the best match.

2
2 Introduction
Time stamp for recordings is the real recording time, which may incorporate
yet not confine to year, month, date, hour, moment, and second. It is intrin-
sically spared with the video information by camcorders/CCTV.[1]
In advanced CCTV, timestamp is put away independently in the video stream
utilizing some particular bytes, in this way it can be gotten to effectively. In
any case, for simple CCTV, timestamp is superimposed on the video outlines
and blended into the video information, in this manner it cant be removed
straightforwardly. For this situation, the best way to naturally get the record-
ing time is to remove the superimposed digits from the video outlines.
The main reason behind timestamp extraction from cctv footage is to check
automatically if there is some footage/recording are missed. By far it is done
manually. A person always sitting in front of the monitor and keep track
if its recording continuously. But from this approach any time gap in the
footage can checked by continuously checking the CCTV recording time.
This paper tries to analyse an approach to extract timestamp from CCTV
footage.Section 3 says the preprocessed operations on the image before de-
tecting the timestamp region. Section 4 discusses about detecting the region
where the time and date is imposed in the frame. Section 5 describes how the
timestamp region(section 4) is prepared for the segmantation. Section 6 dis-
cusses about how the character are extracted from post-processed timestamp
region(section 4 and section 5) and normalizing it to a fixed size. Finally sec-
tion 7 gives a solution how optical characters are recognised using template
matching technique.

Figure 1: CCTV Footage with Timestamp

3
3 Pre-processing
Images are pre-processed before finding timestamp region. CCTV may cap-
ture colored feed so frame/image will be colored image. As the resolution of
the CCTV is low, images contain noises. The image is first converted into
grayscale image and then the noise is reduced by blurring the image.

4
4 Date And Time Region Detection
To learn the date and time form a video frame the first and most important
task is to detect the location of the date and time text in the image. Usually
video to text extraction systems face the challenges like, same text chang-
ing position in the next frame, current frames text disappearing, new text
appearing in the new frame etc. These problems have been studied and var-
ious techniques have been developed ie, layered method, Bayesian technique
etc.[2] [3] This research deals with CCTV footage so there is no need to solve
these problems as each footage will have one continuous timestamp in one
fixed position of all the frames with the same font size. Detecting the times-
tamp position for one frame and using this position for all the next frames of
the same footage will suffice. Researchers have proposed various method to
confirm the locations of the text form a frame i.e edge based methods, color
based methods, layered methods, machine learning based methods etc.[4] For
this purpose one popular edge based method is used.[5]

Figure 2: Time Region Detection Steps

5
The procedures followed in this method are listed below:

a)First the researchers obtains two edge maps (Fig-2: b and c) of the frame
Using 3 3 horizontal and vertical sobel masks.

b) From the vertical edge map, using the edge density they find the can-
didate text area (Fig-2:d).

c)As there are many horizontal edges along the upper and bottom boundaries
of the text area, they remove some false candidates if there are not enough
horizontal edges along the top and bottom boundaries (Fig-2:e).

d)In the final step they suppress the shapes using VQ-based Bayesian Clas-
sifier, Feature Selection and Training to remove false candidates and to fix
inaccurate text area boundaries (Fig-2:f). [5]

Alternative approach will be using the region if the user selects the region of
interest, where the timestamp region is located in the frame.

6
5 Post-processing
The previous stage will provide with one or two text regions depending on
the camera as some will superimpose the timestamp in one line while other
may do it using two lines. Either way the selected regions need to be cut
and the contrast of the these regions need to be enhanced using histogram
equalization and then blurred to reduce noise. It could have been done before
detecting the text region but this will give a better result as the region will
be small and output will be only based on relevant pixels rather than the
whole image. At last thresholding is applied to binarize the image.

6 Segmentation And Normalisation


To extract each number and the signs (dash, dot, slash), first dilation is used
on the noise reduced binary image to include the pixels of the characters those
were left out in previous processes. After that, all the connected components
are searched. During the the search, the rightmost, leftmost, topmost and the
bottommost pixels indices are stored to mark each characters region. This
provides one region for each character without any extra pixel other than
the minimum pixels required for the rectangle to hold the character. But,
the timestamp font size may vary depending on the camcorder models or the
manufacturer or even on the timestamp of the same frame (date has one font
size time has another size). Normalisation is used to tackle this problem. To
tackle this problem every characters region is normalised into a character of
size 20 35 (Here, Height is 35 and the width is 20. The reason behind this
is that, numbers usually have greater height than its width) to match the
size the characters images pre-stored in database, as the comparison is done
with the pre-stored images.

7
7 Optical Character Recognition
There are many techniques to recognise optical characters.[6] [7] To solve this
particular problem, a slight costly but easy to understand and implement
template matching technique is used. Every separated characters image, i
is passed to a function md(i) which calculates and return the character that
matches most. The function uses pre-stored images, d to return the most
matched character. Fig-3 shows the database of the system.

Figure 3: Character Database

It subtracts all the pre-stored images one by one from the normalised
input image of the character and considers the character that has sum of the
differences (D) closest to 0 but not exceeding some predefined threshold
value, T . If the minimum value is greater than T , then that particular
character can be processed (erosion / dilation/ blurring/ different threshold
value or method during binarization etc) again or the closest match can be
considered as the return value.
The above mentioned function is given below:

Px=30 Py=20
md(i) : D = r=0 c=0 abs(i(r, c) d(n)(r, c))

Here, r and c represent row and column of the images and x is height and
y is width of the images. respectively, n = 0, 1, 2, .. . . , 11, where 0 9 index
for 0 to 9 , 10 for : and 11 for /.

8
8 Conclusion
Time and date OCR in CCTV video gives helpful data that cant be gotten
by different means. The joining of this system in a greater application will
offer the capacity of performing time-based questions. The acknowledgment
result will be surprisingly better by incorporating business OCR motor, if
time utilization is not important.
Here template matching technique is used by subtracting pre-stored images
from normalised input image but there are many more methods what can
be done. In this solution there is nothing proposed about the date and time
format, as these varies in almost every situation. This is something will
be given most priority so that an efficient solution comes up to solve this
problem.
Other future work will be:: (1) Extract other text, including text that appears
within the scene; (2) Enhance the OCR module by implementing an online
version to get users feedback.

9
References
[1] L. Zhao, W. Qi, Y.-J. Wang, S.-Q. Yang, and H. Zhang, Video shot
grouping using best-first model merging, in Photonics West 2001-
Electronic Imaging, pp. 262269, International Society for Optics and
Photonics, 2001.

[2] H. Li, D. Doermann, and O. Kia, Automatic text detection and tracking
in digital video, IEEE transactions on image processing, vol. 9, no. 1,
pp. 147156, 2000.

[3] Y. Zhong, H. Zhang, and A. K. Jain, Automatic caption localization in


compressed video, IEEE transactions on pattern analysis and machine
intelligence, vol. 22, no. 4, pp. 385392, 2000.

[4] R. C. Jose and D. Davis, Extraction of text from videos based on lay-
ered method, in Engineering and Technology (ICETECH), 2015 IEEE
International Conference on, pp. 13, IEEE, 2015.

[5] X. Chen and H. Zhang, Text area detection from video frames, in
Pacific-Rim Conference on Multimedia, pp. 222228, Springer, 2001.

[6] H.-L. Tan, Hybrid feature-based and template matching optical charac-
ter recognition system, Dec. 31 1991. US Patent 5,077,805.

[7] M. Ganis, C. L. Wilson, and J. L. Blue, Neural network-based systems


for handprint ocr applications, IEEE Transactions on Image Processing,
vol. 7, no. 8, pp. 10971112, 1998.

10

Вам также может понравиться