Вы находитесь на странице: 1из 6

Cephalometric Landmark Detection in Dental X-ray Images

Using Convolutional Neural Networks

Hansang Lee, Minseok Park, and Junmo Kim
School of Electrical Engineering, Korea Advanced Institute of Science and Technology,
291 Daehakro, Yuseonggu, Daejeon 34141, Republic of Korea

In dental X-ray images, an accurate detection of cephalometric landmarks plays an important role in clinical
diagnosis, treatment and surgical decisions for dental problems. In this work, we propose an end-to-end deep
learning system for cephalometric landmark detection in dental X-ray images, using convolutional neural net-
works (CNN). For detecting 19 cephalometric landmarks in dental X-ray images, we develop a detection system
using CNN-based coordinate-wise regression systems. By viewing x- and y-coordinates of all landmarks as 38 in-
dependent variables, multiple CNN-based regression systems are constructed to predict the coordinate variables
from input X-ray images. First, each coordinate variable is normalized by the length of either height or width
of an image. For each normalized coordinate variable, a CNN-based regression system is trained on training im-
ages and corresponding coordinate variable, which is a variable to be regressed. We train 38 regression systems
with the same CNN structure on coordinate variables, respectively. Finally, we compute 38 coordinate variables
with these trained systems from unseen images and extract 19 landmarks by pairing the regressed coordinates.
In experiments, the public database from the Grand Challenges in Dental X-ray Image Analysis in ISBI 2015
was used and the proposed system showed promising performance by successfully locating the cephalometric
landmarks within considerable margins from the ground truths.
Keywords: Computer-aided detection (CADe), landmark detection, dental X-ray, deep learning, convolutional
neural networks

In a recent decade, deep learning has become the most powerful and reliable machine learning methods in various
fields, such as computer vision,1–3 language processing,4, 5 and gaming.6 In medical imaging, deep learning has
also shown its superior performance in various applications from pre-processing techniques to semantic analysis
on patient images.7, 8 It has achieved the state-of-the-art-level performances in various tasks such as abnormality
detection,9, 10 disease classification,11, 12 and organ segmentation.13, 14 Due to these powerfulness and effectiveness,
deep learning has recently been playing an important role in computer-aided detection (CADe) and diagnosis
(CADx) fields in medical imaging.
In dental X-ray images, an accurate detection of cephalometric landmarks shown in Fig. 1, plays an important
role in clinical diagnosis, treatment and surgical decisions for dental problems. Manual landmark detection is
however time consuming and has a risk of inter- and intra-observer variability, so that it is required to perform
the landmark detection process automatically. To achieve this, two recent public grand challenges15, 16 have been
held and several approaches have been suggested to perform automatic landmark detection on dental X-ray
images. Vandaele et al.17 used ensemble learning with extremely randomized trees (ERT) to learn the location of
cephalometric landmarks. Mirzaalian et al.18 proposed a pictorial structure algorithm based on random forest-
based likelihoods from several hand-crafted features, e.g. local binary patterns, spatial coordinates, blobness,
tubularness, and Zernike features. Chen et al.19 formulated the convex optimization problem to estimate the
displacements from randomly sampled image patches to the landmark locations. Chu et al.20 combined random
Further author information: (Send correspondence to Junmo Kim)
Hansang Lee: (E-mail) hansanglee@kaist.ac.kr
Minseok Park: (E-mail) pms0209@kaist.ac.kr
Junmo Kim: (E-mail) junmo.kim@kaist.ac.kr

Medical Imaging 2017: Computer-Aided Diagnosis, edited by Samuel G. Armato III, Nicholas A. Petrick,
Proc. of SPIE Vol. 10134, 101341W · © 2017 SPIE · CCC code: 1605-7422/17/$18 · doi: 10.1117/12.2255870

Proc. of SPIE Vol. 10134 101341W-1

Downloaded From: http://proceedings.spiedigitallibrary.org/pdfaccess.ashx?url=/data/conferences/spiep/91848/ on 03/09/2017 Terms of Use: http://spiedigitallibrary.org/ss/term

No. Description No. Description
1 Sella turcia 11 Lower incisal incision
2 Nasion 12 Upper incisal incision
3 Orbitale 13 Uppler lip
4 Porion 14 Lower lip
5 Subspinale 15 Subnasale
6 Supramentale 16 Soft tissue pogonion
7 Pogonion 17 Posterior nasal spine
8 Menton 18 Anterior nasal spine
9 Gnathion 19 Articulate
10 Gonion

(a) Dental X-ray image (b) Description of cephalometric landmarks

Figure 1: Cephalometric landmarks in dental X-ray image (left) and their description (right).

forest regression and shape models to further correct the landmark locations. Lindner et al.21, 22 used random
forest regression-voting to detect the landmarks automatically. In addition, Ibragimov et al.23, 24 applied game
theory concepts into random forest detector with Haar-like features to determine the optimal landmark locations.
In this work, we propose an end-to-end deep learning system for cephalometric landmark detection in dental
X-ray images, using convolutional neural networks (CNN). CNN has recently been applied to dental X-ray
images for teeth classification25 and has shown high accuracy results. We apply the CNN to the task of landmark
detection in dental X-ray images to check whether the CNN still shows competitive performance for this problem,
as it has shown for other detection, segmentation, and classification problems. To solve the problem, by viewing
x- and y-coordinates of all landmarks as 38 independent variables, multiple CNN-based regression systems are
constructed to predict the coordinate variables from input X-ray images. We train 38 regression systems with
the same CNN structure on input images and coordinate variables to be regressed, respectively. In experiments,
the proposed system showed promising performance by successfully locating the cephalometric landmarks within
considerable margins from the ground truths. As far as we know, this is the first attempt to apply the deep
learning technique to the problem of cephalometric landmark detection in dental X-ray images.

To detect 19 cephalometric landmarks in dental X-ray images, a CNN-based landmark detection system is
proposed. Details of the proposed approach are summarized in Fig. 2. In the proposed detection system, we view
x- and y-coordinates of 19 landmarks as 38 independent variables. We then re-formulate the landmark detection
problem as multiple problems consisting of regressing individual coordinate variables. To solve these problems,
we construct the multiple CNN-based regression systems in which each of them predicts the individual coordinate
variable from the input X-ray images. As a pre-processing, we normalize each coordinate variable by the length
of coordinate axis, which is one of height or width of an input image.
For each normalized coordinate variable, a CNN-based regression system is designed as shown in Fig. 2. In
the proposed system, the used CNN structure has two convolutional layers, two max pooling layers, and one
fully-connected layer. We use input images of size 64 × 64 which is scaled from the original input images. The
first convolutional layer consists of six feature maps with a 5 × 5 kernel, and is followed by 2 × 2 max pooling
to reduce the size of feature maps. The second convolutional layer consists of twelve feature maps, with a 5 × 5
kernel, and is also followed by 2 × 2 max pooling to subsample the feature maps. Finally, a fully-connected layer
computes binary-sized output vector consisting of probabilities in which the first probability value corresponds
to the normalized coordinate variable of landmarks. The proposed CNN regression system is trained on training
images and corresponding coordinate variables of landmarks.

Proc. of SPIE Vol. 10134 101341W-2

Downloaded From: http://proceedings.spiedigitallibrary.org/pdfaccess.ashx?url=/data/conferences/spiep/91848/ on 03/09/2017 Terms of Use: http://spiedigitallibrary.org/ss/term


Training images &

landmark coordinates


Figure 2: An overview of the proposed detection system.

We construct the above CNN regression systems for 38 coordinate variables individually with the same
structure and specification. With training each CNN regression system on the corresponding coordinate variable
of landmarks, we apply the trained regression system with the test images to obtain the output probability value
for each coordinate variable. We then multiply the output probability value with the test image size to compute
the actual coordinate value of the test image. Finally, we obtain 38 coordinate variables with the trained CNN
regression systems from the test images and combine them to generate 19 landmarks by pairing the regressed


To validate the proposed approach for cephalometric landmark detection, we used the public database from
the Grand Challenges in Dental X-ray Image Analysis in ISBI 201516∗ . In this public database, 300 dental X-
ray images were acquired by Soredex CRANEXr Excel Ceph machine (Tuusula, Finland) and Soredex SorCom
software (3.1.3, version 2.0).15, 16 The resolution of the given images was 1935 × 2400 pixels with the pixel size of
0.1 mm×0.1 mm. In the database, 19 landmarks for each image were manually marked by two experts and the
ground truths were generated by averaging their landmark points. The database was then randomly partitioned
into 150 training data and 150 testing data.
In experiments, the proposed detection system was implemented on LightNet26† and MatConvNet‡ . We
trained the proposed detection system on 150 training data and tested it on 150 testing data. Fig. 3 demonstrates
some examples of the proposed landmark detection results from the testing data. As shown in the figures, it
can be observed that the detected landmarks and the ground truths were not perfectly matched, but the overall
landmarks can be considered as properly located within certain margins from the ground truths without heavy
outliers. Based on these qualitative results, it can be observed that the proposed approach has a promising
potential compared to the simplicity of the method. We also evaluated the average Euclidean distances between
the predicted landmarks and the ground truths. Fig. 4 shows the box plot of the average Euclidean distances
between the predicted landmarks and the ground truths with respect to landmarks.




Proc. of SPIE Vol. 10134 101341W-3

Downloaded From: http://proceedings.spiedigitallibrary.org/pdfaccess.ashx?url=/data/conferences/spiep/91848/ on 03/09/2017 Terms of Use: http://spiedigitallibrary.org/ss/term

Figure 3: Examples of 19 landmarks on test X-ray images; predicted landmarks by the proposed system (red)
and ground truths (yellow)

Despite these promising results, the proposed approach was limited to a relatively low detection accuracy,
or mis-location margins from the ground truths. These limitations can be due to the facts that (1) the input
images were scaled from 1935 × 2400 pixels to 64 × 64 pixels so that the fine error in the scaled images grew
rapidly as the images were enlarged to the original size, and (2) the regression systems were trained without
proper use of deep learning-related techniques, e.g. data augmentation, so that the trained systems were not fully
robust to the data variability. To overcome these limitations, it is required to (1) extend the proposed regression
system to coarse-to-fine framework by including an additional step to correct the location of the initially detected
landmarks, and (2) to use the appropriate techniques including data augmentation for making the regression
system robust to the variability of input images.

Proc. of SPIE Vol. 10134 101341W-4

Downloaded From: http://proceedings.spiedigitallibrary.org/pdfaccess.ashx?url=/data/conferences/spiep/91848/ on 03/09/2017 Terms of Use: http://spiedigitallibrary.org/ss/term

T { 1 i 1

1 T # +
+ : , +
+ +
$ s
T #

+ s I
T T # T
$ i

_LA _L _L -L -L
6_L _L _L _L

Figure 4: Boxplot of Euclidean distances between predicted landmarks and ground truths.

In this research, we proposed the landmark detection system for dental image analysis by constructing mul-
tiple CNN-based regression systems predicting individual coordinate values of landmarks, independently. In
experiments, the proposed system showed promising performance by successfully locating the landmarks with
considerable margins. Use of deeper networks with larger input images would enhance the power of the pro-
posed system. As far as we know, this was the first attempt to construct the end-to-end learning framework for
cephalometric landmark detection task, whereas conventional models usually incorporated random forest-based
regression methods with hand-crafted features and shape-based modeling. Future works will include further
improvement by using deeper network structures and extension of our framework to other landmark detection

This work was supported in part by Samsung Advanced Institute of Technology (SAIT).

[1] Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpathy, A., Khosla,
A., Bernstein, M., Berg, A. C., and Fei-Fei, L., “ImageNet Large Scale Visual Recognition Challenge,”
International Journal of Computer Vision (IJCV) 115(3), 211–252 (2015).
[2] Fukui, A., Park, D. H., Yang, D., Rohrbach, A., Darrell, T., and Rohrbach, M., “Multimodal compact
bilinear pooling for visual question answering and visual grounding,” CoRR abs/1606.01847 (2016).
[3] Lee, H., Park, M., and Kim, J., “Plankton classification on imbalanced large scale database via convolu-
tional neural networks with transfer learning,” in [2016 IEEE International Conference on Image Processing
(ICIP)], 3713–3717 (Sept 2016).
[4] Johnson, M., Schuster, M., Le, Q. V., Krikun, M., Wu, Y., Chen, Z., Thorat, N., Viégas, F. B., Wattenberg,
M., Corrado, G., Hughes, M., and Dean, J., “Google’s multilingual neural machine translation system:
Enabling zero-shot translation,” CoRR abs/1611.04558 (2016).
[5] Park, M., Li, H., and Kim, J., “HARRISON: A benchmark on hashtag recommendation for real-world images
in social networks,” CoRR abs/1605.05054 (2016).
[6] Silver, D., Huang, A., Maddison, C. J., and et al., “Mastering the game of go with deep neural networks
and tree search,” Nature 529, 484–489 (Jan 2016). Article.
[7] Yao, J., Wang, S., Zhu, X., and Huang, J., “Imaging biomarker discovery for lung cancer survival prediction,”
in [Proc. 19th International Conference on Medical Image Computing and Computer-Assisted Intervention
(MICCAI 2016)], MICCAI II, 649–657 (2016).
[8] Shin, H., Roberts, K., Lu, L., Demner-Fushman, D., Yao, J., and Summers, R. M., “Learning to read chest
x-rays: Recurrent neural cascade model for automated image annotation,” CoRR abs/1603.08486 (2016).

Proc. of SPIE Vol. 10134 101341W-5

Downloaded From: http://proceedings.spiedigitallibrary.org/pdfaccess.ashx?url=/data/conferences/spiep/91848/ on 03/09/2017 Terms of Use: http://spiedigitallibrary.org/ss/term

[9] Mao, Y. and Yin, Z., “A hierarchical convolutional neural network for mitosis detection in phase-contrast
microscopy images,” in [Proc. 19th International Conference on Medical Image Computing and Computer-
Assisted Intervention (MICCAI 2016)], MICCAI II, 685–692 (2016).
[10] Gulshan, V., Peng, L., Coram, M., and et al, “Development and validation of a deep learning algorithm for
detection of diabetic retinopathy in retinal fundus photographs,” JAMA 316(22), 2402–2410 (2016).
[11] Dhungel, N., Carneiro, G., and Bradley, A. P., “The automated learning of deep features for breast mass
classification from mammograms,” in [Proc. 19th International Conference on Medical Image Computing
and Computer-Assisted Intervention (MICCAI 2016)], MICCAI II, 106–114 (2016).
[12] Zhang, Q., Bhalerao, A., Parsons, C., Helm, E., and Hutchinson, C., “Wavelet appearance pyramids for
landmark detection and pathology classification: Application to lumbar spinal stenosis,” in [Proc. 19th In-
ternational Conference on Medical Image Computing and Computer-Assisted Intervention (MICCAI 2016) ],
MICCAI II, 274–282 (2016).
[13] Dou, Q., Chen, H., Jin, Y., Yu, L., Qin, J., and Heng, P., “3d deeply supervised network for automatic liver
segmentation from CT volumes,” CoRR abs/1607.00582 (2016).
[14] Fu, H., Xu, Y., Lin, S., Kee Wong, D. W., and Liu, J., “Deepvessel: Retinal vessel segmentation via deep
learning and conditional random field,” in [Proc. 19th International Conference on Medical Image Computing
and Computer-Assisted Intervention (MICCAI 2016)], MICCAI II, 132–139 (2016).
[15] Wang, C. W., Huang, C. T., Hsieh, M. C., and et al., “Evaluation and comparison of anatomical landmark
detection methods for cephalometric x-ray images: A grand challenge,” IEEE Transactions on Medical
Imaging 34, 1890–1900 (Sept 2015).
[16] Wang, C.-W., Huang, C.-T., Lee, J.-H., and et al., “A benchmark for comparison of dental radiography
analysis algorithms,” Medical Image Analysis 31, 63 – 76 (2016).
[17] Vandaele, R., Maree, R., Jodogne, S., and Geurts, P., “Automatic cephalometric x-ray landmark detec-
tion challenge 2014: A tree-based approach,” in [Proc. ISBI Int. Symp. Biomed. Imag. 2014, Automat.
Cephalometric X-Ray Landmark Detection Challenge], 37–44 (2014).
[18] Mirzaalian, H. and Hamarneh, G., “Automatic globally-optimal pictorial structures with random decision
forest based likelihoods for cephalometric x-ray landmark detection,” in [Proc. ISBI Int. Symp. Biomed.
Imag. 2014, Automat. Cephalometric X-Ray Landmark Detection Challenge ], 25–36 (2014).
[19] Chen, C. and Zheng, G., “Fully automatic landmark detection in cephalometric x-ray images by data-driven
image displacement estimation,” in [Proc. ISBI Int. Symp. Biomed. Imag. 2014, Automat. Cephalometric
X-Ray Landmark Detection Challenge ], 17–24 (2014).
[20] Chu, C., Chen, C., Nolte, L.-P., and Zeng, G., “Fully automatic cephalometric x-ray landmark detection
using random forest regression and sparse shape composition,” in [Proc. ISBI Int. Symp. Biomed. Imag.
2014, Automat. Cephalometric X-Ray Landmark Detection Challenge], 9–16 (2014).
[21] Lindner, C. and Cootes, T., “Fully automatic cephalometric evaluation using random forest regression-
voting,” in [Proc. ISBI Int. Symp. Biomed. Imag. 2015, Automat. Cephalometric X-Ray Landmark Detection
Challenge ], 1–8 (2015).
[22] Lindner, C., Wang, C.-W., Huang, C.-T., Li, C.-H., Chang, S.-W., and Cootes, T. F., “Fully automatic
system for accurate localisation and analysis of cephalometric landmarks in lateral cephalograms,” Scientific
Reports 6, 33581 EP – (Sep 2016). Article.
[23] Ibragimov, B., Likar, B., Pernus, F., and Vrtovec, T., “Automatic cephalometric x-ray landmark detection
by applying game theory and random forests,” in [Proc. ISBI Int. Symp. Biomed. Imag. 2014, Automat.
Cephalometric X-Ray Landmark Detection Challenge], 1–8 (2014).
[24] Ibragimov, B., Likar, B., Pernus, F., and Vrtovec, T., “Computerized cephalometry by game theory with
shape- and appearance-based landmark refinement,” in [Proc. ISBI Int. Symp. Biomed. Imag. 2015, Au-
tomat. Cephalometric X-Ray Landmark Detection Challenge ], 1–8 (2015).
[25] Miki, Y., Muramatsu, C., Hayashi, T., Zhou, X., Hara, T., Katsumata, A., and Fujita, H., “Classification of
teeth in cone-beam CT using deep convolutional neural network,” Computers in Biology and Medicine 80,
24 – 29 (2017).
[26] Ye, C., Zhao, C., Yang, Y., Fermüller, C., and Aloimonos, Y., “Lightnet: A versatile, standalone matlab-
based environment for deep learning,” in [Proceedings of the 2016 ACM on Multimedia Conference ], MM
’16, 1156–1159, ACM, New York, NY, USA (2016).

Proc. of SPIE Vol. 10134 101341W-6

Downloaded From: http://proceedings.spiedigitallibrary.org/pdfaccess.ashx?url=/data/conferences/spiep/91848/ on 03/09/2017 Terms of Use: http://spiedigitallibrary.org/ss/term

Вам также может понравиться