Академический Документы
Профессиональный Документы
Культура Документы
key-point discriminator
Hyenok Park*, Hong Joo Lee*, Hak Gu Kim, and Yong Man Roa)
School of Electrical Engineering, KAIST, Daejeon 34141, Republic of Korea
Dongkuk Shin
Medical Image Development Group, R&D Center, Samsung Medison, Seongnam 13530, Republic of Korea
Sa Ra Lee
Department of Obstetrics and Gynecology, Ewha Womans University School of Medicine, Seoul 07985, Republic of Korea
Sung Hoon Kim
Department of Obstetrics and Gynecology, University of Ulsan College of Medicine, Asan Medical Center, Seoul 05505, Republic of
Korea
Mikyung Kong
Department of Obstetrics and Gynecology, Yonsei University College of Medicine, Seoul 03722, Republic of Korea
(Received 25 November 2018; revised 6 June 2019; accepted for publication 6 June 2019;
published 31 July 2019)
Purpose: Transvaginal ultrasound imaging provides useful information for diagnosing endometrial
pathologies and reproductive health. Endometrium segmentation in transvaginal ultrasound (TVUS)
images is very challenging due to ambiguous boundaries and heterogeneous textures. In this study,
we developed a new segmentation framework which provides robust segmentation against ambiguous
boundaries and heterogeneous textures of TVUS images.
Methods: To achieve endometrium segmentation from TVUS images, we propose a new segmenta-
tion framework with a discriminator guided by four key points of the endometrium (namely, the
endometrium cavity tip, the internal os of the cervix, and the two thickest points between the two
basal layers on the anterior and posterior uterine walls). The key points of the endometrium are
defined as meaningful points that are related to the characteristics of the endometrial morphology,
namely the length and thickness of the endometrium. In the proposed segmentation framework, the
key-point discriminator distinguishes a predicted segmentation map from a ground-truth segmenta-
tion map according to the key-point maps. Meanwhile, the endometrium segmentation network pre-
dicts accurate segmentation results that the key-point discriminator cannot discriminate. In this
adversarial way, the key-point information containing endometrial morphology characteristics is
effectively incorporated in the segmentation network. The segmentation network can accurately find
the segmentation boundary while the key-point discriminator learns the shape distribution of the
endometrium. Moreover, the endometrium segmentation can be robust to the heterogeneous texture
of the endometrium. We conducted an experiment on a TVUS dataset that contained 3,372 sagittal
TVUS images and the corresponding key points. The dataset was collected by three hospitals (Ewha
Woman’s University School of Medicine, Asan Medical Center, and Yonsei University College of
Medicine) with the approval of the three hospitals’ Institutional Review Board. For verification, five-
fold cross-validation was performed.
Result: The proposed key-point discriminator improved the performance of the endometrium seg-
mentation, achieving 82.67 % for the Dice coefficient and 70.46% for the Jaccard coefficient. In
comparison, on the TVUS images UNet, showed 58.69 % for the Dice coefficient and 41.59 % for
the Jaccard coefficient. The qualitative performance of the endometrium segmentation was also
improved over the conventional deep learning segmentation networks. Our experimental results indi-
cated robust segmentation by the proposed method on TVUS images with heterogeneous texture and
unclear boundary. In addition, the effect of the key-point discriminator was verified by an ablation
study.
Conclusion: We proposed a key-point discriminator to train a segmentation network for robust seg-
mentation of the endometrium with TVUS images. By utilizing the key-point information, the pro-
posed method showed more reliable and accurate segmentation performance and outperformed the
conventional segmentation networks both in qualitative and quantitative comparisons. © 2019 Ameri-
can Association of Physicists in Medicine [https://doi.org/10.1002/mp.13677]
Key words: adversarial learning, endometrial region, key-point-guided discriminator, medical image
segmentation, transvaginal ultrasound (TVUS) image
3974 Med. Phys. 46 (9), September 2019 0094-2405/2019/46(9)/3974/11 © 2019 American Association of Physicists in Medicine 3974
3975 Park et al.: Endometrium segmentation on TVUS image 3975
FIG. 1. TVUS images of uterus. Note that the endometrium region is highlighted on the corresponding image for easier understanding. The key points define the
length and thickness of the endometrium. For simple visualization, the key points are on the same channel.
FIG. 2. Overview of the proposed network for endometrium segmentation with the proposed discriminator guided by key-point maps. [Color figure can be viewed
at wileyonlinelibrary.com]
20° and 20°. In sum, data augmentation generates about TABLE I. The detailed structure of the endometrium segmentation network.
380,000 augmented training images in a fold.
Filter/stride/dilation
Layer name rate Output shape
2.B. Endometrium segmentation network Input image Input size: 256*320*1 -
The endometrium segmentation network consists of an Encoder Conv conv [3, 3]/ 1 128*160*64
encoder, an ASPP module, and a decoder as shown in Fig. 3. Block 1 conv [3, 3]/ 1
max pool[2, 2]/ 2
The detail structure of the endometrium segmentation net-
Encoder Conv conv [3, 3]/ 1 64*80*128
work is shown in Fig. 3. Block 2 conv [3, 3]/ 1
The encoder is composed of five convolutional blocks. max pool[2, 2]/ 2
Max-pooling is applied after each convolutional block except Encoder Conv conv [3, 3]/ 1 32*40*256
for the last two convolutional blocks. Atrous convolution is Block 3 conv [3, 3]/ 1
applied with a rate of 2 at the fourth convolutional block to max pool[2, 2]/ 2
maintain the resolution of the feature for a thin, small Skip connection Conv [1, 1]/ 1 64*80*256
endometrial region. Therefore, the size of the extracted fea- Encoder Conv conv [3, 3]/ 1/ 32*40*512
ture from the encoder is one-eighth of that of the input image. Block 4 rate = 2
conv [3, 3]/ 1/
To deal with the various sizes of the endometrium, an ASPP rate = 2
module is incorporated on top of the encoded feature to con-
Encoder Conv conv [3, 3]/ 1 32*40*512
sider multiscale information effectively.20,21 For the decoder Block 5 conv [3, 3]/ 1
part, the multiscale features from the ASPP module are con- ASPP module Concat[Conv [1, 1], 32*40*256
catenated and fed into an 1 9 1 convolutional layer. The Conv [3, 3]/ 1/
multiscale information is adaptively aggregated through the rate = 8,
1 9 1 convolution layer. The aggregated feature is then Conv [3, 3]/ 1/
rate = 12,
upsampled two times and concatenated with the feature from Conv [3, 3]/ 1/
the third convolutional block to recover the details. Finally, rate = 24,
two convolutional layers are applied, and then, the feature is Image pooling]
upsampled four times to obtain the predicted endometrium Conv [1, 1]/ 1
map, Y.^ The input image I 2 R256932091 is fed to the endo- Decoder Conv Upsample by 2 64*80*256
metrium segmentation network. The network predicts the Block 1
probability map of the endometrium, Y ^ 2 R256932091. It is Decoder Conv Concat[Decoder Conv 64*80*256
Block 2_1 Block 1, Skip
written as connection]
^ ¼ sigmoidðf ðIjhÞÞ;
Y (1) Conv [3, 3]/ 1
Conv [3, 3]/ 1
where f and h are the function of the endometrium segmenta- Decoder Conv Conv [3, 3]/ 1 64*80*1
tion network and corresponding parameters, respectively. Block 2_2
Table I summarizes the detailed structure of the endometrium Decoder Conv Upsample by 4 256*320*256
Block 2_3
segmentation network.
FIG. 3. The structure of the proposed endometrium segmentation network. [Color figure can be viewed at wileyonlinelibrary.com]
FIG. 4. The structure of the proposed key-point discriminator [Color figure can be viewed at wileyonlinelibrary.com]
The discriminator is guided by the two constraints, the For guiding the discriminator with the key points, the
ground-truth endometrium key-point maps P and the spatial ground-truth key-point maps are concatenated with the input
encoded feature fconv3, from the encoder of the endometrium endometrium segmented map which can be either the pre-
segmentation network. The ground-truth endometrium key- dicted or the ground-truth region of the endometrium. Fur-
point maps P 2 R2563204 has four channels as shown in thermore, the spatial encoded features from the third
Fig. 4. Each channel corresponds to each key point of the convolutional layer, fconv3, whose spatial resolution is one-
endometrium, which is on the boundary of the endometrium fourth of the original input resolution, are fed into the dis-
region and defines the length and thickness of the endome- criminator. The spatial encoded feature helps the discrimina-
trium. As shown in Fig. 4, we put a Gaussian blob on each tor to determine whether the predicted endometrium region
point. The Gaussian blob is defined as follows. Let (xi, yi) corresponds to the input ultrasound image. Table II shows the
denote the coordinates of each key point, where detailed structure of the key-point discriminator.
i 2 f1; 2; 3; 4g. Let r denotes the standard deviation of the
Gaussian blob in a pixel unit. The intensity of the Gaussian
blob on a certain point (x, y) is written as
TABLE II. The detailed structure of the key-point discriminator.
2 2
ðx xi Þ þ ðy yi Þ
f ðx; yÞ ¼ expð Þ: (2) Layer name Filter/stride Output shape
2r2
Input Input size: 256*320*1
The intensity indicates the probability of each key point. It
Discriminator Conv Concat[Endometrium segmented 128*160*64
is effective to provide localization information of the key block 1 map,
points using Gaussian blobs rather than fixed coordinates as GT endometrium Key-Point
shown in Fig. 5, which could allow boundary ambiguity in map]
the training phase. The ground-truth key-point maps have conv [3, 3]/ 1
four channels, which is the number of key points of the endo- conv [3, 3]/ 1
conv [3, 3]/ 2
metrium.
Discriminator Conv conv [3, 3]/ 1 64*80*128
block 2 conv [3, 3]/ 1
conv [3, 3]/ 2
Discriminator Conv Concat[Discriminator 32*40*256
block 3 Conv block 2, Spatial encoded
feature]
conv [3, 3]/ 1
conv [3, 3]/ 1
conv [3, 3]/ 2
Discriminator Conv conv [3, 3]/ 1 16*20*512
block 3 conv [3, 3]/ 1
conv [3, 3]/ 2
Discriminator Conv conv [3, 3]/ 1 16*20*512
block 3 conv [3, 3]/ 1
FC Global Average Pooling 1
FIG. 5. Example of a key point (left) and corresponding Gaussian blob (right) Fully Connected Layer
(r = 2).
TABLE IV. Effect of the key-point map in discriminator. shows the results of the paired t-test which estimates the sta-
tistical significance of using key point maps in the discrimi-
Training way for endometrium Dice Jaccard
nator. The mean difference and 95% confidence interval
segmentation network Coefficient Coefficient
(95% CI) are shown in the Dice coefficient in percentage
Without discriminator 82.38 70.04 units. In the experiment, we calculated the Dice coefficient
With discriminator excluding key-point map 82.41 70.08 for each image on all five test folds. Therefore, the sample
With discriminator including key-point maps 82.67 70.46 size was 3356 images, which is the same size of the entire
(proposed method) dataset. We used this per-image Dice coefficient to estimate
the statistical significance of the difference between the
Bold value indicates the effect of the key-point map in discriminator.
results.
training the endometrium segmentation network with key- First, we estimated the statistical significance of using the
point discriminator improved the performance to 82.67% for proposed discriminator with key-point maps compared to the
the Dice coefficient and 70.46% for the Jaccard coefficient. method without the discriminator. As shown in the second
This shows that key-point information is useful for detailed row, fifth column of Table V, the calculated P-value is less
segmentation of the endometrium. than 0.05 (P-value = 0.0132), which means that the arith-
metic mean of difference is statistically significant. Therefore,
the proposed discriminator including key-point maps
3.C. Statistical analysis improves the performance by a statistically significant differ-
To estimate the statistical significance in the difference of ence compared to the method that does not use the discrimi-
the reported results, we conducted a paired t-test. Table V nator.
TABLE V. Statistical significance of using the key-point maps in the discriminator using paired t-test method. The mean difference, the 95% confidence interval
(CI) of the differences, and the corresponding P-value reported.
With discriminator including key-point maps (proposed method) Without discriminator 0.2278 0.09184 [0.0477, 0.4078] 0.0132
With discriminator excluding key-point maps Without discriminator 0.1176 0.08916 [0.0572, 0.2925] 0.1871
FIG. 6. Qualitative performance evaluation of the endometrium segmentation. Note that the highlighted region on the image is the endometrial region. (a)-(c) are
the cases that UNet and FCN8s are failed. (d) and (c) are the cases that the TVUS image have heterogeneous texture and unclear boundary.
FIG. 7. Qualitative effect of the proposed discriminator. (a)-(c) are the cases that the TVUS image have heterogeneous texture. (d)-(f) are the cases that the key-
point discriminator helps to predict endometrium region. (g)-(i) are the cases that the TVUS image have unclear boundary.
FIG. 9. Examples of key-point maps with various standard deviations (r). Note that the key-point maps are in one channel for simple visualization.
TABLE VI. The performance of the proposed method according to the stan- The strength of our proposed method is that the endome-
dard deviation of the Gaussian blob. trium segmentation network precisely predicts the endome-
trium region thanks to the key-point discriminator used in the
Standard deviation of
Method Gaussian blob r [pixel] Dice coefficient [%] training phase. The proposed network is useful even in TVUS
images that have ambiguous boundary or a heterogeneous
Proposed method 1 83.51 texture. The proposed method requires the discriminator to
2 83.64 train the endometrium segmentation network.
3 83.45
4 83.36
5. CONCLUSIONS
We propose a key-point discriminator to train the seg-
supposed to be placed on the endometrium boundary in the mentation network for more accurate and reliable segmen-
training TVUS dataset. However, to allow the tolerance of tation of the endometrium on the sagittal TVUS images of
their points position in the training phase, we used a Gaussian the uterus. We design the endometrium segmentation net-
blob on each key-point map. Fig. 9 shows examples of the work employing atrous convolution and ASPP module for
key-point maps with various standard deviations. According dealing with the various size of the endometrium. In order
to Fig. 9, with a small size of the Gaussian blob, the key- to utilize key-point information efficiently, we train the
point maps provided more exact localization to the discrimi- endometrium segmentation network with the proposed
nator. In contrast, with a bigger size of the Gaussian blob, it key-point discriminator in an adversarial way. The key-
allowed more tolerance of the position of the points. point discriminator is guided by key-point maps and the
To investigate the effect of the size of the Gaussian blob spatial encoded feature to distinguish predicted map from
on the performance of the proposed method, we performed the ground-truth strictly. By utilizing key-point informa-
experiments according to different sizes of the intensity of tion, the proposed method shows more reliable and accu-
the blobs (i.e., different standard deviations of Gaussian rate segmentation performance and outperformed the
blobs, r). Table VI shows the performance of the proposed conventional segmentation networks in qualitative and
method with different values of r of the Gaussian blob on a quantitative comparisons.
single fold. As seen in Table VI, the segmentation perfor-
mance of ‘r = 2’ is better than those of other sizes of Gaus-
sian blobs. Comparing to a method without a key-point map ACKNOWLEDGMENT
(Dice coefficient = 82.35%), the method with a key-point
This work was supported partly by the Industrial R&D
map improved the performance of the endometrium segmen-
Project (G01170441), and partly by the ICT R&D Program of
tation network. Then, allowing for the tolerance of the posi-
MSIT/IITP (2017-0-01779, A Machine Learning and Statisti-
tion of the points improved the performance of the
cal Inference Framework for Explainable Artificial Intelli-
endometrium segmentation network.
gence).
To verify the effectiveness of the proposed method, we
compared with other methods. As described in Section 3.B.,
the proposed network achieved 82.67% in Dice coefficient *Both authors contributed equally to this paper
and 70.46% in Jaccard coefficient. The results show that it a)
Author to whom correspondence should be addressed. Electronic mail:
outperforms the FCN8s by 4.28% and 5.99% in Dice coeffi- ymro@ee.kaist.ac.kr; Telephone: (+82) 42-350-8094; Fax: (+82) 42-350-
cient and Jaccard coefficient, respectively. Also, our network 7619.
outperformed the UNet by 23.98% and 28.87% in Dice coef-
ficient and Jaccard coefficient, respectively. REFERENCES
To estimate the statistically superior performance of the
1. Koyama T, Tamai K, Togashi K. Staging of carcinoma of the uterine cer-
proposed method, we conducted a paired t-test. As described vix and endometrium. Eur Radiol. 2007;17:2009–2019.
in Section 3.C., we compared the performance difference 2. Widrich T. Role of Ultrasonography in Infertility. In: Office-Based Infer-
between the performance with the key-point discriminator tility Practice. Springer: Berlin, Germany; 2002:39–48.
3. Gong X-H, Lu J, Liu J, et al. Segmentation of uterus using laparoscopic
and without the key-point discriminator. The experimental
ultrasound by an image-based active contour approach for guiding gyne-
result showed that there was a significant performance differ- cological diagnosis and surgery. PLoS ONE. 2015;10:e0141046.
ence between the two methods with a P- 4. Machtinger R, Korach J, Padoa A, et al. Transvaginal ultrasound and
value = 0.0132 < 0.05. Therefore, the proposed discrimina- diagnostic hysteroscopy as a predictor of endometrial polyps: risk factors
for premalignancy and malignancy. Int J Gynecol Cancer. 2005;15:325–
tor that included key-point maps improved the performance 328.
by a statistically significant difference compared to the 5. Gonen Y, Casper RF, Jacobson W, Blankier J. Endometrial thickness
method without discriminator. and growth during ovarian stimulation: a possible predictor of implanta-
For inferring the segmentation, we used only the endome- tion in in vitro fertilization. Fertil Steril. 1989;52:446–450.
6. Alam V, Bernardini L, Gonzales J, Asch RH, Balmaceda JP. A prospec-
trium segmentation network. When inferring the endometrium tive study of echographic endometrial characteristics and pregnancy rates
segmentation on one TVUS image, it took 0.0114 seconds per during hormonal replacement cycles. J Assist Reprod Genet.
image with NVIDIA GeForce GTX 1080 TI. 1993;10:215–219.
7. Israel R, Isaacs JD, Wells CS, et al. Endometrial thickness is a valid 19. Chen L-C, Papandreou G, Kokkinos I, Murphy K, Yuille AL. Deeplab:
monitoring parameter in cycles of ovulation induction with menotropins Semantic image segmentation with deep convolutional nets, atrous con-
alone. Fertil Steril. 1996;65:262–266. volution, and fully connected crfs. IEEE Trans Pattern Anal Mach Intell.
8. Weissman A, Gotlieb L, Casper RF. The detrimental effect of increased 2018;40:834–848.
endometrial thickness on implantation and pregnancy rates and outcome 20. Chen L-C, Papandreou G, Schroff F, Adam H. Rethinking atrous
in an in vitro fertilization program. Fertil Steril. 1999;71:147–149. convolution for semantic image segmentation. arXiv preprint
9. Levine DJ, Berman JM, Harris M, Chudnoff SG, Whaley FS, Palmer arXiv:170605587. 2017.
SL. Sensitivity of myoma imaging using laparoscopic ultrasound com- 21. Chen L-C, Zhu Y, Papandreou G, Schroff F, Adam H. Encoder-decoder
pared with magnetic resonance imaging and transvaginal ultrasound. J with atrous separable convolution for semantic image segmentation.
Minim Invasive Gynecol. 2013;20:770–774. Paper presented at: European Conference on Computer Vision 2018.
10. Kim ST, Lee J-H, Lee H, Ro YM. Visually interpretable deep network 22. Sahiner B, Pezeshk A, Hadjiiski LM, et al. Deep learning in medical
for diagnosis of breast masses on mammograms. Phys Med Biol. imaging and radiation therapy. Med Phys. 2019;46:e1–e36.
2018;63:235025. 23. Dalmısß MU, Litjens G, Holland K, et al. Using deep learning to segment
11. Kim DH, Kim ST, Chang JM, Ro YM. Latent feature representation with breast and fibroglandular tissue in MRI volumes. Med Phys.
depth directional long-term recurrent learning for breast masses in digital 2017;44:533–546.
breast tomosynthesis. Phys Med Biol. 2017;62:1009. 24. Jin C, Shi F, Xiang D, Zhang L, Chen X. Fast segmentation of kidney
12. Kim ST, Kim DH, Ro YM. Detection of masses in digital breast components using random forests and ferns. Med Phys. 2017;44:6353–
tomosynthesis using complementary information of simulated projec- 6363.
tion. Med Phys. 2015;42:7043–7058. 25. Goodfellow I, Pouget-Abadie J, Mirza M, et al. Generative adversarial
13. Kim DH, Kim ST, Ro YM. Improving mass detection using combined nets. Paper presented at:Neural Information Processing Systems; 2014.
feature representations from projection views and reconstructed volume 26. Pfau D, Vinyals O. Connecting generative adversarial networks and
of DBT and boosting based classification with feature selection. Phys actor-critic methods. Paper presented at: Neural Information Processing
Med Biol. 2015;60:8809. Systems Workshop; 2016.
14. Long J, Shelhamer E, Darrell T. Fully convolutional networks for seman- 27. Yu F, Koltun V. Multi-scale context aggregation by dilated convolutions.
tic segmentation. Paper presented at: IEEE conference on Computer Paper presented at: International Conference on Learning Representa-
Vision and Pattern Recognition; 2015. tions. 2016.
15. Ronneberger O, Fischer P, Brox T. U-net: convolutional networks for 28. Radford A, Metz L, Chintala S. Unsupervised representation learning
biomedical image segmentation. Paper presented at: International Con- with deep convolutional generative adversarial networks. Paper pre-
ference on Medical image computing and computer-assisted interven- sented at: International Conference on Learning Representations 2015.
tion. 2015. 29. Milletari F, Navab N, Ahmadi S-A. V-net: Fully convolutional neural
16. Drozdzal M, Vorontsov E, Chartrand G, Kadoury S, Pal C. The impor- networks for volumetric medical image segmentation. Paper presented
tance of skip connections in biomedical image segmentation. In: Deep at: International Conference on 3D Vision 2016.
Learning and Data Labeling for Medical Applications. Springer: Berlin, 30. Kingma DP, Ba J. Adam: a method for stochastic optimization. Paper
Germany; 2016:179–187. presented at: International Conference for Learning Representations
17. Badrinarayanan V, Kendall A, Cipolla R. Segnet: a deep convolutional 2015.
encoder-decoder architecture for image segmentation. IEEE Trans Pat- 31. Pezeshk A, Hamidian S, Petrick N, Sahiner B. 3D convolutional neural
tern Anal Mach Intell. 2017;39:2481–2495. networks for automatic detection of pulmonary nodules in chest CT.
18. Noh H, Hong S, Han B. Learning deconvolution network for semantic IEEE J Biomed Health Informat. 2018;1–1.
segmentation. Paper presented at: Proceedings of the IEEE international
conference on computer vision 2015.