Академический Документы
Профессиональный Документы
Культура Документы
PII: S0169-2607(17)31236-1
DOI: 10.1016/j.cmpb.2018.01.009
Reference: COMM 4596
Please cite this article as: Alvaro David Orjuela-Cañón , Jorge Eliécer Camargo Mendoza ,
Carlos Enrique Awad Garcı́a , Erika Paola Vergara Vela , Tuberculosis diagnosis support analysis for
precarious health information systems, Computer Methods and Programs in Biomedicine (2018), doi:
10.1016/j.cmpb.2018.01.009
This is a PDF file of an unedited manuscript that has been accepted for publication. As a service
to our customers we are providing this early version of the manuscript. The manuscript will undergo
copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please
note that during the production process errors may be discovered which could affect the content, and
all legal disclaimers that apply to the journal pertain.
ACCEPTED MANUSCRIPT
1
Highlights
T
IP
CR
US
AN
M
ED
PT
CE
AC
ACCEPTED MANUSCRIPT
2
Title Page
T
Jorge Eliécer Camargo Mendoza
IP
Systems Engineering Faculty,
Universidad Antonio Nariño,
CR
Bogota D.C., Colombia
*Correspondence Details:
Address: Carrera 3 Este No. 47A – 15 Bloque 4 Piso 1, Bogota D.C., Colombia
PT
Abstract:
Background and objective: Pulmonary Tuberculosis is a world emergency for the World Health
Organization. Techniques and new diagnosis tools are important to battle this bacterial infection.
There have been many advances in all those fields, but in developing countries such as Colombia,
where the resources and infrastructure are limited, new fast and less expensive strategies are
increasingly needed. Artificial neural networks are computational intelligence techniques that can
be used in this kind of problems and offer additional support in the tuberculosis diagnosis process,
providing a tool to medical staff to make decisions about management of subjects under suspicious
T
of tuberculosis.
Materials and Methods: A database extracted from 105 subjects with precarious information of
IP
people under suspect of Pulmonary Tuberculosis was used in this study. Data extracted from sex,
age, diabetes, homeless, AIDS status and a variable with clinical knowledge from the medical
CR
personnel were used. Models based on artificial neural networks were used, exploring supervised
learning to detect the disease. Unsupervised learning was used to create three risk groups based on
US
available information.
Results: Obtained results are comparable with traditional techniques for detection of tuberculosis,
showing advantages such as fast and low implementation costs. Sensitivity of 97% and specificity
AN
of 71% where achieved.
Conclusions: Used techniques allowed to obtain valuable information that can be useful for
physicians who treat the disease in decision making processes, especially under limited
M
1. Introduction
According to the World Health Organization (WHO), tuberculosis (TB) is a disease considered a major
CE
global health problem. Around 10.4 million of new cases were reported in 2015, with 1.4 million
associated deaths [1]. For the same year, Colombia had a TB incidence of 24.2 cases per 100,000
AC
inhabitants; 12,978 reported cases of which 90.2% were new cases. Regions with highest incidence were
Amazonas and Chocó with 72.1 and 45.4 cases per 100,000 inhabitants, respectively [2].
It is known that in both cited regions, as in other developing countries where incidence of TB is
high [1], detection tasks are still a hard work. There, hospital and medical infrastructure used to diagnose
TB is limited, and it is not possible to have sophisticated and elaborated laboratories to achieve minimal
requirements to diagnose TB. According to [2], 73.9% of the cases were confirmed by laboratory, making
necessary to get better resources for improving detection and treatment in primary health of TB.
ACCEPTED MANUSCRIPT
4
According to these challenging conditions, demands to propose new alternatives for TB diagnosis are
necessary, offering to reduce the costs and time characteristics, especially in locations with difficulties to
Computational intelligence techniques are based on models inspired by biology. For instance,
fuzzy logic techniques work with human semantic to be involved in computing processes, genetic
algorithms use evolution theory to find a solution by solving optimization problems, and artificial neural
networks (ANN) use connectionist models emulating the behavior of the brain [3].
T
ANN have demonstrated its ability to find solutions in health problems providing support to
IP
physicians in diagnosis medical tasks. Studies include ANN with supervised learning, where the objective
CR
is to detect the disease, establishing a nonlinear mapping between input information and the output (TB
positive or negative), for which the network was trained. Also, there is an unsupervised learning approach
US
in which the network recognizes similarities in the input data arranging groups, which can be seen as risk
Considering just the TB diagnosis field, many and different applications have been reported
AN
around the world. One of the first models was sated by El-Solh in USA [6]. For this study, the author
collected enough information from each subject, including clinical variables, laboratory tests and X-Ray
M
images of the thorax. Results achieved 100% of sensitivity and 69% of specificity. In Latin America,
Brazil is the leader using these approaches, where different efforts have contributed to dismiss pulmonary
ED
TB incidence. Proposals with different input information have been exposed, having results with
sensitivity values upper than 80%, reaching rates of 100% in some cases. For specificity, results have
PT
been less satisfactory, with registered values that have dropped to 40%, in the worst case [7–11]. All
those results show differences according to available information used in each study. Some studies just
CE
used a couple of variables, and a very few cases used all medical data of patients. It also depends on the
Around the world there are some examples of using computational intelligence models,
especially in developing countries [12–16]. Unsupervised methods have been used to train models that
support the diagnosis process; the most representative works are reported in [10, 17–19], from which the
last two address pleural TB. Very few studies have been conducted with low quality data or with small
datasets. This is a common problem in Colombia, where faraway places do not have enough
infrastructure or information systems that allow to collect enough information with the expected quality.
ACCEPTED MANUSCRIPT
5
ANN models can be trained in a supervised way, where an input-output relation is learned to
extract complex non-linear patterns from the dataset. Multilayer perceptrons (MLP) are models composed
by units known as neurons, configuring a neural network with feedforward connections. Unsupervised
learning models of ANN can be implemented using the Self Organizing Maps (SOM), which were
introduced by Kohonen and Somervuo taking as inspiration the functionality of visual, aural and sensory
areas of the brain [5]. A SOM is widely used to visualize relations between variables in high volumes of
data, in which the architecture is used to represent the data into a nonlinear and reduced new space,
T
typically of two dimensions (such as a map). This new space has similar characteristics of the input space,
IP
highlighting patterns that cannot be easily seen in the input space.
CR
This paper shows a computational intelligence method, based on ANN, as a tool to support the
TB diagnosis process, where resources for medical systems and specialized infrastructure are precarious
US
or do not exist, and hostile environments do not allow to get enough data. In the present case, information
extracted from sex, age, status related with diabetes, homeless and immunodeficiency, combined with a
clinical knowledge was used to train the computational intelligence models. Two approaches were used:
AN
one based on supervised learning, which helps to detect the disease, and the other one based on
a) Database
PT
Database was obtained from the TB Program at Hospital Santa Clara (HSC) in Bogotá D.C. - Colombia.
The HSC is a public institution with patients with a low socioeconomic status. This population is
CE
vulnerable and belongs to an area of the city with high overcrowding conditions and risk of sexual
transmission diseases. Commonly, medical consultation of these patients is belated and most of them
AC
Information from people under suspect of pulmonary tuberculosis in the period of January 2008
to March 2011 was considered. The Ethics and Research Committee of the HSC approved this study. An
informed consent was not needed because all data were obtained in a retrospective and anonymous mode.
Only data from subjects with confirmed diagnostic were considered (using culture and individuals that
finished the anti-TB therapy). At the end, information of 105 subjects was used: 83 subjects (79%) with
TB confirmed and 22 subjects (21%) that were determined without the disease using diagnosis of
ACCEPTED MANUSCRIPT
6
exclusion. An empirical treatment was started with these subjects. Individuals with confirmed TB
Confirmation of the TB cases was achieved using a culture test. For TB negative cases, tests did
not have a positive culture test, other disease was found meanwhile the treatment, and as mentioned, a
Considered information was extracted of subjects with suspicion of TB. A first examination of
signs and symptoms was performed by medical personnel, and a clinical suspicion diagnosis was
T
determined. This data was represented in an input variable named “Clinical information”, which takes a
IP
“1” value when just the medical report was considered, and “0” when other test result or additional
CR
information lead the subject to start the treatment.
Other included variables were extracted from sex, age, homeless and diabetes status, and HIV
US
(human immunodeficiency virus)/AIDS (acquired immunodeficiency syndrome) status. This last was
determined using the study of clinical suggestion and confirming the status with exams, but without
complementary information as CD4 cell or load viral. All variables were coded with zeros and ones
AN
according to negative or positive, respectively (Table 1). Age variable was maintained as numeric, with
its original information, and a normalization given by the maximum value was achieved. This procedure
M
was developed to avoid saturation of values in the synaptic weights of network and to avoid a wrong
Variable Quantity
Male 72 (69%)
Sex
Female 33 (31%)
CE
Yes 28 (27%)
HIV/AIDS
No 77 (73%)
Yes 33 (27%)
Homeless
No 72 (72%)
Yes 2 (2%)
Diabetes
No 103 (98%)
For estimating the statistical error and generalization of the models, using the explained dataset,
a technique of cross-validation was employed [4]. In this technique, the dataset was divided into three sets
ensuring that data from people without and with the disease are equitable distributed. This is performed to
ACCEPTED MANUSCRIPT
7
assess the generalization of the trained model, preserving a portion of data that was not used in training.
b) Multilayer Perceptrons
T
IP
Commonly, one hidden layer and one output layer are enough to solve classification problems [4]. The
ANN used in this work had an input layer composed of seven units, each one for each variable, and one
CR
output layer composed just of one neuron. Values of +1 and -1 were used to represent if input data
corresponds to a patient with TB or not, respectively. Neurons in the hidden layer were established in an
US
experimental way, testing from two to ten neurons. All neurons had a hyperbolic tangent function as
activation function.
AN
Between different algorithms to train the ANN, resilient backpropagation was used because its
speed and low computational cost [20, 21]. In training, also a cross-validation strategy was considered. In
M
each case, training was performed with two sets (see Table 2) and results for validation were computed
with the left out set, maximizing the classification rate between TB and no disease. To avoid overfitting,
ED
an early stopping procedure was implemented. Performances of the obtained models were evaluated using
sensitivity, specificity, classification rate, positive prediction, and negative prediction measures. The
PT
different trainings were performed employing MATLAB 2016a (The MathWorks, Inc, Natrick, MA)
As an additional contribution, a relevance analysis was conducted obtaining the best result [23].
This relevance was computed replacing the original value of the input by its mean value. The procedure
AC
was done for all the input variables, where it was possible to assess the importance of each variable for
T
IP
CR
Figure 1. MLP model used. Number of neurons in the hidden layer is obtained in an experimentally way.
Font: Authors.
The training of the SOM was performed in an unsupervised manner in a process of three steps: one of
M
them competitive, then a cooperative one, and finally, an adaptive. In this case, competition consisted of
an input vector with information of each patient codified as explained before. This vector was presented
ED
to the map and compared with the information of synaptic weights of each neuron, choosing the best
matched unit (BMU), and finalizing the competitive step. A radio of neurons was taken up to execute the
PT
cooperative step, and then the adaptive step was performed to change the synaptic weights of the involved
neurons.
CE
Before training, SOM architecture must be fixed. In the cooperative step, it is necessary to
provide parameters such as number of neurons, size of the map, type of lattice, and neighborhood
AC
function. All these parameters obey to experimental rules with some initial information. To build the map,
length and width dimensions of the map correspond to the analysis of the inertias in a Multiple
Correspondence Analysis (MCA), where the ratio of the two first inertias is the same as the ratio of wide
and high dimensions. The initial value of these dimensions decreases until most of the neurons are
activated by the data. Finally, a hexagonal topology of the lattice was chosen because the distance
between the centers of the neurons is the same for all of them.
ACCEPTED MANUSCRIPT
9
According to the medical experience and based on previous works [17, 19], three groups were
proposed maintaining correspondence with the risk of having the disease: high, medium and low. This
was adjusted after SOM training, where the k-means algorithm was applied to create the groups of
neurons. This algorithm works based on distances between neurons given by synaptic weights, and
joining the closest neurons based on three groups [24]. To label each group, the trained map was used for
computing the number of activations for every group of neurons, using data from patients with TB and
without TB from the training data. Then, assignation of the risk was determined by the number of
T
activations for these both states (high and low risk). A third group was considered when the number of
IP
activations in the group was similar for the two states previously mentioned, and then a medium risk was
CR
determined. The used software to implement the described methodology as MATLAB 2016a (The
3. Results
US
AN
Table 3 shows mean and standard deviation of the results for the three sets used in the cross-validation
technique, and then a 95% confidence interval (CI) was computed. For MLP models, the chosen neural
network had six neurons in the hidden layer according to the best obtained results.
M
Using the MLP chosen model, the before mentioned relevance study was implemented,
CE
obtaining the results described in Figure 2. It is possible to see that the variable with the highest impact is
the clinical information. When this input was replaced by its mean value, the classification rate (TB
AC
positive/negative) dropped under 75% of accuracy using the MLP model. The same procedure using other
input variables did not show major influence in the results, with classification rates higher than 95%.
For SOM models, the map with the best performance in the three sets was chosen. Figure 3
shows the U matrix, visualizing distances between the resultant map with 4 x 3 neurons. Bluish colors
represent closer locations of the neurons and reddish colors represent the largest distances. In a first sight
it is possible to see three regions in the trained map. One of them in the top area, other in the left-bottom
T
IP
CR
US
Figure 2. Relevance analysis using MLP models. Clinical input was the most relevant for the MLP model.
(M = Male, F = Female, Clinical = Clinical Information, HIV = HIV/AIDS status)
AN
M
ED
PT
CE
Figure 3. U matrix for the best SOM model. Colorbar on the right side points out the closeness level
AC
between neurons.
The k-means algorithm was applied with three regions (k=3), and these regions were labeled.
The map with the colored regions according to the risk is displayed in Figure 4. In the left side (Figure
4a), the labeled map was activated just with data of confirmed patients with TB. The number of
activations are shown with numbers in each neuron. For subjects considered as TB negative, this
information is also avaliable (Figure 4b). Table 4 has the summary of the results for this set. Sensitivity
ACCEPTED MANUSCRIPT
11
and specificity were computed taking the high and medium risk as a method to detect the disease (see
Table 3).
T
IP
CR
a) Activations with information of TB confirmed b) Activations with information of negative TB
subjects.
subjects.
US
Figure 4. Labeled map. Three regions based on risk were obtained: red (high risk), yellow (medium risk)
and green (low risk).
AN
Other important information provided by the SOM models are the maps for each input (see
Figure 5). This visual tool is useful to relate the labeled map with each variable. From Figure 4, it was
CE
established that the right-bottom region of the map represents the high risk. When this information is
compared with the maps of the variables, age and diabetes status variables show elevated values in that
AC
region of the map. HIV/AIDS status variable has a relation with the low risk group, because its maximum
4. Discussion
ANN models had comparable result, with sensitivity of 97% for detecting the disease using MLP models.
These results are better than other methods such as sputum smear (20-80%) and Xpert MTB/RIF ® (88%)
[26, 27]. An advantage of ANN models compared with these methods is the reduced demand of
ACCEPTED MANUSCRIPT
12
requirements in time and general costs. Also, it is possible to implement this support tool using paper or
software resources, where its development and replication can happen in an accelerated way. This is
especially relevant when medical infrastructure is insufficient and there are not specific laboratories to
develop a bacteria culture or there are not Xpert ® machines. The presented method represents an
T
IP
CR
a) Male b) Female
US
AN
M
g) Diabetes Status
Figure 5. Maps for each input variable. Colorbar on the right side points out the closeness level between
neurons.
ACCEPTED MANUSCRIPT
13
As a support system, results show that these models can contribute to make decisions about
treatment for patients. To make medical decisions using just a clinical suspicious it can be seen as a
system with 36% of sensitivity. The evaluated models can improve this detection rate, avoiding the use of
results of culture, taking less time to make a final decision about treatment anti-TB.
Differences in the CI estimation were observed between supervised and unsupervised models.
This is due to the use of cross-validation. For MLP models, CI was obtained from one hundred
initializations of synaptic weights for each model, and for SOM models, this value was obtained from the
T
three sets used in cross-validation, showing a higher dispersion [5]. In SOM models, it is not common to
IP
see this kind of validation technique. However, to evaluate the clustering in data, this technique was
CR
employed to avoid the bias caused by just one model. With respect to results found in other studies [17,
28], the obtained performance is comparable with differences in the validation technique.
US
For MLP models it was possible to measure the relevance of the input variables, where clinical
information was the most relevant. The information of this variable was suppressed, and results dropped
around 23% (see Figure 3). This result was the expected since the system is generated as a complement to
AN
the decision making process, where the medical staff has additional information to send the patient to
treatment. It is known that TB diagnosis is a complex task, and it is not limited to information of just one
M
variable [6]. In this case, ANN have the capacity to give a solution with a nonlinear processing that
establishes relations between input data and an output for which it has been trained.
ED
It is difficult to compare our models in a exactly way with respect to previous results [7–9, 14,
15, 17]. These cited studies treat at least ten input variables, some of them with specific radiologic tests,
PT
and additional information about patients. As mentioned, data were collected retrospectively, and it is
possible to find bias in them. Incomplete information and false data could modify results, however the
CE
challenging conditions were considered from the beginning. Currently, new studies including a better way
About used variables, experiments developed with different codifications for age were
implemented, dividing all range into age intervals (0-15, 15-60, >60 years old) and assigning one-zero
vectors. These results were not shown in this paper because of its similarity with the presented results. It
was preferred to maintain original age information, just with a normalization process as mentioned.
Maps of age, clinical information, homeless, HIV/AIDS and diabetes status variables showed a
vertical division, compared with sex variable, which has a horizontal division. A variable with higher
ACCEPTED MANUSCRIPT
14
relation with the high-risk group is the diabetes status, but the distance between its values in the map is
short. Behavior of clinical and HIV/AIDS information should be topic of a deeper study. This, because
both maps show an inverse relation between elevated values of the variables and the high-risk group, in
Finally, three risk groups in the labeled map can be useful in management of treatment.
Information of patients that activated the medium risk group can be reviewed with more detail.
Activations of the high and low risk groups indicate to send the patient to the beginning of the therapy
T
and to send the patient to home, respectively. In the presented case, and for the shown results, 13 of 35
IP
patients under suspicious needed more special examination, saving around a third part of time.
CR
5. Conclusions
Models based on ANN are useful to support the pulmonary Tuberculosis under limited
US
resources. In a database analyzed with basic information, MLP can detect the disease reaching a
sensitivity of 97%. Clustering studies using SOM networks allowed to find relevant relations between
AN
input variables and three risk groups previously established. In this last case, the detection reached 89% in
sensitivity. The tool permitted to find 13 of 35 patients without conclusive diagnostic, needing more
M
Acknowledgements
This work was supported by the Universidad Antonio Nariño under project number 2016207. Authors
PT
thank the Hospital Santa Clara for the support in the development of this work.
CE
Conflict of Interest:
Nariño. Carlos Awad and Erika Vergara declare that they have no conflict of interests.
6. References
1. Organization WH, others (2015) Global tuberculosis report 2015. World Health Organization
30
6. El-Solh AA, Hsiao C-B, Goodnough S, Serghani J, Grant BJB (1999) Predicting active
T
pulmonary tuberculosis using an artificial neural network. Chest J 116:968–973
IP
7. dos Santos Alves E, Souza Filho JBO, Galliez RM, Kritski A (2013) Specialized MLP classifiers
CR
to support the isolation of patients suspected of pulmonary tuberculosis. In: Comput. Intell. 11th
Brazilian Congr. Comput. Intell. (BRICS-CCI CBIC), 2013 BRICS Congr. pp 40–45
8.
US
Santos AM dos, Pereira B de B, Seixas JM de, Mello FCQ, Kritski AL (2007) Neural networks:
an application for predicting smear negative pulmonary tuberculosis. In: Adv. Stat. methods Heal.
AN
Sci. Springer, pp 275–287
9. e Souza JB de O, Vieira APP, de Seixas JM, Aguiar FS, de Queiroz Mello FC, Kritski AL, others
M
(2012) An intelligent system for managing the isolation of patients suspected of pulmonary
tuberculosis. In: Int. Conf. Intell. Data Eng. Autom. Learn. pp 818–825
ED
10. Maidantchik C, Kritski A, Gomes AS, et al (2011) A decision support system based on artificial
PT
neural networks for pulmonary tuberculosis diagnosis. INTECH Open Access Publisher
CE
11. Seixas JM, Faria J, Souza F, Vieira AFM, Kritski A, Trajman A, others (2013) Artificial neural
network models to support the diagnosis of pleural tuberculosis in adult patients. Int J Tuberc
AC
12. Asha T, Natarajan S, Murthy KNB (2011) Effective Classification Algorithms to Predict the
Accuracy of Tuberculosis-A Machine Learning Approach. Int J Comput Sci Inf Secur 9:89
13. Benfu Y, Hongmei S, Ye S, Xiuhui L, Bin Z (2009) Study on the artificial neural network in the
diagnosis of smear negative pulmonary tuberculosis. In: Comput. Sci. Inf. Eng. 2009 WRI World
14. Elveren E, Yumu\csak N (2011) Tuberculosis disease diagnosis using artificial neural network
15. Er O, Temurtas F, Tanrikulu AÇ (2010) Tuberculosis disease diagnosis using artificial neural
16. Uçar T, Karahoca A, Karahoca D (2013) Tuberculosis disease diagnosis by using adaptive neuro
fuzzy inference system and rough sets. Neural Comput Appl 23:471–483
T
IP
17. Aguiar FS, Torres RC, Pinto JVF, Kritski AL, Seixas JM, Mello FCQ (2016) Development of
two artificial neural network models to support the diagnosis of pulmonary tuberculosis in
CR
hospitalized patients in Rio de Janeiro, Brazil. Med Biol Eng Comput 54:1751–1759
18.
US
Orjuela-Cañón AD, de Seixas JM, Trajman A (2013) SOM Neural Networks as a Tool in Pleural
Tuberculosis Diagnostic. In: Braga A de P, Bastos Filho CJA (eds) Ann. 11th Brazilian Congr.
AN
Comput. Intell. SBIC, Porto de Galinhas, PE, pp 1–5
19. Orjuela-Cañón AD, de Seixas J (2013) Fuzzy-ART neural networks for triage in pleural
M
tuberculosis. In: Heal. Care Exch. (PAHCE), 2013 Pan Am. pp 1–4
20. Riedmiller M, Braun H (1993) A direct adaptive method for faster backpropagation learning: The
ED
RPROP algorithm. In: Neural Networks, 1993., IEEE Int. Conf. pp 586–591
PT
21. Naoum RS, Abid NA, Al-Sultani ZN (2012) An enhanced resilient backpropagation artificial
neural network for intrusion detection system. Int J Comput Sci Netw Secur 12:11
CE
22. Beale MH, Hagan MT, Demuth HB (2011) Neural network toolbox getting started guide R2011b.
AC
23. Seixas JM, Calôba LP, Delpino I (1996) Relevance criteria for variable selection in classifier
(2015) Self-organization and missing values in SOM and GTM. Neurocomputing 147:60–70
ACCEPTED MANUSCRIPT
17
26. Sester M, Giehl C, McNerney R, et al (2010) Challenges and perspectives for improved
27. Steingart KR, Schiller I, Horne DJ, Pai M, Boehme CC, Dendukuri N (2014) Xpert{®}
MTB/RIF assay for pulmonary tuberculosis and rifampicin resistance in adults. Cochrane Libr.
28. Orjuela-Cañón AD, de Seixas JM, Trajman A (2013) SOM Neural Networks as a Tool in Pleural
Tuberculosis Diagnostic. In: Braga A de P, Bastos Filho CJA (eds) An. do 11 Congr. Bras.
T
Inteligência Comput. SBIC, Porto de Galinhas, PE, pp 1–5
IP
CR
US
AN
M
ED
PT
CE
AC