Вы находитесь на странице: 1из 6

Estimating HIV incidence from grouped cross-sectional data in setting...

1 of 6

http://www.labome.org/research/Estimating-HIV-incidence-from-grou...

Estimating HIV incidence from grouped cross-sectional data in settings where


anti-retroviral therapy is provided

Topics
hiv infections
malawi

Humphrey Misiri (hmisiri at gmail dot com)


Public Health Department, College of Medicine, University of Malawi, Blantyre, Malawi
DOI http://dx.doi.org/10.13070/rs.en.2.1324
Date 2015-03-02
Cite as Research 2015;2:1324
License CC-BY

Abstract

Prevalence and incidence are measures that are used for monitoring the occurrence of a disease.

Prevalence can be computed from readily available cross-sectional data but incidence is traditionally computed
from longitudinal data from longitudinal studies. Longitudinal studies are characterised by financial and logistical
problems where as cross-sectional studies are easy to conduct. This paper introduces a new method for
estimating HIV incidence from grouped cross-sectional sero-prevalence data from settings where antiretroviral
therapy is provided to those who are eligible according to recommended criteria for the administration of such
drugs.

Introduction
Antiretroviral therapy (ART) has helped to alleviate the suffering of AIDS patients in the world. In many countries,
patients have access to ART. In Malawi, ART is also available for free but not all HIV positive persons have access to
ART. By 2011, over 30% of HIV positive persons were on ART [1].
Incidence is a very important measure of disease occurrence. If the incidence of HIV is known, it is easy to monitor
its spread. On the other hand, prevalence alone does not give complete information about the magnitude of the
spread of HIV or any disease in general.
Consider a virulent disease like Ebola which kills after just a few days from infection. Individuals who are infected
with the Ebola virus die after a very short illness if no meaningful therapeutic intervention is available. In that case,
prevalence can never give a true picture of the extent of an Ebola epidemic since those who die from the disease are
never counted. As a result, a low prevalence of Ebola does not mean Ebola is about to be non-existent or is almost
eradicated from a community. On the other hand, the incidence of Ebola is the best measure which can be used to
monitor the disease since Ebola deaths are included in its computation. Consequently, incidence gives a true picture
of an Ebola epidemic. In the same vein, HIV incidence gives a true picture of the spread of HIV in a community.
Traditionally, incidence is computed from data from longitudinal studies. Unfortunately, there are many financial and
logistical problems associated with conducting longitudinal studies. To avoid these drawbacks, a viable alternative is
to estimate incidence from data from cross-sectional studies. Two good examples of methods for achieving this are
models by Podgor and Leske (1986) and Misiri et al (2012) [2, 3]. These models produce estimates of incidence
which are adjusted for differential mortality. Both approaches are for estimating HIV incidence where ART is not
properly rolled out in the community. It is possible to estimate the incidence of HIV from cross-sectional data from a
population where ART is provided.
The aim of this paper is to introduce a new method of estimating HIV incidence in settings where ART is provided to
HIV positive people who need it regardless of the extent of coverage of such services. This method also adjusts for
differential mortality.
Materials and Methods
Motivation

Podgor and Leske (1986) proposed a method for estimating incidence from grouped cross-sectional data [3]. In the
spirit of Podgor and Leske(1986), we proceed to motivate our approach. Let
the HIV incidence, 3 be the rate of HIV mortality in the absence of ART,
therapy,

be the rate of natural mortality, 2 be


be the rate of recruitment to ARV

be the rate of mortality among ART recipients.

Let X1, X2, X3, X4 and X5 be independent random variables where X1 is the time to death from natural causes, X2 is
the time to HIV infection, X3 is the time to death whilst HIV positive, X4 is the time to ART registration and X5 is the
time to death whilst on ART. It follows from the above description that X1, X2, ... , X5 have exponential distributions
with parameters 1, 2, 3, 4 and 5 respectively.
We will proceed by dividing the population into three strata namely: HIV negative persons, HIV positives on ART and
HIV positives who are not on ART. Denote the total proportion of HIV positives by P0, the proportion of positives who

2015-03-27 09:27 PM

Estimating HIV incidence from grouped cross-sectional data in setting...

2 of 6

http://www.labome.org/research/Estimating-HIV-incidence-from-grou...

are not on ART by P01 and the proportion of positives who are on ART by P02. Both P01 and P02 are proportions of
the population.
Consider an interval [x, x+t]. The number of HIV positives at the end of the interval is
(1)

N1P1=N0P0S1+ N0(1-P0)S2

S1 is the probability of surviving the interval given that one entered the interval already infected.
S2 is the probability of being infected in the interval given that one was HIV negative at the beginning of the interval.
Furthermore, the number of HIV negatives at the end of the interval is
(2)

N1(1-P1)=N0(1-P0)S3

where S3 is the probability of surviving the interval without contracting HIV


According

to

the

relationship

among

these

exponential

random

variables

[3,

4]

(3)

1
1

=1

4)

3+ 4

3+ 4

0
1

S2=1

1+ 2 3

1+ 2

1+ 2 3

0
1

and

=1

2)

1+ 2

1+ 2

In the interval [x, x+t], some people may have just been registered to receive ART but some were already registered
prior to entering the interval. Therefore the formula in (1) above does not capture the number of infected people in [x,
x+t] in a setting where ART is provided. If ART is provided, at the end of the interval there are two groups of HIV
positive individuals namely those who are not on ART and those who are on ART.
Not every infected person is eligible for ART. For example, an individual who gets infected with HIV in a 5-year
interval can never be eligible for ART as the therapy is for HIV positives who are in a reasonably advanced stage of
infection. Therefore, the number of HIV positive individuals who are on ART at the end of the interval is the sum of
old HIV positives who entered the interval already on ART and those HIV positives who are newly registered to
receive ART. This can be denoted by
(4)

N0P02S4 + N0P01S5

S4 is the probability of surviving to the end of the interval whilst on ART given than one was already on ART at the
beginning of the interval
S5 is the probability of surviving to the end of the interval having been newly recruited to receive ART given than one
was not on ART at the beginning of the interval
Using the relationship between independent exponential random variables as described in Lagakos(1976) on pages
553 through 555 [4], these probabilities are defined as follows:
4

5
1
4

3+ 4 5

3+ 4

3+ 4 5

Therefore (4) becomes

(5)

0 02

0 01

3+ 4

3+ 4 5

The number of HIV positives at the end of the interval is therefore

(6)

1 1

0 01

3+ 4

0 (1

0 )

1+ 2

1+ 2 3

0 02

0 01

3+ 4

3+ 4 5

The number of HIV negative persons at the end of the interval is


(7)

N1(1-P1) =N0(1-P0)S6

2015-03-27 09:27 PM

Estimating HIV incidence from grouped cross-sectional data in setting...

where

S6

is

the

probability

of

remaining

HIV

negative

http://www.labome.org/research/Estimating-HIV-incidence-from-grou...

having

survived

the

interval.

Now,

1
6

2)

1+ 2

1+ 2

. Therefore the equation in (7) becomes

0
1 (1

(8)

1)

0 (1

From (8) we have that

0)

1+ 2

1 0
0 1
1

1+ 2

.
1 0
0 1
1

Therefore the left hand side of equation (6) becomes

Consequently, equation (6) becomes


1 0 0
1 1

From

1+ 2

this

( 2) =

3+ 4

0 01

expression
1+ 2

1 0

0 01

1
1 1

where 1- P1 > 0,

01

3+ 4

> 0 and

define

02

According to the Newton-Raphson method:

'(

2)

1 0

2)

1
1 1

( + 1)
2

Note that the graph of (

2)

1+ 2
2 1 0
1+ 2 3

01

0 02

function
5

3+ 4

3+ 4 5

0 01

( 2)

(1

0) 2

3+ 4

3+ 4 5

as

follows

1+ 2

1+ 2 3

can be estimated given appropriate data.


( )

is
1+ 2

1+ 2

> 0.

Using the Newton-Raphson method, the value of

The derivative of (

1+ 2

1+ 2 3

we

has an asymptote at

'

2
2

1 0 3 1 + 2

1+ 2 3

1.

3
2 1 0

1+ 2 3

1+ 2

Because of this, it is possible for to have more than 1

root on either side of the asymptote. Nevertheless, we will retain the roots of which are to the right of the asymptote
because these are the only values which satisfy the condition that. Nevertheless, we will retain the roots of ( 2 )
which are to the right of the asymptote because these are the only values which satisfy the condition that (
= 0 given

2)

> 0.

The standard error of

was estimated using the delta transformation. An explanation of how the formula for the

standard error was derived is given in the Annex.


Application of the method to population-based data from the Malawi Demographic Survey 2010
Description of the data

The estimated population of Malawi in 2011 was 14,388,550 [5]. The national prevalence of HIV was 10% in 2010 [6].
The provision of ARV therapy in Malawi is overseen by the HIV Unit in the Ministry of Health and Population. By
2011, 382,953 people were on ARV therapy [1]. The remaining 1,055,902 were not on ARV therapy. In the same
year, the number of deaths due to HIV was 43,000 [1].
From the ARV Supervision database for 2004-2009 which was maintained by the HIV Unit, in 2004 there were 3,262
ART registrations [7]. By the end of 2008, a total of 20,393 HIV positive persons were recruited to receive ARV
therapy. This gives a recruitment rate ( 4 ) of 3,426 people per year on average. Studies [8, 9] conducted in Malawi
found that ART reduces mortality by 10% [8]. Therefore given HIV mortality rates, the rate of mortality among those
on ART is

= 0.9 *

3.

The age-specific HIV sero-prevalence data analysed for this paper are extracted from the database of the Malawi
Demographic and Health Survey (MDHS2010) which was conducted in 2010. The data are in Table 1 below.

Agegroup HIV-Number % (p03) HIV+ Number %

3 of 6

Not on ARV Number % (p01) On ARV Number % (p02)

15-19

3208

0.022

71

0.022 63

0.020

0.002

20-24

2370

0.051

122

0.051 114

0.048

0.003

25-29

2141

0.108

232

0.108 197

0.092

35

0.016

30-34

1560

0.181

283

0.181 227

0.146

56

0.036

35-39

1224

0.246

301

0.246 232

0.190

69

0.056

2015-03-27 09:27 PM

Estimating HIV incidence from grouped cross-sectional data in setting...

4 of 6

http://www.labome.org/research/Estimating-HIV-incidence-from-grou...

40-44

870

0.247

215

0.247 155

0.178

60

45-49

817

0.193

158

0.193 95

0.116

63

0.069
0.077

50-54

295

0.129

38

0.129 25

0.085

13

0.044

Table 1. Nationally representative HIV sero-prevalence data for Malawi, 2010.

In 1992, HIV was not endemic as it is today. Mortality, in general, was mainly due to causes other than HIV. As HIV
spread throughout Malawi, HIV became the leading cause of mortality. The provision of ART to HIV positives has
reversed this trend in mortality. Therefore, the mortality estimates for 1992 represent true natural mortality rates for
Malawi which are not contaminated by HIV mortality. The source of HIV mortalities is a study by Crampin et al(2002).
This study reports mortality rates for HIV persons not on ARV therapy from a study conducted in a typical rural
setting representative of an average rural area in Malawi [10]. These estimates represent HIV mortality rates in rural
Malawi in the absence of ARV therapy. Table 2 below contains the natural and HIV mortality rates.
Results
HIV

incidence

estimates for 15-19,


20-24, 25-29, 30-34,

Age group index (j) Age group Natural mortality rates ( 1 ) AIDS mortality rates ( 3 )
Men

Women

35-39, 40-44, 45-49,

15-19

0.0038

0.0053

0.0471

50-54 age groups are


in Table 3. The 95%

20-24

0.0041

0.0036

0.0593

25-29

0.0068

0.0068

0.0675

30-34

0.0084

0.0072

0.1354

35-39

0.0076

0.009

0.1354

estimates

40-44

0.0101

0.0089

0.1427

were obtained by using

45-49

0.0097

0.0096

0.1427

the Newton-Raphson
method. The initial

50+

0.0097

0.0096

0.2339

confidence interval for


each estimate is also
presented.
The
incidence

values of

plucked

Table 2. Natural and AIDS mortality rates for Malawi.

into
the
NewtonRaphson algorithm were obtained by a combination of methods which include inspection, use of the R function
uniroot and numerical search procedures.
The age group with the highest incidence is the
30-34 year age group. The smallest incidence
Agegroup FOI

SE

Incidence per 5
years

95% CI for
incidence
Lower
limit

Upper
limit

is for the 15-19 year age group. Although 40-44


and

45-49

age

groups

have

the

same

incidence estimate the two estimates are


different correct to 6 decimal places. All the

15-19

0.0607 0.000358 61

60

61

20-24

0.0858 0.000779 86

84

87

25-29

0.1171 0.001806 117

114

121

small. Furthermore, the 95 % confidence


intervals for the 15-19 through 45-49 age

30-34

0.1628 0.00157

163

160

166

groups are very narrow.

35-39

0.1428 0.00108

143

141

145

40-44

0.1446 0.000897 145

143

146

45-49

0.1447 0.000755 145

143

146

standard errors of the FOI estimates are very

Discussion

50-54

Table 3. Nationally representative HIV incidence estimates for Malawi.

This method is a very good way of estimating


incidence from cross-sectional data. It is
impossible to estimate the HIV incidence for
the age group 50-54 years because the
structure of the model does not permit it.

Our new method relies heavily on the existence of the roots of (

2 ).

For the 50-54 year age group, no estimate is

possible because of the nature of the model used.


We tested the sensitivity of the method to the size of P01 and P02. According to our findings, big values of P01 and
P02 resulted in ( 2 ) whose roots were hard to estimate. In order to have reasonable smaller P01 and P02 for the
Newton-Raphson method to converge efficiently, both parameters (P01 and P02) must be defined as proportions of
the sample for each age group. In any case, the number of people on ART is bound to be small, therefore as a
fraction of the sample for each age group, this produces proportions which make it easy to achieve convergence
when using the Newton - Raphson algorithm.
The objective of the method is to produce incidence estimates. Therefore, defining P01 and P02 as proposed above
does not make the results of the current method unusable. The reader who wants the proportions P01 and P02 to be
defined otherwise can do so and can compute the proportions based on his own definitions from data [3].
The fact that all the confidence intervals were narrow can be explained by the size of the samples for each age

2015-03-27 09:27 PM

Estimating HIV incidence from grouped cross-sectional data in setting...

5 of 6

http://www.labome.org/research/Estimating-HIV-incidence-from-grou...

group. All sample sizes were very big. In such cases, standard errors are very small. These affect the size of the
margin of error. Eventually, confidence intervals computed from such standard errors are likewise narrow. Besides,
the narrow confidence intervals are indicative of high precision in the estimation of FOI.
Conclusion
The novel method introduced in this paper is a very good approach for estimating HIV incidence from aggregated
data collected from settings where ART is provided to HIV infected individuals. This method is timely as it comes at a
time when provision of ART is rampant in many countries of the world.
ANNEX
Derivation of the standard error of 2

Obviously, the force of infection (FOI) 2 is the function of both P0 and P1. That is to say 2=f(P0, P1). Therefore to
find the variance of 2 we use the delta method of transformation. Using the delta method
2

= (

1 ).

0,

Therefore to find the variance of

we use the delta method of transformation. Using the delta method,

2 2
0

( 2) =

0) +

2 2
1

1 ).

We will define a function y in this way:

(1)

1+ 2

1 1

1
1 0

01

3+ 4

3+ 4

02

4
1 0

5 3+ 4
3+ 4 5

2
1+ 2 3

1+ 2

Therefore

(2)

1
1 0 2

!
0

01

02

4
1 0 2

5 3+ 4
3+ 4 5

Similarly,

(3)

!
2

1+ 2

2.

1 1

1+ 2

1+ 2 3

1+ 2

1+ 2 3

1+ 2

1+ 2 3

It is also true that


!
1

(4)
2
0

"
#0

1 2
1 1 2

"
2

1 2

and

1 1
2
1

"
#1

"
2

The partial derivative is the quotient when the result in (2) is divided by the result in (3) above. Similarly, the partial
derivative is the quotient when the result in (4) is divided by the result in (3) above.
The variance of P0 is

0)

0 1 0
$0

. Similarly the variance of P1 is

1)

1 1 1
$1

Declarations
Competing interests

There are no competing interests.


Acknowledgments

I am very grateful to ORC Macro International for allowing me to analyse the MDHS2010 data.
Authors' contributions

HM conceived the study, conceived the method, obtained the data, analyzed the data, drafted the manuscript and
revised it.

References
1. . HIV Unit: 2012 Global AIDS Response Progress Report:Malawi Country Report for 2010 and 2011. Lilongwe,Malawi: Ministry of
Health,Malawi Government; 2012.
2. Misiri H, Edriss A, Aalen O, Dahl F. Estimation of HIV incidence in Malawi from cross-sectional population-based sero-prevalence
data. J Int AIDS Soc. 2012;15:14 pubmed
3. Podgor M, Leske M. Estimating incidence from age-specific prevalence for irreversible diseases with differential mortality. Stat Med.
1986;5:573-8 pubmed
4. Lagakos S. A stochastic model for censored-survival data in the presence of an auxiliary variable. Biometrics. 1976;32:551-9
pubmed

5. . "Population projections for Malawi." [http://www.nso.malawi.net/index.php?option=com_content&view=article&


id=134%3Apopulation-projections-for-malawi&catid=8&Itemid=3. ].

2015-03-27 09:27 PM

Estimating HIV incidence from grouped cross-sectional data in setting...

6 of 6

http://www.labome.org/research/Estimating-HIV-incidence-from-grou...

6. . National Statistical Office (NSO) ORC Macro: Malawi Demographic and Health Survey 2010. Zomba: National Statistical Office
(NSO) and O. R. C. Macro; 2010.
7. Murray C, Ortblad K, Guinovart C, Lim S, Wolock T, Roberts D, et al. Global, regional, and national incidence and mortality for HIV,
tuberculosis, and malaria during 1990-2013: a systematic analysis for the Global Burden of Disease Study 2013. Lancet.
2014;384:1005-70 pubmed
publisher
8. Jahn A, Floyd S, Crampin A, Mwaungulu F, Mvula H, Munthali F, et al. Population-level effect of HIV on adult mortality and early
evidence of reversal after introduction of antiretroviral therapy in Malawi. Lancet. 2008;371:1603-11 pubmed
publisher
9. Floyd S, Molesworth A, Dube A, Banda E, Jahn A, Mwafulirwa C, et al. Population-level reduction in adult mortality after extension of
free anti-retroviral therapy provision into rural areas in northern Malawi. PLoS ONE. 2010;5:e13499 pubmed
publisher
10. Crampin A, Floyd S, Glynn J, Sibande F, Mulawa D, Nyondo A, et al. Long-term follow-up of HIV-positive and HIV-negative individuals
in rural Malawi. AIDS. 2002;16:1545-50 pubmed
ISSN : 2334-1009

2015-03-27 09:27 PM