Вы находитесь на странице: 1из 6

HIV drug resistance analysis tool based on process algebra

Luciano Vieira de Arajo


Dept of Bioinformatics University of So Paulo Rua do Mato, 1010 055080090 +55-11-30916172

Ester C. Sabino
Fundao Pr-Sangue University of So Paulo, Brazil Av.Dr.Enas de Carvalho Aguiar,155 +55-11-30615544-ext221

Joo Eduardo Ferreira


Dept. of Computer Science University of So Paulo Rua do Mato, 1010 055080090 +55-11-30916172

luciano@ime.usp.br ABSTRACT

sabinoec@gmail.com 1. INTRODUCTION

jef@ime.usp.br

The increasing number of drugs used in HIV patient treatment and the mutations associated with drug resistance make the inference of drug resistance a complex task that demands computational systems. Furthermore, the software development/update can generate an extra level of complexity in the process drug resistance analysis. An alternative to handle the complexity of drug resistance and software development is to use a formal representation of involved processes, such as process algebra. This allows mathematical reasoning about the analysis process, a precise description of system behavior, more advanced computational approaches, as concurrent/parallel execution and (semi) automatic software development. The first contribution of this research is a mapping of drug resistance algorithms rules into expressions of process algebra which facilitates the computational manipulation of theses rules. The second contribution is the HIVdag (HIV Drug Analysis Generator) system. This software supports the definition, generation and analyses of genotypic drug resistance tests based on process algebra expressions. Therefore, the users can easily create/update their own drug resistance algorithms any time and independent of software development.

The drug resistance can be understood as a decrease of virus susceptibility to drug used in the patient treatment. It plays an important role in the HIV patients treatment. There are different initiatives to maintain updated sources of information about drug resistance mutations, such as HIV Sequence Database at Los Alamos National Laboratories (www.hiv.lanl.gov) and Drug Resistance Summary section of the Stanford University (hivdb.stanford.edu). These drug resistance information repositories have been used as reference to several groups to develop their own drug resistance interpretation algorithms. In spite of using the same source of information, each group has a different interpretation of drug resistance mutations. As result, some papers have reported discordance between the most used drug resistance interpretation algorithms [7, 8, 11]. As long as, there isnt a consensus about genotypic drug resistance testing, we need resources to easily develop, update and compare the algorithms generated for different groups. In this paper, we present HIVdag software that offers a quick, precise and flexible generation of Drug Resistance Tests Software. It is obtained through a mapping of drug resistance rule into expressions of process algebra and the management of expressions execution using a process definition language, called NPDL. This paper is organized as follows. Section 2 summarizes related work. Section 3 describes the drug resistance interpretation rules. Section 4 presents a synthesis of Process Algebra and NPDL. Section 5 describes mapping of drug resistance interpretation rules into process algebra expressions. In section 6, we present our software HIVdag. Section 7 presents the conclusion.

Categories and Subject Descriptors


J.3 [Computer Applications]: LIFE AND MEDICAL SCIENCES, Biology and genetics, Medical information systems

General Terms
Algorithms, Management, Design, Experimentation, Languages.

Keywords
Process Algebra, Drug Resistance, HIV, NPDL, Mutation Analysis, Genotypic Drug Resistance Testing.

2. RELATED WORK
The most used and publicly available HIV Genotypic drug resistance testing are: HIVdb [8], created by the Stanford University; ANRS [7], developed by Agence Nationale de Recherches sur le Sida; Rega [11], developed by Rega Institute and the HIV Genotyping Test Brazilian Interpretation Algorithm [1], maintained by the Brazilian Ministry of Health RENAGENO Expert Committee. This last system has been widely used, in Brazil, as part of treatment offered by Brazilian Ministry of Health and it is also used to describe the drug resistance profile of Brazilian HIV patient. All of them are rulebased systems that report level of drug resistance to each analyzed

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. SAC08, March 16-20, 2008, Fortaleza, Cear, Brazil. Copyright 2008 ACM 978-1-59593-753-7/08/0003$5.00.

1358

drug. More details about drug resistance interpretation rules are presented in Section 3. The ANRS, Rega and Brazilian Interpretation Algorithm are based on rules consisted of boolean expression that analyze the presence or absence of sets of specific mutations and report three levels of drug resistance: susceptible, intermediate, and resistant. As well, the HIVdb is a rule-based system. However, HIVdb rules attribute a drug penalty score to a total drug score. Thus, the drug resistance is reported according to the total drug score into five levels: susceptible, potential low-level resistance, low-level resistance, intermediate resistance, and high-level resistance. Another difference between them is the interpretation of mutations and drug resistance. Consequently, these software generate different results and motivate the comparison of their behavior, as in [8, 10]. All these algorithms were manually developed without special computer techniques and using program languages like Perl. Thus, each one is bound to limitations, such as: codification errors, lack of tests and time to development/update. The challenge is to avoid manual develop and be able to not only compare the algorithms results but also to understand and to explore the differences. One initiative of automate code generation of genotypic drug resistance system was the ASI [4] - Algorithm Specification Interface, developed by University of Stanford. The ASI is a specific interface and compiler to generate 3 drug resistance algorithms: HIVdb, ANRS, Rega and variations of theses algorithms. In the ASI, the rules are defined using XML format. After that, the compiler generates the algorithm code using the XML file. This approach is an informal way to generate code, that describes the rules used, but it doesnt describe the system behavior. Furthermore, an extension of algorithms to include new parameters, such as virus subtype and used drugs can demand a large system restructure. The Figure 1 shows a rule of Rega Algorithm in XML format which is a representative sample of other algorithms such as ANRS, HIVdb, Rega. The XML file contains the resistance level, set of rules of each drug and the classification table of total penalty score, when representing HIVdb algorithms. In spite of flexibility provide by XML approach, it is not a clear formal model to support the correctness representation and execution control of drug resistance rules.
<ALGORITHM> <ALGNAME>Rega v7.1.1</ALGNAME> <ALGVERSION>7.1.1</ALGVERSION> <DEFINITIONS> <LEVEL_DEFINITION> <ORDER>1</ORDER> <ORIGINAL>Susceptible GSS 1</ORIGINAL> <SIR>S</SIR> </LEVEL_DEFINITION> ... <LEVEL_DEFINITION> <ORDER>6</ORDER> <ORIGINAL>Resistant GSS 0</ORIGINAL> <SIR>R</SIR> </LEVEL_DEFINITION> <RULE> ... </DEFINITIONS> ... <DRUG> <NAME>AZT</NAME>

<RULE> <CONDITION> SELECT ATLEAST 1 FROM (151M,69i) </CONDITION> <ACTIONS> <LEVEL>6</LEVEL> </ACTIONS> ... </RULE> ... </ALGORITHM>

Figure 1: Sample of HIVdb drug resistance rule in XML format [4] Some works [10,11,13] have indicated that HIV drug resistance research can be improved using information about of both virus and patient treatment. In addition, the HIV integrated environments, as DBCollHIV[1], have provided integrated data about laboratorial exams, drug treatment history and epidemiology and bioinformatics tool to analyze it, such as: Brazilian Interpretation Algorithm, subtype tool, etc. This scene demands software to support the improvement of drug resistance research.

3. DRUG RESISTANCE INTERPRETATION RULES


The actual drug resistances rules are based-on mutations analysis which is carrying out with the verification if the virus has or not mutations associated with drug resistance. Thus, mutations are essential part of drug resistance rules. At the rules, mutations are represented using numbers and letters; where numbers represent the mutations genome position and letters indicate the mutate amino acid founded at the genome position. For instance, 41L indicates a mutate amino acid L at genome position 41. Some positions have more than one amino acid associated with resistance which is indicating for a list of letters delimited for bars (/), as in 181C/I/L. In order to understand the structure of drug resistance rules, we present one representative sample of Brazilian Algorithm, ANRS and HIVdb algorithms, as follow: Brazilian Algorithm sample of rules to drug called DLV - Presence of 1 or more from (100I , 181C/I/L, 188L, 230L, 236L) and absence of 190A indicates Resistant This rule means that if the list of virus mutations has the presence of 1 or more mutations of set (100I, 181C/I/L, 188L, 230L, 236L) and the absence of mutation 190A, the virus is reported as resistant to drug DLV. The next example is an ANRS rule sample of rules to drug called DDI. - Exclude 70R AND Exclude 184VI AND Select Atleast 2 From (41L,69D,74V,215FY,219QE) This rule indicates that the presence of at least two mutations of set (41L,69D,74V,215FY,219QE) in the absence of mutations 70R and 184V/I reports the virus as resistant to DDI. The last example is a part of HIVdb rule to drug called AZT. - 116Y =>10, 151L => 20,

1359

The first line of this rule indicates that mutation 116Y attributes penalty score of 10 to AZT score. And the second line shows that the presence of mutation at position 151L scores 20 points to AZT score. The special feature of HIVdb is the classification of final score into a resistance level, as follow: -Infinite to 10=> susceptible, 10 to 15=> potential low-level resistance, 15 TO 30=> low-level resistance, 30 to 60=> intermediate resistance, 60 to +infinite => high-level resistance. The rules of HIVdb, Brazilian Algorithms, ANRS and Vega share the same structure. Each algorithm contains a set of rules to each drug and the rules evaluate the association between mutation and resistance levels/penalty score.

algorithm is composed of several rules while each rule represents a set of drug resistance mutation associated with a drug mutation level or a penalty score. In order to map rules of drug resistance into expressions of PA, we must represent the drug resistance rules as atomic actions and operators that indicate the execution order of these actions. An atomic action can be understood as program/method that carries out a specific indivisible task. In the case of drug resistance, we define some atomic action. Let MS (MutationSearch) be a rule (i.e., a Boolean function) action that receives, as parameter: a list of mutations to be compared with virus mutations list, a pair of integer values representing minimum and maximum values of mutations matches to be considered. The MS execution generates a Boolean value, true or false. The true value indicates that the quantity of mutations matches is delimited by minimum and maximum values received as parameter. Otherwise, MS returns false. Moreover, if the parameter maximum is 0 (zero), it indicates that mutations must be absence of virus to return true. If the maximum value is null indicates unlimited number of matches. Let SS (SetScore) be an atomic action that increases the drug total score penalty using integer number received as parameter. Let SC (ScoreClassification) be an atomic action that receives as parameter an integer number representing the total penalty score and return its respective drug resistance level. Let RS (ResultsSynchronization) be an atomic action that synchronizes the results of different rules. It retrieves the rules results and returns the most significant result or the resistance level associated with the final drug score. Let SRL (SetResistanceLevel) be an atomic action that sets the drug resistance level. Let RE (ResultsEquivalence) be an atomic action that receives as parameter the equivalence relation between algorithms results. Let ARC (AlgorithmsResultComparison) be an atomic action that retrieves and compares the results of each executed algorithm. Let GO (Go ON) be an atomic action that simulates the behavior of silent action of PA. Considering the list of NPDL operators {, +, ||, %r} and the atomic actions {MS,SS,SC,RS,SRL,RE,ARC,GO}, we map samples of drug resistance rules of Brazilian Algorithm, such as: one set of drug rules, the algorithms and the comparison between algorithms. The first mapping example is the set of rules that analyses the drug resistance level of drug called DLV. This analysis is carrying out using 4 rules, namely: DLV1 = Presence of 1 or more from (100I, 181C/I/L, 188L, 230L, 236L) AND absence of 190A reports Resistant level. DLV2= Presence of 1 or more from (225H, 227L) OR Presence of 1 or more from (106A, 103N) reports Intermediate level. DLV3= Presence of 1 or more from (106A/M, 103N/H/T/S V) AND absence of (190A,225H,227L) reports Resistant level.

4. SUMMARY OF PROCESS ALGEBRA AND NPDL


Process Algebra (PA) is a framework for formal reasoning about processes. It is useful to detect undesirable features and to formally derive desirable features of a system specification [5]. The PA uses the algebraic expressions to represent systems behavior. These algebraic expressions are formed by atomic actions and binary operators, where the atomic action represents an indivisible task to be executed and the binary operator indicates the execution order of the actions. The Navigation Plan Definition Language (NPDL) [3] is a process definition language based on process algebra. The NPDL implements not only operator of PA but also some extended operators that facilitate the process definition. The complete list of operators of PA and NPDL can be found at [3] and [5] respectively. In this section, we present only the operators used to represent drug resistance rules, as namely: (1) Alternative composition + which, in the term t1+t2, defines the process that executes either t1 or t2, but only one at a time; (2) Sequential composition which, in the term t1.t2, defines the process that executes t2 after the finish of execution of t1; (3) Parallel composition || which, in the term t1 || t2, defines the process that executes t1 and t2 at the same time; (4) conditional execution %r which, in the term % r t1, defines the process that executes t1, only if the rule r return a true value. The complementary behavior is represented by %!r", where the operator %!, in the term %!r t1, represent the process that executes t1, only if rule r return a false value. In some cases, the operator % can be associated with a silent action which is an action that does not perform a task; it only indicates that the expression evaluation can continue. As a example of NPDL expression, we present: P = %r (A . B) + %!r (C||D). In this expression, P represents the process that evaluates the result of rule r to decide which actions must be executed. If r returns the true value, the action B is executed after the finish of A execution. If a false value is returned, the actions C and D can be executed simultaneously.

5. MAPPING OF DRUG RESISTANCE RULES INTO PROCESS ALGEBRA EXPRESSIONS.


In summarized way, we can consider that the rules of the analyzed algorithms possess the same structure. That is, each

1360

DLV4= Presence of 1 or more from (190E) reports Resistant level. In order to represent the evaluation of presence or absence of some virus mutations, we use the rule MS. The set of mutation to be sought and the minimum and maximum of mutations matches to be considered by MS are respectively represented as parameters of MS. To verify the result of MS execution, we use the operator %, e.g. %MS. However, the operator % releases the execution of associated action. Particularly, the MS execution that returns a true value indicates that the rule verification must continue. Thus, the action associated with the blue operator will be a silent action GO which only indicates that the verification of the rule must continue. The connectors operators OR and AND are represented by the alternative composition + and sequential composition , respectively. At last, the attribution of resistance level to drug is represented by the atomic action SRL and the resistance level to be attributed is represented as parameter of SRL. Thereby, the first rule of DLV is mapped as:
DLV1 = %MS(100I,181C/I/L,188L,230L,236L;1;null) %MS(190A;0;0) GO SRL(Resistant) GO

Thereby, each algorithm rule can be executed in parallel pathway; where each pathway analyzes one drug with its set of rules. And each drug rule have its actions executed according with the operators definition. The mapping of ANRS and REGA algorithms follow the same steps used to mapping Brazilian Algorithm rules. However, the mapping of HIVdb rules needs the actions SS(SetScore) and SC(ScoreClassification) to mapping of score attribution and classification of drug resistance rule according the drug score, respectively. Follow, we present part of XML that defines the HIVdb rules to DLV drug.
<DRUG> <NAME>DLV</NAME> <RULE> <CONDITION> SCORE FROM (98G => 5, 100I => 40, 100V => 10, ... 318F => 50 ) </CONDITION> </RULE> </DRUG>

This expression indicates two sequential executions of MS followed by the execution of SRL. The first execution of MS seeks for the presence of mutation 100I,181C/I/L,188L,230L, 236L. If a minimum of one mutation, without maximum limitation, is found the MS returns a true value. Other wise, MS returns a false value. After that, the operator % evaluates the MS return. If the returned value is true, the silent action GO indicates that the verification must continue and the operator releases the second execution of MS. In case of false value, the rule execution is finished, because the first rule condition was not satisfied. If the second MS execution is released, it will seek for the presence of mutation 190A, if a minimum and maximum of zero mutations are found, this execution of MS is finished, and the true value is returned. At the end of second MS execution the operator % evaluates the MS return. If the returned value is true, the execution of SRL is released. Otherwise, the rule execution is finished, once the second rule condition was not satisfied. Considering that, the execution of SRL is released, the analysis process represented by DLV1 is set as resistance level is set as Resistant and it is finished. Follow the describe process, the others three rule of DLV are mapped as:
DLV2 = ( %MS(225H, 227L; 1; null) GO + %MS(106A, 103N; 1 ;null) GO ) SRL(Intermediate) DLV3 = %MS(106A/M, 103N/H/T/S/V;1; null) GO %MS(190A, 225H, 227L; 0; 0) GO SRL(Intermediate) DLV4 = %SM(190E; 1; null) GO SRL(Resistant)

Using the actions SS and SC, this rule is mapped as: DLV = (%MS(98G;1;null) GO SS(5) + %MS(100I;1;null) GO SS(40) + %MS(100V;1;null) GO SS(10) + %MS(101E/P;1;null) GO SS(5)+ + %MS(318F;1;null) GO SS(50) ) SC After the successful execution of each MS action, the action SS adds the indicated score to the total drug score. Then, at the end of expression, the action SC attributes the drug resistance level based on the total drug score. As presented, the atomic actions can be composed using PA operators to form a process that represents one drug resistance rule. After that, several drug resistance rules can be combined into a process representing the drug resistance analysis of the drug. At last, the processes of each drug are joined using de PA operators into a main process to define a complete drug resistance algorithm. The comparison between algorithms comprehends the concurrent execution of chosen algorithms and synchronized by the action ARC that compares the algorithms results, as in: P = (HIVdb||ANRS||Brazilian_Algorithms)ARC Not only mutations can be analyzed in this process but also others parameters, such as: treatment adherence, drugs used in treatment, time of drug treatment and laboratorial exams results. Each of these parameters can be encapsulated into an atomic action or into a process and be introduced in the analysis process of drug and/or algorithm.

After the mapping of the four DLV rules, we must define the process to represent the complete analysis of DLV, as DLV = (DLV1||DLV2||DLV3||DLV4) RS. This expression indicates that rules can be executed simultaneously followed by a synchronization of the results. The version 4 of Brazilian Interpretation Algorithm analyses twenty one drug rules and it can be represented in process algebra as BAI_V4 =
(ABC||DDI||3TC||D4T||TDF||DDC||AZT|| AZT_3TC||DLV||EFV||NVP||APV||IDV||LPV/r||NFV||RTV||SQV||ATV|| APV_r||SQV_r||IDV_r).

6. HIVdag software
We have developed HIVdag. It is a software for creation, execution and comparison of HIV Genotypic Drug Resistance Tests. The HIVdag is based on concepts of process algebra which

1361

formally describes system behavior and uses NPDL for manage the execution of PA expressions. It was developed using Ruby on Rails (www.rubyonrails.org), PostgreSQL database (www.postgresql.org) and a web services integrated to NPDL

the set of mutations to be searched. Each part of rule can be composed using parenthesis and logical operators (AND,OR). Using this interface, the user doesnt need know about process algebra or NPDL because the rules definition is made in the similar way of user understand it. After the definition of all rules,

Figure 2 HIVdag interface to drug resistance rules definition the algorithm is automatically generated and released to use. The interpreter called NavigationPlanTool[3]. rule represented at Figure 2 is the rule to drug DLV referents to The HIVdag offers user friendly interfaces to define the Brazilian Algorithms Version 4, which is automatically mapped algorithms rules, Figure 2. At the first part of interface, the user to the following NPDL expression, DLV1 = (( %MS(106A/M, 1, null) informs the algorithm of reference, drug of reference, result type (Resistance Level or Score) and rule result according with result type selected. At the second part, the user defines the mutations to be analyzed and their associations. It is made using the following options: occurrence type (presence of or absence of), Min Mutation and Max Mutation representing the minimum and the maximum number of mutations matches to be considered and
GO + %MS(103N/ H/T/S/V,1,null) GO) %MS(190A, 225H, 227L, 0,0) GO) SRL(R)

The analysis result can be visualized in a web page, as in ( http://clinmaldb.usp.br:8083/hiv/resistencia/resistencia.html), export to file in XML/CSV Format or integrated to DBCollHIV as in the bottom of Figure 3, where the drug and its respective

Figure 3. Sample of HIVdag integrated into DBCollHIV

1362

resistance level are reported. The HIVdag also supports analysis of a large set of patient and generates a drug resistance profile of them, as in Figure 4. This type of report is useful to measure the impact of new algorithm versions on a group of patients in treatment and to help in the decision process of drug acquisition, which is important to HIV government drug distribution programs, as in Brazil.

resistance and clade distributions among HIV-1--infected blood donors in Sao Paulo, Brazil, J Acquir Immune Defic Syndr, 41(2006)338-41. [3] Braghetto, K. R.; Ferreira, J. E.; Pu, C., Using Control-Flow Patterns for Specifying Business Processes in Cooperative Environments. In: The 22nd Annual ACM Symposium on Applied Computing, 2007, Seoul. The 22nd Annual ACM Symposium on Applied Computing, 2007.v. 2. p.1234-1241. [4] Betts, B.J., and R.W. Shafer. Algorithm specification interface for human immunodeficiency virus type 1 genotypic interpretation. J. Clin. Microbiol. 41:27922794.2003. [5] Fokkink, W.J. Introduction to Process Algebra (Texts in
Theoretical Computer Science). Springer-Verlag, Berlin, 2000.

[6] Liu TF, Shafer RW(2006). Web Resources for HIV type 1 Genotypic-Resistance Test Interpretation. Clin Infect Dis 42(11):1608-18. Epub 2006. [7] Meynard, J. L., M. Vray, L. Morand-Joubert, et al. Phenotypic or genotypic resistance testing for choosing antiretroviral therapy after treatment failure: a randomized trial. AIDS 16:727736.2002. [8] Ravela J, Betts BJ, Brun-Vezinet F, et al. HIV-1 protease and reverse transcriptase mutation patterns responsible for discordances between genotypic drug resistance interpretation algorithms. J Acquir Immune Defic Syndr 2003;33:814. [9] Shafer RW, Stevenson D, Chan B. Human immunodeficiency virus reverse transcriptase and protease sequence database. Nucleic Acids Res. 1999;27:348- 352. [10] Shafer RW. Rationale and Uses of a Public HIV DrugResistance Database. Journal of Infectious Diseases 194 Suppl 1:S51-8.2006. [11] Snoeck, Joke ; Kantor, Rami ; Shafer, Robert W ; et al. Discordances between interpretation algorithms for genotypic resistance to protease and reverse transcriptase inhibitors of the Human Immunodeficiency Virus are subtype dependent. Antimicrobial Agents and Chemotherapy, Washington, v.50, n.2, p.694-701, 2006. [12] Van Laethem, K., A. De Luca, A. Antinori, A. Cingolani, C. F. Perno, and A.-M. Vandamme. 2002. A genotypic drug resistance algorithm that significantly predicts therapy response in HIV-1 infected patients. Antivir.Ther. 7:123-129. [13] Zazzi,M.; Romano, L.; Venturi,G.; Shafer,R.; Reid, C.; Dal Bello,F.; Parolin,C; Pal, G.; Valensin, P; Comparative evaluation of three computerized algorithms for prediction of antiretroviral susceptibility from HIV type 1 genotype. Journal of Antimicrobial Chemotherapy (2004) 53, 356360 DOI: 10.1093/jac/dkh021

Figure 4. HIVdag drug resistance profile report, resulted from[2]

7. CONCLUSION
In this paper, we presented an alternative for mapping of genotypic drug resistance algorithms rules using expressions of process algebra. The algebraic representation of drug resistance rules allows a formal reasoning about the rules and provides the correctness in automatic generation of software for drug resistance analyses. To complete this approach, we develop the HIVdag software that uses the process algebra expressions to defines, generates, executes and compares genotypic drug resistance test software. This work offers to the researcher a simple, fast and precise way to explore the different interpretations for mutations associated to drug resistance without worry about software development. Our ongoing researches include a new interface to allow advanced users to add and analyze new parameters into the drug resistance rules, for instance: virus subtype, used drugs by patients and laboratorial exams results.

8. REFERENCES
[1] Arajo, L. V.; Soares, M. A.; Tanuri, A.; Oliveira, S. M.; Chequer, P.; Sabino, E. C.; Ferreira, J. E., DBCollHIV: A Database System for Collaborative HIV analysis in Brazil. Genetics and molecular research, v.5, p.203-215, 2006. [2] Barreto, C.C., Nishyia, A., Araujo, L.V., Ferreira, J.E., Busch, M.P. and Sabino, E.C., Trends in antiretroviral drug

1363

Вам также может понравиться