Tackling the Debugging Challenge of Rule Based Systems

Valentin Zacharias
FZI Research Center for Information Technology, Karlsruhe 76131, Germany zach@fzi.de http://www.fzi.de

Abstract. Rule based systems are often presented as a simpler and more natural way to build computer systems - compared to both imperative programming and other logic formalisms. However, with respect to finding and preventing faults this promise of simplicity remains elusive. Based on the experience from three rule base creation projects and a survey of developers, this paper identifies reasons for this gap between promise and reality and proposes steps that can be taken to close it. An architecture for a complete support of rule base development is presented. Keywords: Rule based systems, F-logic, Knowledge acquisition, Rule based modeling, Knowledge modeling, Verification, Anomaly detection.

1 Introduction
Rule based systems are often presented as a simpler and more natural way to build computer systems - compared to both imperative programming and other logic formalisms such as full first order logic or desciption logics. It seems natural to assume that this simplicity also applies to finding and preventing faults; to software quality assurrance in general. However, with respect to debugging of rule bases this promise of simplicity remains elusive. Indeed a survey found [1] that most developers of rule based systems think that the debugging of rule based systems is more difficult than the debugging of ’conventional’ - object oriented and procedural programs. An observation that the authors also found corroborated in three rule base development projects they participated in. 1.1 The Debugging Challenge The assumption of rule based systems as simpler than ’conventional’ programs rests on three observations: – Rule languages are (mostly) declarative languages that free the developer from worrying about how something is computed. This should decrease the complexity for the developer. – The If-Then structure of rules resembles the way humans naturally communicate a large part of knowledge. – The basic structure of a rule is very simple and easy to understand.
J. Filipe and J. Cordeiro (Eds.): ICEIS 2008, LNBIP 19, pp. 144–154, 2009. c Springer-Verlag Berlin Heidelberg 2009

Tackling the Debugging Challenge of Rule Based Systems 145 To see whether this simplicity assumption holds in practice. Towards this end an existing reporting application in a human resource department was re-created as rule based system. at the same time a majority of rule base developers sees rule base development as inferior with respect to debugging. It models (mostly heuristic) knowledge about anomalies in rule bases. chemistry and biology. The next two sections then discuss overall principles to better support the creation of rule bases and an architecture of tool support for the development process. The analysis presented in this paper is based on the experience from three projects. The results showed. Section three discusses what kinds of problems were encountered. However.com . a survey of developers of rule based systems [1] included a question that asked participants to compare rule base development to the development of procedural and object oriented programs. The authors found this problem of fault detection corroborated in three different projects he participated in. And. Approximately 5 person months went into this application. rank it as the issue most hindering the development of rule based systems. – Project Halo1 is a multistage project to develop systems and methods that enable domain experts to model scientific knowledge. – Online Forecast was a project to explore the potential of knowledge based systems with respect to maintainability and understandability. – The goal of the project F-verify was to create a rule base as support for verification activities.projecthalo. This project was done by a developer with experience in creating rule bases and took 3 month to complete. 1 http://www. It consists of anomaly detection rules that work on the reified version of a rule base. reuse and understandability that were expected from rule based (and declarative) knowledge and program specification. 1. This paper’s structure is as follows: section two gives a short overview of the three projects that form the input for this analysis.2 Overview This paper is an attempt to reconcile the promise of simplicity of rule based systems with reality. that these developers indeed see the advantages in ease of change. participants of the survey saw debugging of rule based systems as actually more difficult than the debugging of conventional software. This project was performed by a junior developer who had little prior experience with rule based system. 2 Experiences In addition to the survey in [1] and literature research. As part of the second phase of Project Halo six domain experts were employed for 6 weeks each to create rule bases representing domain knowledge in physics. It identifies why rule bases are not currently simple to build and proposes steps to address this. in a different question.

Zacharias All of these projects used F-logic [2] as representation language and the Ontobroker inference engine [3]. Developers with prior experience in imperative programming languages said that developing rule bases was more difficult then developing imperative applications. methods and tools presented in this paper are geared towards this kind of small. The editing tools differed between the projects: Project Halo used the high level. F-logic is very similar to some of the rule languages under discussion for the Semantic Web [4] and it is based on normal logic programs probably the prototypical rule language. graphical tools developed in the project. However.in the creation of rules in a way that they can work together . because they are true in broader contexts than program statements can be used [5]. In all cases the developers complained that creating correct rule bases takes too long and in particular that debugging takes up too much time.that most errors get made. While part of these observations can be explained by longer training with imperative programming.as is to be expected for such relatively small projects. We identified six reasons for this observed discrepancy. – The One Rule Fallacy and the Problem of Terminology. The authors examined literature and tools to ensure that tool support identified as missing is indeed missing from rule based systems and not only from systems supporting F-logic.ontoprise. the inference engine obviously combines the rules only based on how the user has specified the rules. All of these projects have been created using relatively lightweight methods that focus more on the actual implementation of rules than on high level models . while another understands it socially. – That one rule understands a parent relation biologically. Declarative statements such as rules are often assumed to be easier reusable and more modular.de . Based on these sentiments the one rule fallacy goes as follows: One rule is simple to create and the combination of declarative rules is handled automatically by the inference engine hence entire rule bases are simply to create. and it is here .146 V. – That one rule processes a number as inch. However. in particular when performed by domain experts or junior programmers. while another processes it as millimeter. it still stands in a marked contrast to the often repeated assertion that creating rule bases is simpler. In particular for rules in a rule set to work together. all rules need to be consistent in the domain formalization and the use of terminology. F-verify used Ontostudio together with prototypical debugging and editing tools. The analysis. domain expert/end user programmer driven projects. 3 Analyis In the three projects sketched above the authors found that the creation of working rule bases was an error prone and tiresome process. The use of only one rule formalism obviously restricts the generality of any statements that can be made. For example it must not happen: – That one rule uses a relation part while another one uses part of . 2 http://www. Online Forecast used a simple version of Ontoprise’s2 OntoStudio.

that is opaque to the user. however. . part of (Chile. – The Problem of Opacity..is absolutely unrelated. – The Problem of Interconnection. This simple error then introduces a connection to a part of the rule base that deals with countries and that . In the authors experience failed interactions between rules are the most important source of errors during rule base creation. The Problem of Opacity also affects the developer’s ability to identify faults in rule bases. everything is potentially relevant to everything else3 . shown in the source code and the subject of visualizations. because it becomes hard. As a very simple example consider the following rule base: hot(X) ← part of (Chile. another that it leads to a rule being tried in unexpected circumstances that then lead to errors due to the wrong usage of built-ins. this depends on ensuring the consistent formalization and use of terminology across the rule base.Tackling the Debugging Challenge of Rule Based Systems 147 Rule based systems hold the promise to allow the automatic recombination of rules to tackle even problems not directly envisioned during the creation of the rule base. makes it harder to identify. More typical than this example. because errors appear in seemingly unrelated parts of the rule base. Because rule interactions are managed by the inference engine. are cases that do not lead to wrong conclusions but rather merely lead the inference engine to try one path that ends in a rule cycle or in some kind of error.from the user’s point of view . avoid and correct mistakes based on only a local view of the rule base. user managed modularization. for example. to see the consequences of changes or to gain confidence in the completeness of a rule base.should be part of(Chili. At the same time this points to the Problem of Terminology as one explanation for the difficulty in debugging rule bases. SouthAmerica) In this example a rule that deduces that something is hot based on it having chili as part.X) . This complicates error localization for the user. Not having an overall view on the rule base mainly affects the ability of developers to correctly edit the rule base. The only common way to explicitly show these interactions between the rules is in a prooftree of a successful query to the rule base. The problem of terminology makes the rule base harder to understand. Hence a part of the gap between the expected simplicity of rule base creation and the reality can be explained by naive assumptions about rule base creation. At the same time. One example would be that a rule causes a whole rule base to become unstratisfiable. is changed by a typo to deduce that something is hot if the country Chile is part of it. This stands in contrast to imperative programming. however. 3 At least in the absence of robust. However. it is this interaction between rules that is commonly not shown. X) # fault . where the relations between the entities and the overall structure of the program are explicitly created by the developer.

The development of rule based systems cannot take full advantage of the declarative nature of rules.it forces the developer to learn about the inner structure of the inference engine. Both imperative and rule based systems sometimes show bugs by behaving erratically. This stands in marked contrast to the idea that rule bases free the developer from worrying about how something is computed. however. A high prevalence of agile and iterative methods was also found in the developer survey [1]. This is unlike many imperative languages that often produce a partial output and a stack trace. Doing these design and architecture decisions ’on the fly’ reduces the risk of costly requirements. Iterative and Lightweight Methods.hence the order of debugging is based on the evaluation strategy of the inference engine. However a rule base does not define an order of execution . The first corollary of this observation is that the verification support for these systems cannot rely on the availability of formal specifications of any kind. design and (up-front) architecture mistakes.the no-result case. A second likely consequence is a higher prevalence of implementation mistakes. – The Problem of Procedural Debugging. these design and architecture decisions are often repeated and changed. All deployed debugging tools for rule based programs known to the authors are based on the procedural or imperative debugging paradigm. as a declarative representation a rule base does not define an order of execution . A further 23% of the projects worked without following any specific methodology.they are now widely believed to be superior to waterfall-like models [7]. The execution steps of a procedural debugger are then for instance: the inference engine tries to prove a goal A or it tries to find more results for another goal B. because agile methodologies encourage developers to make these decisions based on the scope of the current iteration.hence the order of debugging is based on the evaluation strategy of the inference engine.148 V. This debugging paradigm is well known from the world of imperative programming and characterized by the concepts of breakpoints and stepping. but only rule based systems show the overwhelming majority of bugs by terminating without giving any help on error localization. By far the most common symptom of an error in the rule base was a query that (unexpectedly) did not return any result . In this way procedural debugging of rule bases breaks the declarative paradigm . not on the expected scope of the final application. Zacharias – The No-Result Case and the Problem of Error Reporting. Recent years also saw the increasing adoption and acceptance of iterative and evolutionary development methodologies [6] . . – Agile. A procedural debugger offers a way to indicate a rule/rule part where the program execution is to stop and has to wait for commands from the users. The user can then look at the current state of the program and give commands to execute a number of the successive instructions. an impediment for many static analysis methods. With many of these methods architecture and design decisions are taken on the fly during the implementation of software. when debugging is done on the procedural nature of the inference engine. In addition. However. In such a case most inference engines give no further information that could aid the user in error localization. Agile and iterative methods were used in more than 50% of the projects with more than 10 person months of effort. at the same time it means that many of these mistakes are then made during implementation.

Compared to tools available as support for the development with imperative languages.7 domain experts that were involved in verification and validation. Slightly more than half of the projects included at least one domain expert that created rules. text editors that automatically load their data into the inference engine or simple schema based verification during rule formulation where the most successful tools employed. End user programmers usually can’t justify making an investment in programming training comparable to that of professional programmers. – Less refined tools support. 4 Core Principles The authors have identified four core principles to guide the building of better tool support for the creation of rule bases. this was reflected immediately in erroneous rules and unmotivated developers.Tackling the Debugging Challenge of Rule Based Systems 149 – End User Programmers and Domain Experts. This discrepancy is a direct consequence from the fact that in recent years the percentage of applications built with imperative programming languages was much larger than those built with rule languages. In cases where quick feedback during knowledge formulation could not be given4. To create tools in a way that they give feedback at the earliest moment possible. The survey of developers [1] also found that the average rule base development projects includes 1. The creation of software artifacts used to be the very specialized profession of a few thousand experts worldwide. those for the development of rule bases often lack refinement. because web developers are known to have an even higher percentage of end user programmers [10. but has now become a set of skills possessed to some degree by tens of millions of people. try-and-error process of rule base creation by allowing trying out a rule as it is created. . – Interactivity. These principles where conceived either by generalizing from tools that worked well or as direct antidote to problems encountered repeatedly. Tools that embodied interactivity proved to be very popular and successful in the three projects under discussion. that the average developer is not trained as well as the knowledge engineers that created the first expert systems.5 domain experts that create rules themselves and 1. The amount of software in society increases continually. and more and more people are involved in its creation.people that are usually trained for a non-programming area and just need a program. 4 This was mostly due to very long reasoning times or due to technical problems with the inference engine. To support an incremental.11]. Within this group an increasing role is played by end user programmers . End user programmers are particularly important for web related development. Tools such as fast graphical editors for test queries. For the US it is estimated that there are at least four times as many end user programmers as professional programmers [8]. This rise in the number of end user programmers means. with estimates for the number of end user programmers ranging from 11 million [8] up to 55 million [9]. script or spreadsheet as a tool for some task.

anomaly detection and visualization.16. – Modularization. Visibility is included as a direct counteragent to the problems of opacity and interconnection.1 Test. The declarititivity principle is a direct response to the problem of procedural debugging described in the previous section. At its core it shows test. A high level view of this architecture is shown in figure 1. This has been done because test data can inform the rule creation process . Debug and Rule Creation as Integrated Activity In order to truly support the interactivity principle the test. Based on the test data an editor can give feedback on the inferences made possible by a rule while it is created. [14. – Visibility.as mandated by the interactive principle. An overview of this integrated activity is shown in figure 2.20]) we have identified (and largely implemented) an architecture of tool support for the development of rule bases.g. debug and rule creation as an integrated activity . To show the hidden structure of (potential) rule interactions at every opportunity. Immediate feedback after small changes also helps to deal with the problem of interconnection and the problem of error reporting. the contents of the rule base and the test data) the current rule allows to infer such and such.19. modularization of rule bases has been extensively researched and implemented (e. However. – Declarativity. 5 Supporting Rule Base Development The principles identified in the previous section need to be embodied in concrete tools and development processes in order to be effective. This principle is a direct consequence of the problem of interconnection.150 V.15. Zacharias Interactivity is known to be an important success factor for development tools. This allows the developer to get instant feedback on whether her intuition of what a rule is supposed to infer matches reality. These activities are supported by test coverage. 5. create test and debug if necessary.17] and will not be further discussed in this paper. To create tools in a way that the user never has to worry about the how. Declaritivity of all development tools is a prerequisite to realize the potential of reduced complexity offered by declarative programming languages. debug and rule creation activities need to be closely integrated. Based on our experience from the three projects described earlier and on best practices as described in literature (eg [18. Testing has been broken up into two separate activities: the creation/identification of test data and the creation and evaluation of test queries.even in the absence of actual test queries. Each of the main parts will be described in more detail in the following paragraphs. This stands in contrast to the usual sequence of: create program part. An editor can automatically display that (given the state of the rule that is being created.13]. Interactivity as a principle addresses many of the problems identified in the previous section by supporting faster learning during knowledge formulation. When the . To support the structuring of a rule base in modules in order to give the user the possibility to isolate parts of the rule base and prevent unintended rule interactions. about the procedural nature of the computation. in particular for those geared towards end user programmers [12.

debug and rule creation as integrated activities inferences a rule enables do not match the expectations of the user she must be able to directly switch to the debugger to investigate this. The overall architecture of tool support for the rule base development Fig. 1. Anomaly detection heuristics focus on errors across rules that would otherwise be very hard to detects.Tackling the Debugging Challenge of Rule Based Systems 151 Fig. even without an actual test query. debug and rule creation because unlike in traditional development: – test data is used throughout editing to give immediate feedback – debugging is permanently available to support rule creation and editing. Test. Examples for anomalies are rules that use concepts that are not in the ontology or rules deducing relations that are never used anywhere. In this way testing and debugging become integrated. Its goal is to identify errors early that would be expensive to diagnose later. even without a test query being present. anomalies detection partly addresses the problem of opacity. Hence we speak of integrated test.2 Anomalies Anomalies are symptoms of probably errors in the knowledge base [21]. interconnection and error reporting by finding some of the errors related to rule interactions based on static analysis. . 2. Anomalies give feedback to the rule creation process by pointing to errors and can guide the user to perform extra tests on parts of the rule base that seem problematic. Of the problems identified in section 3. 5. The developer can debug rules in the absence of test queries. Anomaly detection is a well established and often used static verification technique for the development of rule bases.

the more complex heuristics. detailed descriptions can be found in [22]. development. Not being able to get an overview of the entire rule base and being lost in the navigation of the rule base was a frequent complaint in the three projects described in the beginning.3 Visualization The visualization of rule bases here means the use of graphic design techniques to display the overall structure of the entire rule base. It was established that debuggers needs to be declarative in order to realize the full potential of declarative programming languages. Zacharias We deployed anomaly detection heuristics within Project Halo.4 Debugging In section 3 the current procedural debugging support was identified as unsuited to support the simple creation of rule bases. – Visualizes the logical/semantic connections between rules and how rules work together to arrive at a result. An Explorative Debugger is not only a tool to identify the cause of an identified problem but also a tool to try out a program and learn about its working. how it interacts with other parts of the rule base and what role the different rule parts play. The overview visualization of the entire rule base is a response to the problem of opacity. Explorative Debugging puts the focus on rules. To address this shortcoming we have developed and implemented the Explorative Debugging paradigm for rule-based systems [23]. To. however. An explorative debugger is a rule browser that: – Shows the inferences a rule enables. It enables the user to use all her knowledge to quickly find the problem and to learn about the program at the same time. independent of the answer to any particular query. 5. To address this we created a scalable overview representation of the static and dynamic structure of a rule base. The goals for such visualizations are the same as for UML class diagrams and other overview representations of programs and models: to aid teaching. Other logical connections are formed by the prooftree that represents the logical structure of a proof that lead to a particular result. debugging. 5 Logical connections between rules are static dependencies formed for example by the possibility to unify a body atom of one rule with the head atom of another. . show which other parts are affected by a change. thereby violated the interactivity paradigm. It supports the navigation along these connections5. Unlike in procedural debuggers the debugging process is not determined by the procedural nature of the inference engine but by the user who can use the logical/semantic relations between the rules to navigate. 5. both as very simple heuristics integrated into the rule editor and more complex heuristics as a separate tool. and weren’t accepted by the users. validation and maintenance by facilitating a better understanding of a computer program/model by the developers. for example. The anomaly heuristics integrated into the editor performed well and were well received by the developers. Explorative Debugging works on the declarative semantics of the program and lets the user navigate and explore the inference process.152 V. took a long time to be calculated. It enables the user to check which inferences a rule enables.

– Is integrated into the development environment and can be started quickly to try out a rule as it is formulated. M. Governatori.. vol.. Kifer. N. 47–56 (June 2003) 7. Shaw. S. M.. Paschke. 6 Conclusions Current tools support for the development of rule based knowledge based systems fails to address many common problems encountered during this process. Studer. Scaffidi.a survey of developers. A.) RuleML 2005. Such a complete support for the development of rule bases is an important prerequisite for Semantic Web rules to become a reality and for business rule systems to reach their full potential. J...) Rule ML 2008. Larman. LNCS. Her Majesty’s Stationary Office. McCarthy. R. In: Adi. In: L/HCC 2005: Proceedings of the 2005 IEEE Symposium on Visual Languages and Human-Centric Computing. M. Journal of the ACM 42. The principles of interactivity. J. 351–369 (1999) 4. Basili. References 1. B.: Product-development practices that work. Springer. Stoutenburg. An architecture that combines integrated development. V.: Programs with common sense. pp.. de Bruijn. 5321. A.. debugging and testing supported by test coverage metrics. V. LNCS.. pp. Decker. visualization and anomaly detection heuristics can help tackle this challenge. Springer. The interested reader finds a complete description of the explorative debugging paradigm and one of its implementation in [23]. London.: Estimating the numbers of end users and end user programmers. 6–16. MacCormack. Fensel.Tackling the Debugging Challenge of Rule Based Systems 153 – It allows to further explore the inference a rule enables by digging down into the rule parts. pp. Myers. MIT Sloan Management Review. C. vol. G.: Software Cost Estimation with COCOMO II. G.. Wu. In: Proceedings of the Teddington Conference on the Mechanization of Thought Processes. Lausen. The novel paradigm of explorative debugging together with techniques from the automatic debugging community can form the robust basis for debugging in this context. C. Erdmann. visibility and modularization can guide the instantiation of this architecture in concrete tools. 741–843 (1995) 3. (eds. IEEE Computer. In: Database Semantics: Semantic Issues in Multimedia Systems.: Iterative and incremental development: A brief history.R. D. Englewood Cliffs (2000) . 207–214 (2005) 9. J..: Ontobroker: Ontology-based access to distributed and semi-structured unformation.: A realistic architecture for the semantic web. S. 17–29. (eds.. declarativity. Boehm. Boley.: Logical foundations of object-oriented and frame-based languages. pp. Kifer.. Heidelberg (2005) 5. M. 3791. Tabet. H. Fensel. Zacharias. Heidelberg (2008) 2. 75–91 (1959) 6.. 75–84 (2001) 8.: Development and verification of rule based systems . S. pp. D. B... Prentice Hall PTR. In: Bassiliades. A.

-Comput. B. Nash. pp. J. 5–7 (July/August 2004) 12. J. Preece. J. R..T. Balling. Zhang.. Baroff. Tsai. In: 1st Workshop on End-User Software Engineering (WEUSE 2005) at ICSE 2005 (2005) 13.. C.. W. A. L. R. In: Proceedings of the 7th International Conference on Knowledge Management . In: Proceedings of the 3rd Workshop on Scripting for the Semantic Web at the ESWC 2007 (2007) .. Simon.the overall structure. Grossner..: Rewarding good behavior: End-user debugging and rewards. 173–189 (1990) 15. R. J. A. T.: Six challenges in supporting end-user debugging. (1991) 18. 629–658 (1997) 19. S.. Zacharias.: Foundation and application of knowledge base verification. Gokulchander.. D. IEEE Software... Vishnuvajjala.. International Journal of Expert Systems (1994) 17.: A software engineering methodology for rule-based systems. NASA Contract NAS1-18585. Hum. Zacharias.154 V. Ruthruff. Harrison. 47. A. Vignollet. In: VL/HCC 2004: IEEE Symposium on Visual Languages and Human-Centric Computing (2004) 14.. M. F. 202–212 (1999) 21.Special Track on Knowledge Visualization and Knowledge Discovery (2007) 23. Shinghal. Talbot.: Revealing the structure of rule based systems. In: Proceedings of the 2004 IEEE Symposium on Visual Languages and Human Centric Computing (2004) 11. Zacharias 10. IEEE Transactions on Knowledge and Data Engineering 11. Burnett. Gilman.: Everyday programming: Challenges and opportunities for informal web development. 343–363 (1993) 20. H. Preece. Jacob. Phalgune.. M. IEEE Transactions on Knowledge and Data Engineering 2. Int. M. V.D. J.: Verification and validation of knowledge-based systems. Final Rep. Preece.: Validation and verification of knowledge-based systems: a survey. Burnett. 683–701 (1994) 22. U. R. A.B. Froscher. Gupta. P. Stud.: Visualization of rule bases . Radhakrishnan. L.: Evaluation of verification tools for knowledge-based systems. Mehrotra.. Ruthruff. Rosson..: Explorative debugging for rapid rule base development.. International Journal of Intelligent Systems 9. Shneiderman. 99–125 (1987) 16. Technical report. Abecker. W.: The dangers of end-user programming. J. M.. A. Beckwith.: Direct manipulation user interfaces for expert systems. Journal of Applied Intelligence.: Rule groupings: a software engineering approach towards verification of expert systems. V...

Sign up to vote on this title
UsefulNot useful