Tackling the Debugging Challenge of Rule Based Systems

Valentin Zacharias
FZI Research Center for Information Technology, Karlsruhe 76131, Germany zach@fzi.de http://www.fzi.de

Abstract. Rule based systems are often presented as a simpler and more natural way to build computer systems - compared to both imperative programming and other logic formalisms. However, with respect to finding and preventing faults this promise of simplicity remains elusive. Based on the experience from three rule base creation projects and a survey of developers, this paper identifies reasons for this gap between promise and reality and proposes steps that can be taken to close it. An architecture for a complete support of rule base development is presented. Keywords: Rule based systems, F-logic, Knowledge acquisition, Rule based modeling, Knowledge modeling, Verification, Anomaly detection.

1 Introduction
Rule based systems are often presented as a simpler and more natural way to build computer systems - compared to both imperative programming and other logic formalisms such as full first order logic or desciption logics. It seems natural to assume that this simplicity also applies to finding and preventing faults; to software quality assurrance in general. However, with respect to debugging of rule bases this promise of simplicity remains elusive. Indeed a survey found [1] that most developers of rule based systems think that the debugging of rule based systems is more difficult than the debugging of ’conventional’ - object oriented and procedural programs. An observation that the authors also found corroborated in three rule base development projects they participated in. 1.1 The Debugging Challenge The assumption of rule based systems as simpler than ’conventional’ programs rests on three observations: – Rule languages are (mostly) declarative languages that free the developer from worrying about how something is computed. This should decrease the complexity for the developer. – The If-Then structure of rules resembles the way humans naturally communicate a large part of knowledge. – The basic structure of a rule is very simple and easy to understand.
J. Filipe and J. Cordeiro (Eds.): ICEIS 2008, LNBIP 19, pp. 144–154, 2009. c Springer-Verlag Berlin Heidelberg 2009

participants of the survey saw debugging of rule based systems as actually more difficult than the debugging of conventional software. rank it as the issue most hindering the development of rule based systems.projecthalo. – Project Halo1 is a multistage project to develop systems and methods that enable domain experts to model scientific knowledge. The analysis presented in this paper is based on the experience from three projects. Section three discusses what kinds of problems were encountered. Approximately 5 person months went into this application.com .2 Overview This paper is an attempt to reconcile the promise of simplicity of rule based systems with reality. at the same time a majority of rule base developers sees rule base development as inferior with respect to debugging. chemistry and biology.Tackling the Debugging Challenge of Rule Based Systems 145 To see whether this simplicity assumption holds in practice. This project was done by a developer with experience in creating rule bases and took 3 month to complete. The next two sections then discuss overall principles to better support the creation of rule bases and an architecture of tool support for the development process. 1. – The goal of the project F-verify was to create a rule base as support for verification activities. And. – Online Forecast was a project to explore the potential of knowledge based systems with respect to maintainability and understandability. It identifies why rule bases are not currently simple to build and proposes steps to address this. As part of the second phase of Project Halo six domain experts were employed for 6 weeks each to create rule bases representing domain knowledge in physics. It models (mostly heuristic) knowledge about anomalies in rule bases. reuse and understandability that were expected from rule based (and declarative) knowledge and program specification. Towards this end an existing reporting application in a human resource department was re-created as rule based system. This paper’s structure is as follows: section two gives a short overview of the three projects that form the input for this analysis. in a different question. The results showed. 2 Experiences In addition to the survey in [1] and literature research. It consists of anomaly detection rules that work on the reified version of a rule base. a survey of developers of rule based systems [1] included a question that asked participants to compare rule base development to the development of procedural and object oriented programs. However. that these developers indeed see the advantages in ease of change. The authors found this problem of fault detection corroborated in three different projects he participated in. 1 http://www. This project was performed by a junior developer who had little prior experience with rule based system.

In particular for rules in a rule set to work together. while another processes it as millimeter. domain expert/end user programmer driven projects. 2 http://www. F-logic is very similar to some of the rule languages under discussion for the Semantic Web [4] and it is based on normal logic programs probably the prototypical rule language. while another understands it socially. Online Forecast used a simple version of Ontoprise’s2 OntoStudio. it still stands in a marked contrast to the often repeated assertion that creating rule bases is simpler. Zacharias All of these projects used F-logic [2] as representation language and the Ontobroker inference engine [3]. because they are true in broader contexts than program statements can be used [5]. graphical tools developed in the project. all rules need to be consistent in the domain formalization and the use of terminology.that most errors get made. and it is here . The authors examined literature and tools to ensure that tool support identified as missing is indeed missing from rule based systems and not only from systems supporting F-logic. All of these projects have been created using relatively lightweight methods that focus more on the actual implementation of rules than on high level models . In all cases the developers complained that creating correct rule bases takes too long and in particular that debugging takes up too much time.in the creation of rules in a way that they can work together . Developers with prior experience in imperative programming languages said that developing rule bases was more difficult then developing imperative applications. – That one rule understands a parent relation biologically. the inference engine obviously combines the rules only based on how the user has specified the rules.de . The use of only one rule formalism obviously restricts the generality of any statements that can be made. The analysis. We identified six reasons for this observed discrepancy. However. The editing tools differed between the projects: Project Halo used the high level. methods and tools presented in this paper are geared towards this kind of small. – The One Rule Fallacy and the Problem of Terminology. Declarative statements such as rules are often assumed to be easier reusable and more modular. Based on these sentiments the one rule fallacy goes as follows: One rule is simple to create and the combination of declarative rules is handled automatically by the inference engine hence entire rule bases are simply to create. – That one rule processes a number as inch.ontoprise. in particular when performed by domain experts or junior programmers. 3 Analyis In the three projects sketched above the authors found that the creation of working rule bases was an error prone and tiresome process. F-verify used Ontostudio together with prototypical debugging and editing tools. While part of these observations can be explained by longer training with imperative programming. For example it must not happen: – That one rule uses a relation part while another one uses part of .146 V.as is to be expected for such relatively small projects. However.

X) . X) # fault . 3 At least in the absence of robust. Not having an overall view on the rule base mainly affects the ability of developers to correctly edit the rule base. it is this interaction between rules that is commonly not shown.. As a very simple example consider the following rule base: hot(X) ← part of (Chile. for example. At the same time this points to the Problem of Terminology as one explanation for the difficulty in debugging rule bases. this depends on ensuring the consistent formalization and use of terminology across the rule base. – The Problem of Interconnection. everything is potentially relevant to everything else3 . One example would be that a rule causes a whole rule base to become unstratisfiable. is changed by a typo to deduce that something is hot if the country Chile is part of it. Because rule interactions are managed by the inference engine. however. shown in the source code and the subject of visualizations. another that it leads to a rule being tried in unexpected circumstances that then lead to errors due to the wrong usage of built-ins. because it becomes hard. This complicates error localization for the user. The only common way to explicitly show these interactions between the rules is in a prooftree of a successful query to the rule base. because errors appear in seemingly unrelated parts of the rule base. The problem of terminology makes the rule base harder to understand. In the authors experience failed interactions between rules are the most important source of errors during rule base creation. where the relations between the entities and the overall structure of the program are explicitly created by the developer. user managed modularization. – The Problem of Opacity. however.should be part of(Chili.Tackling the Debugging Challenge of Rule Based Systems 147 Rule based systems hold the promise to allow the automatic recombination of rules to tackle even problems not directly envisioned during the creation of the rule base. Hence a part of the gap between the expected simplicity of rule base creation and the reality can be explained by naive assumptions about rule base creation. However. More typical than this example. avoid and correct mistakes based on only a local view of the rule base.from the user’s point of view . SouthAmerica) In this example a rule that deduces that something is hot based on it having chili as part. part of (Chile. that is opaque to the user. At the same time. to see the consequences of changes or to gain confidence in the completeness of a rule base.is absolutely unrelated. are cases that do not lead to wrong conclusions but rather merely lead the inference engine to try one path that ends in a rule cycle or in some kind of error. This simple error then introduces a connection to a part of the rule base that deals with countries and that . This stands in contrast to imperative programming. The Problem of Opacity also affects the developer’s ability to identify faults in rule bases. . makes it harder to identify.

an impediment for many static analysis methods. Agile and iterative methods were used in more than 50% of the projects with more than 10 person months of effort. The execution steps of a procedural debugger are then for instance: the inference engine tries to prove a goal A or it tries to find more results for another goal B. The user can then look at the current state of the program and give commands to execute a number of the successive instructions. not on the expected scope of the final application. because agile methodologies encourage developers to make these decisions based on the scope of the current iteration.148 V.the no-result case. A further 23% of the projects worked without following any specific methodology. All deployed debugging tools for rule based programs known to the authors are based on the procedural or imperative debugging paradigm. This debugging paradigm is well known from the world of imperative programming and characterized by the concepts of breakpoints and stepping. when debugging is done on the procedural nature of the inference engine. Zacharias – The No-Result Case and the Problem of Error Reporting. but only rule based systems show the overwhelming majority of bugs by terminating without giving any help on error localization. A high prevalence of agile and iterative methods was also found in the developer survey [1]. A second likely consequence is a higher prevalence of implementation mistakes. In this way procedural debugging of rule bases breaks the declarative paradigm .they are now widely believed to be superior to waterfall-like models [7].hence the order of debugging is based on the evaluation strategy of the inference engine.it forces the developer to learn about the inner structure of the inference engine. However. In addition. Recent years also saw the increasing adoption and acceptance of iterative and evolutionary development methodologies [6] . at the same time it means that many of these mistakes are then made during implementation. With many of these methods architecture and design decisions are taken on the fly during the implementation of software. – The Problem of Procedural Debugging. Both imperative and rule based systems sometimes show bugs by behaving erratically. The first corollary of this observation is that the verification support for these systems cannot rely on the availability of formal specifications of any kind. however. – Agile. Iterative and Lightweight Methods.hence the order of debugging is based on the evaluation strategy of the inference engine. By far the most common symptom of an error in the rule base was a query that (unexpectedly) did not return any result . design and (up-front) architecture mistakes. Doing these design and architecture decisions ’on the fly’ reduces the risk of costly requirements. as a declarative representation a rule base does not define an order of execution . However a rule base does not define an order of execution . The development of rule based systems cannot take full advantage of the declarative nature of rules. A procedural debugger offers a way to indicate a rule/rule part where the program execution is to stop and has to wait for commands from the users. . these design and architecture decisions are often repeated and changed. This is unlike many imperative languages that often produce a partial output and a stack trace. In such a case most inference engines give no further information that could aid the user in error localization. This stands in marked contrast to the idea that rule bases free the developer from worrying about how something is computed.

text editors that automatically load their data into the inference engine or simple schema based verification during rule formulation where the most successful tools employed. These principles where conceived either by generalizing from tools that worked well or as direct antidote to problems encountered repeatedly. Tools that embodied interactivity proved to be very popular and successful in the three projects under discussion. To create tools in a way that they give feedback at the earliest moment possible. 4 This was mostly due to very long reasoning times or due to technical problems with the inference engine. The creation of software artifacts used to be the very specialized profession of a few thousand experts worldwide. and more and more people are involved in its creation. but has now become a set of skills possessed to some degree by tens of millions of people. To support an incremental.11]. Compared to tools available as support for the development with imperative languages. 4 Core Principles The authors have identified four core principles to guide the building of better tool support for the creation of rule bases. In cases where quick feedback during knowledge formulation could not be given4. those for the development of rule bases often lack refinement. End user programmers usually can’t justify making an investment in programming training comparable to that of professional programmers. Tools such as fast graphical editors for test queries. Slightly more than half of the projects included at least one domain expert that created rules.7 domain experts that were involved in verification and validation. – Less refined tools support. – Interactivity. The amount of software in society increases continually. This discrepancy is a direct consequence from the fact that in recent years the percentage of applications built with imperative programming languages was much larger than those built with rule languages.Tackling the Debugging Challenge of Rule Based Systems 149 – End User Programmers and Domain Experts.people that are usually trained for a non-programming area and just need a program.5 domain experts that create rules themselves and 1. that the average developer is not trained as well as the knowledge engineers that created the first expert systems. This rise in the number of end user programmers means. End user programmers are particularly important for web related development. The survey of developers [1] also found that the average rule base development projects includes 1. For the US it is estimated that there are at least four times as many end user programmers as professional programmers [8]. this was reflected immediately in erroneous rules and unmotivated developers. script or spreadsheet as a tool for some task. try-and-error process of rule base creation by allowing trying out a rule as it is created. Within this group an increasing role is played by end user programmers . with estimates for the number of end user programmers ranging from 11 million [8] up to 55 million [9]. . because web developers are known to have an even higher percentage of end user programmers [10.

This principle is a direct consequence of the problem of interconnection.even in the absence of actual test queries. This stands in contrast to the usual sequence of: create program part. The declarititivity principle is a direct response to the problem of procedural debugging described in the previous section. the contents of the rule base and the test data) the current rule allows to infer such and such. Based on the test data an editor can give feedback on the inferences made possible by a rule while it is created. To show the hidden structure of (potential) rule interactions at every opportunity.17] and will not be further discussed in this paper.16. This allows the developer to get instant feedback on whether her intuition of what a rule is supposed to infer matches reality. debug and rule creation activities need to be closely integrated.150 V. Testing has been broken up into two separate activities: the creation/identification of test data and the creation and evaluation of test queries. 5 Supporting Rule Base Development The principles identified in the previous section need to be embodied in concrete tools and development processes in order to be effective. At its core it shows test.g.as mandated by the interactive principle. Zacharias Interactivity is known to be an important success factor for development tools.13]. – Modularization. Immediate feedback after small changes also helps to deal with the problem of interconnection and the problem of error reporting. create test and debug if necessary. To support the structuring of a rule base in modules in order to give the user the possibility to isolate parts of the rule base and prevent unintended rule interactions. A high level view of this architecture is shown in figure 1. Visibility is included as a direct counteragent to the problems of opacity and interconnection. anomaly detection and visualization. debug and rule creation as an integrated activity . modularization of rule bases has been extensively researched and implemented (e. This has been done because test data can inform the rule creation process . However. An overview of this integrated activity is shown in figure 2. Each of the main parts will be described in more detail in the following paragraphs.20]) we have identified (and largely implemented) an architecture of tool support for the development of rule bases.15.1 Test. – Declarativity. Interactivity as a principle addresses many of the problems identified in the previous section by supporting faster learning during knowledge formulation. Debug and Rule Creation as Integrated Activity In order to truly support the interactivity principle the test. – Visibility.19. about the procedural nature of the computation. When the . Declaritivity of all development tools is a prerequisite to realize the potential of reduced complexity offered by declarative programming languages. An editor can automatically display that (given the state of the rule that is being created. To create tools in a way that the user never has to worry about the how. Based on our experience from the three projects described earlier and on best practices as described in literature (eg [18. [14. in particular for those geared towards end user programmers [12. These activities are supported by test coverage. 5.

The overall architecture of tool support for the rule base development Fig. Anomaly detection is a well established and often used static verification technique for the development of rule bases. .Tackling the Debugging Challenge of Rule Based Systems 151 Fig. 5. even without an actual test query. Hence we speak of integrated test. 1. Anomaly detection heuristics focus on errors across rules that would otherwise be very hard to detects. In this way testing and debugging become integrated. interconnection and error reporting by finding some of the errors related to rule interactions based on static analysis. Of the problems identified in section 3. anomalies detection partly addresses the problem of opacity. 2. Anomalies give feedback to the rule creation process by pointing to errors and can guide the user to perform extra tests on parts of the rule base that seem problematic. debug and rule creation because unlike in traditional development: – test data is used throughout editing to give immediate feedback – debugging is permanently available to support rule creation and editing. even without a test query being present. Its goal is to identify errors early that would be expensive to diagnose later. Test. The developer can debug rules in the absence of test queries.2 Anomalies Anomalies are symptoms of probably errors in the knowledge base [21]. Examples for anomalies are rules that use concepts that are not in the ontology or rules deducing relations that are never used anywhere. debug and rule creation as integrated activities inferences a rule enables do not match the expectations of the user she must be able to directly switch to the debugger to investigate this.

for example. Explorative Debugging works on the declarative semantics of the program and lets the user navigate and explore the inference process. 5. Not being able to get an overview of the entire rule base and being lost in the navigation of the rule base was a frequent complaint in the three projects described in the beginning. It was established that debuggers needs to be declarative in order to realize the full potential of declarative programming languages. detailed descriptions can be found in [22]. Zacharias We deployed anomaly detection heuristics within Project Halo. The overview visualization of the entire rule base is a response to the problem of opacity. independent of the answer to any particular query. To address this we created a scalable overview representation of the static and dynamic structure of a rule base. An Explorative Debugger is not only a tool to identify the cause of an identified problem but also a tool to try out a program and learn about its working. Other logical connections are formed by the prooftree that represents the logical structure of a proof that lead to a particular result. It supports the navigation along these connections5. thereby violated the interactivity paradigm. . An explorative debugger is a rule browser that: – Shows the inferences a rule enables. – Visualizes the logical/semantic connections between rules and how rules work together to arrive at a result.152 V. and weren’t accepted by the users. 5. It enables the user to check which inferences a rule enables. development. how it interacts with other parts of the rule base and what role the different rule parts play. To address this shortcoming we have developed and implemented the Explorative Debugging paradigm for rule-based systems [23]. validation and maintenance by facilitating a better understanding of a computer program/model by the developers. both as very simple heuristics integrated into the rule editor and more complex heuristics as a separate tool. the more complex heuristics. The goals for such visualizations are the same as for UML class diagrams and other overview representations of programs and models: to aid teaching. It enables the user to use all her knowledge to quickly find the problem and to learn about the program at the same time. took a long time to be calculated. The anomaly heuristics integrated into the editor performed well and were well received by the developers.4 Debugging In section 3 the current procedural debugging support was identified as unsuited to support the simple creation of rule bases. 5 Logical connections between rules are static dependencies formed for example by the possibility to unify a body atom of one rule with the head atom of another. Explorative Debugging puts the focus on rules. Unlike in procedural debuggers the debugging process is not determined by the procedural nature of the inference engine but by the user who can use the logical/semantic relations between the rules to navigate. show which other parts are affected by a change. debugging. however.3 Visualization The visualization of rule bases here means the use of graphic design techniques to display the overall structure of the entire rule base. To.

47–56 (June 2003) 7. (eds. Paschke. Springer. M. R.: Development and verification of rule based systems .: Logical foundations of object-oriented and frame-based languages.. pp.: A realistic architecture for the semantic web. pp. Basili. declarativity. M. Decker. Governatori.: Estimating the numbers of end users and end user programmers. B. 6 Conclusions Current tools support for the development of rule based knowledge based systems fails to address many common problems encountered during this process.: Product-development practices that work. 741–843 (1995) 3. V. Lausen. Boley. Zacharias. 75–84 (2001) 8. IEEE Computer. LNCS. H. Scaffidi. J. Her Majesty’s Stationary Office. LNCS. London. In: Bassiliades. vol.. The novel paradigm of explorative debugging together with techniques from the automatic debugging community can form the robust basis for debugging in this context. D. D. The interested reader finds a complete description of the explorative debugging paradigm and one of its implementation in [23]. S. 207–214 (2005) 9. McCarthy. – Is integrated into the development environment and can be started quickly to try out a rule as it is formulated. Englewood Cliffs (2000) . Such a complete support for the development of rule bases is an important prerequisite for Semantic Web rules to become a reality and for business rule systems to reach their full potential. Springer. 6–16.: Software Cost Estimation with COCOMO II. S.Tackling the Debugging Challenge of Rule Based Systems 153 – It allows to further explore the inference a rule enables by digging down into the rule parts. A.a survey of developers. In: Adi. References 1. Myers. Larman. A... Kifer.. 75–91 (1959) 6. M. pp. The principles of interactivity. Boehm. C. Fensel. 5321. vol. N. (eds. Journal of the ACM 42.: Programs with common sense. A. Shaw. visibility and modularization can guide the instantiation of this architecture in concrete tools. visualization and anomaly detection heuristics can help tackle this challenge. 351–369 (1999) 4.R. G. 17–29. In: L/HCC 2005: Proceedings of the 2005 IEEE Symposium on Visual Languages and Human-Centric Computing. In: Database Semantics: Semantic Issues in Multimedia Systems... J. pp. V. MIT Sloan Management Review. Wu. 3791.. de Bruijn. Erdmann.: Ontobroker: Ontology-based access to distributed and semi-structured unformation. Heidelberg (2005) 5... J. M. debugging and testing supported by test coverage metrics.) RuleML 2005.. Studer. Kifer. G. Prentice Hall PTR. B. MacCormack.) Rule ML 2008.. Tabet. Fensel. Stoutenburg. S. An architecture that combines integrated development. pp. C.: Iterative and incremental development: A brief history.. In: Proceedings of the Teddington Conference on the Mechanization of Thought Processes.. Heidelberg (2008) 2..

Tsai. Jacob.: Direct manipulation user interfaces for expert systems..B. H.. M.. V. IEEE Software. C. R. Simon.T. Talbot. L. Ruthruff.... A.: Rewarding good behavior: End-user debugging and rewards.: The dangers of end-user programming. Final Rep. U. A. Stud. 202–212 (1999) 21. In: 1st Workshop on End-User Software Engineering (WEUSE 2005) at ICSE 2005 (2005) 13... M. In: Proceedings of the 2004 IEEE Symposium on Visual Languages and Human Centric Computing (2004) 11. Int.: Six challenges in supporting end-user debugging.D.: Visualization of rule bases . P.. M. J. T. Mehrotra.-Comput. Radhakrishnan. Rosson. V.. 343–363 (1993) 20. J. Zhang. In: Proceedings of the 3rd Workshop on Scripting for the Semantic Web at the ESWC 2007 (2007) . Balling. J. Preece.: Revealing the structure of rule based systems. Vishnuvajjala. Nash. Gokulchander. Abecker.154 V. 47. Preece. 629–658 (1997) 19. A.. Shneiderman. In: Proceedings of the 7th International Conference on Knowledge Management . R. W. A. B.. Phalgune. Technical report. International Journal of Intelligent Systems 9.. Preece.: A software engineering methodology for rule-based systems.. NASA Contract NAS1-18585. W. D. Hum.the overall structure.. Zacharias. Journal of Applied Intelligence. 99–125 (1987) 16. In: VL/HCC 2004: IEEE Symposium on Visual Languages and Human-Centric Computing (2004) 14. Gilman.: Rule groupings: a software engineering approach towards verification of expert systems. pp. Froscher. F. 683–701 (1994) 22. Burnett. Burnett. Vignollet. A. M. IEEE Transactions on Knowledge and Data Engineering 11.: Validation and verification of knowledge-based systems: a survey. Baroff. Grossner.. Zacharias.Special Track on Knowledge Visualization and Knowledge Discovery (2007) 23. J. Shinghal.: Explorative debugging for rapid rule base development.: Foundation and application of knowledge base verification. Gupta. 173–189 (1990) 15. Zacharias 10.: Everyday programming: Challenges and opportunities for informal web development. Harrison. Beckwith.. International Journal of Expert Systems (1994) 17.. J.: Verification and validation of knowledge-based systems.: Evaluation of verification tools for knowledge-based systems. S. L.. IEEE Transactions on Knowledge and Data Engineering 2. R. 5–7 (July/August 2004) 12. J. R. Ruthruff. (1991) 18.

Sign up to vote on this title
UsefulNot useful