Tackling the Debugging Challenge of Rule Based Systems

Valentin Zacharias
FZI Research Center for Information Technology, Karlsruhe 76131, Germany zach@fzi.de http://www.fzi.de

Abstract. Rule based systems are often presented as a simpler and more natural way to build computer systems - compared to both imperative programming and other logic formalisms. However, with respect to finding and preventing faults this promise of simplicity remains elusive. Based on the experience from three rule base creation projects and a survey of developers, this paper identifies reasons for this gap between promise and reality and proposes steps that can be taken to close it. An architecture for a complete support of rule base development is presented. Keywords: Rule based systems, F-logic, Knowledge acquisition, Rule based modeling, Knowledge modeling, Verification, Anomaly detection.

1 Introduction
Rule based systems are often presented as a simpler and more natural way to build computer systems - compared to both imperative programming and other logic formalisms such as full first order logic or desciption logics. It seems natural to assume that this simplicity also applies to finding and preventing faults; to software quality assurrance in general. However, with respect to debugging of rule bases this promise of simplicity remains elusive. Indeed a survey found [1] that most developers of rule based systems think that the debugging of rule based systems is more difficult than the debugging of ’conventional’ - object oriented and procedural programs. An observation that the authors also found corroborated in three rule base development projects they participated in. 1.1 The Debugging Challenge The assumption of rule based systems as simpler than ’conventional’ programs rests on three observations: – Rule languages are (mostly) declarative languages that free the developer from worrying about how something is computed. This should decrease the complexity for the developer. – The If-Then structure of rules resembles the way humans naturally communicate a large part of knowledge. – The basic structure of a rule is very simple and easy to understand.
J. Filipe and J. Cordeiro (Eds.): ICEIS 2008, LNBIP 19, pp. 144–154, 2009. c Springer-Verlag Berlin Heidelberg 2009

As part of the second phase of Project Halo six domain experts were employed for 6 weeks each to create rule bases representing domain knowledge in physics. The analysis presented in this paper is based on the experience from three projects.2 Overview This paper is an attempt to reconcile the promise of simplicity of rule based systems with reality. at the same time a majority of rule base developers sees rule base development as inferior with respect to debugging. participants of the survey saw debugging of rule based systems as actually more difficult than the debugging of conventional software. 1. reuse and understandability that were expected from rule based (and declarative) knowledge and program specification. – The goal of the project F-verify was to create a rule base as support for verification activities. 1 http://www. It consists of anomaly detection rules that work on the reified version of a rule base. – Project Halo1 is a multistage project to develop systems and methods that enable domain experts to model scientific knowledge. The authors found this problem of fault detection corroborated in three different projects he participated in. This project was done by a developer with experience in creating rule bases and took 3 month to complete.Tackling the Debugging Challenge of Rule Based Systems 145 To see whether this simplicity assumption holds in practice. a survey of developers of rule based systems [1] included a question that asked participants to compare rule base development to the development of procedural and object oriented programs. that these developers indeed see the advantages in ease of change. It identifies why rule bases are not currently simple to build and proposes steps to address this.com . chemistry and biology.projecthalo. Section three discusses what kinds of problems were encountered. – Online Forecast was a project to explore the potential of knowledge based systems with respect to maintainability and understandability. 2 Experiences In addition to the survey in [1] and literature research. The results showed. Approximately 5 person months went into this application. rank it as the issue most hindering the development of rule based systems. It models (mostly heuristic) knowledge about anomalies in rule bases. This paper’s structure is as follows: section two gives a short overview of the three projects that form the input for this analysis. The next two sections then discuss overall principles to better support the creation of rule bases and an architecture of tool support for the development process. Towards this end an existing reporting application in a human resource department was re-created as rule based system. in a different question. This project was performed by a junior developer who had little prior experience with rule based system. And. However.

F-verify used Ontostudio together with prototypical debugging and editing tools. The authors examined literature and tools to ensure that tool support identified as missing is indeed missing from rule based systems and not only from systems supporting F-logic.as is to be expected for such relatively small projects. 3 Analyis In the three projects sketched above the authors found that the creation of working rule bases was an error prone and tiresome process.ontoprise. In all cases the developers complained that creating correct rule bases takes too long and in particular that debugging takes up too much time.146 V.in the creation of rules in a way that they can work together . Based on these sentiments the one rule fallacy goes as follows: One rule is simple to create and the combination of declarative rules is handled automatically by the inference engine hence entire rule bases are simply to create. Zacharias All of these projects used F-logic [2] as representation language and the Ontobroker inference engine [3]. We identified six reasons for this observed discrepancy. in particular when performed by domain experts or junior programmers. while another processes it as millimeter. because they are true in broader contexts than program statements can be used [5].that most errors get made. – That one rule processes a number as inch. methods and tools presented in this paper are geared towards this kind of small. For example it must not happen: – That one rule uses a relation part while another one uses part of . graphical tools developed in the project. while another understands it socially. Online Forecast used a simple version of Ontoprise’s2 OntoStudio. – That one rule understands a parent relation biologically. the inference engine obviously combines the rules only based on how the user has specified the rules. and it is here . F-logic is very similar to some of the rule languages under discussion for the Semantic Web [4] and it is based on normal logic programs probably the prototypical rule language. Declarative statements such as rules are often assumed to be easier reusable and more modular. 2 http://www. – The One Rule Fallacy and the Problem of Terminology. Developers with prior experience in imperative programming languages said that developing rule bases was more difficult then developing imperative applications. All of these projects have been created using relatively lightweight methods that focus more on the actual implementation of rules than on high level models . The editing tools differed between the projects: Project Halo used the high level. it still stands in a marked contrast to the often repeated assertion that creating rule bases is simpler. However. domain expert/end user programmer driven projects. all rules need to be consistent in the domain formalization and the use of terminology. In particular for rules in a rule set to work together. The analysis. The use of only one rule formalism obviously restricts the generality of any statements that can be made. However. While part of these observations can be explained by longer training with imperative programming.de .

. is changed by a typo to deduce that something is hot if the country Chile is part of it. however. – The Problem of Opacity. user managed modularization.Tackling the Debugging Challenge of Rule Based Systems 147 Rule based systems hold the promise to allow the automatic recombination of rules to tackle even problems not directly envisioned during the creation of the rule base. 3 At least in the absence of robust. shown in the source code and the subject of visualizations. are cases that do not lead to wrong conclusions but rather merely lead the inference engine to try one path that ends in a rule cycle or in some kind of error. X) # fault . another that it leads to a rule being tried in unexpected circumstances that then lead to errors due to the wrong usage of built-ins. to see the consequences of changes or to gain confidence in the completeness of a rule base. Not having an overall view on the rule base mainly affects the ability of developers to correctly edit the rule base. . this depends on ensuring the consistent formalization and use of terminology across the rule base. however.should be part of(Chili. because errors appear in seemingly unrelated parts of the rule base. SouthAmerica) In this example a rule that deduces that something is hot based on it having chili as part. The problem of terminology makes the rule base harder to understand. Hence a part of the gap between the expected simplicity of rule base creation and the reality can be explained by naive assumptions about rule base creation. More typical than this example. At the same time. Because rule interactions are managed by the inference engine. that is opaque to the user. part of (Chile. This simple error then introduces a connection to a part of the rule base that deals with countries and that .is absolutely unrelated. – The Problem of Interconnection. The only common way to explicitly show these interactions between the rules is in a prooftree of a successful query to the rule base. In the authors experience failed interactions between rules are the most important source of errors during rule base creation.X) . where the relations between the entities and the overall structure of the program are explicitly created by the developer. because it becomes hard. for example. The Problem of Opacity also affects the developer’s ability to identify faults in rule bases. This stands in contrast to imperative programming. At the same time this points to the Problem of Terminology as one explanation for the difficulty in debugging rule bases. One example would be that a rule causes a whole rule base to become unstratisfiable. avoid and correct mistakes based on only a local view of the rule base. makes it harder to identify. This complicates error localization for the user. everything is potentially relevant to everything else3 . As a very simple example consider the following rule base: hot(X) ← part of (Chile. However. it is this interaction between rules that is commonly not shown.from the user’s point of view .

A procedural debugger offers a way to indicate a rule/rule part where the program execution is to stop and has to wait for commands from the users. as a declarative representation a rule base does not define an order of execution . – Agile. The first corollary of this observation is that the verification support for these systems cannot rely on the availability of formal specifications of any kind. However a rule base does not define an order of execution .it forces the developer to learn about the inner structure of the inference engine. an impediment for many static analysis methods.hence the order of debugging is based on the evaluation strategy of the inference engine. All deployed debugging tools for rule based programs known to the authors are based on the procedural or imperative debugging paradigm.they are now widely believed to be superior to waterfall-like models [7]. Recent years also saw the increasing adoption and acceptance of iterative and evolutionary development methodologies [6] . However.hence the order of debugging is based on the evaluation strategy of the inference engine. because agile methodologies encourage developers to make these decisions based on the scope of the current iteration. Doing these design and architecture decisions ’on the fly’ reduces the risk of costly requirements. Agile and iterative methods were used in more than 50% of the projects with more than 10 person months of effort. In such a case most inference engines give no further information that could aid the user in error localization. Iterative and Lightweight Methods. . With many of these methods architecture and design decisions are taken on the fly during the implementation of software. This stands in marked contrast to the idea that rule bases free the developer from worrying about how something is computed. This debugging paradigm is well known from the world of imperative programming and characterized by the concepts of breakpoints and stepping. In this way procedural debugging of rule bases breaks the declarative paradigm . A further 23% of the projects worked without following any specific methodology. these design and architecture decisions are often repeated and changed. – The Problem of Procedural Debugging. The development of rule based systems cannot take full advantage of the declarative nature of rules. but only rule based systems show the overwhelming majority of bugs by terminating without giving any help on error localization. design and (up-front) architecture mistakes. In addition. however. The execution steps of a procedural debugger are then for instance: the inference engine tries to prove a goal A or it tries to find more results for another goal B. The user can then look at the current state of the program and give commands to execute a number of the successive instructions. Zacharias – The No-Result Case and the Problem of Error Reporting.148 V. Both imperative and rule based systems sometimes show bugs by behaving erratically. This is unlike many imperative languages that often produce a partial output and a stack trace. not on the expected scope of the final application. at the same time it means that many of these mistakes are then made during implementation. By far the most common symptom of an error in the rule base was a query that (unexpectedly) did not return any result . A second likely consequence is a higher prevalence of implementation mistakes. A high prevalence of agile and iterative methods was also found in the developer survey [1].the no-result case. when debugging is done on the procedural nature of the inference engine.

11]. This rise in the number of end user programmers means. For the US it is estimated that there are at least four times as many end user programmers as professional programmers [8]. 4 Core Principles The authors have identified four core principles to guide the building of better tool support for the creation of rule bases. and more and more people are involved in its creation. this was reflected immediately in erroneous rules and unmotivated developers. because web developers are known to have an even higher percentage of end user programmers [10. try-and-error process of rule base creation by allowing trying out a rule as it is created. text editors that automatically load their data into the inference engine or simple schema based verification during rule formulation where the most successful tools employed.Tackling the Debugging Challenge of Rule Based Systems 149 – End User Programmers and Domain Experts. script or spreadsheet as a tool for some task. . Compared to tools available as support for the development with imperative languages. but has now become a set of skills possessed to some degree by tens of millions of people. Within this group an increasing role is played by end user programmers . – Interactivity. – Less refined tools support. To create tools in a way that they give feedback at the earliest moment possible. Tools that embodied interactivity proved to be very popular and successful in the three projects under discussion.people that are usually trained for a non-programming area and just need a program. 4 This was mostly due to very long reasoning times or due to technical problems with the inference engine. In cases where quick feedback during knowledge formulation could not be given4. End user programmers usually can’t justify making an investment in programming training comparable to that of professional programmers.7 domain experts that were involved in verification and validation. The amount of software in society increases continually. with estimates for the number of end user programmers ranging from 11 million [8] up to 55 million [9]. that the average developer is not trained as well as the knowledge engineers that created the first expert systems. The creation of software artifacts used to be the very specialized profession of a few thousand experts worldwide. The survey of developers [1] also found that the average rule base development projects includes 1. This discrepancy is a direct consequence from the fact that in recent years the percentage of applications built with imperative programming languages was much larger than those built with rule languages. Tools such as fast graphical editors for test queries.5 domain experts that create rules themselves and 1. These principles where conceived either by generalizing from tools that worked well or as direct antidote to problems encountered repeatedly. To support an incremental. those for the development of rule bases often lack refinement. End user programmers are particularly important for web related development. Slightly more than half of the projects included at least one domain expert that created rules.

5.as mandated by the interactive principle.even in the absence of actual test queries. At its core it shows test. The declarititivity principle is a direct response to the problem of procedural debugging described in the previous section.13]. in particular for those geared towards end user programmers [12. To create tools in a way that the user never has to worry about the how. – Modularization.17] and will not be further discussed in this paper. Visibility is included as a direct counteragent to the problems of opacity and interconnection. Based on the test data an editor can give feedback on the inferences made possible by a rule while it is created.150 V.20]) we have identified (and largely implemented) an architecture of tool support for the development of rule bases. anomaly detection and visualization. – Visibility. To support the structuring of a rule base in modules in order to give the user the possibility to isolate parts of the rule base and prevent unintended rule interactions. Interactivity as a principle addresses many of the problems identified in the previous section by supporting faster learning during knowledge formulation. create test and debug if necessary. This principle is a direct consequence of the problem of interconnection. Declaritivity of all development tools is a prerequisite to realize the potential of reduced complexity offered by declarative programming languages. about the procedural nature of the computation. An editor can automatically display that (given the state of the rule that is being created. debug and rule creation activities need to be closely integrated. [14. 5 Supporting Rule Base Development The principles identified in the previous section need to be embodied in concrete tools and development processes in order to be effective. the contents of the rule base and the test data) the current rule allows to infer such and such. Based on our experience from the three projects described earlier and on best practices as described in literature (eg [18. This stands in contrast to the usual sequence of: create program part. This has been done because test data can inform the rule creation process .g. An overview of this integrated activity is shown in figure 2. However.15. Debug and Rule Creation as Integrated Activity In order to truly support the interactivity principle the test. debug and rule creation as an integrated activity .1 Test. – Declarativity. Testing has been broken up into two separate activities: the creation/identification of test data and the creation and evaluation of test queries. modularization of rule bases has been extensively researched and implemented (e. Immediate feedback after small changes also helps to deal with the problem of interconnection and the problem of error reporting. When the . To show the hidden structure of (potential) rule interactions at every opportunity. Each of the main parts will be described in more detail in the following paragraphs. This allows the developer to get instant feedback on whether her intuition of what a rule is supposed to infer matches reality. Zacharias Interactivity is known to be an important success factor for development tools.19. These activities are supported by test coverage. A high level view of this architecture is shown in figure 1.16.

5. even without an actual test query. In this way testing and debugging become integrated. The overall architecture of tool support for the rule base development Fig. Its goal is to identify errors early that would be expensive to diagnose later. The developer can debug rules in the absence of test queries. 2. Hence we speak of integrated test. debug and rule creation because unlike in traditional development: – test data is used throughout editing to give immediate feedback – debugging is permanently available to support rule creation and editing. . interconnection and error reporting by finding some of the errors related to rule interactions based on static analysis. Anomaly detection heuristics focus on errors across rules that would otherwise be very hard to detects. Anomalies give feedback to the rule creation process by pointing to errors and can guide the user to perform extra tests on parts of the rule base that seem problematic. Anomaly detection is a well established and often used static verification technique for the development of rule bases. Examples for anomalies are rules that use concepts that are not in the ontology or rules deducing relations that are never used anywhere. anomalies detection partly addresses the problem of opacity. 1.2 Anomalies Anomalies are symptoms of probably errors in the knowledge base [21].Tackling the Debugging Challenge of Rule Based Systems 151 Fig. even without a test query being present. debug and rule creation as integrated activities inferences a rule enables do not match the expectations of the user she must be able to directly switch to the debugger to investigate this. Of the problems identified in section 3. Test.

It supports the navigation along these connections5. detailed descriptions can be found in [22]. Not being able to get an overview of the entire rule base and being lost in the navigation of the rule base was a frequent complaint in the three projects described in the beginning.152 V. To. . The goals for such visualizations are the same as for UML class diagrams and other overview representations of programs and models: to aid teaching. how it interacts with other parts of the rule base and what role the different rule parts play. Unlike in procedural debuggers the debugging process is not determined by the procedural nature of the inference engine but by the user who can use the logical/semantic relations between the rules to navigate. The anomaly heuristics integrated into the editor performed well and were well received by the developers.3 Visualization The visualization of rule bases here means the use of graphic design techniques to display the overall structure of the entire rule base. 5. however. both as very simple heuristics integrated into the rule editor and more complex heuristics as a separate tool. 5. and weren’t accepted by the users. debugging. Explorative Debugging works on the declarative semantics of the program and lets the user navigate and explore the inference process. It was established that debuggers needs to be declarative in order to realize the full potential of declarative programming languages.4 Debugging In section 3 the current procedural debugging support was identified as unsuited to support the simple creation of rule bases. development. To address this we created a scalable overview representation of the static and dynamic structure of a rule base. show which other parts are affected by a change. – Visualizes the logical/semantic connections between rules and how rules work together to arrive at a result. To address this shortcoming we have developed and implemented the Explorative Debugging paradigm for rule-based systems [23]. thereby violated the interactivity paradigm. Other logical connections are formed by the prooftree that represents the logical structure of a proof that lead to a particular result. Explorative Debugging puts the focus on rules. It enables the user to check which inferences a rule enables. validation and maintenance by facilitating a better understanding of a computer program/model by the developers. took a long time to be calculated. independent of the answer to any particular query. Zacharias We deployed anomaly detection heuristics within Project Halo. An explorative debugger is a rule browser that: – Shows the inferences a rule enables. 5 Logical connections between rules are static dependencies formed for example by the possibility to unify a body atom of one rule with the head atom of another. It enables the user to use all her knowledge to quickly find the problem and to learn about the program at the same time. The overview visualization of the entire rule base is a response to the problem of opacity. the more complex heuristics. for example. An Explorative Debugger is not only a tool to identify the cause of an identified problem but also a tool to try out a program and learn about its working.

M. In: Database Semantics: Semantic Issues in Multimedia Systems.a survey of developers. Kifer. pp. Wu.. An architecture that combines integrated development. visualization and anomaly detection heuristics can help tackle this challenge. A. 75–84 (2001) 8. A. J.. G. Shaw. Studer. pp.: A realistic architecture for the semantic web. MIT Sloan Management Review. D.. References 1. London. Her Majesty’s Stationary Office. Kifer. H. C. Fensel. 741–843 (1995) 3. Springer. B.R. Heidelberg (2008) 2.. In: L/HCC 2005: Proceedings of the 2005 IEEE Symposium on Visual Languages and Human-Centric Computing. J. M. declarativity. Basili.: Product-development practices that work. debugging and testing supported by test coverage metrics. pp. 47–56 (June 2003) 7. McCarthy. 17–29. Erdmann. M.. visibility and modularization can guide the instantiation of this architecture in concrete tools. Zacharias.. In: Adi. vol. V.. (eds. The novel paradigm of explorative debugging together with techniques from the automatic debugging community can form the robust basis for debugging in this context. Boley. – Is integrated into the development environment and can be started quickly to try out a rule as it is formulated. B.: Estimating the numbers of end users and end user programmers. Englewood Cliffs (2000) . LNCS. S. M. de Bruijn. (eds.) Rule ML 2008. Tabet. Paschke. Fensel.: Programs with common sense.. Larman. 75–91 (1959) 6. Heidelberg (2005) 5. S. Prentice Hall PTR. In: Bassiliades. Stoutenburg. pp..: Iterative and incremental development: A brief history.: Ontobroker: Ontology-based access to distributed and semi-structured unformation. In: Proceedings of the Teddington Conference on the Mechanization of Thought Processes.) RuleML 2005. R.: Logical foundations of object-oriented and frame-based languages. 5321. S. Decker. A. Journal of the ACM 42. pp.. C. G. Springer.. V. Such a complete support for the development of rule bases is an important prerequisite for Semantic Web rules to become a reality and for business rule systems to reach their full potential. LNCS.: Development and verification of rule based systems . 207–214 (2005) 9. 351–369 (1999) 4. MacCormack...Tackling the Debugging Challenge of Rule Based Systems 153 – It allows to further explore the inference a rule enables by digging down into the rule parts. N.. IEEE Computer.. Myers. 6 Conclusions Current tools support for the development of rule based knowledge based systems fails to address many common problems encountered during this process. Scaffidi.: Software Cost Estimation with COCOMO II. Boehm. J. 3791. D. vol. Governatori. The principles of interactivity. Lausen. 6–16. The interested reader finds a complete description of the explorative debugging paradigm and one of its implementation in [23].

J. Gokulchander.. Preece. C. Froscher.B.. F. A.: Foundation and application of knowledge base verification. Preece. Talbot. R. Burnett. IEEE Transactions on Knowledge and Data Engineering 11. Shinghal. 202–212 (1999) 21. Phalgune. In: VL/HCC 2004: IEEE Symposium on Visual Languages and Human-Centric Computing (2004) 14. International Journal of Expert Systems (1994) 17. Mehrotra. J.. R. L. J. Preece. Radhakrishnan. 47. Journal of Applied Intelligence. Gupta. Ruthruff. Ruthruff. Zhang. International Journal of Intelligent Systems 9. Technical report. IEEE Transactions on Knowledge and Data Engineering 2. Stud. 629–658 (1997) 19. pp. W. 173–189 (1990) 15. M.. R. J. 5–7 (July/August 2004) 12.. Vignollet. Jacob. V. NASA Contract NAS1-18585. Nash..: Rewarding good behavior: End-user debugging and rewards. Zacharias. J. A.. Tsai. In: Proceedings of the 2004 IEEE Symposium on Visual Languages and Human Centric Computing (2004) 11. IEEE Software.. Hum. A. Gilman..... P.: Explorative debugging for rapid rule base development. H. In: Proceedings of the 7th International Conference on Knowledge Management . (1991) 18. 683–701 (1994) 22.: Six challenges in supporting end-user debugging.. Shneiderman.-Comput.: Validation and verification of knowledge-based systems: a survey.T.: Verification and validation of knowledge-based systems.: Evaluation of verification tools for knowledge-based systems.. Abecker. T.Special Track on Knowledge Visualization and Knowledge Discovery (2007) 23. A. S.D.: Everyday programming: Challenges and opportunities for informal web development. Harrison. D. 343–363 (1993) 20.: Visualization of rule bases .. Baroff. Int. A. U. M.: The dangers of end-user programming. W. Beckwith.154 V. In: 1st Workshop on End-User Software Engineering (WEUSE 2005) at ICSE 2005 (2005) 13. Burnett. In: Proceedings of the 3rd Workshop on Scripting for the Semantic Web at the ESWC 2007 (2007) .: Direct manipulation user interfaces for expert systems. J... B. Balling.: Rule groupings: a software engineering approach towards verification of expert systems. M.: Revealing the structure of rule based systems. Zacharias. R. Rosson. Zacharias 10. Simon. Grossner.: A software engineering methodology for rule-based systems. V. 99–125 (1987) 16. M. Final Rep. Vishnuvajjala..the overall structure. L..

Sign up to vote on this title
UsefulNot useful