Declarative Logic Programming. Michael Kifer
Чтение книги онлайн.
Читать онлайн книгу Declarative Logic Programming - Michael Kifer страница 8
Artificial Intelligence
The use of logic and deduction as a basis for question-answering and reasoning in expert systems dates to at least the late 1960s. Rule-based systems were also a common approach to AI problem solving. Logic languages such as Prolog were attractive to this community because they encompassed both basic information and rule-based “intelligence” to work with that knowledge in a uniform model, while providing a formal foundation for rule-based reasoning. There were other attractions, such as the natural use of meta-programming features to manipulate programs and implement alternative evaluation systems. Prolog rules seemed an accessible means for domain experts (who were assumed not to be sophisticated programmers) to directly capture their reasoning strategies. Furthermore, the resolution-based deduction methods used with Prolog were at once a close analog of human reasoning and an efficient evaluation mechanism. By the early 1980s, there was interest in working with large fact bases in expert systems, and imbuing database systems with intelligence, manifested in the closely related areas of Knowledge-Based Systems (KBS) and Expert Database Systems (EDS). Some saw the function-free subset of Prolog as a happy medium between databases and general rule-based reasoning. Datalog became a common basis for work in the KBS and especially the EDS communities.
Background. As mentioned, one of the earliest proposals to use theorem proving as a mechanism for query answering was that of Green [1969]. The 1970s saw a proliferation of expert systems that tried to replicate human expertise in computational form [Puppe 1993]. These early systems tended to be ad hoc, with much of their knowledge encoded procedurally. Toward 1980, KBS emerged as an architectural approach to make expert systems (and other reasoning applications) easier to construct and maintain, especially when involving large collections of information [Davis 1986]. In a KBS, there is a separation of knowledge structures and the computational mechanism to apply that knowledge. These two parts are often called the knowledge base and the inference engine. (The inference engine did not necessarily use logical inference. It could, for example, make use of probabilistic reasoning.) The knowledge base contains both concrete information (facts) and more abstract forms of knowledge, such as rules, templates, or classification hierarchies. Logic was the representation for the knowledge base in some KBS, although there were competing approaches, such as frame-based systems [Minsky 1975] and semantic networks [Findler 1979]. (It is interesting to note that there was some debate at the time as to the suitability of logic for this role [Hayes 1977, Hayes 1980, Winograd 1975].) The logic-based KBS often structured the knowledge base in the form of facts and rules, similar to a logic program. For example, the DLOG system [Goebel 1985] was a KBS using logic, and its implemented subset consisted of facts and Horn clauses that were passed to a Prolog-based interpreter. Another example of a logic-based framework for KBS was the Syllog system [Fellenstein et al. 1985], which had a database and Prolog-style rules, but presented a structured natural-language interface to them. In a similar vein to KBS, Expert Database Systems (EDS) were an effort to imbue database systems with richer representation and reasoning capabilities. Logic (usually in the guise of logic programming) was often the choice for providing capabilities such as classification hierarchies [Dahl 1982] and incorporating constraints into query answering [Dahl 1986, Kifer and Li 1988]. Kerschberg [1990] provides an overview of EDS.
1.2.1 Uptake
While Datalog had precursors from Logic Programming, Database Systems, and Artificial Intelligence, the bulk of the early work on it took place in the database theory and query-language communities. Deductive-database researchers showed some interest, but their focus was more on removing non-declarative features—such as cut—from Prolog while retaining top-down, SLD-resolution approaches to evaluation [Minker et al. 2014]. On the database-query side, however, there was strong interest in adding rules and recursion to existing query frameworks, which mainly employed bottom-up techniques based on relational algebra for evaluation. Since Datalog did not have function symbols, a safe program implied a finite number IDB facts from a finite EDB, meaning that bottom-up techniques would converge. Much early work on implementation techniques for Datalog and similar languages was based on bottom-up approaches, and researchers found ways to adapt these approaches to support additional features, such as complex objects, aggregation, and negation. These extensions motivated new semantic constructs, such as stratification and stable models for negation and other non-monotone extensions. Therefore, it should be no surprise that the majority of sources cited in this chapter appeared in database venues, through there is a significant body of references from logic programming and AI publications.
1.3 Coining “Datalog”
To the best of our knowledge, the name “Datalog” as applied to function-free Prolog was coined while two of the authors of this chapter, David Maier and David S. Warren, were working on the book Computing with Logic [Maier and Warren 1988]. The book used simplified logics and languages to introduce concepts such as substitution and resolution. The simplest logic was propositional Horn clauses, and the obvious name for the corresponding language was “Proplog.” We wanted to introduce unification and specialization of clauses using predicate Horn logic without function symbols, to avoid some of more complicated bits of Prolog initially, such as the occurs check and structure sharing. Maier and Warren kicked around ideas for a name for the corresponding language, but came up with nothing exciting at first. The next day, Maier came back with “Datalog,” presumably because predicates without function symbols looked a lot like database relations.
At the time, several researchers had already proposed function-free, definiteclause logic as a basis for deductive databases. While Maier and Warren were not thinking about “Datalog” as the name for a database language, it quickly spread to that use, and the first appearances in print of “Datalog” depict it as a database language. Computing with Logic only came out in January 1988, but Maier introduced the name to students at Oregon Graduate Institute (then Oregon Graduate Center) and to collaborators at Stanford University while he was working on the book. The first uses of the name Datalog in the general literature, as far was we can tell, were in the proceedings of the PODS conference in March 1986, where it appears in Afrati et al. [1986] and Bancilhon et al. [1986]. The first mention of Datalog in the gray literature was by Harry Porter, who had taken a class from Maier on the magic of Logic Programming based on the notes for Computing with Logic. The earliest use we have found is a manuscript by Porter from October 1985 [Porter 1985]. We note that the name “Datalog” was also applied about the same time to a natural-language system for database query (written in Lisp) [Hafner and Godden 1985]. That use came from “database dialog,” and does not seem to have a connection to Prolog.
In any case, by mid-1986, Datalog as the name for a database language had been established. Viewed from the programming language side, it was a restricted version of Prolog, with no function symbols and no extra-logical predicates, such as cut
and var
. From a logical perspective, it was based on definite Horn clauses over predicates. (A definite Horn clause is a disjunction of literals with exactly one positive literal.) As a database query language, it resembled domain relational calculus. There is some variation across authors about whether or not “pure Datalog” includes negation; here we will assume not. We do assume Datalog programs are safe; safety conditions need to be modified appropriately for some of the extensions below.
1.4 Extensions to Datalog
While pure Datalog has a simple syntax and a clean theory, it does lack direct support for features commonly found in database languages: negation (and set difference), arithmetic, sets and multi-sets, aggregation, updates, and typing and