Ontology Engineering. Elisa F. Kendall
Чтение книги онлайн.
Читать онлайн книгу Ontology Engineering - Elisa F. Kendall страница 9
8 Jess, the Java Expert System Shell and scripting language, see https://herzberg.ca.sandia.gov/docs/52/.
9 FLORA-2: Knowledge Representation and Reasoning with Objects, Actions, and Defaults, see http://flora.sourceforge.net/.
10 For more information on general first-order logics and their use in ontology development, see Sowa (1999) and ISO/IEC 24707:2018 (2018).
11 For more information on description logics, KR and reasoning, see Baader et al. (2003) and Brachman and Levesque (2004).
CHAPTER 2
Before You Begin
In this chapter we provide an introduction to domain analysis and conceptual modeling, discuss some of the methods used to evaluate ontologies for reusability and fit for purpose, identify some common patterns, and give some high-level analysis considerations for language selection when starting a knowledge representation project.
2.1 DOMAIN ANALYSIS
Domain analysis involves the systematic development of a model of some area of interest for a particular purpose. The analysis process, including the specific methodology and level of effort, depends on the context of the work, including the requirements and use cases relevant to the project, as well as the target set of deliverables. Typical approaches range from brainstorming and highlevel diagramming, such as mind mapping, to detailed, collaborative knowledge and information modeling supported by extensive testing for more formal knowledge engineering projects. The tools that people use for this purpose are equally diverse, from free or inexpensive brainstorming tools to sophisticated ontology and software model development environments. The most common capabilities of these kinds of tools include:
• “drawing a picture” that includes concepts and relationships between them, and
• producing sharable artifacts, that vary depending on the tool—often including web sharable drawings.
Analysis for ontology development leverages domain analysis approaches from several related fields. In a software or data engineering context, domain analysis may involve a review of existing software, repositories, and services to find commonality and to develop a higher-level model for use in re-engineering or to facilitate integration (de Champeaux, Lea, and Faure, 1993; Kang et al., 1990). In an artificial intelligence and knowledge representation context, the focus is on defining structural concepts, their organization into taxonomies, developing individual instances of these concepts, and determining key inferences for subsumption and classification for example, as in Brachman et al. (1991b) and Borgida and Brachman (2003). From a business architecture perspective, domain analysis may result in a model that provides wider context for process re-engineering, including the identification of core competencies, value streams, and critical challenges of an organization, resulting in a common view of its capabilities for various purposes (Ulrich and McWhorter, 2011). In library and information science (LIS), domain analysis involves studying a broad range of information related to the knowledge domain, with an aim of organizing that knowledge as appropriate for the discourse community (Hjørland and Albrechtsen, 1995). Domain analysis to support ontology development takes inspiration from all of the above, starting from the knowledge representation community viewpoint and leveraging aspects of each of the others as well as from the terminology community (ISO 704:2009, 2009).
The fact that the techniques we use are cross-disciplinary makes the work easier for people from any of these communities to recognize aspects of it and dive in. At the same time, this cross-disciplinary nature may make the work more difficult to understand and master, involving unfamiliar and sometimes counterintuitive methods for practitioners coming from a specific perspective and experience base. Some of the most common disconnects occur when software or data engineers make assumptions about representation of relationships between concepts, which are first class citizens in ontologies, but not in some other modeling paradigms such as entity relationship diagramming (Chen, 1976) or the Unified Modeling Language, Version 2.5.1 (2017). Java programmers, for example, sometimes have difficulty understanding inheritance—some programmers take short cuts, collecting attributes into a class and “inheriting from it” or reusing it when those attributes are needed, which may not result in a true is-a hierarchy. Subject matter experts and analysis who are not fluent in logic or the behavior of inference engines may make other mistakes initially in encoding. Typically, they discover that something isn’t quite right because the results obtained after querying or reasoning over some set of model constructs are not what they expected. Although there may be many reasons for this, at the end of the day, the reasoners and query engines only act as instructed. Often the remedy involves modeling concepts and relationships more carefully from the domain or business perspective, rather than from a technical view that reflects a given set of technologies, databases, tagging systems, or software language.
2.2 MODELING AND LEVELS OF ABSTRACTION
Sometimes it helps people who are new to knowledge representation to provide a high-level view of where ontologies typically “play” in a more traditional modeling strategy. Figure 2.1, below, provides a notional view of a layered modeling architecture, from the most abstract at the highest level to very concrete at the lowest. This sort of layering is common to many modeling paradigms.
An ontology can be designed at any level of abstraction, but with reference to Figure 2.1, typically specifies knowledge at the context, conceptual, and/or logical layers. By comparison, entity-relationship models can be designed at a conceptual, logical, or physical layer, and software models at the physical and definition layers. Knowledge bases, content management systems, databases, and other repositories are implemented at the instance layer. In terms of the Zachman Framework™,12 which is well known in the data engineering community, an ontology is typically specified with respect to at least one of the elements (what, how, where, who, when, why) across the top three rows of the matrix—executive perspective, business perspective, and architect’s perspective. A comprehensive ontology architecture might include some or all of these perspectives, with increasing granularity corresponding to increasingly specific levels in the framework.
Figure 2.1: Abstraction layers.
An ontology architecture developed to address requirements of a finance application might include:
1. a foundational layer of reusable and customized ontologies for metadata and provenance description (e.g., based on Dublin Core,13 SKOS (Simple Knowledge Organization System),14 and the PROV Ontology (PROV-O),15 potentially with extensions that are context-specific);
2. a domain-independent layer of reusable ontologies covering standard concepts for dates and times, geopolitical entities, languages, and other commonly used concepts, building on the metadata layer—there are standardized and de facto standards that may be used for this purpose;
Конец