Data Management: a gentle introduction. Bas van Gils

Чтение книги онлайн.

Читать онлайн книгу Data Management: a gentle introduction - Bas van Gils страница 18

Data Management: a gentle introduction - Bas van Gils

Скачать книгу

concepts and their relationships are transformed into a logical structure of data elements, which can be either entities or attributes of these entities. As with business concepts, entities can also be connected through relationships (hence the name Entity Relationship Diagram (ERD) that is frequently used). Example 13 explains this further.

       Example 13. Data elements

      The diagram shows four entities, each with several attributes. Even more, the entities are related and there is a verbalization attached to each relationship. Compare this diagram, which lists data elements to the diagram in example 12, which lists business concepts. The diagram with business concepts lists the things that business talks about. Apparently, order line is not something business stakeholders talk about, or else it would have shown up as a business concept. However, in order to store data in the system in an effective manner, the order line is needed as it stores the combination of products and required quantity for a specific order.

Illustration

      The goal of this chapter was to discuss base terminology in the field of data management. Important terms are business concept, data element, entity, attribute, table, column, field, and record. In addition to introducing important terminology, this chapter expanded on definitions with examples and created links to other chapters. By doing so, this chapter provides a basis for a consistent and complete framework for data management that can be used in practice.

Illustration

      1 The classic works on information theory such as [Sha48] provide more insight in the use of the word codifies.

       Illustration

       Synopsis - This book is about data management (DM). Roughly defined, DM is about managing data. In this chapter, I will introduce a definition of data management which is based on the standard reference for DM, the Data Management Body of Knowledge (DMBOK) [Hen17]: it is the capability that organizations have in order to manage data as an asset. In this chapter, I will also discuss the topics (sub capabilities) that are part of the field of DM.

      In chapter 2, I discussed how many organizations see data as one of their most important assets. A loose definition of DM therefore is: the capability that the organizations have in order to manage data as an asset. While this gives a good idea of the purpose of DM, it doesn’t say much about what it entails to do DM. The definition from the DMBOK gives a bit more insight [Hen17]:

       Data management is the development, execution, and supervision of plans, policies, pro- grams and practices that deliver, control, protect and enhance the value of data and information throughout their lifecycles.

      In a recent article about data strategy, already mentioned in section 3.2, this was compared to the world of sports such as soccer or ice hockey [DD17]. The purpose of DM is twofold:

      • Grip on data - This is what the first part of the DMBOK definition talks about. This part of the definition gives an overview of the types of activities that are involved in DM: the idea is to determine what we want to do with data (plans) and set up policies and practices (guard rails) to steer the organization in the right direction. This direction entails, on the one hand, the delivery of data to turn it into value but also how the controlling, protection, and enhancement of data assets can make that happen. A big task indeed.

      • Value creation through the use of data - The latter part of the DMBOK definition suggests that the purpose of DM is to turn data into value throughout its lifecycle. After its creation it can be used and reused in processes until eventually the data gets archived or destroyed.

      The analogy from example 14 clarifies these two perspectives further.

       Example 14. The data river

      In this example, I will compare water that flows through a river to data that flows through an organization. The example is illustrated below:

Illustration

      Consider a river that starts in the mountains. Assuming that high up the mountains there is little or no pollution, the water is expected to be clean. This is the equivalent of data that gets created in a process and stored in a system. As a rule, data tends to be correct/ or high quality here too.

      When the water starts flowing down the mountain it passes a few villages where people use it for

Скачать книгу