Data Management: a gentle introduction. Bas van Gils
Чтение книги онлайн.
Читать онлайн книгу Data Management: a gentle introduction - Bas van Gils страница 18
![Data Management: a gentle introduction - Bas van Gils Data Management: a gentle introduction - Bas van Gils](/cover_pre712569.jpg)
Example 13. Data elements
The diagram shows four entities, each with several attributes. Even more, the entities are related and there is a verbalization attached to each relationship. Compare this diagram, which lists data elements to the diagram in example 12, which lists business concepts. The diagram with business concepts lists the things that business talks about. Apparently, order line is not something business stakeholders talk about, or else it would have shown up as a business concept. However, in order to store data in the system in an effective manner, the order line is needed as it stores the combination of products and required quantity for a specific order.
This small example, of course, doesn’t show all the intricacies of going from the level of business concepts to the level of data elements. The purpose of the example is only to show that the relationship between business concepts and data elements is complicated at best5. Mapping business concepts to data elements is only one part of the analysis, though. The second part consists of mapping the data elements to tables and columns. This is a far more straightforward process: typically, entities map on tables and attributes map on columns6.
■ 6.6 OUTLOOK
The goal of this chapter was to discuss base terminology in the field of data management. Important terms are business concept, data element, entity, attribute, table, column, field, and record. In addition to introducing important terminology, this chapter expanded on definitions with examples and created links to other chapters. By doing so, this chapter provides a basis for a consistent and complete framework for data management that can be used in practice.
■ 6.7 VISUAL SUMMARY
1 The classic works on information theory such as [Sha48] provide more insight in the use of the word codifies.
2 For the tech-savvy readers: in this chapter, I will mainly focus on data that is stored in relational databases. The terminology mostly fits with other structures (e.g. NoSQL [RW12]) as well.
3 A more extensive discussion of data security can be found in chapter 17.
4 Many good words are being used in literature, such as “business term”, “business object” and “business concept”. I went with the latter because this makes it easy to align with the notion of conceptual data models that is introduced in chapter 11.
5 If you are interested in this process, look up a good reference work on normalization in database systems such as [Dat04].
6 There are exceptions to the rule and the underlying database technology should be taken into account. This is, however, beyond the scope of this discussion.
Synopsis - This book is about data management (DM). Roughly defined, DM is about managing data. In this chapter, I will introduce a definition of data management which is based on the standard reference for DM, the Data Management Body of Knowledge (DMBOK) [Hen17]: it is the capability that organizations have in order to manage data as an asset. In this chapter, I will also discuss the topics (sub capabilities) that are part of the field of DM.
■ 7.1 INTRODUCTION
In chapter 2, I discussed how many organizations see data as one of their most important assets. A loose definition of DM therefore is: the capability that the organizations have in order to manage data as an asset. While this gives a good idea of the purpose of DM, it doesn’t say much about what it entails to do DM. The definition from the DMBOK gives a bit more insight [Hen17]:
Data management is the development, execution, and supervision of plans, policies, pro- grams and practices that deliver, control, protect and enhance the value of data and information throughout their lifecycles.
In a recent article about data strategy, already mentioned in section 3.2, this was compared to the world of sports such as soccer or ice hockey [DD17]. The purpose of DM is twofold:
• Grip on data - This is what the first part of the DMBOK definition talks about. This part of the definition gives an overview of the types of activities that are involved in DM: the idea is to determine what we want to do with data (plans) and set up policies and practices (guard rails) to steer the organization in the right direction. This direction entails, on the one hand, the delivery of data to turn it into value but also how the controlling, protection, and enhancement of data assets can make that happen. A big task indeed.
• Value creation through the use of data - The latter part of the DMBOK definition suggests that the purpose of DM is to turn data into value throughout its lifecycle. After its creation it can be used and reused in processes until eventually the data gets archived or destroyed.
The analogy from example 14 clarifies these two perspectives further.
Example 14. The data river
In this example, I will compare water that flows through a river to data that flows through an organization. The example is illustrated below:
Consider a river that starts in the mountains. Assuming that high up the mountains there is little or no pollution, the water is expected to be clean. This is the equivalent of data that gets created in a process and stored in a system. As a rule, data tends to be correct/ or high quality here too.
When the water starts flowing down the mountain it passes a few villages where people use it for