Data Management: a gentle introduction. Bas van Gils

Чтение книги онлайн.

Читать онлайн книгу Data Management: a gentle introduction - Bas van Gils страница 20

Data Management: a gentle introduction - Bas van Gils

Скачать книгу

of the overall architecture of the enterprise.13IntegrationIntegration deals with the movement of data from process to process, from system to system. The main contribution is a set of techniques and approaches to ensure that data flows through the organization so that it can be used where needed.14Reference dataReference data is about “understanding data through data”. This is the realm of code lists and hierarchies of codes. An example would be codes for geographical areas where the company does business, or codes that define the types of products the company offers.15Master dataMaster data is concerned with creating a “golden copy” of data about key business concepts for the organization, by creating a single version of the truth. There are many ways to achieve this. This area ties in closely with Integration.16QualityData quality is about data that is fit for purpose. It is about setting requirements (a norm) and taking corrective action when data doesn’t meet them. This may entail different quality attributes, such as correctness and completeness.17SecuritySecurity is about a risk-based approach to protecting data assets. It is concerned with defining a data security policy, data classification (confidentiality, integrity, availability) and implementing measures to keep data safe according to this policy.18Business intelligenceBusiness intelligence (BI) is concerned with reporting what happened in the past and with data-driven predictions about the future (analytics).19Big dataBig data is still a major trend. It is about data sets with many data points (volume) that change rapidly (velocity) and often of various types (variability) [ZEDR+11]. Dealing with this type of data is completely different from traditional (little) data from a technical perspective, and opens the door to a whole range of new insights from a business perspective.20TechnologyThis area is not listed in the DMBOK wheel. I’ve included a chapter on this topic to give a focused, high-level overview of relevant developments in the area of data/ DM technology.

Illustration

       Illustration

       Synopsis - In this chapter, I will give a high-level overview of the distinction between five different types of data: transaction data, master data, business intelligence data, reference data, and metadata. For each, I will also provide links to other chapters.

      Most organizations have large amounts of data. This is a well-known fact and one of the reasons why DM is such an important topic. What’s more, they typically also have many different types of data. Classifying data can be useful for different purposes. For example, it may help to decide on the approach to DM, or to decide what type of media it should be stored on. Many different classification schemes have been proposed. This is illustrated in example 16.

       Example 16. Data classification

      Data can be classified to indicate the type of use: descriptive data (describe a state of affairs in the real-world), diagnostic data (show how well something – e.g. a process – is functioning), predictive data (make predictions about a future state of affairs), or prescriptive data (define parameters to ensure that a certain process or system performs as desired).

      Another way to classify data is to consider what it describes: i.e. geographic data (what a specific area looks like), weather data (past/ present/ future weather for a specific area), and people data (such as names, addresses, and relationships to other people).

      While useful, these types of classifications are not the main topic of this chapter. Instead, I will look a level deeper and consider five related types of data. I already hinted at these in table 7.1 where I gave an overview of the DM topics that I will discuss in this book.

      In this section, I will give a high-level overview of five fundamentally different data types and indicate in which chapter I will discuss these further. The point is not so much to give an extensive discussion here but to make the reader aware that there are different types of data before launching into detailed discussions about governance, architecture, etc. in future chapters. Figure 8.1 outlines the five types of data.

Illustration

      Figure 8.1 Five types of data

      The first type of data is transaction data. This type of data usually provides a description of some event that took place in the real-world, such as a purchase, a payment of an invoice. Assuming business goes well, you will typically have many records of this type that are created every day: every time someone makes a purchase or payment, for example. Also note that these records tend to be highly structured and you want to keep track of all of them so that you can later analyze how business is really going.

      The second type of data is master data. To understand what this is about, consider a situation where you have half a dozen systems where you store data about your customers. One of your customers calls with a complaint. In which system are you going to look to find out what is going on? Even more, how are you going to deal with the situation where systems are in disagreement (one system says this customer has his office in Amsterdam, whereas the other claims it is in Rotterdam)?

      To tackle challenges of this type, organizations typically want to organize a “golden record” or “single version of the truth” which must show what the organization believes to be true. There are many ways to implement this as we will see in chapter 15. This is both complex and costly and organizations typically only do this for their most important business concepts, such as Party / Customer, and Product. Typically, this type of data does not change all that often (ask yourself this: how often do people move or change their name? How often do you introduce new/ retire old products?). Example 17 shows that transaction data may also contain (references to) master data objects.

       Example 17. Master data & transaction data

      Suppose that you have just sold a product called Cool8 to a customer whose name is John Doe. The record of this transaction will show such things as a time stamp, the actual store where the purchase was made, which employee was involved and so on.

      From a master data perspective, two business concepts are of interest: the customer and the product. This customer may have made previous purchases at this store, or perhaps at other stores. If this customer purchases a lot of our Cool8 product then this may be useful to know. If this customer used to purchase Cool7 and has now switched to Cool8 then it may also be useful to find out why and what that implies for future sales.

      Now, suppose that John Doe did, in fact, make purchases at various stores but

Скачать книгу