Data Management: a gentle introduction. Bas van Gils

Чтение книги онлайн.

Читать онлайн книгу Data Management: a gentle introduction - Bas van Gils страница 2

Data Management: a gentle introduction - Bas van Gils

Скачать книгу

that will only grow in relevance. Data is not a trend that’s going to flame out in a few years, so just like financial literacy and human capital management, it is now obvious that data literacy is going to be a critical knowledge requirement for all managers and executives in the future. As such, we should be thinking about data education in the same way we think about financial and HR education, building the foundations in schools and universities, then continuing to apply those foundations to practical experience through employee onboarding programs, and broader corporate training.

      This book serves these objectives well. All the important enterprise-level data management topics are included. It serves as a valuable curriculum for someone just starting out in a professional data career, or indeed for someone who like me, who picked up bits and pieces without much structure to my learning. Bas’s explanations are clear, and build upon each other systematically. I personally appreciate the research that has gone into identifying the clearest definitions available, even when that means quoting other sources. Bas has effectively curated the “best of” from existing industry literature, and tied everything together into a consistent whole, through his own lucid insight, analysis and explanations.

      I wish you, the reader, well whether this is the start of your data management journey, or like me, you are finding structure for your fragmented knowledge. You have found an excellent resource to help you fulfill your objectives.

      Tony Shaw, CEO & Founder of Dataversity

      October 2019

       Illustration

      “Language (die Sprache) is always a mediator”, the famous Von Humboldt wrote 200 years ago. “It is between the finite and the infinite”, he continues, “and at the same time between one individual and the other”. In traditional philosophical categories: as a subject-object relator and a subject-subject relator. That Von Humboldt spoke using the terms finite and infinite says something about his view of the human subject (its finiteness, in several respects). It is important to note that when Von Humboldt calls language a mediator, he explicitly wants to say that the two things that get mediated do not exist independently of each other, but that in a way they come into existence through the mediation. The mediator is more than a formal relationship. That is why for him language is not a coding system where an (arbitrary) sign is determined for something that already exists for us. Such a coding system does not make language, it presupposes language.

      To some extent, the characterization of Von Humboldt for language can also be applied to data, the subject of this book. Yes, the formal data structures in a computer have been designed, so as such they are not language in the Von Humboldt sense. Still, they draw on language, and so take over some of its characteristics. Data also mediates between subjects. This is one reason why data needs to be protected, as identified in chapters 17 and 21 of this book, and why “shared understanding” is a fundamental goal. It is also mediating with an infinite world around us. To use a phrase of Bas, “data codifies what we know about the world”. At another place, data is defined as the combination of fact and meaning. If this is true (and who am I am to question Bas?), it means that managing data has two rather different faces. Because managing facts, as stored in files on a disk, is quite different from managing such an intangible thing as “meaning”. I don’t want to push this point too much, but I think here is one reason why data management is not simple and not comparable to the management of physical assets such as vehicles or library books, in spite of some similarities.

      When data is a mediator, it also runs the risks of the fate of the mediator: always to fall in between. So that neither the IT department nor the business unit cares for it; that there is no budget for it. That it is seen as instrumental only, and so is not a genuine concern in its own right. In the short history of IT so far we have learned that this would a big mistake. Data needs to be recognized as an asset, and needs to be managed. Not as a goal in its own of course – a point that is stressed by Bas several times in this book. It remains a mediator, but still, it needs to be managed properly. Therefore I am glad with this book that takes data management seriously. A book that tries to integrate insights on data management from theory and practice. A book that can not only serve practitioners and companies that struggle with data management but that can also be a good reference text for academic courses in the field of Information Management or Data Science. I wish it all the best!

      Dr. Hans Weigand, Associate Professor Information Systems, Tilburg University

      October 2019

       Illustration

      When I started my studies at Tilburg University in 1998, one of the first things that I learned was an appreciation for the ‘golden triangle’ of processes, data, and systems. Only through careful alignment of these three can organizations function well. It was interesting to see that so many people – academics and professionals alike – worried mostly about either systems or processes, while data appeared to take the back seat.

      After my studies, I started working on my dissertation at Nijmegen University. The focus of my research was Web information retrieval. The main idea behind my research was based on economic principles: if you have demand and supply of data, then all you have to do is “match” the two. How hard can that be? After all, the topic of information retrieval had been studied for decades. Let’s just say that I learned a lot in those days, not just about the information needs of people surfing the Internet, but also about semantics, data modeling, data structures, etc.

      Since then, I have worked in many different roles, from IT professional to strategy consultant and pretty much every role in between. Over the years, I noticed that data was becoming an increasingly important topic. People started to recognize that mishandling data was costing the organization in missed opportunities, rework, reputational damage, etc. and that products and services could be greatly enhanced when enriched with data. Around this time, people started talking about data as “the new oil” and recognized it for the valuable asset that it really was. This was further strengthened by the apparent rise of topics such as artificial intelligence, data science, and big data.

      I started studying data management in earnest around 2008. A few years later, Tanja Glisin suggested I study the DAMA DMBOK [MBEH09] which really opened my eyes to the depth and breadth of the field. I found that the DMBOK was the reference within our field at the time, especially when complemented with other – more in-depth – publications. The second version of the DMBOK was published in 2017 and showed the significant improvement of our knowledge of the field [Hen17]. I have used both versions of the DMBOK over the years, both as a reference during consultancy assignments and teaching.

      The DMBOK is a great reference, but may practitioners find it too theoretical to be of practical use. A more pragmatic book that combines theory with practical recommendations is missing. After much debate and discussions with friends, many of whom I have interviewed for this book, I decided to attempt to fill this gap.

      The decision to actually move forward with the writing project was made in March of 2019, while visiting the Enterprise Data World conference in Boston, Massachusetts. I wrote the first version of the book during the summer months of 2019 and am forever grateful for all the support and help I received. There are so many people to thank and I sincerely hope I am not forgetting anyone. First of all, I would like to thank my colleagues at Strategy Alliance for their patience and help in preparing the manuscript. I

Скачать книгу