Linked Data Visualization. Laura Po

Чтение книги онлайн.

Читать онлайн книгу Linked Data Visualization - Laura Po страница 5

Linked Data Visualization - Laura Po Synthesis Lectures on the Semantic Web: Theory and Technology

Скачать книгу

pictures associated with the URIs of these links. What is displayed is an overview of images and information related to London from different data sources.

image

      Figure 1.3: LodView visualization of London.

      Tim Berners-Lee had a grand vision for the Internet when he began development of the World Wide Web in 1989 [Gillmor, 2004, Chapter 2]. He envisioned a read/write Web. However, what had emerged in the 1990s was an essentially read-only Web, the so-called Web 1.0. The users’ interactions with the Web were limited to the search and the reading of information. The lack of active interaction between users and the Web lead, in 1999, to the birth of the Web 2.0. For the first time, common users were able to write and share information with everyone. This era empowered users with a few new concepts like blogs, social media, and video-streaming platforms like Twitter, Facebook, and Youtube.

      Over time, users started to upload textual and multimedia content at an incredibly high rate and, as a consequence, more and more people started to use the Web for several different purposes. The high volume of web pages and the higher number of requests required Web applications to find new ways for handling documents. Machines needed to understand what data they are handling. The main idea was to provide a context to the documents in a machine-readable format. This new revolution, the Web 3.0, is called Semantic Web or Web of Data.

      With the advent of the Semantic Web, users started to publish content together with metadata, i.e., other data that provide some context about the main data in a machine-understandable way. The machine-readable descriptions enable content managers to add meaning to the content. In this way, a machine can process knowledge itself, instead of text, using processes similar to human deductive reasoning and inference, thereby obtaining more meaningful results and helping computers to perform automated information gathering and research. Making data understandable to machines implies, anyway, the sharing of a common data structure. To solve this issue, the RDF (Resource Description Framework) was the language proposed by the W3C for achieving a common data structure.

      The Semantic Web also allows creating links among data on the Web. So that a person or machine can explore the Web of data. With Linked Data, when you have some of it, you can find other, related, data. Like the Web of hypertext, the Web of data is constructed with documents on the Web and the links between arbitrary things are described by RDF. The URIs identify any kind of object or concept.

      Connecting your own data to other information already present on the Web resulted in at least two important consequences. The first is the possibility to add even more information and provide a more extended context, and the second is the creation of a global network of LD, the Giant Global Graph.

      Alongside the arise of the Semantic Web, the Web shifted from a web pages-oriented Web to a data-oriented Web (Figure 1.4). Users of the Web started to publish data online and governments foresee in opening data, a way for enroling the citizen in the governative life of the city.

      The volume of data is growing exponentially everywhere. Each minute, 149,513 emails are sent, 3.3 million Facebook posts are created, 65,972 Instagram photos are uploaded, 448,800 Tweets are constructed, and 500 hours of YouTube videos are uploaded. The tremendous increase of data through the Internet of Things (continuous increase of connected devices, sensors, and smartphones), has contributed to the rise of a “data-driven” era. Moreover, future predictions argue that by 2020, every person will generate 1.7 megabytes in just a second.

      Each sector is affected by this dizzying increase in available data and this means that Big Data anlysis techniques must be implemented for mining data. Big Data is formed of large, diverse, complex, longitudinal, and distributed data sets generated from various instruments, sensors, Internet transactions, email, video, click streams, and other sources, whereas open-linked data focusses on the opening and the combining of data. The data can be released both by public organizations and by private organizations or individuals. Big Data analytics can be used to promote better utilization of resources and improved personalization. Naturally, there are no barriers between Big Data, Linked Data, and Open Data. It means that when a dataset is at the same time open, structured in node-edge fashion, and tremendously big, it can be referred as a BOLD (Big, Open, and Linked Data) source.

image

      Figure 1.4: Transition from the Web of Documents to the Web of Data.

      As a consequence, the arisen of the Web of Data gave birth to new specialized figures that can boost the value of those data. Data analysts, which are able to analyze and discover patterns from the data, Data Scientists, which try to predict the future based over past data, or the Chief Data Officer (CDO), who has the duty of defining and governing the data improvements strategy for supporting the achievement of corporate objectives, are only a few figures born for handling with the Web of Data.

      Now, in 2019, we are already entering the fourth-generation internet, the Internet of Things, or the Web of intelligence connections. It is talked to be the web of the augmented reality for interacting at the same time with the real world and the online world. Domotic houses, smart domestic appliances, and voice assistants are only a few applications that will take place in the following years. Although interesting, the innovations of the Web 4.0 are out of the scope of this book and will not be addressed.

      The term Linked Data was coined in 2006 from one of the creators of the Web, Sir Tim Berners-Lee. At the same time he published a note3 listing four rules for publishing LD.

      1. Use URIs as names for things. This is the first rule for publishing LD. This rule is the first milestone for creating a system where all resources could be univocally identified. The term resource refers both to real-world objects than web pages.

      2. Use HTTP URIs so that people can look up those names. The second rule adopted the HTTP protocol as the mean for reaching resources and their information. Thanks to it, users are able to look for a specific object and get all the information they need as a result. Moreover, considering the fact that the resources should also be machine-readable, it is possible to exploit the content negotiation system for obtaining different representations of the requested resource.

      3. When someone looks up a URI, provide useful information using the standard. This means the resource’s information should be returned to the requester in an RDF compliant format.

      4. Include links to other URIs so that they can discover more things. The last rule emphasizes the fact that resources should be connected to other resources in order to create what can be considered as the successor of WWW, the Giant Global Graph. This rule is the enabler of the great connectivity of Linked Data. Starting from a resource, the users of the Web have the possibility to jump from an object to another resource as they desire.

      Some time later, more precisely at the TED4 conference in 2009, the same Tim Berners-Lee restated the principles defined in 2006 as three “extremely simple” rules.5

      1. All kinds of conceptual things, they have names now that start with HTTP.

      2. If I take one of those HTTP names and I look it up … I will get back some data in a standard format which is kind of useful data that somebody might like to know about that thing, about the event, …

      3. When I get back that information it’s not

Скачать книгу