The Googlization of Everything. Siva Vaidhyanathan

Чтение книги онлайн.

Читать онлайн книгу The Googlization of Everything - Siva Vaidhyanathan страница 7

Автор:
Серия:
Издательство:
The Googlization of Everything - Siva  Vaidhyanathan

Скачать книгу

network effect for most of Google’s services is not the same exponential effect we saw with the proliferation of the telephone or fax machine. If only one person in the world used Gmail, it would still be valuable to her, because it can work well with every other standard e-mail interface. But if only a few people used Google for Web searching, Google would not have the data it needs to improve the search experience. Google is better because it’s bigger, and it’s bigger because it’s better. This is an arithmetic, rather than geometric, network effect, but it matters nonetheless. Opting out or switching away from Google services degrades one’s ability to use the Web.

      It may seem as if I’m arguing that Google is a monopoly and needs to be treated as such, broken up using the antimonopoly legislation and regulations developed over the late nineteenth and early twentieth centuries. But because Google is sui generis, business competition and regulation demand fresh thinking. It’s such a new phenomenon that old metaphors and precedents don’t fit the challenges the company presents to competitors and users. So far, Google manages us much better than we manage Google. Just because Wagner’s defense of Google is shallow does not necessarily mean that we would be better off severing the company into various parts or restricting its ambitions in some markets. But the very fact that Google is nothing like anything we have seen before both demands vigilance and warrants concern. That fact also means that there is no general answer to how competing firms or regulators should approach Google’s ventures. Everything must be considered case by case and with an eye on particulars. “Is Google a monopoly?” is the wrong question to ask. Instead, we should begin by examining what Google actually does and how that compares to what competitors do or might do in the future. That approach will give us a better sense of what the Googlization of everything means and what has already been done about it.

      THE SEARCH FOR A BETTER SEARCH

      There is a broad consensus that Web search is still in a very pedestrian phase. Both Yahoo and Google generally work the same way, and neither offers consistently superior search results. People tend to choose one or the other platform based on other factors—habit, the default search service embedded in a browser, their choice of e-mail client, appearance, or speed.16 At most search-engine companies, the computers tend to take the string of text that users type into a box and scour their vast indexes of copies of Web pages for matches. Among the matches, each page is ranked instantly by a system that judges “relevance.” Google calls its ranking system PageRank: links rise to the top of the list of search results by attracting a large number of incoming links from other pages. The more significant or highly ranked a recommending page is, the more weight a link from it carries within the PageRank scoring system.17 Each website copied into Google’s servers thus carries with it a set of relative scores instantly calculated to place it in a particular place on a results page, and this ranking is presumed to reflect its relevance to the search query. Relevance thus tends to mean something akin to value, but it is a relative and contingent value, because relevance is also calculated in a way that is specific not just to the search itself but also to the search history of the user. For this reason, most Web search companies retain records of previous searches and note the geographic location of the user.

      While this approach is standard, and works fairly well in most situations for most users, a number of search-engine companies have been working furiously to deepen the “thinking” that computers do when queried. Since 2008, we have seen the debut of a number of new search engines that offer a different way of searching and depend heavily on the ability to understand the context and purpose of the search query. And Google, understandably, refines and alters its search principles with regularity.

      Cuil, which debuted ignominiously in 2008, was founded by a group of former Google employees. Its launch was marred by too much publicity and attention. The first users found the system terribly slow and fragile. Cuil boasts of searching a larger index of sources than either Google or Microsoft’s search engine, Bing. It also claims to be able to conduct rudimentary semantic analyses of the potential results pages to assess relevance better than the popularity method of PageRank. By the summer of 2009, Cuil delivered consistently good results to basic queries, but no one seemed to notice. Most importantly, Cuil pledged not to collect user data via logs or cookies, the small files with identifying information that Google and other search engines leave in every user’s Web browser, because it is more interested in what the potential results pages mean than what the user might think about. Cuil is a clever and innovative search service that has suffered from terrible business and public-relations decisions.18

      In early 2009, the eccentric entrepreneur and scientist Stephan Wolfram released what he called a “computational knowledge engine,” Wolfram Alpha. By staging a series of small-scale demonstrations for the most elite Web thinkers in the United States, Wolfram was able to seed curiosity and attract attention for his service. Unlike a commercial search engine, Alpha is not so much designed to find pages and videos on the Web as to answer research questions by mining publicly available data sets. It does not even attempt to index Web sites. Its utility to users and advertisers, therefore, is narrow. But as a concept in knowledge management and discovery, it is potentially revolutionary. If you ask Alpha, “How many atoms are in a molecule of ammonia?” it will tell you the answer. It finds facts. It even generates facts, in a sense, by computing new information from different, distinct data sets. Wolfram Alpha is not intended to compete with Google in any way or in any market (although Google’s Web search can answer the same question by directing users to the top link: a page from Yahoo Answers!). However, if it succeeds, Alpha will remove a small set of scientific queries from the mass of Google searches. Google will hardly notice—unless it decides to adopt elements of Alpha technology for its own services. Wolfram Alpha is certain to serve as a useful experiment in the development of machine-based knowledge development. But it’s not for shopping.19 It won’t have anything like Google’s effect on people worldwide, and it, too, is designed to remain a clever resource but never to become a major player in general information or Web searching.

      Currently, the major search engines do not “read” the query for meaning. They are purely navigational: they point. However, all the big search companies (and most of the small ones, as well) are working on what is known in the industry as “semantic search,” searches that take account of the contextual meaning of the search terms. For example, in 2001, if a user typed “What is the capital of Norway?” into Google, the results would have been a set of pages that included the string of text “What is the capital of Norway?” By contrast, a semantic search engine that reads what computer scientists and linguists call “natural language” can understand the patterns of human diction well enough to predict that a user expects the result of this search to be the answer to the question, not a set of pages asking the same question. To accomplish the goal of generating a natural-language or semantic search system, search companies need two things: brilliant thinkers in the areas of linguistics, logic, and computer science, and massive collections of human-produced language on which computers can conduct complex statistical analysis. Many companies have the former. Only Google, Yahoo, and Microsoft have the latter. Of those, Google leads the pack.

      It’s no accident that Google has enthusiastically scanned and “read” millions of books from some of the world’s largest libraries. It wants to collect enough examples of grammar and diction in enough languages from enough places to generate the algorithms that can conduct natural-language searches. Google already deploys some elements of semantic analysis in its search process. PageRank is no longer flat and democratic. When I typed “What is the capital of Norway?” into Google in August 2010, the top result was “Oslo” from the Web Definitions site hosted by Princeton University. The second result was “Oslo” from Wikipedia.

      One search company is trying to combine the two approaches, blending semantic search with community-based assessment of the quality of sources. By those standards, Hakia should be the best search engine in the world. Hakia specializes in medical information, and it invited medical professionals to help assess the value and validity of potential result sites. The results, however, are not clearly superior to Google’s. Hakia does place medical journal

Скачать книгу