Search-Based Applications. Gregory Grefenstette
Чтение книги онлайн.
Читать онлайн книгу Search-Based Applications - Gregory Grefenstette страница 7
Business applications were built on top of databases, which defined the universe of information available to the end user, and search engines were used for IR on the Web and in the enterprise.
Figure 1.2: Databases have traditionally been concerned with the world of structured data; search engines with that of unstructured data (some of these data types, like HTML pages and email messages, contain a certain level of exploitable structure, and are consequently sometimes referred to as "semi-structured").
Such neat distinctions are now falling away as the core architectures, functionality and roles of search engines and databases have begun to evolve and converge. A new generation of non-relational databases, which shares conceptual models and structures with search engines, has emerged from the world of the Web (see Chapter 4), and a new breed of search engine has arisen which provides native functionality akin to both relational and non-relational databases (described in Chapters 3-9 and listed in Chapter 10).
It is this new generation engine that supports Search Based Applications, which offer precise, multi-axial information access and analysis that is virtually indistinguishable at a surface level from database applications, yet are endowed with the usability and massive scalability of Web search.
1.1.1 WHAT IS A SEARCH BASED APPLICATION?
We define a Search Based Application (SBA) as any software application built on a search engine backbone rather than a database infrastructure, and whose purpose is not classic IR, but rather mission-oriented information access, analysis or discovery.1
Definition: Search Based Application
A software application that uses a search engine as the primary information access backbone, and whose main purpose is performing a domain-oriented task rather than locating a document. Examples:
Customer service and support
Logistical track and trace Contextual advertising
Decision intelligence
e-Discovery
SBAs may be used to provide more intuitive, meaningful and scalable access to the content in a single database, hiding away the complexity of the database structure as data is extracted and re-purposed by search engine techniques. They may also be used to autonomously and intelligently gather together massive volumes of unstructured and structured data from an unlimited number of sources (internal or external) and to make this aggregate data available in real time to a wide base of users for a broad range of purposes.
While search engines in the SBA context complement rather than replace databases, which remain ideal tools for many types of transaction processing, this ’re-purposing’ of search engines nonetheless represents a major rupture with a 30-year tradition of database-centered software application development. In spite of the significance of this shift, the SBA trend has been unfolding largely under the radar of researchers, systems architects and software developers. However, SBAs have begun to capture the focused attention of business.2
"The elements that make search powerful are not necessarily the search box, but the ability to bring together multiple types of information quickly and understandably, in real time, and at massive scale. Databases have been the underpinning for most of the current generation of enterprise applications; search technologies may well be the software backbone of the future."
—Susan Feldman, IDC LINK, June 9, 2010
1.2 HIGH IMPACT, LOW RISK SOLUTION FOR BUSINESSES
SBAs offer businesses a rapid, low risk way to eliminate some of the peskiest and most common information systems (IS) problems: siloed data, poor application usability, shifting user requirements, systemic rigidity and limited scalability.
Figure 1.3: Search engine-based Sourcier makes vast volumes of structured water quality data accessible via map-based search and visualization, and ad hoc, point-and click-analysis.
Even though SBAs allow business to clear these hurdles and bring together large volumes of real time information in an immediately actionable form—thereby improving productivity, decision making and innovation—too many in the business community are still unaware that search engines can serve as an information integration, discovery and analysis platform. This is the reason we have written this book.
1.3 FERTILE GROUND FOR INTERDISCIPLINARY RESEARCH
We have also undertaken this project to introduce SBAs to a wider segment of the data management research community. Though the convergence of search and database technologies is gradually being recognized by this community3, many researchers are still unaware of the pragmatic benefits of SBAs and the mutually beneficial evolutions underway in both search and database disciplines.
However, as a group of prominent database and search scientists recently noted, exploding data volumes and usage scenarios along with major shifts in computing hardware and platforms have resulted in an "urgent, widespread need for new data management technologies," innovations that will only come about through interdisciplinary research.4
Figure 1.4: This Akerys portal generates personalized, real-time real estate market intelligence based on unstructured online classifieds and in-house databases.
1.4 A VALUABLE TOOL FOR DATABASE ADMINISTRATORS
Like their research counterparts, many Database Administrators (DBAs)DBA database administrator are also unfamiliar with SBAs. We hope this book will raise awareness of SBAs among DBAs as well, because SBAs offer these professionals a fast and non-intrusive way to offload overtaxed systems5 and to reveal the full richness of the data those systems contain, opening database content up for free-wheeling discovery and analysis, and enabling it to be contextualized with external Web, database and enterprise content.
1.5 NEW OPPORTUNITIES FOR SEARCH SPECIALISTS
For search specialists who are not yet familiar with SBAs, we hope to introduce them to this significant new way