Search-Based Applications. Gregory Grefenstette
Чтение книги онлайн.
Читать онлайн книгу Search-Based Applications - Gregory Grefenstette страница 4
16.4 And Continuing Database/Search Convergence
Acknowledgments
We would like to thank Gary Marchionini and Diane Cerra for inviting us to participate in this timely and important lecture series, with a special thank you to Diane for her assistance and patience in guiding us through the publication process. We would also like to thank Morgan & Claypool’s reviewers, including Susan Feldman, Stephen Arnold and John Tait, for their thoughtful suggestions and comments on our manuscript. Ms. Feldman and Mr. Arnold are constant sources of insight for all of us working in search and information access-related disciplines, and we welcome Mr. Tait’s remarks based on his long IR research experience at the University of Sunderland and his more recent efforts at advancing research in IR for patents and other large scale collections at the Information Retrieval Facility.
In addition, we are grateful to our colleagues and managers at Exalead for allowing us time to work on this lecture, and for providing valuable feedback on our draft manuscript, especially Olivier Astier, Stéphane Donzé and David Thoumas. We would also like to thank our partners and customers. They are the source of the examples provided in this book, and they have played a pioneering role in expanding the boundaries of applied search technologies, in general, and search-based applications, in particular.
Finally, we would like to thank our families. Their love sustains us in all we do, and we dedicate this book to them.
Gregory Grefenstette and Laura Wilber
December 2010
Glossary
Glossary
ACID | Constraints on a database for achieving Atomicity, Consistency, Isolation and Durability |
Agility | The ease with which a computer application can be altered, improved, or extended |
API | Application Programming Interface, specifies how to call a computer program, what arguments to use, and what you can expect as output |
Application layer | Part of the Open System Interconnection model, in which an application interacts with a human user, or another application |
Atomicity | The idea that a database transaction either succeeds or fails in its entirety |
Availability | The percentage of time that data can be read or used. |
Batch | A computer task that is programmed to run at a certain time (usually at night) with no human intervention |
B2C | Business to Customer; B2C websites offer goods or services directly to users |
B+ tree | A block-oriented data structure for efficient insertion and removal of data nodes |
BI | Business Intelligence, views on data that aid users with business planning and decision making |
BigTable | An internal data storage system used by Google, handles multidimensional key-value pairs |
BSON | Binary JSON |
Business application | Any information processing application used in running a business |
Cache | A rapid computer memory where frequently or recently used data is temporarily stored |
CAP theorem | One cannot achieve Consistency, Availability, and Partition tolerance at the same time |
Category | A flat or hierarchic semantic dimension added to a document, or part of a document |
Categorization | Assigning, usually through statistical means, one or more categories to text |
CDM | Customer Data Management |
Cloud services | Computer applications that are executed on computers outside the enterprise rather than in-house. Examples are SalesForce, Google Apps, Yahoo mail, etc. |
Clustering | Grouping documents according to content similarity |
CMS | Content Management System |
Consistency | A quality of an information system in which only valid data is recorded; that is, there are not two conflicting versions of the same data |
Connector | A program that extracts information from a certain file format, or from a database |
Consolidation | Making all the data concerning one entity available in one output |
COTS | Commercial off-the-shelf software |
Crawl | Fetching web pages for indexing by following URLs found in each page |
CRM | Customer Relationship Management, applications used by businesses to interact with customers |
CSIS | Customer Service Information System |
Data integration | Merging data from different data sources or different information systems |
Data mart | A subset of data found in an enterprise information system, relevant for a specific group or purpose |
Data warehouse | A database which is used to consolidate data from disparate sources |
DBA | Database administrator, the person who is responsible for maintaining (and often designing) an organization’ database(s) |
Deep Web | Web pages that are dynamically generated as a result of form input and/or database querying |
Directory | A listing of the files or websites in a particular storage system |
DIS | Decision Intelligence System, a computer-based system for helping decision making |
Document model | A model of seeing a database entity as a single persistent document, composed of typed fields and categories corresponding to the entity’s attributes |
Dublin Core Metadata |
A standard for metadata associated with documents,
|