Search Analytics for Your Site. Louis Rosenfeld

Чтение книги онлайн.

Читать онлайн книгу Search Analytics for Your Site - Louis Rosenfeld страница 4

Автор:
Серия:
Издательство:
Search Analytics for Your Site - Louis  Rosenfeld

Скачать книгу

on a development server, the launch was scheduled, and it wouldn’t be long before Vanguard’s 12,000 employees were enjoying a far better search experience.

      And yet, something didn’t seem quite right.

      The project manager wanted to ensure the quality of the search results and asked John to do a review of the build on the development server. So he poked around and kicked the new engine’s tires, trying out a few common search queries to see what happened.

      What happened wasn’t pretty. The search engine seemed to be retrieving results that made no sense; the results were far worse, in fact, than those of its predecessor. How on earth could all that time, money, and effort lead to an even worse search experience?

      The launch deadline loomed just a few weeks out.

      The Brake Gets Stuck

      So John pulled the chain to halt the process from going forward. With his project manager’s support, John described the problem to the IT staff who owned the project. They nodded their heads and listened patiently. And then they told John that they couldn’t see the problem. After all, the search engine was up and running, and had been set up as the vendor suggested. The vendor was experienced and clearly knew what it was doing, likely far more than anyone at Vanguard (John included) could possibly know about how a search engine should work. Anecdotal findings from one person’s poor search experience weren’t going to trump that knowledge. With the launch date just around the corner, the staff weren’t about to halt the project.

      Now, this may seem to be an unreasonable response. But most IT people would react in the same way, and with good reasons: technically, the search engine really was working quite well. And while Vanguard’s IT staff were uncommonly sensitive to user experience issues, it wasn’t clear that the problem John was intuiting actually existed. After all, he had no compelling proof to present that the search was broken. Combine these reasons with the pressures IT faced to get the project completed on schedule, and you could argue that the IT people were actually being very reasonable.

      But as an information architect, John was concerned about the user experience of search. That’s why he’d been brought in to the search engine selection process in the first place—to make sure that the search engine actually served the end user, rather than just conforming to a set of technical requirements. But the new search engine already seemed too likely to fail miserably. John could already envision the hate mail coming in from users who were demanding that the old search engine be reinstated. And he could already hear the words from managers’ mouths: “What the hell happened here?” John had raised a red flag, but he failed to make a convincing argument.

      So John wasn’t satisfied. He’d tried to put the brakes on the search engine’s launch to avert a disaster and had failed.

      Measuring the Unmeasurable

      Of course, John wasn’t going to give up. Otherwise, this story would be a very boring way to kick off a book! Besides, a large IT investment—and people’s jobs—were at stake.

      When John first started working on the project, his goal was to introduce user-centered thinking to the search engine selection process to complement the technical tests that IT would be using. To do so in an environment that was both technical and, as a corporation, driven by the bottom line, he had to wade into some treacherous waters—he’d have to come up with some metrics to quantify the experience of using the current search engine.

      Now you might wonder what the big deal was. Either the search engine found the damned thing, or it didn’t—should be pretty easy to measure, right? Well, not quite.... There certainly are searches that work that way, for example, looking up a colleague’s phone number in the Vanguard staff directory. But many—probably most—searches don’t have a single “right” answer. “Parking,” “benefits,” and “experts” are all common queries on the Vanguard Intranet. They are also questions that have many answers—some more right than the others, but none that are ideal or perfect. From the perspective of users, relevance is very often relative.

      Most designers know that it’s difficult to measure search performance and, well, just about any aspect of the user experience. In fact, being asked to do so causes droplets of sweat to form on many a designer’s brow. It just doesn’t feel right. Experience is difficult to boil down to a few simple, measurable actions. Considering that most of those in the field don’t have advanced degrees in statistics—and probably experienced similarly sweaty moments during high school algebra—it’s not surprising.

      Yet, here was John Ferrara, with a bachelor’s degree in communications, sallying forth to measure the user experience of Vanguard’s search system.

      The Before-and-After Test

      John focused on analyzing a few really common search queries to see how well they were performing—queries that represented needs that huge numbers of Vanguard’s intranet searchers wanted addressed. If you’re familiar with the “long tail,”[1] these would be considered the “short head.” (If you’re not, don’t worry—you’ll learn the basics in Chapter 2.) John wanted to compare how well these queries performed before and after—with the original search system and now with the new one.

      Next, John needed some metrics for these common queries so he could compare them. He knew that there wasn’t a single metric that would be perfect, so he hedged his bets and came up with two sets of metrics respectively: relevancy and precision.[2] Relevancy measured how well the search engine returned a query’s best match at the top of all results. Precision measured how relevant the top results were. (To be fair, John didn’t invent precision; he borrowed it from the information retrieval researchers, who have been using it for years.) Let’s take a closer look at these two sets of metrics and how John used them.

      So What’s Relevant?

      John went through his list of common search queries. To test how relevant each would be, he had to make an informed judgment (also known as a guess) at what a reasonable searcher would want to find for each query. Reasonable, as in the results don’t seem like they were selected by a crazy person.

      We’ve already seen one good example of such a situation: finding a colleague’s phone number in the staff directory. There’s a clear, obvious, and correct answer to this question. But in many cases where the answer wasn’t so obvious, John got out his red pen and deleted those queries from his relevancy test. He was now working with a cleaned-up set of queries that he was confident had “right answers”—ones like “company address.”

      John determined the best matches for each remaining query. He then tested each query by recording where the best match ranked among the search results. Then he measured performance a few different ways. Was it the first result? If not, did it make the top five “critical” results? Each of these measurements had something to say about how well queries were performing. They helped in two ways: they revealed outliers that were problematic, and they helped track overall search system performance over time. Figure 1-1 shows the former: queries, such as “job descriptions” that have high numbers stand out problematically from their peers and deserve some attention.

Скачать книгу