Search Analytics for Your Site. Louis Rosenfeld

Чтение книги онлайн.

Читать онлайн книгу Search Analytics for Your Site - Louis Rosenfeld страница 11

Автор:
Серия:
Издательство:
Search Analytics for Your Site - Louis  Rosenfeld

Скачать книгу

#8 (non-standard but widely used) hit count III number of matches found[b] [a] The GMT offset is important because you must have accurate timestamps to look for patterns of usage, such as spikes of traffic at lunchtime. Tracking the time relative to GMT lets analytics systems merge search logs from multiple time zones, which is especially important when adjusting for Daylight Savings Time. [b] Some search engines return the approximate number of hits, rather than provide a definitive number. This is usually because they are reserving the option to check whether the user has security access to additional documents. If you don’t have confidential documents, you may be able to disable the access check and get a real number.

       WHAT EXTENDED LOG ENTRIES LOOK LIKE

      Optional fields can be quite helpful as well. These include the “referer” field (it should be “referrer,” but the spec spelled it wrong, so now we’re stuck with this misspelling), which can offer insights into site navigation problems; the user-agent for recognizing various platforms using the search; and an optional cookie, which is better than IP address for tracking searchers. To conform to other Web log formats, these fields might come before the hit count and time taken fields.

      An extended log entry could look like this (detailed below in Table 2-4):

      Table 2-4.

http://www.flickr.com/photos/rosenfeldmedia/5826101254/

Extended Fields
Position Field Example Meaning
#9 referer URL http://search.example.com/search?q=sound The page that the user was on when he searched: in this case, from a search results page for the query “sound”.
#10 user-agent “Mozilla/5.0 (iPhone; U; CPU iPhone OS 2_2 like Mac OS... The browser or app that sent the query. These are most useful for getting client metrics (especially mobile) and recognizing robot crawlers.
#11 cookie “USERID=CustomerA; IMPID=01234” Cookie for server session (rare).

       SEARCH PARAMETERS

      Most search engines stick to the common format for additional options and settings (such as language or in the search part of the request). They start after the results page URL with a question mark and then put in a code followed by an equal sign followed by a value, delimited by an ampersand (or comma or semicolon), like this:

      search.html?qq=noise&zone=all

      There’s no standard, so the query parameter might be q, qq, qt, qry, query, w, words, s, st, search, or something else entirely. This, and all the other codes, should be documented by the search vendor or open-source group. (We’ve provided an example below, as well as details in Table 2-5.) You’ll find this information useful if you need to “teach” your analytics application what to look for to identify—and parse out—actual queries from your logs. Here is an example of a query parameter:

      Table 2-5.

http://www.flickr.com/photos/rosenfeldmedia/5826101316/

Query Parameters
Code Field Example Meaning
q query q=noise The search terms, in this case “noise”
1 language l=fi The searcher’s language, here it’s Finnish
s stan S=2I Start the display at result number 21
p per page P=20 Show 20 results per page
v section v=housewares Limit the query to the housewares section
i simple i=I Show the simple search interface

      The contents of the log file enable site search analytics: the entries provide the evidence needed to deduce how your users are searching and how well the site search is helping them. Cherish the logs or at least keep an archive: you may need to go back someday.

      [7] The NCSA combined/extended log format is documented at http://publib.boulder.ibm.com/tividd/td/ITWSA/ITWSA_info45/en_US/HTML/guide/c-logs.html#combined and http://httpd.apache.org/docs/2.2/mod/mod_log_config.html#examples

      Summary

       SSA offers a unique treasure trove of data worth tapping because it’s the one place where users tell you in their own words what they want from your site.

       SSA provides different information about users than insights you normally get from SEM (Search Engine Marketing) and SEO (Search Engine Optimization). Think of people searching the Web as people you want to attract to your site, while people searching your site are customers you want to retain. SSA is concerned with the latter.

       Query data can be captured in search engine logs or by analytics applications that harvest information on users’ actions on your site.

       When users search your site, they typically will have more specific needs (and queries) than when they search for information on the Web.

       As the Zipf distribution shows, a little SSA goes a long way. Start by improving the performance of your

Скачать книгу