Search Analytics for Your Site. Louis Rosenfeld

Чтение книги онлайн.

Читать онлайн книгу Search Analytics for Your Site - Louis Rosenfeld страница 10

Автор:
Серия:
Издательство:
Search Analytics for Your Site - Louis  Rosenfeld

Скачать книгу

over when during the process—research, design, development, or maintenance—you tackle SSA. You’ll glean something small—good things at each point, none of which will likely take you off on a radical tangent.

      Finally, if you’re one of those wearers of many hats, don’t fret: as mentioned earlier, SSA scales wonderfully. Even if you spend 15 minutes per month looking over the simplest reports—the most frequent queries list and the null results query list—you’ll get something useful out of your analysis. This month’s 15 minutes of tuning can gently grow to 30 minutes next month, and so on. The work is the same—it will fill whatever time you can make or justify for it.

       [6] http://tech.groups.yahoo.com/group/webanalytics/

      Your Secret Weapon

      Thank your lucky stars: SSA remains safely under the radar. No one owns it, and the people in most organizations who are closest to it—the IT folks who manage the search engine—aren’t likely to worry much about things like user intent. So if you can crack open the data, you (and your organization) will own the keys to a very powerful secret weapon. Read ahead.

      Anatomy of a Search Log Entry

       Avi Rappoport, Search Tools Consulting— http://searchtools.com/

      Though most of us are now using analytics applications that provide some SSA reporting functionality, you may be in a situation where you’ll have to create your own reports—either because the analytics application doesn’t support your specific needs—or because you don’t have access to an analytics application. In both cases, you’ll need to process the data yourself.

      Working with search engine transaction logs, you’ll find the search query, any search parameters (such as language or date), and the number of matches retrieved by the search engine. Most also contain the date and time, and some kind of searcher identifier. Understanding the format makes it easier to understand search analytics reports, recognize what they can and can’t tell you, and perform special processing for unusual questions.

      Many search engines conform to the NCSA extended Web server log format,[7] so that’s what we’ll cover here. These text files have a standard field order, with spaces between them. To indicate a field with internal spaces, it needs double quotes or square brackets at the start and end.

      However, there’s no place in the NCSA extended format for the hit count (the number of items matched in the search), so search engines tend to slide it in the middle or hang it off the end. If your search log format is not documented, you may need to do some sleuthing: you can figure this out by entering several unique searches that you know will generate no matches, and then look in the search log for those terms.

       BASIC FIELDS

      A simple query entry in this log format looks like this:

      XX.XX.XX.14 - - [10/Jul/2010:10:24:13 -0800] "GET /search?q=noise HTTP/1.1" 200 9429 111

      We can break that down into fields for better analysis, as shown in Table 2-2.

      Table 2-2.

http://www.flickr.com/photos/rosenfeldmedia/5826101122/

Fields By Position
#1 #2 #3 #4 #5 #6 #7 #8
meaning ip - - date/timestamp search request response code bytes hits
example xx.xx.xx.14 - - [10/Jul/2010:10:24:13-0800] “GET/search?q= noise HTTP/I.I” 200 9429 III

      Table 2-3 provides even more detail on each field.

      Table 2-3.

http://www.flickr.com/photos/rosenfeldmedia/5826101190/

Details About Fields
Position Field Example Meaning
#1 IP or host name XX.XX.XX.14 ID of the computer sending the search.
#2 auth. user - usually empty, RFC931 authentication
#3 user name - usually empty
#4a date [10/Jul/2010 date of the query in standard form
#4b time :10:24:i3 time of the query in standard form
#4C offset -0800] offset time from GMT[a]
#5a request “GET HTTP results (form action)
#5b URL /search.html search results page URL
#5c parameters ?query=noise search terms and other options
#5d version HTTP/1.1” version (always the same)
#6 response code 200 server response code (if it’s not 200, you are in trouble)
#7 bytes 9249 bytes returned (the size

Скачать книгу