Business Experiments with R. B. D. McCullough
Чтение книги онлайн.
Читать онлайн книгу Business Experiments with R - B. D. McCullough страница 19
![Business Experiments with R - B. D. McCullough Business Experiments with R - B. D. McCullough](/cover_pre926115.jpg)
Figure 1.6 Mobile landing page test for storage company.
Our last example shows a test to determine whether it is beneficial to include a video icon on the product listing to indicate that there is a video available for the product. The images in Figure 1.7 show a product listing without the icon (left) and with the icon (right). These images appear on the product listing page that shows all the products in a particular category (e.g. dresses, tops, shoes). In this test, users were assigned to either never see the video icons or to see the video icons for every product that had a video available. The two groups were compared based on the percentage of sessions that viewed a product detail page, which is the page the user sees when she clicks on one of the product listing images. The hypothesis was that the icons would encourage more people to click to see the product details where they can view the video. They also measured the total sales ($) per session. Do you think the icons will encourage users to click through to the product page?
Figure 1.7 Video icon test.
Source: Elias de Carvalho/Pexel.
Here we have shown four examples of website tests, but the options for testing websites and other digital platforms like apps or kiosks are nearly limitless. The growth in website testing has been driven largely by software that manages the randomization of users into test conditions. Popular software options include Optimizely, Maxymiser, Adobe Test&Target, Visual Website Optimizer, and Google Experiments. These tools integrate with digital analytics software like Google Analytics or Adobe Analytics that track user behavior on websites, which provides the data to compare the two versions. Most major websites will have testing software installed and often have a testing manager whose job is to plan, conduct, analyze, and report the results of tests. Newer software tools also make it possible to run tests on email, mobile apps, and other digital interfaces like touch‐screen kiosks or smart TV user interfaces. These website tests represent the ideal business experiment: we typically have a large sample of actual users, users are randomly assigned to alternative treatments, the user behavior is automatically measured by the web analytics solution, and the comparison between the two groups is a good estimate of the causal difference between the treatments. It is also relatively easy to implement the better version as soon as you get the results, and so these types of tests have a major impact on how websites are managed.
So, how good are you at guessing which version is better? Table 1.6 shows the winning treatment for each of the tests (in bold) along with the lift in performance. If you were able to guess the results to all four examples, then you are gifted web designer. Even experienced website managers frequently guess incorrectly and user behavior changes over time and from one website to another, which means that the only way to figure out which version is better is to run a test.
Table 1.6 Summary of web test results.
Test | A treatment | B treatment | Response measures | Result |
---|---|---|---|---|
Email sign‐up | No incentive | $10 incentive | # of sign‐ups | 300% lift |
Skirt images | Head‐to‐toe | Cropped | Skirt sales ($/session) | 7% lift |
Location search | Zip search | GPS search | Sign‐ups | 40% lift |
Rentals | 23% lift | |||
Video icon | No icon | Icon | % to product detail | No significant |
Sales ($/session) | Difference |
Note: Winning treatment shown in boldface.
Notice that Table 1.6 shows a different response measure for each test. The response measure (KPI) in an experiment is simply the measure that is used to compare the performance of two treatments. This measure is usually directly related to the business goal of the treatment. For instance, the purpose of the Dell Landing Page is to get people to sign up to talk to a Dell representative, so the percentage of users who submit a request is a natural response measure. Dell could have selected a different response measure, such as the % of users who actually speak with a Dell representative or who sign up and pay for services. In some cases, the test will include several response measures; the video icon test used both the % of users that viewed a product detail page, which is closely related to the goal of the video icon, and the sales per session, which reflects the ultimate business goal of any retail website. We will discuss the selection of response measures later, but for now it is sufficient to recognize that choosing a response measure that relates to business goals is a critical (and sometimes overlooked) part of test design.
Table 1.6 reports the test results in terms of a percentage lift. For example, in the Dell Landing Page test, the lift was 36%, which means that the hero image produced 36% more submissions. Table 1.6 only reports the lift numbers for test results that were found to be significant. Significance tests are used to determine whether there is enough data to say that there really is a difference between the two treatments. Imagine, for example, that we had test data on only five users: two who saw version A and looked at product details and three who saw version B and did not look at product details. Is this enough data to say that A is better than B? Your intuition probably tells you that it isn't, which is true, but when samples are a bit bigger,