.

Чтение книги онлайн.

Читать онлайн книгу - страница 4

Автор:
Жанр:
Серия:
Издательство:
 -

Скачать книгу

By the time we've reached the fourth or fifth number, we will have forgotten the first one we looked at.

Let's try a trend line, as shown in Figure 1.2.

Figure 1.2 Now can you see the trends?

      Now we have much better insight into the trends. Office supplies has been the lowest-selling product category in all but two quarters. Furniture trends have been dropping slowly over the time period, except for a bump in sales in 2015 Q4 and a rise in the last two quarters. Technology sales have mostly been the highest but were particularly volatile at the start of the time period.

      The table and the line chart each visualized the same 48 data points, but only the line chart lets us see the trends. The line chart turned 48 data points into three chunks of data, each containing 16 data points. Visualizing the data hacks our short-term memory; it allows us to interpret large volumes of data instantly.

      How Do We Visualize Data?

      We've just looked at some examples of the power of visualizing data. Now we need to move on to how we build the visualizations. To do that, we first need to look at two things: preattentive attributes and types of data.

      Preattentive Attributes

      Visualizing data requires us to turn data into marks on a canvas. What kind of marks make the most sense? One answer lies in what are called “preattentive attributes.” These are things that our brain processes in milliseconds, before we pay attention to everything else. There are many different types. Let's look at an example.

Look at the numbers in Figure 1.3. How many 9s are there?

Figure 1.3 How many 9s are there?

How did you do? It's easy to answer the question – you just look at all the values and count the 9s – but it takes a long time. We can make one change to the grid and make it very easy for you. Have a look at Figure 1.4.

Figure 1.4 Now it's easy to count the 9s.

      Now the task is easy. Why? Because we changed the color: 9s are red, and all the other numbers are light gray.

Color differences pop out. It's as easy to find one red 9 on a table of hundreds of digits as it is on a 10-by-10 grid. Think about that for a moment: Your brain registers the red 9s before you consciously addressed the grid to count them. Check out the grid of 2,500 numbers in Figure 1.5. Can you see the 9?

Figure 1.5 There is a single 9 in this grid of 2,500 numbers. We wager you saw it before you started reading any other numbers on this page.

      It's easy to spot the 9. Our eyes are amazing at spotting things like this.

Color (in this case, hue) is one of several preattentive attributes. When we look at a scene in front of us, or a chart, we process these attributes in under 250 milliseconds. Let's try out a couple more preattentive features with our table of 9s. In Figure 1.6, we've made the 9s a different size from the rest of the figures.

Figure 1.6 Differences in size are easy to see too.

Size and hue: Aren't they amazing? That's all very well when counting the 9s. What if our task is to count the frequency of each digit? That's a slightly more realistic task, but we can't just use a different color or size for each digit. That would defeat the preattentive nature of the single color. Look at the mess that is Figure 1.7.

Figure 1.7 Coloring every digit is nearly as bad as having no color.

      It's not a complete disaster: If you're looking for the 6s, you just need to work out that they are red and then scan quickly for those. Using one color on a visualization is highly effective to make one category stand out. Using a few colors, as we did in Figure 1.2 to distinguish a small number of categories, is fine too. Once you're up to around eight to ten categories, however, there are too many colors to easily distinguish one from another.

To count each digit, we need to aggregate. Visualization is, at its core, about encoding aggregations, such as frequency, in order to gain insight. We need to move away from the table entirely and encode the frequency of each digit. The most effective way is to use length, which we can do in a bar chart. Figure 1.8 shows the frequency of each digit. We've also colored the bar showing the number 9.

Figure 1.8 There are 13 9s.

Since the task is to count the 9s in the data source, the bar chart is one of the best ways to see the results. This is because length and position are best for quantitative comparisons. If we extend the example one final time and consider which numbers are most common, we could sort the bars, as shown in Figure 1.9.

Figure 1.9 Sorted bar chart using color and length to show how many 9s are in our table.

      This series of examples with the 9s reemphasizes the importance of visualizing data. As with Anscombe's Quartet, we went from a difficult-to-read table of numbers to an easy-to-read bar chart. In the sorted bar chart, not only can we count the 9s (the original task), but we also know that 9 was the third most common digit in the table. We can also see the frequency of every other digit.

The series of examples we just presented used color, size, and length to highlight the 9s. These are three of many preattentive attributes. Figure 1.10 shows 12 that are commonly used in data visualization.

Figure 1.10 Preattentive features.

      Some of them will be familiar to you from charts you have already seen. Anscombe's Quartet (see Figure 1.1) used position and spatial grouping. The x- and y-coordinates are for position, while spatial grouping allows us to see the outliers and the patterns.

      Preattentive attributes provide us with ways to encode our data in charts. We'll look into that in more detail in a moment, but not before we've talked about data.

      To recap, we've seen how powerful

Скачать книгу