Practical Data Analysis with JMP, Third Edition. Robert Carver

Чтение книги онлайн.

Читать онлайн книгу Practical Data Analysis with JMP, Third Edition - Robert Carver страница 9

Автор:
Жанр:
Серия:
Издательство:
Practical Data Analysis with JMP, Third Edition - Robert Carver

Скачать книгу

a Macintosh, it is an arrowhead ►). Disclosure buttons enable you to expand or contract the amount of information displayed on the screen. The disclosure button shown here lets you temporarily hide the three panels discussed above.

      4. Try it out! Click the disclosure button to hide and then reveal the panels. Also click the red triangles and the Header Graphs icon and notice what happens.

      The red triangles offer you menu alternatives that will not mean much at this point, but which we will discuss in the next section. The red triangle in the upper right corner (above the diagonal line) relates to the columns of the grid, and the one in the lower left corner to the rows.

      Below the right-hand red triangle is a small icon that looks like a bar chart. This opens thumbnail descriptive graphs for each column.

      The top row of the grid contains the column names, and the left-most column contains row numbers. The cells contain the data.

      Our main interest within this data table is how life expectancy varies around the world. Variation is so common as to be unremarkable, but the very fact that they vary is what leads us to analyze them. We can imagine many reasons that life expectancy varies around the world; there are differences in nutrition, wealth, access to health care and clean water, education, political stability, and so on. Are there systematic differences in different parts of the world?

      We have a table displaying all 215 countries, but it is difficult to detect patterns by scanning up and down a long list. As a first step in analysis, we will make some simple graphs to summarize the table information visually. Software affords us many options to visualize a set of data and can help us discover errors in the recording of the raw data, locate important patterns of variability, or identify possible connections between and among variables. JMP’s Graph Builder is an intuitive, interactive platform for visualization.

      1. From the Life Expectancy 2017 data table window, click Graph ► Graph Builder.

      The graph builder gives us a Cartesian plane on which we can create a JMP visualization representing multiple columns in a single visual display. There are numerous options available, but in this first example, we will look at just a few.

      In analyzing this set of data, our primary interest lies in the variation of life expectancy. Following one of JMP’s conventions, we will think of this column as our Y variable.

      2. To display life_exp on the Y axis, click the life_exp column in the panel of Variables, and drag it to the vertical Y drop zone in the Graph Builder window. When you do this, your screen should look like Figure 1.6.

      Figure 1.6: Using the Graph Builder

Figure 1.1 Some JMP Help Options

      In this graph, each dot represents the value for one country. If you move your cursor to any dot and hover, the name of the country and other data appear. So, for instance, we find that Hong Kong enjoyed the longest life expectancy. Notice that the reported life expectancies lie between approximately 53 years and 85 years, with a large number of countries enjoying life expectancies above 65 years.

      By default, JMP jitters the points in this graph (see the drop-down menu next to Jitter just below the list of variables). This spreads the points apart to the left and right, so that identical or similar values do not overlap in the graph.

      3. Click the Jitter menu and select None. You will see why jittering has its advantages. Explore the other options as well.

      Now let’s see how the values compare across different regions in the world. In the data table, we have already assigned a different color to each region, but have not provided a legend to explain the color coding.

      4. One way to produce a legend is to drag Region to the Color drop zone.

      Each global region is colored so that all the countries in East Asia and Pacific, for example, are red. This immediately reveals that nearly all the countries with short life expectancy are in Sub-Saharan Africa. This fact was not at all obvious from the initial data table; that is what visualization can do for us.

      5. Now move the cursor back to the list of columns and once again choose Region, and this time drag it to the Group X drop zone at the top of the tableau.

      When you do this, you will now have seven adjacent small graphs showing the values from each region. As you examine these graphs, you might notice that the values vary vertically within each region and that the patterns of variation are similar in some regions but dramatically different in others. The study of descriptive statistics largely revolves around common patterns of variation, comparisons of those patterns, and deviations from those patterns. Here again, it is very evident that the nations of Sub-Saharan Africa largely have the shortest life expectancies in the world. What other general patterns emerge?

      Because the data are reported geographically, another useful way to examine the patterns is to overlay them on a map. Doing so magnifies a few key points.

      6. In Graph Builder, click the Start Over button in the upper left.

      7. Drag the Country Code column to the lower left of Graph Builder into the drop zone labeled Map Shape.

      8. Now drag life_exp over the map and release the mouse button. Alternatively, you might drag life_exp into the Color drop zone. Your map should now look like Figure 1.7. At this point, click the Done button.

      Figure 1.7: Map of the World Colored by Life Expectancy

Figure 1.1 Some JMP Help Options

      As the legend to the right indicates, the colors shaded dark red enjoy the longest life expectancies and dark blue countries have the shortest life expectancies. This map is an alternative method to see how life expectancy varies around the world.

      Please note two limitations of this graph. You might have spotted a white “hole” in the center of Africa. These are countries for which JMP found no data in our data table. Additionally, there is a notation at the bottom of the graph indicating that JMP did not recognize some of the country abbreviations, and hence did not display them on the map.

      Of course, data analysis is not limited to graphing and mapping—there are numbers to be crunched, and JMP will do the heavy computational work. We have many pages ahead of us to learn how to request and to interpret many useful computations. With this set of data, we will summarize life expectancy in different parts of the world. Don’t worry about the details of these steps. The goal right now is just for you to see a typical JMP platform and its output.

      Windows users: the next instruction asks you to select an option from the Analyze menu, but there is no visible menu bar in the Graph Builder window. At the top of the window, just above Graph Builder, find the gray horizontal bar with three dots. (See Figure 1.7.) Hover over the bar and the menus will appear.

      1. Select Analyze ► Fit Y by X. This analysis platform lets us plot one variable (life expectancy) versus another (region).

      Why “fit” Y by X? Analysts often speak of fitting an abstract or theoretical model to a set of data. We can think of models as common or standard patterns of variation, and the process of model fitting begins with exploring how a Y column varies across categories or values of an X column.

Скачать книгу