Using Stata for Quantitative Analysis. Kyle C. Longest

Чтение книги онлайн.

Читать онлайн книгу Using Stata for Quantitative Analysis - Kyle C. Longest страница 6

Автор:
Серия:
Издательство:
Using Stata for Quantitative Analysis - Kyle C. Longest

Скачать книгу

Next, in Stata, open the Data Editor window, highlight the upper left data cell, right click and choose Paste, or use the paste function (Ctrl+V). Once you have pasted in the data, you should be presented with a window that asks whether you want to Treat First Row as Data or Treat First Row as Variable Names. At this point you will need to determine whether the Excel data are entered with the first row as data (Figure 1.6) or variable names (Figure 1.5) to make the appropriate selection. After you have selected the option that fits with the type of data file you have, close the Data Editor and follow the previously described steps to save the data from within Stata as a Stata data file.4

      4This “copy and paste” method may seem to be the easiest way to transfer data from Microsoft Excel into a Stata format, especially for novice users. But there are some disadvantages to this strategy. More practiced users should transform Excel worksheets into .csv files and then implement the -insheet- command. The specifics of this command are beyond the scope of this introductory text, but the Stata Help Files section of Chapter 8 provides information on how Stata’s Help files can be used to learn how to use this command.

      jpg A CLOSER LOOK: COMBINING DATA

      Often when you are conducting analyses, you will need to expand your data set. Typically, this type of expansion involves one of two possibilities: (1) adding new cases to the data; or (2) adding new variables connected to existing cases. Combining data in either of these situations is not terribly complicated in Stata, but it is rather data-situation specific. Therefore, rather than fully covering the Stata commands to accomplish these tasks in detail, here I provide a general overview of when and why you might need to combine data in these ways, as well as noting the Stata command that allows you to do so. Because completing these types of data combinations is somewhat more advanced, I would encourage readers who realize they may need these commands to work their way through the book so that they feel comfortable with Stata. Then return to this section and combine it with the information in the Stata Help Files section of Chapter 8 to guide yourself through the exact command you will need for your particular data combination goals.

      To fully understand the different types of data combinations that are possible, it may be helpful to use an example. Consider the data you have been working with so far in this chapter as your “original” data that may have come from a survey you conducted. Figure 1.7 presents the data as they are right now.

jpg

      FIGURE 1.7 • ORIGINAL DATA

      Appending Data

      As an example of the first combination scenario, consider that the current 10 cases were the first set of respondents who completed your survey. You may have begun your analyses assuming that these cases were the only ones who had decided to respond to the survey. But, as is often the case, after a few more weeks you notice that 10 new respondents submitted their survey information. You would of course want to include these tardy completers in your final analyses.

      In this scenario, you would have a second data set of these 10 new cases, and it would look like that presented in Figure 1.8.

jpg

      FIGURE 1.8 • NEW OBSERVATIONS WITH SIMILAR VARIABLES

      As you can see, these new data contain the exact same variables as the original data, but the set has 10 new respondents, which is indicated by their 10 unique ids values. The type of data combination you would want to conduct in this situation is referred to as “appending” in Stata because you are adding new cases to an existing data set. Therefore the command to complete this data combination is –append-. Again, for the specifics of how to complete this command, see the Stata Help Files section of Chapter 8 to learn how to use the help file to teach yourself the full details of the –append- process. But to help you see what the end product is, Figure 1.9 displays what the data would look like if you used the –append- command to join the two example data sets.

jpg

      FIGURE 1.9 • NEW APPENDED DATA SET

      You can see that the new data set now contains 20 cases, which are the original 10 respondents plus the 10 more recent survey completers, and each has information on all of the same variables.

      Merging Data

      There are two typical variants for the second combination scenario. The first occurs when you conduct a follow-up survey on a similar set of cases, such as a pre-test–post-test model. Here you would want to include the new variables (i.e., post-test responses) to the initial data set from the pre-test. In this situation your new data would look like that presented in Figure 1.10.

jpg

      FIGURE 1.10 • NEW VARIABLES FOR ORIGINAL RESPONDENTS

      If you compare these new data to the original, you will notice that these are the same 10 cases, noted by their similar ids values. Also, none of their gender identifications have changed. But all of their ages have increased by a year (as if this follow-up survey was conducted 1 year after the initial survey), and several of their employment status and religion responses have changed. In this case, you would be looking to attach this new information to the original data to potentially examine the causes of why some respondents shift employment categories or religions, for example.

      A second variant that involves the same data combination process would be where you would like to include new variables for existing cases that correspond to some other information about these cases. For example, perhaps you have a set of survey responses from adults who recently visited a hospital. You may want to bring in new variables that involve information about the particular hospital each case visited. Or following the example we have been using, you may want to bring in information about the religious denomination with which they affiliate. In this situation, your new data might look like those shown in Figure 1.11.

jpg

      FIGURE 1.11 • DENOMINATION SPECIFIC DATA

      In these data, you will notice that the information pertains to the particular religion, not the respondents. The variables therefore are information about how many total Baptists there are, or whether Mormonism would be considered an evangelical denomination. Of course, in a real situation, you could have a great deal more information about each denomination that may be useful in analyzing your survey data. Notice here that you do not have every denomination in this new data that is present in your original data. This situation can occur with this type of combination and will not cause a problem for Stata.

      Both of these situations are referred to as “merging” in Stata because you are bringing in new information about the existing cases. As you may have guessed, then, the command to complete the combination is –merge-. One key difference in the two types of merges is what exactly you are merging on. Understanding this difference is the key to completing the merge correctly. In the first merge example, you

Скачать книгу