Exercises and Projects for The Little SAS Book, Sixth Edition. Lora D. Delwiche

Чтение книги онлайн.

Читать онлайн книгу Exercises and Projects for The Little SAS Book, Sixth Edition - Lora D. Delwiche страница 8

Автор:
Жанр:
Серия:
Издательство:
Exercises and Projects for The Little SAS Book, Sixth Edition - Lora D. Delwiche

Скачать книгу

The American Kennel Club (AKC) reports rankings of dog breeds by year based on the number of registrations. These data are found in the raw data file AKCbreeds.dat. For each breed, the data include the name of the breed, and ranking for each of four years. Breeds with missing ranks were not recognized by the AKC during that year.

      a. Open the raw data file AKCbreeds.dat in a simple editor such as WordPad. In a comment in your program, state the number of variables and observations.

      b. Read the raw data file into SAS. View the log to verify that your data set has the same number of variables and observations as you stated in part a).

      c. Print the data set.

      47. The World Health Organization (WHO) monitors vaccine recommendations in countries around the world. The raw data file Vaccines.dat contains the recommended vaccines for a sample of 13 countries. The variables in this file are vaccine name, mode of disease transmission, worldwide incidence, worldwide deaths, and recommendations (stored in 13 individual columns for the respective countries of Chile, Cuba, United States, United Kingdom, Finland, Germany, Saudi Arabia, Ethiopia, Botswana, India, Australia, China, and Japan).

      a. Open the raw data file Vaccines.dat in a simple editor such as WordPad. In a comment in your program, state the number of variables and observations.

      b. Read the raw data file into SAS. View the log to verify that your data set has the same number of variables and observations as you stated in part a).

      c. Print the data set.

      48. Each year, Forbes magazine publishes a list of the world’s 100 biggest companies. Each company receives a score using four metrics: sales, profits, assets, and market value. The final overall ranking is based on a composite score of these metrics. The variables in the raw data file BigCompanies.dat are ranking, company name, country, sales (billions), profits (billions), assets (billions), and market value (billions).

      a. Open the raw data file BigCompanies.dat in a simple editor such as WordPad. In a comment in your program, state which variables must be read in as character and which variables should be read in as numeric.

      b. Read the raw data file into SAS.

      c. Print the data set.

      49. Crayola crayons were introduced in 1903, and since then numerous standard colors have been released. Each crayon has a unique name, which corresponds to a hexadecimal code and RGB triplet. The raw data file Crayons.dat contains information on these standard crayon colors with variables corresponding to crayon number, color name, hexadecimal code, RGB triplet, pack size, year issued, and year retired.

      a. Open the raw data file Crayons.dat in a simple editor such as WordPad. In a comment in your program, state which variables must be read in as character and which variables should be read in as numeric.

      b. Read the raw data file into a permanent SAS data set.

      c. Print the data set.

      50. The tallest mountains in the world are located in central and southern Asia. The raw data file Mountains.dat contains information on mountains over 7,200 meters (23,622 ft). Researchers measure the prominence of a mountain as the height above the highest saddle connecting it to a higher summit. The variables in this file are mountain name, height (m), height (ft), year of first ascent, and prominence (m).

      a. Open the raw data file Mountains.dat in a simple editor such as WordPad. In a comment in your program, state which variables must be read in as character and which variables should be read in as numeric.

      b. Read the raw data file into SAS.

      c. Print the data set.

      51. Information Technology Services (ITS) at Central State University has a computing service called ”the Grid,” which is offered to faculty, staff, and students. This supercomputer is a cluster of 10 computers that, if programmed correctly in a grid environment, can process much faster by distributing the work across 10 machines. University users that would like to use the Grid computing environment must register with ITS. The raw data file CompUsers.dat contains the variables user ID, classification group (faculty, staff, or student), first name, last name, email address, campus phone number, and department.

      a. Examine the raw data file CompUsers.dat and read it into SAS.

      b. Print the data set.

      c. Write another DATA step to read the raw data file and remove the student records. Do this as efficiently as possible by testing the classification group as it is being read in with the INPUT statement.

      d. Print the data set.

      52. The World Health Organization (WHO) collected data in countries across the world regarding the outbreak of swine flu cases and deaths in 2009. The data in the file SwineFlu2009.dat include counts per country by month during the epidemic. There are many variables in the raw data file with the following descriptions:

      By date, ID for sorting by first case date

      By continent, ID (X.YY) for sorting by first case date within a continent where X represents continent X, and YY represents the YYth country with the next first case

      Country

      Date of first case reported

      Number of cumulative cases reported on the first day of the month for April, May, June, July, and August (across the columns, respectively)

      Last reported cumulative number of cases reported to WHO as of August 9, 2009

      By date, ID for sorting by first death date

      By continent, ID (X.YY) for sorting by first death date within a continent where X represents continent X, and YY represents the YYth country with the next first death

      Date of first death

      Number of cumulative deaths reported on the first day of the month for May, June, July, August, September, October, November, and December (across the columns, respectively)

      a. Examine the raw data file SwineFlu2009.dat and read it into SAS.

      b. Print a report that describes the contents of the data set including attributes of the variables.

      53. The data in the file BenAndJerrys.dat represent various ice cream flavors and their nutritional information. The variables in the raw data file are flavor name, portion size (g), calories, calories from fat, fat (g), saturated fat (g), trans fat (g), cholesterol (mg), sodium (mg), total carbohydrate (g), dietary fiber (g), sugars (g), protein (g), year introduced, year retired, content description, and notes.

      a. Examine the raw data file BenAndJerrys.dat and read it into SAS using a DATA step.

      b. Read the raw data file using PROC IMPORT.

      c. Create reports that describe the contents for each data set.

      d. Note any differences between the two data sets as a comment in your program.

      54. Data on previous winners of the Oscars are stored in a Microsoft Excel file named Oscars.xlsx. The variables in this file are ID, year, host, best picture, best actor, best actress, best director, and best screenplay.

      a. Examine

Скачать книгу