Statistics and Probability with Applications for Engineers and Scientists Using MINITAB, R and JMP. Bhisham C. Gupta
Чтение книги онлайн.
Читать онлайн книгу Statistics and Probability with Applications for Engineers and Scientists Using MINITAB, R and JMP - Bhisham C. Gupta страница 30
After tallying the data, we find that of the 110 companies, 28 belong in the first category, 26 in the second category, 20 in the third category, 16 in the fourth category, and 20 in the last category. Thus, a frequency distribution table for the data in Table 2.3.1 is as shown in Table 2.3.2.
Table 2.3.2 Frequency distribution for the data in Table 2.3.1.
Frequency | Cumulative | Cumulative | |||
Categories | Tally | or count | frequency | Percentage | percentage |
1 | ///// ///// ///// ///// ///// /// | 28 | 28 | 25.45 | 25.45 |
2 | ///// ///// ///// ///// ///// / | 26 | 54 | 23.64 | 49.09 |
3 | ///// ///// ///// ///// | 20 | 74 | 18.18 | 67.27 |
4 | ///// ///// ///// / | 16 | 90 | 14.55 | 81.82 |
5 | ///// ///// ///// ///// | 20 | 110 | 18.18 | 100.00 |
Total | 110 | 100.00 |
Interestingly, we can put technology to work on data in Table 2.3.1 to produce Table 2.3.2.
Example 2.3.2 (Industrial revenue) Using MINITAB and R, construct a frequency distribution table for the data in Table 2.3.1.
Solution:
MINITAB
1 Enter the data in column C1 of the Worksheet Window and name it Categories.
2 From the Menu bar, select Stat Tables Tally Individual Variables
3 In this dialog box, enter C1 in the box under Variables.
4 Check all the boxes under Display and click OK.
5 The frequency distribution table as shown below appears in the Session window.
This frequency distribution table may also be obtained by using R as follows:
USING R
R has built in ‘table()’ function that can be used to get the basic frequency distribution of categorical data. To get the cumulative frequencies, we can apply built in ‘cumsum()’ function to tabulated frequency data. Then using the ‘cbind()’ function we combine categories, frequencies, cumulative frequencies, and cumulative percentages to build the final distribution table. In addition, we can use the ‘colnames()’ function to name the columns of the final table as needed. The task can be completed running the following R code in R Console window.
#Assign given data to the variable data data = c(4,3,5,3,4,1,2,3,4,3,1,5,3,4,2,1,1,4,5,3,2,5,2,5,2,1,2,3,3,2, 1,5,3,2,1,1,2,1,2,4,5,3,5,1,3,1,2,1,4,1,4,5,4,1,1,2,4,1,4,1,2,4,3,4,1, 4,1,4,1,2,1,5,3,1,5,2,1,2,3,1,2,2,1,1,2,1,5,3,2,5,5,2,5,3,5,2,3,2,3,5, 2,3,5,5,2,3,2,5,1,4) #To get frequencies data.freq = table(data) #To combine necessary columns freq.dist = cbind(data.freq, cumsum(data.freq), 100*cumsum(data.freq)/sum(data.freq)) #To name the table columns colnames(freq.dist)