Statistics and Probability with Applications for Engineers and Scientists Using MINITAB, R and JMP. Bhisham C. Gupta
Чтение книги онлайн.
Читать онлайн книгу Statistics and Probability with Applications for Engineers and Scientists Using MINITAB, R and JMP - Bhisham C. Gupta страница 38
Figure 2.4.7 Frequency polygon for survival time of parts under extraneous operating conditions.
Figure 2.4.8 Typical frequency distribution curve.
Figure 2.4.9 Three typical types of frequency distribution curves.
The shape of the frequency distribution curve of a data set depends on the shape of its histogram and choice of class or bin size. The shape of a frequency distribution curve can in fact be of any type, but in general, we encounter the three typical types of frequency distribution curves shown in Figure 2.4.9.
We now turn to outlining the various steps needed when using MINITAB and R.
MINITAB
1 Enter the data in column C1.
2 From the Menu bar, select Graph Histogram. This prompts the following dialog box to appear on the screen.
3 From this dialog box, select an appropriate histogram and click OK. This will prompt another dialog box to appear.
4 In this dialog box, enter C1 in the box under the Graph variables and click OK. Then, a histogram graph will appear in the Session window.
5 After creating the histogram, if you want to customize the number of classes (cells or bins), click twice on any bar of the histogram. This prompts another dialog box Edit Bars to appear. In the new dialog box, select Binning. This allows the user to select the desired number of classes, their midpoints or cutpoints.To create a cumulative frequency histogram, take all the steps as previously described. Follow this in the dialog box at Histogram‐Simple, and select Scale Y‐Scale Type. Then, check a circle next to Frequency and a square next to Accumulate values across bins. Click OK. A customized Cumulative Frequency Histogram using MINITAB is as obtained as shown in Figure 2.4.10. Note: To get the exact sample cumulative distribution, we used the manual cutpoints shown in the first column of Table 2.4.3 when Binning.
6 To obtain the frequency polygon in the dialog box Histogram‐Simple, select Data view Data Display, remove the check mark from Bars, and placing a check mark on Symbols. Under the Smoother tab, select Lowess for smoother and change Degree of smoothing to be 0 and Number of steps to be 1. Then, click OK twice. At this juncture, the polygon needs be modified to get the necessary cutpoints. We produced by right clicking on the X‐axis, and selecting the edit X scale. Under the Binning tab for Interval Type, select Cutpoint, and under the Interval Definition, select Midpoint/Cutpoint positions. Now type manually calculated interval cutpoints. Note that one extra lower and one extra upper cutpoint should be included so that we can connect the polygon with the ‐axis as shown in Figure 2.4.7.
Figure 2.4.10 The cumulative frequency histogram for the data in Example 2.4.5.
USING R
We can use the built in ‘hist()’ function in R to generate histograms. Extra arguments such as ‘breaks’, ‘main’, ‘xlab’, ‘ylab’, ‘col’ can be used to define the break points, graph heading,
SurvTime = c(60,100,130,100,115,30,60,145,75,80,89,57,64,92,87,110, 180,195,175,179,159,155, 146,157,167,174,87,67,73,109,123,135,129, 141,154,166,179,37,49,68,74,89,87,109,119,125,56,39,49,190) #To plot the histogram hist(SurvTime, breaks=seq(30,198, by=24), main=‘Histogram of Survival Time’, xlab=‘Survival Time’, ylab=‘Frequency’, col=‘grey’, right = FALSE) #To obtain the cumulative histogram, we replace cell frequencies by their cumulative frequencies h = hist(SurvTime, breaks=seq(30,198, by=24), right = FALSE) h$counts = cumsum(h$counts) #To plot the cumulative histogram plot(h, main=‘Cumulative Histogram’, xlab=‘Survival Time’, ylab=‘Cumulative Frequency’, col=‘grey’) Below, we show the histograms obtained by using the above R code.
Another graph called the ogive curve, which represents the cumulative frequency distribution (c.d.f.), is obtained by joining the lower limit of the first bin to the upper limits of all the bins, including the last bin. Thus, the ogive curve for the data in Example 2.4.5 is as shown in Figure 2.4.11.
Figure 2.4.11 Ogive curve using MINITAB for the data in Example 2.4.5.
2.4.5 Line Graph
A line graph, also known as a time‐series graph, is commonly used to study any trends in the variable of interest that might occur over time. In a line graph, time is marked on the horizontal axis (the