The Art of Mathematics in Business. Dr Jae K Shim
Чтение книги онлайн.
Читать онлайн книгу The Art of Mathematics in Business - Dr Jae K Shim страница 20
∑X = 174 ∑Y = 225 ∑XY = 3414 ∑X2 = 2792
Using the shortcut method for r2,
This means that about 60.84 percent of the total variation in total sales is explained by advertising and the remaining 39.16 percent is still unexplained. A relatively low r2 indicates that there is a lot of room for improvement in the forecasting equation (Y2 = $10.5836 + $0.5632X). Advertising or a combination of price and advertising might improve r2.
Note: A low r2 is an indication that the model is inadequate for explaining the y variable. The general causes for this problem are:
1.Use of a wrong functional form.
2.Poor choice of an x variable as the predictor.
3.Omission of some important variable or variables from the model.
2. Standard Error of the Estimate (Se)
The standard error of the estimate, designated Se, is defined as the standard deviation of the regression. It is computed as
Statistics can be used to gain some idea of accuracy of these predictions.
Since, t = 3.94 > 2, we conclude that the b coefficient is statistically significant. As was indicated previously, the table’s critical value (cut-off value) for 10 degrees of freedom is 2.228 (from Table 8 in the Appendix).
Rule of thumb: Any t value greater than +2 or less than 2 is acceptable. The higher the t value, the greater the confidence we have in the coefficient as a predictor. Low t values are indications of low reliability of the predictive power of that coefficient.
Example 3
Returning to our example data, Se is calculated as
Suppose you wish to make a prediction regarding an individual Y value--such as a prediction about the sales when an advertising expense = $10. Usually, we would like to have some objective measure of the confidence we can place in our prediction, and one such measure is a confidence (or prediction) interval constructed for Y.
Note: t is the critical value for the level of significance employed. For example, for a significant level of 0.025 (which is equivalent to a 95% confidence level in a two-tailed test), the critical value of t for 10 degrees of freedom is 2.228 (See Table A.2 in the Appendix). As can be seen, the confidence interval is the linear distance bounded by limits on either side of the prediction.
Example 4
If you want to have a 95 percent confidence interval of your prediction, the range for the prediction, given an advertising expense of $10 would be between $10,595.10 and $21,836.10, as determined as follows: Note that from Example 4.2, Y′ = $16.2156
The confidence interval is therefore established as follows:
$16.2156 ± (2.228)(2.3436)
= $16.2156 ± (2.228)(2.3436)
= $16.2156 ± 5.2215
which means the range for the prediction, given an advertising expense of $10 would be between $10.5951 and $21.8361. Note that $10.9941 = $16.2156 - 5.2215 and $21.4371 =$16.2156 + 5.2215.
3. Standard Error of the Regression Coefficient (Sb) and the t Statistic
The standard error of the regression coefficient, designated s, and the t statistic are closely related. Sb is calculated as:
or, in short-cut form,
Sb gives an estimate of the range where the true coefficient will “actually” fall.
The t statistics (or t value) is a measure of the statistical significance of an independent variable X in explaining the dependent variable Y. It is determined by dividing the estimated regression coefficient b by its standard error Sb It is then compared with the table t value (see Table 7 in the appendix). Thus, the t statistic measures how many standard errors the coefficient is away from zero. Low t values are indicators of low reliability of that coefficient.
Example 5
The Sb for our example is:
Since t - 3.94 > 2, the conclusion is that the b coefficient is statistically significant.
How is it used and applied?
The least-squares method is used to estimate both simple and multiple regressions, although in reality managers will confront multiple regression more often than simple regression. Computer software is used to estimate b’s. A spreadsheet program such as Excel can be used to develop a model and estimate most of the regression statistics discussed thus far. Table 20.1 shows the relevant statistics.
Regression analysis is a powerful statistical technique that is widely used by businesspersons and economists. In order to obtain a good fit and to achieve a high degree of accuracy, analysts must be familiar with statistics relating to regression, such as r2 and the t value, and be able to make further tests that are unique to multiple regression.
See also Sec. 19, Regression Analysis; Sec. 21, Simple Regression.
Table 20.1: Excel regression output
(1)R-squared (r2) = .608373 = 60.84%
(2)Standard error of the estimate (Se) = 2.343622
(3)Standard error of the coefficient (Sb) = 0.142893
(4)t-value = 3.94
Note that all of the above are the same as the ones manually obtained. Note the following:
(1)t-statistic is more relevant to multiple regressions which have more than one b’s.
(2)r2 tells you how good the forest (overall fit) is while t-statistic tells you how good an individual tree (an independent variable) is.
In summary, the table t value, based on a degree of freedom and a level of significance, is used:
1.To