Читать онлайн книгу - Introduction to Linear Regression Analysis. Douglas C. Montgomery. Математика. LiveLib

Новинки Лучшее Рекомендации

Информация о книге:

Название:

Автор:

Жанр:

Серия:

Издательство:

Introduction to Linear Regression Analysis - Douglas C. Montgomery

Скачать книгу

greatly improved both earlier editions and this fifth edition of the book. We particularly appreciate the many graduate students and professional practitioners who provided feedback, often in the form of penetrating questions, that led to rewriting or expansion of material in the book. We are also indebted to John Wiley & Sons, the American Statistical Association, and the Biometrika Trustees for permission to use copyrighted material.

DOUGLAS C. MONTGOMERY

ELIZABETH A. PECK

G. GEOFFREY VINING

ABOUT THE COMPANION WEBSITE

This book is accompanied by an instructor companion website and a student companion website:

www.wiley.com/go/montgomery/introlinearregression6e

The instructor site includes PowerPoint slides to facilitate instructional use of the book.

The student site includes data sets.

      CHAPTER 1
      INTRODUCTION

      1.1 REGRESSION AND MODEL BUILDING

Regression analysis is a statistical technique for investigating and modeling the relationship between variables. Applications of regression are numerous and occur in almost every field, including engineering, the physical and chemical sciences, economics, management, life and biological sciences, and the social sciences. Regression analysis is used extensively in data mining and is a basic tool of data science and analytics. Because of its wide applicability to a range of problems, regression analysis may be the most widely used statistical technique.

As an example of a problem in which regression analysis may be helpful, suppose that an industrial engineer employed by a soft drink beverage bottler is analyzing the product delivery and service operations for vending machines. He suspects that the time required by a route deliveryman to load and service a machine is related to the number of cases of product delivered. The engineer visits 25 randomly chosen retail outlets having vending machines, and the in-outlet delivery time (in minutes) and the volume of product delivered (in cases) are observed for each. The 25 observations are plotted in Figure 1.1a. This graph is called a scatter diagram. This display clearly suggests a relationship between delivery time and delivery volume; in fact, the impression is that the data points generally, but not exactly, fall along a straight line. Figure 1.1b illustrates this straight-line relationship.

If we let y represent delivery time and x represent delivery volume, then the equation of a straight line relating these two variables is

(1.1)

Figure 1.1 (a) Scatter diagram for delivery volume. (b) Straight-line relationship between delivery time and delivery volume.

where β₀ is the intercept and β₁ is the slope. Now the data points do not fall exactly on a straight line, so Eq. (1.1) should be modified to account for this. Let the difference between the observed value of y and the straight line (β₀ + β₁x) be an error ε. It is convenient to think of ε as a statistical error; that is, it is a random variable that accounts for the failure of the model to fit the data exactly. The error may be made up of the effects of other variables on delivery time, measurement errors, and so forth. Thus, a more plausible model for the delivery time data is

(1.2)

Equation (1.2) is called a linear regression model. Customarily x is called the independent variable and y is called the dependent variable. However, this often causes confusion with the concept of statistical independence, so we refer to x as the predictor or regressor variable and y as the response variable. Because Eq. (1.2) involves only one regressor variable, it is called a simple linear regression model.

To gain some additional insight into the linear regression model, suppose that we can fix the value of the regressor variable x and observe the corresponding value of the response y. Now if x is fixed, the random component ε on the right-hand side of Eq. (1.2) determines the properties of y. Suppose that the mean and variance of ε are 0 and σ², respectively. Then the mean response at any value of the regressor variable is

Notice that this is the same relationship that we initially wrote down following inspection of the scatter diagram in Figure 1.1a. The variance of y given any value of x is

Thus, the true regression model μ_y|x = β₀ + β₁x is a line of mean values, that is, the height of the regression line at any value of x is just the expected value of y for that x. The slope, β₁ can be interpreted as the change in the mean of y for a unit change in x. Furthermore, the variability of y at a particular value of x is determined by the variance of the error component of the model, σ². This implies that there is a distribution of y values at each x and that the variance of this distribution is the same at each x.

Figure 1.2 How observations are generated in linear regression.

Figure 1.3 Linear regression approximation of a complex relationship.

For example, suppose that the true regression model relating delivery time to delivery volume is μ_y|x = 3.5 + 2x, and suppose that the variance is σ² = 2. Figure

Скачать книгу

Introduction to Linear Regression Analysis. Douglas C. Montgomery

Чтение книги онлайн.

Читать онлайн книгу Introduction to Linear Regression Analysis - Douglas C. Montgomery страница 14

Информация о книге:

ABOUT THE COMPANION WEBSITE

CHAPTER 1
      INTRODUCTION

      1.1 REGRESSION AND MODEL BUILDING

Introduction to Linear Regression Analysis. Douglas C. Montgomery

Чтение книги онлайн.

Читать онлайн книгу Introduction to Linear Regression Analysis - Douglas C. Montgomery страница 14

Информация о книге:

ABOUT THE COMPANION WEBSITE

CHAPTER 1 INTRODUCTION 1.1 REGRESSION AND MODEL BUILDING

CHAPTER 1
INTRODUCTION

1.1 REGRESSION AND MODEL BUILDING