Handbook of Regression Analysis With Applications in R. Samprit Chatterjee

Чтение книги онлайн.

Читать онлайн книгу Handbook of Regression Analysis With Applications in R - Samprit Chatterjee страница 9

Handbook of Regression Analysis With Applications in R - Samprit  Chatterjee

Скачать книгу

considerations and principles will have a place in the analyst's toolkit for a long time to come.

      Of course, part of that usefulness comes from the ability to generalize regression models to more complex situations, and that is the thrust of the changes in this new edition. One thing that hasn't changed is the philosophy behind the book, and our recommendations on how it can be best used, and we encourage the reader to refer to the preface to the first edition for guidance on those points. There have been small changes to the original chapters, and broad descriptions of those chapters can also be found in the preface to the first edition. The five new chapters (Chapters 11, 13, 14, 15, and 16, with the former chapter 11 on nonlinear regression moving to Chapter 12) expand greatly on the power and applicability of regression models beyond what was discussed in the first edition. For this reason many more references are provided in these chapters than in the earlier ones, since some of the material in those chapters is less established and less well‐known, with much of it still the subject of active research. In keeping with that, we do not spend much (or any) time on issues for which there still isn't necessarily a consensus in the statistical community, but point to books and monographs that can help the analyst get some perspective on that kind of material.

      Chapter 13 extends applications to data with multiple observations for each subject consistent with some structure from the underlying process. Such data can take the form of nested or clustered data (such as students all in one classroom) or longitudinal data (where a variable is measured at multiple times for each subject). In this situation ignoring that structure results in an induced correlation that reflects unmodeled differences between classrooms and subjects, respectively. Mixed effects models generalize analysis of variance (ANOVA) models and time series models to this more complicated situation. Models with linear effects based on Gaussian distributions can be generalized to nonlinear models, and also can be generalized to non‐Gaussian distributions through the use of generalized linear mixed effects models.

      Modern data applications can involve very large (even massive) numbers of predictors, which can cause major problems for standard regression methods. Best subsets regression (discussed in Chapter 2) does not scale well to very large numbers of predictors, and Chapter 14 discusses approaches that can accomplish that. Forward stepwise regression, in which potential predictors are stepped in one at a time, is an alternative to best subsets that scales to massive data sets. A systematic approach to reducing the dimensionality of a chosen regression model is through the use of regularization, in which the usual estimation criterion is augmented with a penalty that encourages sparsity; the most commonly‐used version of this is the lasso estimator, and it and its generalizations are discussed further.

      A final small change from the first edition to the second edition is in the title, as it now includes the phrase With Applications in R. This is not really a change, of course, as all of the analyses in the first edition were performed using the statistics package R. Code for the output and figures in the book can (still) be found at its associated web site at http://people.stern.nyu.edu/jsimonof/RegressionHandbook/. As was the case in the first edition, even though analyses are performed in R, we still refer to general issues relevant to a data analyst in the use of statistical software even if those issues don't specifically apply to R.

      We would like to once again thank our students and colleagues for their encouragement and support, and in particular students for the tough questions that have definitely affected our views on statistical modeling and by extension this book. We would like to thank Jon Gurstelle, and later Kathleen Santoloci and Mindy Okura‐Marszycki, for approaching us with encouragement to undertake a second edition. We would like to thank Sarah Keegan for her patient support in bringing the book to fruition in her role as Project Editor. We would like to thank Roni Chambers for computing assistance, and Glenn Heller and Marc Scott for looking at earlier drafts of chapters. Finally, we would like to thank our families for their continuing love and support.

      SAMPRIT CHATTERJEE

      Brooksville, Maine

      JEFFREY S. SIMONOFF

      New York, New York

      October, 2019

      How to Use This Book

      This book is designed to be a practical guide to regression modeling. There is little theory here, and methodology appears

Скачать книгу