Practical Data Analysis with JMP, Third Edition. Robert Carver
Чтение книги онлайн.
Читать онлайн книгу Practical Data Analysis with JMP, Third Edition - Robert Carver страница 4
Chapter 22: Quality Improvement
Overview
Processes and Variation
Control Charts
Variability Charts
Capability Analysis
Pareto Charts
Application
About This Book
What Does This Book Cover?
Purpose: Learning to Reason Statistically
We live in a world of uncertainty. Today more than ever before, we have vast resources of data available to shed light on crucial questions. But at the same time, the sheer volume and complexity of the “data deluge” can distract and overwhelm us. The goal of applied statistical analysis is to work with data to calibrate, cope with, and sometimes reduce uncertainty. Business decisions, public policies, scientific research, and news reporting are all shaped by statistical analysis and reasoning. Statistical thinking is an essential part of the boom in “big data analytics” in numerous professions. This book will help you use and discriminate among some fundamental techniques of analysis, and it will also help you engage in statistical thinking by analyzing real problems. You will come to see statistical investigations as an iterative process and will gain experience in the major phases of that process.
To be an effective analyst or consumer of other people’s analyses, you must know how to use these techniques, when to use them, and how to communicate their implications. Knowing how to use these techniques involves mastery of computer software like JMP. Knowing when to use these techniques requires an understanding of the theory underlying the techniques and practice with applications of the theory. Knowing how to effectively communicate with consumers of an analysis or with other analysts requires a clear understanding of the theory and techniques, as well as clarity of expression, directed toward one’s audience.
There was a time when a first course in statistics emphasized abstract theory, laborious computation, and small sets of artificial data—but not practical data analysis or interpretation. Those days are thankfully past, and now we can address all three of the skill sets just cited.
Scope and Structure of This Book
As a discipline, statistics is large and growing; the same is true of JMP. One paperback book must limit its scope, and the content boundaries of this book are set intentionally along several dimensions.
First, this book provides considerable training in the basic functions of JMP 15. JMP is a full-featured, highly interactive, visual, and comprehensive package. The book assumes that you have the software at your school or office. The software’s capabilities extend far beyond an introductory course, and this book makes no attempt to “cover” the entire program. The book introduces students to its major platforms and essential features and should leave students with sufficient background and confidence to continue exploring on their own. Fortunately, the Help system and accompanying manuals are quite extensive, as are the learning resources available online at http://www.jmp.com.
Second, the chapters largely follow a traditional sequence, making the book compatible with many current texts. As such, instructors and students will find it easy to use the book as a companion volume in an introductory course. Chapters are organized around core statistical concepts rather than software commands, menus, or features. Several chapters include topics that some instructors might view as “advanced”—typically when the output from JMP makes it a natural extension of a more elementary topic. This is one way in which software can redefine the boundaries of introductory statistics.
Third, nearly all the data sets in the book are real and are drawn from those disciplines whose practitioners are the primary users of JMP software. Inasmuch as most undergraduate programs now require coursework in statistics, the examples span major areas in which statistical analysis is an important path to knowledge. Those areas include engineering, life sciences, business, and economics.
Fourth, each chapter invites students to practice the habits of thought that are essential to statistical reasoning. Long after readers forget the details of a particular procedure or the options available in a specific JMP analysis platform, this book may continue to resonate with valuable lessons about variability, uncertainty, and the logic of inference.
Each chapter concludes with a set of “Application Scenarios,” which lay out a problem-solving or investigative context that is in turn supported by a data table. Each scenario includes a set of questions that implicitly require the application of the techniques and concepts presented in the chapter.
New in the Third Edition
This edition preserves much of the content and approach of the earlier editions, while updating examples and introducing new JMP features. As in the second edition, there are three review chapters (Chapters 5, 9, and 17) that pause to recap concepts and techniques. One of the perennial challenges in learning statistics is that it is easy to lose sight of major themes as a course progresses through a series of seemingly disconnected techniques and topics. Some readers should find the review chapters to be helpful in this respect. The review chapters share a single large data set of World Development Indicators, published by the World Bank.
The scope and sequence of chapters is basically the same as the prior edition. There is some additional new material about the importance of documenting one’s work with an eye toward reproducibility of analyses, as well as production of presentation-ready reporting. The second edition was based on JMP 11, and since that time, platforms have been added or modified, and some functionality has relocated in the menu system. This edition captures those changes.
Some of the updated data tables are considerably larger than their counterparts in earlier editions. This creates the opportunity to demonstrate methods for meaningful graphs when data density and overplotting become issues. I also use some of the larger data tables to introduce machine learning practices like partitioning a data set into training and validation sets.
JMP Projects are introduced in Chapter 2 and used throughout the book. Projects are a way to organize, preserve, and document multiple analyses using multiple data tables. They naturally support a logical and reproducible workflow. Using projects is a way for newcomers to establish good habits and for JMP veterans to be more efficient.
Other additions and amendments include:
● Early introduction of more data types, Header Graphs, and JMP Public.
● Expanded use of Subset, Global and Local Data Filters and Animate. In the prior editions, for example, the set of data tables included some subsets of larger tables. Because data preparation is such an important part of the analytical cycle, readers learn to perform filtering and subsetting functions on their own.
● The Recode command has evolved since JMP 11, as have the lessons using Recode. Readers will learn why and how to recode a column.
● In the Regression chapters, coverage