The Statistical Analysis of Doubly Truncated Data. Prof Carla Moreira
Чтение книги онлайн.
Читать онлайн книгу The Statistical Analysis of Doubly Truncated Data - Prof Carla Moreira страница 7
Another feature leading to left‐truncation is the delayed entry into study. This happens when the individuals enter the study only at some random time
after onset. For example, diagnosis of a certain disease may not be ascertained until the first visit to the hospital. If the 'end‐of‐disease' event occurs before the potential date of visit, the time‐to‐event of such a patient will be never known, with the resulting difficulty in observing relatively small event times. Beyersmann et al. (2012) provide an illustrative example of this issue in the investigation of abortion times.1.2.2 Right‐truncation
In some particular settings, the target variable of ultimate interest
is observed only for the individuals who experience the event before a certain calendar time . A typical example of such a situation is the investigation of the incubation (or induction) times for AIDS; see for example Klein and Moeschberger (2003). The incubation time is defined as the time elapsed between the date of HIV infection, say, and the development of AIDS. If stands for the incubation time and , then the incubation times of individuals developing AIDS prior follow the distribution of conditionally on . Here, is called the right‐truncation time. An immediate effect of right‐truncation is that large values of are sampled with a relatively small probability.1.2.3 Truncation vs. Censoring
At this point, the reader may be curious about the difference between truncation and censoring. Right‐censoring is a very well known phenomenon in Survival Analysis and reliability studies, among other fields. It happens when the follow‐up of a given individual stops before the event of interest has taken place. In such a case, the observer only knows that the target variable is larger than the registered value, which is referred to as censoring time. A sample made up of real and censored values is typically analysed by the Kaplan–Meier estimator (Kaplan and Meier, 1958), which corrects for the fact that some of the recorded values for
are smaller than the true ones. With truncated data, every value in the sample corresponds to a true observation of ; however, the distribution of the observed values may be shifted with respect to the true one due to the truncation event. This difference between truncation and censoring suggests that specific methods to estimate the target distribution under random truncation should be employed. Indeed, Woodroofe (1985) provides a deep analysis of one‐sided truncation, introducing the original idea of Lynden–Bell (1971) as a nonparametric maximum likelihood estimator (NPMLE) of the probability distribution in that setting. The estimator in Woodroofe (1985) is a particular case of the estimator corresponding to doubly truncated data, on which this book is focused.
1.3 Double Truncation
A variable of interest
is said to be doubly truncated by a couple of random variables if the observation of is possible only when occurs. In such a case, and are called left‐ and right‐truncation variables respectively. Double truncation reduces to left‐truncation when degenerates at , while it corresponds to right‐truncation when . This book is focused on the problem of estimating the distribution of , and other related curves, from a set of iid triplets with the distribution of given .There are several scenarios where double truncation appears in practice. One setting leading to double