The Statistical Analysis of Doubly Truncated Data. Prof Carla Moreira
Чтение книги онлайн.
Читать онлайн книгу The Statistical Analysis of Doubly Truncated Data - Prof Carla Moreira страница 10
The observed values of
range from 0.5 to 89 (months), while ranges from to 45.5. This suggests that the lower limit of the support of is about , while the upper limit of the support of is about 99.5. As discussed in Chapter 2, in such a case the distribution of the incubation time is identifiable on the interval (months). The AIDS Blood Transfusion Data also includes information on the age of the individual at infection; see Table 1.2.Table 1.2 Descriptive statistics for the AIDS Blood Transfusion Data: sample size
and mean (and standard deviation, SD) for the incubation time (months) by age at infection.Age group | Mean (SD) | |
---|---|---|
30 years | 56 | 27.09 (18.28) |
30–60 years | 104 | 33.80 (18.95) |
60 years | 135 | 32.46 (16.74) |
This dataset is used in Chapters 2, 3, 4 and 5 and can be obtained from AIDS.DT
in DTDA
package.
1.4.3 Equipment‐ S
Rounded Failure Time Data
Companies are often interested in estimating the time to failure of their devices after installation. For doing this, maintenance departments may register events of failure between two specific dates
and for the units installed in the field. This field lifetime distribution is, however, doubly truncated because of the interval sampling. The Equipment‐S
data (Ye and Tang, 2016) concern failures of a certain device (details are not given due to confidentiality issues) recorded between 1996 and 2011, a 15 year long observational window. Information on the date of installation and the date of failure, rounded to years, was obtained by digitizing Figure 2 in the referred paper. This dataset is therefore a discrete version of the original data in Ye and Tang (2016). In this example the right‐truncation time is the number of years between installation and 2011, while the left‐truncation time is just . In Table 1.3 the Equipment‐ S
failure times are summarized.
The observable range for the Equipment‐ S
failure times goes from zero to 34 years, which is the maximum observed value for the right‐truncating variable
EqSRounded
in the DTDA
package contains this dataset, which is used in Chapter 2.
1.4.4 Quasar Data
A classical motivating example of doubly truncated data, introduced by Efron and Petrosian (1999), is found in cosmology when registering the luminosity of quasars. Quasars are observed only if the luminosity lies within a certain interval, bounded at both ends that are determined by detection limits of observation devices, so the data suffer from double truncation. The original dataset studied by Efron and Petrosian (1999) comprises
triplets , where is the luminosity in the log‐scale, obtained from a transformation model based on the redshift and the apparent magnitude of the th quasar. See Efron and Petrosian (1999) for further details on the transformation model. Due to experimental constraints, the distribution of each luminosity in the log‐scale is truncated to a known interval . Specifically, quasars with apparent magnitude above were too dim to yield dependent redshifts, and hence they were excluded from the study. In addition, the lower limit was used to avoid confusion with non‐quasar stellar objects. Some descriptive statistics are provided in Table 1.4.Table 1.3 Years to failure and number of failing units for the Equipment‐ S
Rounded Failure Time Data.
Years: | 0–4 | 5–9 | 10–14 | 15–19 | 20–24 |
---|