The Statistical Analysis of Doubly Truncated Data. Prof Carla Moreira

Чтение книги онлайн.

Читать онлайн книгу The Statistical Analysis of Doubly Truncated Data - Prof Carla Moreira страница 9

The Statistical Analysis of Doubly Truncated Data - Prof Carla Moreira

Скачать книгу

age at diagnosis which, by definition of childhood cancer, is supported on the
interval (time in years). The number of cases was 409. However, for three cases the value of
was not available, so we only consider the
children who report complete information.

      Because of the interval sampling, the age at diagnosis

is doubly truncated by the pair
, where the right‐truncation variable
is the time in years from birth (date of onset,
) to 31 December 2003, and
. The
triplets
,
, with the values observed for
were reported in Moreira and de Uña‐Álvarez (2010), while de Uña‐Álvarez (2020) included the cancer group in the statistical analysis. Ordinary descriptive statistics can be applied to the information gathered along this 5 year long window to compute, for instance, the average age at cancer diagnosis. However, if the goal is to describe the population of children eventually developing cancer, the double truncation issue should be acknowledged and properly corrected, so potential biases are avoided.

range between
and 14.5 (years); equivalently, the observed values for
range between 0.5 and 19.5. This means that the lower and upper endpoints of
and
satisfy
and
. Thus, in this case, the target variable
is observable on its whole support
, and there are no identification issues for
, the cdf of
. Information on
is summarized in Table 1.1.

and mean (and standard deviation, SD) for the age at diagnosis (years).

Group
Mean (SD)
All 406 6.47 (4.50)
By gender Female 178 6.43 (4.51)
Male 228 6.51 (4.51)
By ICCC Group Leukemia 107 6.30 (4.15)
Lymphoma 57 8.66 (4.39)
N. System Tumour 94 6.38 (4.29)
Neuroblastoma 38 3.16 (3.47)
Other 105 6.87 (4.70)
Missing 5 3.92 (5.18)

      

      1.4.2 AIDS Blood Transfusion Data

      Kalbfleish and Lawless (1989) reported 494 cases of transfusion‐related AIDS, corresponding to individuals diagnosed prior to 1 July 1986 (

). The variable of ultimate interest
is the induction or incubation time, which is the time elapsed from HIV infection to AIDS. Importantly, HIV was unknown before 1982 (
); this implies that cases developing AIDS prior to this date were not reported. Let
denote the time from HIV infection to 1 July 1986 (in months), and introduce
; then, due to the interval sampling, only triplets
satisfying upper U less-than-or-equal-to upper 
						<noindex><p style= Скачать книгу