Frailty modelling for multivariate current status data with applications in epidemiology
Final Report Abstract
Current status data arise in the life sciences when the exact timing of an event is unobserved and it is only known at a given point in time whether or not the event has occurred. A prominent example of such data arises in infectious disease epidemiology, if one is interested in the times to occurrence of multiple non-lethal infectious diseases within the same individual. Usually, the times at the onset of infection can only be determined to lie below or above the observed monitoring time when the serological samples were collected, giving rise to multivariate current status data. In this setting, the individual can be thought of as forming a cluster in which event times are likely to be correlated. As human beings are dissimilar, it is also likely that there is heterogeneity across individuals (clusters) due to characteristics that may be difficult or impossible to measure. Random-effects models for time-to-event data, also known as frailty models, provide a conceptually simple and appealing way to quantify these associations within clusters and to model unobserved heterogeneity across clusters. In the vast majority of the literature, the random effect is assumed to follow a continuous probability distribution. However, in some areas of application, a discrete frailty distribution may be more appropriate. One such area is infectious diseases transmitted by sexual contact, in which heterogeneity could be represented by the number of sexual partners. In this project, we investigated and compared various existing families of discrete univariate and multivariate (shared) frailty models by taking as our focus the variance of the relative frailty distribution in survivors. The relative frailty variance (RFV) among survivors provides a readily interpretable measure of how the heterogeneity of a population, as represented by a frailty model, evolves over time. We explored the shape of the RFV for the purpose of model selection and reviewed available discrete random effect distributions in this context. We found non-monotone trajectories of the RFV for discrete univariate and shared frailty models with multiple changes in slope over time, which is a property that seems to be absent for continuous frailty models discussed in the literature. We also showed that in the long run, in contrast to continuous frailty models, the heterogeneity of data generated by a discrete timeinvariant frailty approaches either infinity or homogeneity. Through the one-to-one relationship of the RFV with the cross-ratio function in shared frailty models, our results also apply to patterns of association within a cluster. Extensions and contrasts to discrete correlated and time-varying frailty models were discussed. In particular, we focused on the implementation and interpretation of the so-called Addams family of discrete frailty distributions. The Addams family is homogeneous in the sense that the gamma distribution is the only continuous density within this family. We proposed methods of estimation for this family of densities in the context of shared frailty models for the hazard rates for current status data. We highlighted interpretational advantages of the Addams family of discrete frailty distributions, as compared to other distributions. Our methods were illustrated with applications to multivariate serological data, made available by the Dutch National Institute for Health and Environment (RIVM) through the PIENTER-2 study.
Publications
-
On the Addams family of discrete frailty distributions for modeling multivariate case I interval-censored data. Biostatistics, 26(1).
Bardo, Maximilian; Hens, Niel & Unkel, Steffen
