Multilevel random effect and marginal models for longitudinal data

In many clinical trials, in order to characterize the safety profile of a subject with a given treatment, multiple measurements are taken over time. Mostly, measurements taken from the same subject are not independent. Thus, in cases where the dependent variable is categorical, the use of logistic regression models assuming independence between observations taken from the same subject is not appropriate. In this paper, marginal and random effect models that take the correlation among measurements of the same subject into account were fitted and extensions on the existing models also proposed. The models were applied to data obtained from a phase-III clinical trial on a new meningococcal vaccine. The goal is to investigate whether children injected by the candidate vaccine have a lower or higher risk for the occurrence of specific adverse events than children injected with licensed vaccine, and if so, to quantify the difference. Moreover, in the paper, extensions for the random intercept partial proportional odds model and generalized ordered logit model which assumes identical variability for different category levels were extended by introducing category specific random terms. This is very appealing to study the association between different category levels. Instead of using the classical logistic regression, Generalized Estimating Equations (GEEs) and random effect models are appropriate when measurements taken from the same subject are not independent. The result reveals that, in both marginal and random effects model, significant difference between the two vaccines were found for pain and redness adverse event.


INTRODUCTION
Pharmaceutical companies develop vaccines which contains an agent that resembles a disease-causing microorganism in order to improve immunity to a particular disease. When a company aims to bring a new vaccine product to market, the safety profile of the vaccine is assessed in different ways to ensure that it is safe. In most cases, the clinical safety evaluation of the vaccine is performed regarding two specific aspects (Bergsma et al., 2013). Generalized estimating equations (Liang and Zeger, 1986) from marginal models are usually preferred to evaluate the overall adverse events as a function of treatment group and visiting day. While, in a random effects approach (Berslow and Clayton, 1993), the response rates are modeled as a function of covariates and parameters specific to a subject.
The two model families do not only differ in the questions they address, but also in the way they deal with the dependencies between the observations. This difference way of handling the within child association leads the two models families for different purpose as mentioned by different authors (Laird and Ware, 1982;Agresti, 2002;Fitzmaurice et al., 2004).
As a result, interpretations of the regression model parameters are different. For the partial proportional odds random intercepts model which assume identical baseline variability within the subject being in different categories of the outcome, extensions that allow to have different random effect variability at each category proposed.

Case study: Phase-III clinical trial
The data used in this paper come from a phase III   (Aitchison and Silvey, 1957;Genter and Farewell, 1985;Agresti, 2002 where A i is a diagonal matrix with the marginal variance (µ i )=Var (w i ) on the main diagonal and is the working a correlation matrix that depend on the unknown parameter vector α. Liang and Zeger (1986) showed that using the method of moments concept, when the marginal mean has been correctly specified and when the mild regularity condition hold, the estimator obtained by solving the score equation (2) is consistent and asymptotically normally distributed with mean β and asymptotic variance covariance matrix var (β).

Statistical methodology
In summary, marginal models for longitudinal data separately model the mean response, and within child association among the repeated responses. The aim is to make inference about the mean response, whereas the association is regarded as nuisance characteristics of the data that must be accounted for to make valid inferences about changes in the population mean response. This separate specification of the mean and within child association has an important implication on parameter interpretation. Since the GEE approach does not specify completely the joint distribution, that use cumulative probabilities like proportional odds models, adjacent categories logits and Continuation ratio logits (McCullagh, 1980;Ananth and Kleinbaum, 1997;Agresti, 2002) are possible choices for modeling ordinal data. Continuation-ratio model is suited when the underlying outcome is irreversible and adjacent-category model designed for situations in which the subject must 'pass through' one category to reach the next category (Liu and Agresti, 2005) are not used in this analysis. As a result, this paper will focus on ordinal logistic regression models under the GEE modeling framework.

Proportional odds model (POM)
The unique feature of proportional odds model (POM) is that the odds ratio for each predictor is taken to be constant across all possible collapsing of the response variable. When the assumption is met, odds ratios in a POM are interpreted as the odds of being lower or higher on the outcome variable across the entire range of the outcome (Scott et al., 1997).
In POM, reversing the direction of the response levels will change the direction of the effects but not their magnitude or significance (McCullagh, 1980;Hosmer and Lemeshow, 2000).
Let µ ijk be the probability of the i th subject at the j th visiting day being in the response category k, µ ijk =P(y ij = k). Further, let the cumulative probability of the response in category k or above be represented The lowest outcome which corresponds to a baseline level, The POM is represented as follows; where, k is the level of the ordered category. The parameter β 0k is the intercept for category k, usually considered as nuisance parameters of little interest (Agresti, 2002). Trt i takes the value 1 when the likelihood-based methods to compare models and to conduct inferences about the parameter are not available. To draw inference in a quasi-likelihood approach, Boos (1992), Rotnitzky and Jewel (1990) illustrate a generalization of score tests for different models including models based on GEE.

Generalized linear mixed models
In

Random effect models for ordinal outcomes
The unique feature of proportional odds model (POM) is that the odds ratio for each predictor is taken to be constant across all possible collapsing of the response variable (Scott et al., 1997). When proportional odds assumption is met and child specific parameter estimates are of interest, partial ordinal model (POM) can be easily fitted in random effects modeling framework by introducing random effect terms (b i ) specific to child i in model (1). In this model, the ordinal nature of the response is taken into account by considering the cumulative probabilities, The random effect POM is written as follows; (3) where Π ijk is the cumulative probability of the outcome (Y ij k) conditional upon other covariates, k is the level of the ordered category, is the intercept for category k, the parameters and represents conditional log-odds ratios of the grouped categories superior to the cutoff (k) compared to the categories inferior to k, and b i is child specific parameter to the i th child. To relax the strong assumption of identical log-odds ratio for the outcome by the covariate association in POM, partial proportional odds model (PPOM) and generalized ordered logit model (GOLM) have been considered, and can be easily fitted using NLMIXED procedure in SAS.

Partial proportional odds model (PPOM)
When the proportional odds assumption applies to some but not all of the covariates, the partial proportional odds model (4)    Relationship between marginal and random effect model parameters Zeger et al. (1988) derived an approximate relationship for the population averaged parameters (from GEE) and subject specific parameters with random effect in the linear predictor given by: Since the procedure PROC GENMOD in the current question is why we assume only one and similar random effect for different categories?
To overcome such problems, we extend models (3) and (4) where is the random intercepts for each category of the model and the vector of these random effects assumed to follow a multivariate normal distribution with mean vector zero and (co)variance matrix D (i.e. b ik N (0,D)).
where D is k x k general covariance matrix with elements d rs . The elements of the matrix, d rs represents the (co)variance between b ir and b is (r=1, 2, 3; s=1, 2, 3).
The advantage of the extended model over model (3) and (4) is that, it enables us to study the association between different category levels using b ik covariance matrix. All the considered random-effects models are fitted by maximization of the marginal likelihood, obtained by integrating out the random effects. Since the likelihood function does not have a closed form in this case, model fitting is not an easy task. Numerical approximations will be used (Molenberghs and Verbeke, 2005) to maximize the marginal likelihood.
In GLMMs, although in practice one is usually primarily interested in estimating the parameters in the marginal distribution for Y ij , it is often useful to calculate estimates for the random effects b i as well. They reflect between-child variability, which makes them helpful for detecting special profiles   Figure 1. Percentage of solicited symptoms by treatment group at each visiting day

Generalized random effect models
In this section results based on an alternative approach using child (cluster) level terms in the model will be discussed. Given the discrete nature of time varying covariate (day) a random intercept model which adjusts only the intercept but does not modify the fixed effects was considered.

Random effect models for ordinal outcomes
The assumption of proportional odds across different categories was tested using the likelihood ratio test, by comparing model (3) (3), model (4) and model (5) are summarized in Table 3. The results showed that, when we fitted the extended model (7) in the case of PPOM using category specific random effect terms, over model (4) which assumes only the same baseline variability at each category, the difference between the two treatment groups increase for high cutoff points (Table 3).
For instance, based on model (4), the odds ratio comparing a child injected with the candidate vaccine with another child injected by the licensed vaccine, both having identical covariate and random-intercept values at 3 rd category of the outcome is 0.329, while based on model (7) it is 0.065. The results for the fixed effect parameters from the three models generally agree in terms of indicating children injected with candidate vaccine are less likely to show at least moderate and severe intensity levels of pain as indicated by less than one odds ratio (Table 3). Since, the 95% CI does not include one for the effect of treatment at category one and three, this differences between

Extensions for random effect models with ordinal outcomes
Even though, it is very computationally intensive due to the increased number of random effects, to allow different category specific random terms and to study the association between category levels, the extended model (7)    To fix ideas, let us reconsider the estimated effect of the candidate vaccine under different random effect model formulation (Table 4).

DISCUSSION
The estimated treatment effect 0.449 and 0.162 from the random effect model describes how the odds of observing at least moderate and severe levels of pain increase for any child treated with the candidate vaccine. Note that, the same odds ratio is significant across both the random effects model and marginal models, but the magnitude of the effect can differ.
Therefore, the answer for the question "how the candidate vaccine is beneficial?" will depend on whether the interest is in its impact on the study population or on an individual drawn from that population. The es-   (5).
If the interest is to model the heterogeneity among children and to draw likelihood based inferences, we prefer to fit random effect models over GEE. In random effects model, each child is assumed to have its own level of adverse event. Thus, it is well known that fixed effects parameters do not maintain their interpretation when random effects are introduced in the model. Therefore the fixed effects odds ratio no longer is an odds ratio between any two children as mentioned by Zeger et al. (1988).
Among the considered models that account the ordinal nature of the data, the extended model (7) with category specific random terms better fits the data (Table 5) Table 4 for model a (7) and model b (7)  implies that, the difference between the two treatment groups can be considered constant over the follow-up period. In addition to this, the extended models better fits the data than the existing methods.