Cox’s semiparametric proportional hazard model: larynx
Cox (1972) proposed the following model: , where
is an arbitrary baseline hazard function.
Since
for all values of x, the hazard functions for different values of x are proportional to one another, and
is the hazard ratio corresponding to an unit increase in the value of x.
Cox model is semi-parametric because is arbitrary, but the effect of x depends on the parameter through
. How to estimate β for Cox model?
Case 1: No censoring, no ties.
Use rank likelihood, i.e, under exponential PH model, the log rank likelihood is still given by
.

Case 2: Right-censored data, only 1 death per timepoint.
The rank likelihood method can be generalized by summing over all possible rank vectors that are compatible with the observed censored data.
The generalized rank likelihood:
.
There is another way (the partial likelihood approach) to derive the same expression with the advantage that it shows explicitly how the unknown baseline hazard function is being eliminated from the likelihood. Again we will illustrate using the hypothetical example:

The general form of the partial likelihood, if all , is
.
Case 3: Data right-censored,
for some j.
How to handle ties in partial likelihood? There are 3 methods.
Method 1 (Breslow’s method): Let
be the death set at time
, i.e. , the set of
persons who die at time
, and
the sum of
over the death set
.
Method 2 (Efron’s method): By thinking of
tied failure times as
distinct but infinitesimally close failure times, Efron’s approximation differs from Breslow’s method in how they treat 'coxph' denominator. Efrons method is more computationally intensive.
, where we have the clarifications
,
,
.
Method 3 (Exact method): Method 3 is the most computing intensive. If the amount of ties is not excessive, the 3 methods should give similar results.

Laryngeal cancer example
Hypotheses testing:
overall test H:
,
LRT=18.3 on 4 df (p=0.001)
testing effect of age only, i.e. H:
; Wald test:
, LRT=2{-187.7074-(-188.6208)}=1.8268 on 1 df (p=0.1765)
testing effect of stage of cancer only,i.e. H:
,
LRT=2{-187.7074-(-195.5478)}=15.68 on 3 df (p=0.0013)

Interaction between a factor and a continuous variable:
Recall on p.23: we entered stage of cancer (II, III, IV vs I) and age at diagnosis to Cox model and saw that the effect of age on survival is not statistically significant (p=0.18). K&M added the interaction terms between stage of cancer and age to the model (read Tables 8.3 and 8.4 of K&M). The only significant interaction term is between Z1 (stage II vs I) and age. This means that the hazard ratio or relative risk of dying for a stage II patient relative to a stage I patient with disease diagnosed at the same age depends on age.

Using the estimates reported in Table 8.4 of K&M,

.