Event history models of all types have a few characteristics that make them unique. First of all, forget that whole symmetry thing around zero.
Here our dependent variable of interest is time to event. We are interested in how long a person lives, remains sober, stays with a given company, or, in a study of my parenting skills, goes without threatening to skin a child alive and tack her hide upside the door as a warning to her sisters.
Regardless of the specific nature of the event, we are interested in TIME, which by definition must be positive. You cannot have negative duration.
Let’s take death as an example of an event. We will define death operationally as the time of death written on the death certificate. As our beginning point, let’s take attack by weasels. Some people might die right after a weasel attack, if, say, attacked by a particularly large weasel, or a whole sneak of weasels. (Yes, the correct term for a collection of weasels is a ‘sneak’. ) If you don’t believe me, look it up.
Others might linger for a while and then die, with their bodies unable to combat those severe weasel-bite wounds. Some additional number may die from complications of infections due to weasel bites and so on.
Our dependent variable we are interested in is T, where T represents the time from the biting weasel onslaught to death. At each time period, there is a baseline hazard rate. Remember this term, because it is important.
The baseline hazard rate is a constant. Weasel attack survival may be like this. Say 5% of weasel attacks are the sneak variety and the victim dies within 24 hours. However, of those who survive, only 1% die within the next 24 hours, and 0.2% catch some type of nosocomial infection and die within the following 48 hours. In an exponential model, the baseline hazard rate is a constant – period – because we assume that the rate of an event does not change with time.For other models, the baseline hazard rate is a constant for a given time interval. The Weibull model, for example, allows for a monotonic hazard rate, i. e., it can be increasing or decreasing but only in one direction.
The baseline hazard rate for that second period is .01. so h(2) = .01
However, one can, and usually will, have covariates. I mean a person is more than the sum of his episodes of attack-by-weasel, right? So, while the hazard rate may be .01, it may increase if a person has other pre-existing conditions, such as old age. A 99-year-old weasel attack victim may have a greater hazard rate than a 17-year-old victim. Other factors may have a negative relationship with hazard, for example, having been vaccinated for rabies.
Thus, Weibull model can be expressed as a log-linear function
log(T) = b0 + b1X1 + b2X2 + σε
where the last part is a stochastic disturbance term, stochastic disturbance sounding better to say than ‘error’ and less likely to draw the attention of malpractice attorneys and hedge fund investors.
What makes the Weibull model different is that it also includes a shape parameter. The covariates alter the scale value but the shape (if it is increasing, decreasing or flat) remains constant.
A Weibull model can work over defined ranges but may not always be the best pick. Think mortality, for example. There is actually a relatively high mortality rate in the first year of life – being born is a risky business – but then mortality drops until age 14 after which your risk of death goes up again until, well, until you die.