# Regression models for the relative risk

$$\newcommand{\pr}{\mathbb{P}}\newcommand{\E}{\mathbb{E}}$$ Relative risks (and risk differences) are collapsible and generally considered easier to interpret than odds-ratios. In a recent publication Richardson et al (JASA, 2017) proposed a new regression model for a binary exposure which solves the computational problems that are associated with using for example binomial regression with a log-link function (or identify link for the risk difference) to obtain such parameter estimates.

Let $$Y$$ be the binary response, $$A$$ binary exposure, and $$V$$ a vector of covariates, then the target parameter is

\begin{align*} &\mathrm{RR}(v) = \frac{\pr(Y=1\mid A=1, V=v)}{\pr(Y=1\mid A=0, V=v)}. \end{align*}

Let $$p_a(V) = \pr(Y \mid A=a, V), a\in\{0,1\}$$, then the idea is to posit a linear model for $\theta(v) = \log \big(RR(v)\big)$ and a nuisance model for the odds-product $\phi(v) = \log\left(\frac{p_{0}(v)p_{1}(v)}{(1-p_{0}(v))(1-p_{1}(v))}\right)$ noting that these two parameters are variation independent which can be from the below L’Abbé plot. Similarly, a model can be constructed for the risk-difference on the following scale $\theta(v) = \mathrm{arctanh} \big(RD(v)\big).$

p0 <- seq(0,1,length.out=100)
p1 <- function(p0,op) 1/(1+(op*(1-p0)/p0)^-1)
plot(0, type="n", xlim=c(0,1), ylim=c(0,1),
xlab="P(Y=1|A=0)", ylab="P(Y=1|A=1)", main="Constant odds product")
for (op in exp(seq(-6,6,by=.25))) lines(p0,p1(p0,op), col="lightblue")
logrr <- 0.25; abline(a=0, b=exp(logrr), col=1, lty=2) # Add line $$\log(\mathrm{RR})=0.25$$


The model can be specified in R with lava using the following syntax

library("lava")
m <- lvm(a ~ x+z,
logrr ~ 1,
nuisance ~ x + z)
m <- binomial.rr(m, y ~ a | logrr | nuisance)
## Alternative syntax:
## binomial.rr(m, response="y", exposure="a", target.model="logrr", nuisance.model="nuisance")
m

Binomial regression (exposure | relative-risk | odds-product) :

y ~ a | logrr | nuisance

Latent Variable Model

a ~ x+z          binomial(logit)
logrr ~ 1        deterministic
nuisance ~ x+z   deterministic

Exogenous variables:
x        gaussian
z        gaussian


We can then simulate from the model

d <- lava::sim(m, n=5e4, seed=1, p=c(logrr=1))

  a          x          z logrr    nuisance y
1 1 -0.6264538  0.5258908     1 -0.10056299 1
2 0  0.1836433 -0.4875444     1 -0.30390103 0
3 0 -0.8356286  1.1382508     1  0.30262217 1
4 1  1.5952808  1.2151344     1  2.81041518 1
5 0  0.3295078 -0.4248307     1 -0.09532293 0
6 0 -0.8204684 -1.4508403     1 -2.27130867 0


A double-robust estimator can be applied using the target library

library(target)
riskreg(y ~ a | 1 | x+z | x+z, data=d, type="rr")

riskreg(formula = y ~ a | 1 | x
z | x
z, data = d, type = "rr")

Estimate Std.Err   2.5% 97.5% P-value
(Intercept)    1.005 0.01327 0.9787 1.031       0

Previous