# Regression models for the relative risk

$\newcommand{\pr}{\mathbb{P}}\newcommand{\E}{\mathbb{E}}$ Relative risks (and risk differences) are collapsible and generally considered easier to interpret than odds-ratios. In a recent publication Richardson et al (JASA, 2017) proposed a new regression model for a binary exposure which solves the computational problems that are associated with using for example binomial regression with a log-link function (or identify link for the risk difference) to obtain such parameter estimates.

Let $$Y$$ be the binary response, $$A$$ binary exposure, and $$V$$ a vector of covariates, then the target parameter is

\begin{align*} &\mathrm{RR}(v) = \frac{\pr(Y=1\mid A=1, V=v)}{\pr(Y=1\mid A=0, V=v)}. \end{align*}

Let $$p_a(V) = \pr(Y \mid A=a, V), a\in\{0,1\}$$, the idea is then to posit a linear model for $\theta(v) = \log \big(RR(v)\big)$ and a nuisance model for the odds-product $\phi(v) = \log\left(\frac{p_{0}(v)p_{1}(v)}{(1-p_{0}(v))(1-p_{1}(v))}\right)$ noting that these two parameters are variation independent which can be from the below L’Abbé plot. Similarly, a model can be constructed for the risk-difference on the following scale $\theta(v) = \mathrm{arctanh} \big(RD(v)\big).$

p0 <- seq(0,1,length.out=100)
p1 <- function(p0,op) 1/(1+(op*(1-p0)/p0)^-1)
plot(0, type="n", xlim=c(0,1), ylim=c(0,1),
xlab="P(Y=1|A=0)", ylab="P(Y=1|A=1)", main="Constant odds product")
for (op in exp(seq(-6,6,by=.25))) lines(p0,p1(p0,op), col="lightblue")
logrr <- 0.25; abline(a=0, b=exp(logrr), col=1, lty=2) # Add line $$\log(\mathrm{RR})=0.25$$


The model can be specified in R with lava using the following syntax

library("lava")
m <- lvm(a ~ x+z,
logrr ~ 1,
nuisance ~ x + z)
m <- binomial.rr(m, y ~ a | logrr | nuisance)
## Alternative syntax:
## binomial.rr(m, response="y", exposure="a", target.model="logrr", nuisance.model="nuisance")
m


Binomial regression (exposure | relative-risk | odds-product) :

y ~ a | logrr | nuisance

Latent Variable Model

a ~ x+z          binomial(logit)
logrr ~ 1        deterministic
nuisance ~ x+z   deterministic

Exogenous variables:
x        gaussian
z        gaussian


We can then simulate from the model

d <- lava::sim(m, n=5e4, seed=1, p=c(logrr=1))


a          x          z logrr    nuisance y
1 1 -0.6264538  0.5258908     1 -0.10056299 1
2 0  0.1836433 -0.4875444     1 -0.30390103 0
3 0 -0.8356286  1.1382508     1  0.30262217 1
4 1  1.5952808  1.2151344     1  2.81041518 1
5 0  0.3295078 -0.4248307     1 -0.09532293 0
6 0 -0.8204684 -1.4508403     1 -2.27130867 0


A double-robust estimator can be constructed by considering both a model for the exposure as a function of the auxillary covariates as well as a model for the nuisance parameter. This estimator (and other double-robust estimators) can be applied using a newly developed library targeted in R (and python)

library(targeted)
riskreg(y ~ a | 1 | x+z | x+z, data=d, type="rr")

riskreg(formula = y ~ a | 1 | x + z | x + z, data = d, type = "rr")

Estimate Std.Err   2.5% 97.5% P-value
(Intercept)    1.005 0.01327 0.9787 1.031       0


It is also possible to examine the interaction effect between the exposure and a covariate:

riskreg(y ~ a | x | x+z | x+z, data=d, type="rr")

riskreg(formula = y ~ a | x | x
z | x
z, data = d, type = "rr")

Estimate Std.Err     2.5%   97.5% P-value
(Intercept) 1.000487 0.01323  0.97456 1.02642  0.0000
x           0.001409 0.01527 -0.02853 0.03134  0.9265


The double-robustness is here illustrated with a misspecified nuissance model (odds-product)

summary(riskreg(y ~ a | 1 | 1 | x+z, data=d, type="rr"))

riskreg(formula = y ~ a | 1 | 1 | x+z, data = d, type = "rr")

Relative risk model
Response:  y
Exposure:  a

Estimate Std.Err     2.5%   97.5%   P-value
log(RR):
(Intercept) 1.005273 0.01439  0.97707 1.03348 0.000e+00
log(OP):
(Intercept) 0.142626 0.02187  0.09976 0.18549 6.978e-11
logit(Pr):
(Intercept) 0.001136 0.01094 -0.02030 0.02257 9.173e-01
x           1.004365 0.01366  0.97758 1.03115 0.000e+00
z           1.004225 0.01372  0.97734 1.03111 0.000e+00


or a misspecified model for the exposure

summary(riskreg(y ~ a | 1 | x+z | 1, data=d, type="rr"))

riskreg(formula = y ~ a | 1 | x+z | 1, data = d, type = "rr")

Relative risk model
Response:  y
Exposure:  a

Estimate  Std.Err     2.5%   97.5% P-value
log(RR):
(Intercept)  1.004493 0.013201  0.97862 1.03037  0.0000
log(OP):
(Intercept)  0.003391 0.021968 -0.03967 0.04645  0.8773
x            1.003925 0.023428  0.95801 1.04984  0.0000
z            1.007712 0.023480  0.96169 1.05373  0.0000
logit(Pr):
(Intercept) -0.002000 0.008944 -0.01953 0.01553  0.8231