A probability game

Assume that two positive numbers are given, \(X\) and \(Y\), with unknown joint probability distribution \(P\), and \(X\neq Y\) a.s.

A player draws randomly one of the numbers and has to guess if the number is smaller or larger than the other unrevealed number, i.e., let \(U\sim Bernoulli(\tfrac{1}{2})\) independent of \(X, Y\), then the player sees \(Z_{1} = UX + (1-U)Y,\) and \(Z_{2} = (1-U)X + UY\) remains unseen.

A random guess (coin-flip) would due to the sampling \(U\), indepedently of \(F\), have probability \(\tfrac{1}{2}\) of correct guessing. The question is if we can find a better strategy?

Assume we can sample \(\widetilde{Z}\) from a distribution with support covering \(X\) and \(Y\). Then the player can instead guess that \(Z_{1}>Z_{2}\) iff \(Z_{1}>\widetilde{Z}\). In this case the player guesses correct with probability

\begin{align*} &\mathbb{P}(Z_{1}>\widetilde{Z}, Z_{1}>Z_{2}) + \mathbb{P}(Z_{1}<\widetilde{Z}, Z_{2}>Z_{1}) = \\\
&\qquad \mathbb{P}(Z_{1}>\widetilde{Z}, Z_{1}>Z_{2}, U=1) + \mathbb{P}(Z_{1}>\widetilde{Z}, Z_{1}>Z_{2}, U=0) \\\
&\qquad + \mathbb{P}(Z_{1}<\widetilde{Z}, Z_{2}>Z_{1}, U=1) + \mathbb{P}(Z_{1}<\widetilde{Z}, Z_{2}>Z_{1}, U=0) = \\\
&\qquad \tfrac{1}{2}\mathbb{P}(X>\widetilde{Z}, X>Y) + \tfrac{1}{2}\mathbb{P}(Y>\widetilde{Z}, Y>X) + \tfrac{1}{2}\mathbb{P}(X<\widetilde{Z}, X<Y) + \tfrac{1}{2}\mathbb{P}(Y<\widetilde{Z}, Y<X) \end{align*}

Applying \(\mathbb{P}(A)+\mathbb{P}(B) = \mathbb{P}(A\cup B)+\mathbb{P}(A\cap B)\), we have

\begin{align*} &\tfrac{1}{2}\Big[\mathbb{P}(X>Y)+\mathbb{P}(X>Y, X>\widetilde{Z}>Y) + \\\
&\qquad \mathbb{P}(X<Y) + \mathbb{P}(X<Y, X<\widetilde{Z}<Y) \Big] \\\
&\qquad= \tfrac{1}{2}\left\{1+\mathbb{P}(\widetilde{Z} \text{ between } X \text{ and } Y)\right\}, \end{align*}

which under the given assumptions is strictly larger than \(\tfrac{1}{2}\). The player should intuitively try to to choose the distribution of \(\widetilde{Z}\) that “separates” \(X\) and \(Y\).

Example in python

import sys
import numpy as np
import scipy.linalg as linalg
from   scipy.stats import norm, bernoulli, uniform

if sys.version_info.major < 3 or sys.version_info.minor<6:
      raise Exception("Must be using Python >= 3.6")

def rexp(n: int): return np.exp(-uniform.rvs(size=n))

class game:
      """Simulation of simple probability game"""
      def __init__(self,
		 n: int,
		 gen=lambda n: -np.log(uniform.rvs(size=n)),
		 rho=.5):
	    R = np.matrix([[1, rho], [rho, 1]]);
	    L = linalg.cholesky(R)
	    xy = np.matrix(np.resize(norm.rvs(size=2*n), (n,2)))*L
	    self.u = bernoulli.rvs(0.5, size=n)
	    self.w = np.array(xy[range(len(xy)), self.u])
	    self.z = np.array(xy[range(len(xy)), 1-self.u])
	    self.x = np.array(xy[:,[0]])
	    self.y = np.array(xy[:,[1]])
	    self.g = gen(n)
	    self.guess = self.w>self.g
	    self.true = self.w>self.z

n = int(1e5)
sim = game(n)
print(np.mean(sim.true==sim.guess))
0.60851

Example in R

sim <- function(n, delta=0, ...) {
  xy <- exp(mets::rmvn(n, ...)) + delta
  u <- rbinom(n, 1, 0.5)
  w <- xy[cbind(seq(n), u+1)]
  z <- xy[cbind(seq(n), 2-u)]
  wmax <- w>z
  cbind(w=w, z=z, wmax=wmax, x=xy[,1], y=xy[,2])
}

n <- 1e5
val <- sim(n=n, rho=0.5)
z0 <- rexp(n, 1)
guess <- val[, "w"]>z0
mean(guess==val[, "wmax"])
[1] 0.6037
val <- sim(n=1e5, rho=0.5, mu=c(-3,3))
z0 <- rexp(n, 1)
guess <- val[, "w"]>z0
mean(guess==val[, "wmax"])
[1] 0.96077
val <- sim(n=1e5, rho=0.5, mu=c(10,10))
z0 <- rexp(n, 1)
guess <- val[, "w"]>z0
mean(guess==val[, "wmax"])
[1] 0.49956