Probability Problems 05

In this 5th post, I try to put some questions that is more statistics-like, as they are really popular.

Table of Contents

Fair coin

You toss a coin 10,000 times and get 5,050 heads, will you say this is a fair coin?

Gambling

Suppose you have 100 dollars. You have $0.4$ probability to double your bets, or $0.6$ to lose all. You goal is to reach 500 dollars.
You have three strategies. First, all in. Second, invest half of your capital each time. Third, invest only 10 dollars each.
What if the probability of winning and losing flips? How will you change your strategy?
[Reference](https://math.stackexchange.com/questions/4449198/best-strategy-to-reach-500-for-a-gambling-situation-in-a-casino)

Basic idea: if you have higher winning probability, then playing more games with few bets each time can offer less variance. To justify this, regard each play as a binary variable $X_i$ with $p$ probability to get $+1$, or $1-p$ probability to get $-1$.

Then $\sum{X_i}$ after $n$ games have expectation of $n(2 p-1)$, and variance of $4np(1-p)$. If you bet $a$ dollars each time, this will become $na(2p-1)$ and $4na^2p(1-p)$, and there is a trade off here for step size and total number of playing.

If you have winning probability less than $0.5$, then you don’t want to play too much rounds, a large variance strategy gives you a larger chance to meet your target.

Another idea

If you choose to invest only 10 dollars each time, this is is actually a gambler’s ruin problem! You are at 100, you want to maximize your probability that you hit 500 before you hit 0. The step size is the amount you bet each time. Then this becomes an optimization problem with step size as your flexible variable.

Kelly

The question mentions a strategy that bets half of your fortune each time, this is not justify by the Kelly criteria that maximizes your log wealth. According to Kelly, the best proportion here should be $\pi=2p-1=0.2$ only when you have winning prob greater than 0.5.

Absolute value of x - y

Two uniforms from 0 to 1. What is the expectation of their absolute difference?
The easy solution is that. We know for $n$ uniforms. The max and min expectation is $\frac{1}{n+1}$ and $\frac{n}{n+1}$. In this case, their difference should be $\frac{n-1}{n+1}$.

or else, you could use the integral. $\int_0^1 \int_0^1|x-y|dx dy$.

Products of numbers with and without replacement

Given $S=\{1,2, \ldots, 10\}$, you draw two numbers with replacement and compute the product. Also, you draw two numbers without replacement and compute the product.
- Using basic arithmetic/reasoning (don't explicitly compute the expected values), which would you expect to have a larger value?
- Find the expectation of both products.

https://math.stackexchange.com/questions/3865110/expected-product-of-two-numbers-with-and-without-replacement

Raining on weekends

50% of raining on Saturday, 80% of raining on Sunday. What is the range of probability for raining on weekend. And the corrlation between raining on Saturday and Sunday.
## Bayes Probability
Richard runs a blivet making factory. Unfortunately, $20 \%$ of the blivets Richard makes are defective. Defective blivets fail $10 \%$ of the time, but good (non-defective) blivets never fail. How many times does Richard have to test a blivet that comes off his assembly line in order for him to be at least $95 \%$ sure it is good?

This is a slight modification of Bayes, denote $A$ as the event that the chosen blivet is a good one. And the event $B$ as the chosen blivet passed our test. For a defective blivet, the probability that it well pass the test is $0.9^n$, well a good blivet passes with probability $1$. And we want to choose $n$ to make $P(A|B)$ bigger than $0.95$.

The answer at last should be $15$.

Measure two sticks

Suppose you have two sticks A and B to measure. But you only have a ruler that gives the error follows $N(0,\sigma^2)$. What you can do to measure the sticks better?

A very detailed and clear solution is offered here: https://stats.stackexchange.com/questions/491677/measuring-sticks-to-minimize-error

The key to the question is the properties of normal (how the variance can be added up and shrunk by a constant). You should measure the sum and the difference of the sticks, then use the average to lower the errors.

$$ \begin{aligned} S &\sim N\left(\theta_A+\theta_B, \sigma^2\right)\\ D &\sim N\left(\theta_A-\theta_B, \sigma^2\right)\\ X_A&=\frac{S+D}{2} \sim N\left(\theta_A, \frac{\sigma^2}{2}\right) \end{aligned} $$

Sum of 100 and 600 Dice Rolls

Throw a fair dice 100 times, let $X$ be the sum of the results; flip a fair coin 600 times, let Y be the number of heads. What is the probability that $P(X <Y)$?

Use the Central limit theorem(or just a sum of independent variables), we know $X\sim N(350, 100\times\frac{35}{12})$, and $Y\sim N(300,600\times\frac{1}{4})$. $P(X<Y)=P(X-Y<0)=P(N(50,\frac{875}{3}+150)<0)$. Then just express it using the standard normal to derive the probability.

Probability of Even Number of Heads

Denote the probability of even heads as $p_n$ after $n$ tosses. When $n=0$ then obviously, $p_0=1$, and $p_1=0.5$.

Consider the recursive relationship. If as of $n-1$ tosses, the number of heads is even, then only the $n-th$ toss is tail can the probability retain. Conversely, only odd number of heads plus another head can become even number of heads. Thus, we can derive the following,

$$ \begin{aligned} p_{n}&=p\times(1-p_{n-1})+(1-p)\times p_{n-1}\\ p_n&=(1-2p)p_{n-1}+p \end{aligned} $$

When $p=0.5$, then $p_n=0.5$ which is a constant.

Simulate an Unfair Coin in Arbitrary Probability

How to use a fair coin but simulate an unfair coin with any given probability? Say we want to simualte a coin with 0.3 probability of head.

Start from an interval from 0-1. Every time we toss the fair coin, we pick half of the interval. Say we have tossed HTT.

Then we take $0,0.5$, and $0.25, 0.5$, and $0.375, 0.5$ in sequence until the remaining interval is outside totally in side or outside the given probability $p$, which in this case is $0.3$. Since the final interval is wholly outside of $p$, we take this experiment as a T.

Why this works? Because we partition the interval into $\frac{1}{2^n}$ smaller pieces after $n$ tosses, and each small piece has the same probability to be chosen. If after some tosses, the interval is beyond the given $p$, then no matter how granular we partition the interval, it is not possible for the final piece to lie in that part, and we could stop. No matter how small $p$ is, we could always choose a $n$ such that make $\frac{1}{2^n}$ to make that happen. If $n$ is arbitrary large, and the interval is sufficiently small, then of course the length of the interval can be omitted and we are as if choosing uniforms from the unit interval.

Squared distance and absolute distance

You have $n$ real numbers. Find a number $x$ such that the sum of squared distance is the smallest. Find another $x$ such that the sum of absolute value distance is the smallest.

Reference

Let’s consider these as the optimization problem as use the derivative to solve. The first problem is to find $x$ to make $\sum (s_i - x)^2$ smallest. Take the derivative and set it equal to zero, then we have that $x$ is the mean of the numbers.

Similarly, for the second problem, the goal is $\arg \min x \sum{i=1}^N\left|s_i-x\right|$.

Recall that $\frac{\mathrm{d}|x|}{\mathrm{d} x}=\operatorname{sign}(x)$. we have the derivative $\sum_{i=1}^N \operatorname{sign}\left(s_i-x\right)$. Set this to zero means we want to find $x$ making half of the number greater than it and half smaller than it, indicating $x$ has to be the median.

Maximize the Group Products

Given a positive integer say $n$, divide it into several integers sum equal to $n$. Say $n=5$, and divide it into $2+3$, then the max product is $6$. Find a strategy to find this optimal product.

The product is maximized if each value is the same (almost at least). Say this value is $x$, and we have $\frac{n}{x}$ groups. Now let’s try to maximize $x^{\frac{n}{x}}$. Denote $\log y=\frac{n}{x}\log x$, then we have $\frac{dy}{dx}=x^{\frac{n}{x}-2}n(1-\log x)$. Set the derivate to zero then we have $x=e$. That means each integer should only be $2$ or $3$. But we also notice that for each $6$, $2^3<3^2$, so we prefer $3$ over $2$.

Thus, the strategy is to find first as many $3$, and the rest should either be $2$ or $4=2\times2$.

Uniform Points on a Circle

What is the variance of a single component of a point chosen from a ball sphere with radius $r$.

The point has a coordinate $(X,Y,Z)$, and $X^2+Y^2+Z^2=r^2$. Due to symmetry, we only need to drive $Var(X)$. We know that $E(X)=0$ due to symmetry also. Then we only need to find $E(X^2)$. $X,Y,Z$ follows the same distribution, and then $3E(X^2)=r^2$, which gives $E(X^2)=\frac{r^2}{3}$.

Approximation of $0.99^{100}$

This is a very frequently used trick that is take log and use $\log{(1+x)}\approx x$. Denote $x=0.99^{100}$, we have $\log x=100 \log(1-0.01)\approx 100\times(-0.01)=-1$. Thus $x=e^{-1}$.

Paper Roll

You have a paper roll of radius $R$ and $K$ layers, suppose you take out at a constant speed of $v$. Derive a differential how the remaining roll radius $r$ change with time $t$.

Each layer has some thickness, which is $\frac{R}{K}$. Let’s omit the length of the row (since it is irrelevant), but only as a circle. Then the remaining area after time $t$ is $\pi R^2-\pi r^2$, if we denote the remaining radius as $r$.

On the other side, after time t, we have extract $vt\times\frac{R}{K}$ area from the circle. The reduction of the area should equal to the extraction, which yields $\frac{dr}{dt}=\frac{-vR}{2K\pi r}$.

Three Dice Roll Median

Compare the following three expectations. Roll a dice once, the expectation of the number squared. Roll a dice twice, the expectation of the product. Roll a dice three times, the expectation of median squared.

The first problem is simple, just need to calculate the expectation of squared. For the second, since the two tosses are independent, so $E(XY)=E(X)E(Y)=3.5^2=12.25$.

For the third problem, first solve it using the intuition. Ignore the square first, the distribution of numbers are symmetric around $3.5$, so the median must equal to mean.

Well, we are calculating the expectation of median squared, and it should be larger than the square of expected median. This is like $E(f(X))$ and $f(E(X))$. Here $f(x)=x^2$ which is a convex function, so according to Jensen’s inequality, we do have $f(\mathrm{E}[X]) \leq \mathrm{E}[f(X)]$. That is, the median squared expectation is larger than $12.25$.

How to compare the first and third value, name $E(X_1)$ and $E(X_3)$? Intuitively, the median is more centralized in its distribution. Well the $X_1$ has equal distribution on $1^2, 2^2,\cdots,6^2$, but the values grow quadratically fast. Some simulation says $E(X_1)$ is larger actually.

How you could compute mathematically the value of the three dice median? See the reference here.

Probability of Uniform Distances

Suppose X and Y are uniforms on (0,L), derive the P(|X-Y|<a).
Recall the classic problem that two people can meet during an interval, a geometric method should be the easiest way. The probability should be $(L^2-(L-a)^2)/L^2$. How we are able to solve it using the calculus?

Begin by using symmetry and conditional probability. $\mathbb{P}(|X-Y|<a)=2 \mathbb{P}(X-Y<a \mid X>Y)$.

The key is to that we need to discuss values of $X$, that is $\mathbb{P}(X-Y<a \mid X>Y)=\int_0^a \int_0^x \frac{1}{L^2} d y d x+\int_a^L \int_{x-a}^x \frac{1}{L^2} d y d x$. When $0<X<a$ and $X>a$, the boundary of $Y$ changes.

Finally, twice of the integration value will become $\mathbb{P}(|X-Y|<a)=2 \cdot\left(\frac{a}{L}-\frac{a^2}{2 L^2}\right)=\frac{2 a}{L}-\frac{a^2}{L^2}$.

Yiming Zhang
Yiming Zhang
Quantitative Researcher Associate, JP Morgan