Probability Problems 01

Table of Contents

Stick broken in $n$ pieces

We have a stick of length 1. The stick got broken at (n-1) randomly chosen points (lengths of parts can be non-integer or floating point numbers also) so we get n parts. We need to find the probability that these n pieces can form a n sided polygon.

The n pieces can form a n sided polygon if and only if none of the them are greater than $1/2$. Now think about this stick as a circle of perimeter 1. And then we place $n$ points on the circle.

These $n$ points break the circle into $n$ pieces, and we treat the first point as the beginning / ending the original stick. Then this problem is equivalent to the stick problem, with $n-1$ break points and $n$ pieces.

Now, derive the probability that the longest pieces has a length of $1/2$, which is equivalent to another problem that all of the points lie on the same semicircle. which is $\frac{n}{2^{n-1}}$. Why? Since the $n-1$ break points are independent and have $1/2$ probability to lie in the semicircle after the “first point”, and each point also equally likely to be the “first point”, which justifies the probability.

Therefore, our desired event, that none of the pieces is larger than $1/2$ is simply $1-\frac{n}{2^{n-1}}$.

Bit Transmission

An electronic device is being devised for transmitting 1 bit of information at a time. So far, the device is not $100 \%$ reliable. With the probability $p$ the bit is transmitted correctly; or else it's flipped. $n$ such devices are chained output-to-input and the first receives a bit. What is the probability that the bit will be transmitted correctly through the chain of $n$ devices?

It is absolutely necessary to know the binomial theorem.

$$ (a+b)^{n}=\sum_{k=0}^{n}\left(\begin{array}{l} n \\ k \end{array}\right) a^{n-k} b^{k} $$

Only when the number of failures are even can we get the right signal. Using the probability mass function of binomial distribution, multiply a series returning $1,0,1,0\cdots$, and sum the probability. Then we get,

$$ \begin{aligned} P &=\sum_{k=0}^{n}\left(\begin{array}{l} n \\ k \end{array}\right) p^{n-k}(1-p)^{k} \cdot \frac{1+(-1)^{k}}{2} \\ &=\frac{1}{2}\left(\sum_{k=0}^{n}\left(\begin{array}{l} n \\ k \end{array}\right) p^{n-k}(1-p)^{k}\right)+\frac{1}{2}\left(\sum_{k=0}^{n}\left(\begin{array}{l} n \\ k \end{array}\right) p^{n-k}(p-1)^{k}\right) \\ &=\frac{1}{2}\left((p+(1-p))^{n}+(2 p-1)^{n}\right) \\ &=\frac{1}{2}+\frac{1}{2}(2 p-1)^{n} \end{aligned} $$

Average Number of Runs

A coin is tossed $N$ times. What is the expected value (average) of the number of runs? Define a run as contiguous equal outcomes, like"HH", "TTT", "T"

Consider two tosses, where we have $1/2$ probability that the two tosses are different, which creates a run in the middle. So for $N$ tosses, we have $N-1$ such pairs, which means $\frac{N-1}{2}$ new runs. Add the first run, we have $\frac{N+1}{2}$.

Average Number of Runs in a Sequence of Random Numbers

$M$ numbers are randomly selected from the set $\{1, \ldots, N\}$. A run is either a decreasing or an increasing subsequence of maximum stretch, i.e., "Subruns" do not count as runs.
Assuming no number was selected more than once in succession, what is the average number of runs?

Note that numbers can be repeatedly draw, but the same number will not show up in succession.

Define $E_k$ as the probability that the the $k^{th}$ number $x_k$ in the sequence M doesn’t break a run. That means, $x_{k-1} < x_k <x_{k+1}$ or $x_{k-1} > x_k >x_{k+1}$. The three numbers have a total of $N(N-1)^2$ choices, but only $n$ choose $3$ cases have three different numbers, and only two permutations to make them monotonic increasing or decreasing. So overall, $P(E_k)=$

$$ 2\times\frac{\left(\begin{array}{c} N \\ 3 \end{array}\right)}{N(N-1)^{2}} $$

Then, the probability that there is a break (which creates a new run) here equals $1-P(E_k) = p_k$. Expected number of new runs to sum all $p_k$ from $2$ to $M-1$.

$$ \begin{aligned} p_{k} &=1-2 \frac{\left(\begin{array}{l} N \\ 3 \end{array}\right)}{N(N-1)^{2}}=1-2 \frac{N(N-1)(N-2)}{3 ! \cdot N(N-1)^{2}} \\ &=1-\frac{N-2}{3(N-1)} \\ &=\frac{2 N-1}{3(N-1)} \\ \end{aligned} $$

The boundary condition, $p_0=1$ and $p_M=0$, the first number creates a new run for sure, while the last number is unable to create any. So overall the total number of runs are

$$ 1+\sum_{k=2}^{M-1} p_{k}=1+(M-2) \frac{2 N-1}{3(N-1)} $$

Book Index Range

The index of a book lists every page on which certain words appear. To save space these are listed in ranges; for example, if a word occurs on pages $1,2,3,5,8$, and $9$ , then its index contains ranges: $1-3,5,8-9$.

A certain word appears on each page of an $n$-page book $(n>0)$ independently with probability $p$. Find the expected number of entries in its index entry.

Average Visibility of Moviegoers

At a movie theater, moviegoers line up to buy tickets. The ticket seller calls a patron viewable if he/she is taller than all the people in front of him/her in line, otherwise, the patron is hidden. Given that no two patrons are precisely the same height, what is the average number of viewable patrons among all possible permutations in a line of $n$ moviegoers.

The first person is viewable for sure. The second person has $1/2$ to be taller and viewable. The third person has $1/3$ probability to be taller than the first 2 people to be viewable. Thus, for the $i$th people, he has $1/i$ probability to be the tallest. Overall, for $n$ people we have, $$ \sum_{i=1}^{n} \frac{1}{i} \approx \ln n+\gamma $$ where $\gamma=0.5772157 \ldots$, Euler’s constant.

For further reference, please see Wiki

Number of Ways for Ball to Come Back

There are 4 basketball players $A, B, C, D$.Initially the ball is with $A$.The ball is always passed from one person to a different person.In how many ways can the ball come back to $A$ after seven passes?

Let the number of circuits that get back to $A$ after $n$ passes be $f(n)$

We have two cases. Either pass $n-2$ is to $A$ or it is not.

If pass $n-2$ is to $A$, then the pass $n-1$ can be to any other three players, and the final pass is to $A$ again. Thus, the number of such circuits is $3 f(n-2)$.

If pass $n-2$ is not to $A$, then pass $n-1$ has to be one of the other two people other than $A$, so that the final pass is to $A$. The number of such circuits is $2 f(n-1)$.

Therefore, we get the recursion

$$ f(n)=2 f(n-1)+3 f(n-2) $$

with $f(0)=1$ and $f(1)=0$. This linear recurrence has the solution

$$ f(n)=\frac{3^{n}+3(-1)^{n}}{4} $$

Switching 100 Light Bulbs

There are 100 light bulbs lined up in a row in a long room. Each bulb has its own switch and is currently switched off. The room has an entry door and an exit door. There are 100 people lined up outside the entry door. Each bulb is numbered consecutively from 1 to 100. So is each person.

Person No. 1 enters the room, switches on every bulb, and exits. Person No. 2 enters and flips the switch on every second bulb (turning off bulbs 2, 4, 6…). Person No. 3 enters and flips the switch on every third bulb (changing the state on bulbs 3, 6, 9…). This continues until all 100 people have passed through the room.

What is the final state of bulb No. 64? And how many of the light bulbs are illuminated after the 100th person has passed through the room?

For No.64, how many people have switched? 1, 2, 4, 8, 16, 32, 64. Note all these factors can be paired to get 64 (which means the status of light is unchanged), except for 8, because the 64 is a perfect square number. Similarly for all the other numbers. Only the square number will be turned on. Which means 10 lights in these 100.

Concerning Even Number of Heads

Flipping a coin, a heads comes up with the probability of p
  1. What is the probability of having an even number of heads in $n$ flips?

  2. What is the expected number of flips before getting an even number of heads the first time?

Denote $P(n)$ be the probability that after $n$ tosses, the probability of getting even number of heads. Then probability of odd numbers will be $1-P(n)$.

By inspection, we can get the recursive formula here,

$$ P(n+1)=p(1-P(n))+(1-p)P(n) $$

With the initial condition that $P(0)=1$. (0 heads is even)

Assume the form of solution is $P(n)=A+B a^{n}$, we can derive $P(n)=\frac{1}{2}+\frac{1}{2}(1-2 p)^{n}$.

You should see how this is similar to the problem of “Bit Transmissions”. Even without assuming the solution, you should be able to derive in the same fashion.

Second Part

To get the first appearance of even number of heads in $n$ tosses, then

  1. The last toss has to be a head
  2. The total number of heads is 2
  3. The first head is in the first $n-1$ tosses

So the expectation can be calculated as the following. $n$ is the number of tosses. $n-1$ is the number of choices for the first head, and then follows the probability of getting 2 heads and $n-2$ tails.

$$ E(p)=\sum_{n=2}^{\infty} n(n-1)(1-p)^{n-2} p^{2}=\frac{2}{p} $$

The derivation itself is an exercise.

Hint: $n(n-1)(1-p)^{n-2}$ can be twice of the derivatives of $x^n$. After the summation, take differentiation twice, and replace $x$ with $1-p$.

Another quick solution I think should be viewing the two heads as two identical event. That is we spend $1/p$ steps to view the first head. And then another $1/p$ steps for the second.

Expectation of Interval Length on Circle

Two points, $X$ and $Y$ are uniformly distributed on a circle of circumference $1 .$
1. What is the expected arc length of interval $X Y ?$
2. Point $P$ is fixed on the circle. What is the expected arc length of interval $X Y$ that contains point $P$ ?
3. What if there are 3 random points?
4. What if there are $n$ random points?

By symmetry, $XY+YX=1$, the two arc lengths sum to 1. So each average length of either is $0.5$.

Similarly, when we have $n$ points, the expected length of one arc length is $1/n$.

Now cut this circle from fixed point $P$, and consider it as the end point on an interval, so it belongs to the average lengths that is twice than the other intervals. The expected length is $\frac{2}{n+1}$.

Dropping Numbers into a Square

Place the numbers $1,2, \ldots, 9$ at random so that they fill a $3 \times 3$ grid.
1. What is the probability that each of the row sums and each of the column sums is odd?
2. What if both diagonal sums are also required to be odd?
3. What if exactly one of the diagonal sums is also required to be odd?

Note that there are 5 odd numbers in range 1 to 9. To make a column or row sums odd, then it needs to contain 1 or 3 odd numbers. To fit for 6 rows/columns. The only situation is that, all the 5 odd numbers have to be in one row and one column (with an intersection which saves a odd number)

There are $9!$ total choices to place numbers. There are 9 ways to choose a combination of row and column. There are $5!$ number of choices to place odd numbers, and there are $4!$ number of choices to place even numbers.

Thus, overall, we have $\frac{9 \times 5 ! \times 4 !}{9 !}=\frac{1}{14}$.

There is only one combination of row and column to make diagonal sums be odd, which is choose the second column and second row. Thus, $\frac{5 ! \times 4 !}{9 !}=\frac{1}{126}$.

Finally, $\frac{4 \times 5 ! \times 4 !}{9 !}=\frac{4}{126}=\frac{2}{63}$.

Expectation of Pairings

https://www.cut-the-knot.org/Probability/ExpectationOfPairings.shtml

Suppose that 8 boys and 12 girls line up in a row. Let $S$ be the number of places in the row where a boy and a girl are standing next to each other. For example, for the row

GBBGGGBGBGGGBGBGGBGB

we have that $S=13$. Find the average value of $S$ (if all possible orders of these 20 people are considered).

This is almost the same problem as the average number of runs problem of coins.

There are total of $m+n$ people with $m$ boys and $n$ girls. Think of them as pairs, and then there are $m+n-1$ pairs.

For each pair, there is $\frac{m}{m+n}$ probability to choose a boy, and $\frac{n}{m+n-1}$ probability to choose a girl or inverse ($\frac{n}{m+n}$ to choose a girl), so the probability here has to be doubled. (Note, this probability is like taking out with replacement, which is still irrelevant to the remaining number of people)

So the expectation is:

$$ E=(n+m-1) \cdot \frac{2 n m}{(n+m)(n+m-1)}=\frac{2 n m}{n+m} $$
import random

def count_bg_pairs(lineup):
    count = 0
    for i in range(len(lineup) - 1):
        if (lineup[i], lineup[i+1]) in [('B', 'G'), ('G', 'B')]:
            count += 1
    return count

def simulate_average_bg_pairs(num_boys, num_girls, num_trials=100000):
    people = ['B'] * num_boys + ['G'] * num_girls
    total_pairs = 0

    for _ in range(num_trials):
        random.shuffle(people)
        total_pairs += count_bg_pairs(people)

    return total_pairs / num_trials

# Example usage
num_boys = 8
num_girls = 12
average_pairs = simulate_average_bg_pairs(num_boys, num_girls)
print(average_pairs)

Expectation of the Largest Number

https://www.cut-the-knot.org/Probability/ExpectationOfMaximum.shtml

$n$ balls are drawn, without replacement, from an urn containing $N>n$ balls, numbered 1 to $N$.
What is the expected value of the largest number drawn?

Suppose the largest number is $k$, then the remaining $n-1$ numbers have to be taken from $1$ to $k-1$.

$$ \begin{aligned} E &=\frac{\sum_{k=n}^{N} k\left(\begin{array}{l} k-1 \\ n-1 \end{array}\right)}{\left(\begin{array}{l} N \\ n \end{array}\right)}=\frac{\sum_{k=n}^{N} n\left(\begin{array}{l} k \\ n \end{array}\right)}{\left(\begin{array}{l} N \\ n \end{array}\right)} \\ &=\frac{n}{\left(\begin{array}{l} N \\ n \end{array}\right)}\left[\left(\begin{array}{l} n \\ n \end{array}\right)+\left(\begin{array}{c} n+1 \\ n \end{array}\right)+\ldots+\left(\begin{array}{c} N \\ n \end{array}\right)\right] \\ &=\frac{n}{\left(\begin{array}{l} N \\ n \end{array}\right)}\left(\begin{array}{l} N+1 \\ n+1 \end{array}\right)=\frac{n(N+1)}{(n+1)} . \end{aligned} $$

Note1

One of the tricks used above is, refer to [Hockey-stick identity](Hockey-stick identity - Wikipedia)

$$ \sum_{m=k}^n\left(\begin{array}{c} m \\ k \end{array}\right)=\left(\begin{array}{l} n+1 \\ k+1 \end{array}\right) $$

Note2

Note the answer looks really similar to a problem where we draw $n$ balls from a uniform distribution (no chance for the balls to be the same, which is equal to “without replacement”). Then the largest number should be $\frac{n}{n+1}$. Here, we see that in a discrete integers space, a multiplier $N+1$ is added.

Same Number of Heads

A fair coin is tossed $n$ times by two people. What is the probability that they get same number of heads?

Say Alice and Bob.

Alice has probability $p$ to get $k$ heads, while Bob has also probability $p$ to get $k$ heads, or $n-k$ heads due to the fair coin. Thus, the probability that they have same number of heads equal to the probability that they have $n$ heads in total from $2n$ tosses.

Therefore, the probability is $\frac{1}{2^{2 n}}\left(\begin{array}{c} 2 n \ n \end{array}\right)$.

Yiming Zhang
Yiming Zhang
Quantitative Researcher Associate, JP Morgan