Random Points on a Circle Riddler
Last week’s Riddler was about determining the probability that randomly distributed points on a circle all belong to a common same half-circle. It seemed like it should be pretty easy to solve by hand (edit: and it totally was), but for whatever reason I couldn’t piece out a solution. Instead, I settled for simulating the problem and making some pretty explanatory pictures.
Here’s the prompt copied verbatim from FiveThirtyEight for April 19:
If \(N\) points are generated at random places on the perimeter of a circle, what is the probability that you can pick a diameter such that all of those points are on only one side of the newly halved circle?
Determining When Points Belong to the Same Hemisphere
To simulate the solution, I used a simple geometric trick to detect when points lie in the same half-circle. Let \(C\) denote the unit circle in \(\mathbb{R}^2\). Then, \(p_1,\ldots, p_n \in C\) belong to the same hemisphere of \(C\) just in case there exists \(i \in \{1,\ldots,n\}\) for which either \(\min\left(\langle q_i, p_j \rangle\right) \geq 0\) or \(\max\left(\langle q_i, p_j \rangle\right) \leq 0\), where \(q_i\) is a non-zero perpendicular vector to \(p_i\).
To see that this works, note that the points \(p_k\) \(k=1,\ldots,n\) lie in the same half-circle iff we can split the circle \(C\) along a line \(l\) passing through the origin and one of the points \(p_i\), so that all points lie on the same side of \(l\). In turn, all points \(p_i\) lie on the same side of \(l\) iff they all share the same sign when projected onto a normal \(q_i\) to \(l\). Finally, note that \(q_i\) will be normal to \(l\) just in case \(q_i\) is perpendicular to \(p_i\) (since \(l\) passes through \(0\) and \(p_i\)). The following image shows how things work:
The function below uses this observation to determine whether angles \(\theta_i\), corresponding to points \(p_i\), lie in the same half-circle. It returns a tuple with a flag indicating whether a splitting line \(l\) exists (or not), and the index of the point \(p_i\) through which \(l\) passes. In the case that no split exists, it returns NaN
in the second entry of the tuple.
import matplotlib.pyplot as plt
import numpy as np
%matplotlib inline
def thetas_on_same_half(thetas):
thetas = thetas.flatten()
n_thetas = thetas.size
ps = np.array([np.cos(thetas), np.sin(thetas)]).T
for i in range(n_thetas):
q = np.array([ps[i,1], -ps[i,0]]).reshape((2,1))
# be careful with the dot product p_i * q_i
# this will only be zero to machine precision,
# so it may be slightly positive or negative
# make it actually zero to avoid this
same_side_as_q = np.dot(ps, q)
same_side_as_q[i] = 0.0
proj_min = np.min(same_side_as_q)
proj_max = np.max(same_side_as_q)
if proj_min >= 0 or proj_max <= 0:
return True, i
return False, np.nan
Splitting Visualizations
I wanted to get a sense (visually) of the frequency with which the points belong to the same hemisphere. So, I simulated 100 draws of \(n\) random points for \(n=2,\ldots,5\) points. For each draw, I plotted the points on a circle. If there was a line splitting the circle into a hemisphere containing all of the points and an empty hemisphere, then I drew the splitting line (in green) running through the point \(p_i\) described as above.
def plot_splits(n_points, n_row):
n_runs = n_row*n_row
thetas = 2.0 * np.pi * np.random.rand(n_points, n_runs)
results = np.zeros(n_runs, dtype='bool')
idx = np.zeros(n_runs)
ivals = []
fig, axs = plt.subplots(n_row, n_row, figsize=(6,6))
theta_circle = np.linspace(0, 2*np.pi, 1000)
x_circle = np.cos(theta_circle)
y_circle = np.sin(theta_circle)
for i in range(n_runs):
results[i], idx[i] = thetas_on_same_half(thetas[:, i])
for i in range(n_runs):
ax = axs[i % n_row, int(i / n_row)]
if results[i]:
idx_split = int(idx[i])
theta_split = thetas[idx_split, i]
x1, y1 = np.cos(theta_split), np.sin(theta_split)
ax.plot([x1,-x1],[y1,-y1], c='g')
ax.scatter(np.cos(thetas[:,i]), np.sin(thetas[:,i]), c='C0')
ax.plot(x_circle, y_circle, c='k')
ax.set_xlim(-1.01, 1.01)
ax.set_ylim(-1.01, 1.01)
ax.set_xticks([],[])
ax.set_yticks([],[])
fig.suptitle("Examples for n = {0} Points \n(Splits exist in {1}/{2} cases)".format(n_points,
len(results[results == True]),
n_runs))
plt.show()
np.random.seed(797979)
for n_points in range(2, 5+1):
plot_splits(n_points = n_points, n_row = 10)
Simulating the Solution
Since I didn’t solve the problem analytically, the only thing I had left to do was to estimate the splitting probabilities for a range of different points. I estimated the probabilities for \(n = 2,\ldots,20\) points, using 25,000 draws of \(n\) random points for each \(n\).
def estimate_split_prob(n_points, n_runs):
thetas = 2.0 * np.pi * np.random.rand(n_points, n_runs)
results = np.zeros(n_runs, dtype='bool')
for i in range(n_runs):
results[i], _ = thetas_on_same_half(thetas[:, i])
p = len(results[results]) / n_runs
return p
n_runs = 25000
n_points = np.arange(2, 20+1, dtype='int')
probs = np.zeros(n_points.size)
np.random.seed(797979)
for i, nps in enumerate(n_points):
probs[i] = estimate_split_prob(nps, n_runs)
fig, ax = plt.subplots()
ax.plot(n_points, probs,drawstyle="steps-post")
ax.set_xlim(np.min(n_points), np.max(n_points))
ax.set_ylim(0, 1.05)
ax.set_xlabel("N Points")
ax.set_ylabel("Probability of Split")
plt.show()
print("N_pts\tProbability of Split")
print("-------------")
for i, nps in enumerate(n_points):
print("{0}\t{1}".format(nps, probs[i]))
N_pts Probability of Split
-------------
2 1.0
3 0.753
4 0.50792
5 0.31324
6 0.18852
7 0.11008
8 0.06452
9 0.037
10 0.01952
11 0.01088
12 0.00644
13 0.00312
14 0.00172
15 0.00076
16 0.0004
17 0.00028
18 8e-05
19 4e-05
20 0.0