Probability of dart hitting a point given that it hits a point in a finite set of points

221 Views Asked by At

I understand the probability of the dart hitting the center point is 0, and I think this is used as an example of how a probability of 0 doesn't mean something is impossible.

Now image we define 3 arbitrary points on the dart board, say, the center point C, and the two points between the center and the top/bottom edges called T and B respectively. Also let's suppose the player throws the dart in such a way that it's equally likely to hit any point on the board. What's the probability of the dart having landed on C given that we know it already landed on one of the 3 points?

Intuitively the answer seems to be 1/3. But working out the math would give an undefined result if I attempt to solve it like P(C|T∪C∪B) = P(C)/P(T∪C∪B) = 0/0. Is there a way to properly solve this? Or is the only way to just assign equal probabilities to all points and ignore the fact that the probability of hitting any one point is zero?


EDIT: I think I made some progress thanks to your comments, but I hope someone smarter than myself can comment on my attempted solution. I think the zero probability comes from dividing 1/∞, which afaik is by itself not well defined. The full rigorous expression would be

$\lim_{x \to \infty} \frac{1}{x}$

Intuitively $x$ is the number of points in the board, so the probabilities of the dart landing on each point is $1/x$ and the probability of landing on any of the 3 points is $3/$x, so the full probability now becomes:

$\lim_{x \to \infty} \frac{1/x}{3/x}$

Which if I'm not mistaken is just 1/3.


EDIT 2: I get the impression that most comments and answers here suggest that it's impossible to calculate these probabilities unless we come up with some ad hoc definitions. I just want to clarify that the dart board is a metaphor for a random point in a circle which, in my above example, has a uniform distribution.

Since the example is so trivial it provides little motivation to actually solve it, so here is another example that is a little less trivial based on @Vincent's comment.

Imagine a random real generator R that generates a real number from -1 to +1 that has a probability density function D. Also, imagine that I wrap R into another function F that returns the absolute value of the number produced by R like F = ABS(R()).

So, let's say we run F and it outputs $n$. What's the probability that the number generated by R was actually $n$ (as opposed to $-n$), given that we know the density function D?

If I'm not mistaken, the probability is just $\frac{D(n)}{D(-n)+D(n)}$, which I can't prove but intuitively seems right.

Applying the same logic to the original dart problem would again give 1/3 without having to deal with divisions by zero (at least not explicitly) and without the need for any ad hoc definition.

2

There are 2 best solutions below

3
On BEST ANSWER

There is no general definition for conditioning on a particular event of probability zero. You can make your own definition if you like, such as in the comment of Snoop, although its usefulness is limited. Conditional probabilities are useful because they can be summed or integrated such as $$ P[A]=\sum_{i=1}^{\infty} P[A|B_i]P[B_i] \quad, \quad E[X]=E[E[X|Y]]$$ and single events of probability zero do not affect the sum or integral.

Suppose $Y$ is a continuous random variable with a well defined PDF (in particular, $P[Y=y]=0$ for all $y \in \mathbb{R}$). Suppose $A$ is an event. Then there are many versions of $E[1_A|Y]$. All versions have the form $h(Y)$ for some measurable function $h:\mathbb{R}\rightarrow\mathbb{R}$, and any two versions $h(Y)$ and $\tilde{h}(Y)$ satisfy $P[h(Y)=\tilde{h}(Y)]=1$. You can take any particular version $h(Y)$ and "define" $$P[A|Y=y]=h(y) \quad \forall y \in\mathbb{R} \quad (Eq. *)$$ The understanding is that this definition is only meaningful "in the aggregate." It is useful for the "vast majority" of $y\in \mathbb{R}$, but it need not make any sense for a particular finite or countably infinite set of $y$ values in $\mathbb{R}$. You can change the value of $h(0.4)$ to anything you like and it will not change $\int_{-\infty}^{\infty} h(y)f_Y(y)dy$.


You can see what happens when you cast your problem into $E[X|Y]$ notation. Let $(U,V)$ denote the random location of the dart, assumed to be uniform over a ball of radius 1. Fix points $(a_1, b_1), (a_2,b_2), (a_3,b_3)$ in the ball and define random variables $X$ and $Y$ by indicator functions: \begin{align} X &= 1_{\{(U,V)=(a_1,b_1)\}}\\ Y&=1_{\{(U,V)\in \{(a_1,b_1), (a_2, b_2), (a_3,b_3)\}} \end{align} Then $E[X|Y]$ has infinitely many versions. Since $P[Y=1]=0$, it holds that $E[X|Y]$ is a version of the conditional expectation of $X$ given $Y$ if and only if $E[X|Y]=h(Y)$ for some function $h:\{0,1\}\rightarrow\mathbb{R}$ that satisfies $h(0)=0$. That means $h(1)$ is allowed to be any number you like. All such functions $h$ satisfy $P[h(Y)=0]=1$.

So if we define $$P[(U,V)=(a_1,b_1)| (U,V)\in\{(a_1, b_1), (a_2,b_2),(a_3,b_3)\}]=h(1)$$ we see that this value $h(1)$ can take any real number (even negative numbers, or numbers larger than 1). It does not affect anything since $P[(U,V)\in\{(a_1, b_1), (a_2,b_2),(a_3,b_3)\}]=0$.


Weird example: If we assume the above random vector $(U,V)$ can take any value in the set $B=\{(u,v):u^2+v^2\leq 1\}$, and is uniformly distributed over $B$, we can define a random vector $(R,S)$ by $$(R,S) = \left\{\begin{array}{cc} (U,V) & \mbox{if $(U,V) \notin \{(a_2, b_2), (a_3,b_3)\}$}\\ (a_1,b_1) &\mbox{if $(U,V) \in \{(a_2,b_2), (a_3, b_3)\}$} \end{array}\right.$$ Then $P[(U,V)=(R,S)]=1$, and so $(R,S)$ is also uniformly distributed over $B$. However, if we are told that $(R,S) \in \{(a_1, b_1), (a_2, b_2), (a_3,b_3)\}$ then we know for sure that $(R,S)=(a_1,b_1)$.


Towards your new example, suppose for simplicity that $R \sim Unif[-1,1]$ and define $F=|R|$. Since you are now conditioning on a continuous random variable $F$, there is more justification in saying that $P[R\geq 0|F=f]=1/2$ "for almost all $f \in [0,1]$" because we can use the above equation (*) in the aggregate.

Here is how: Define $A=\{R\geq 0\}$. Then $1_A$ is a 0/1 valued random variable that is 1 if and only if $R\geq 0$. Then $E[1_A|F]$ exists and has infinitely many versions, each version has the form $h(F)$ for some function $h:\mathbb{R}\rightarrow\mathbb{R}$. The most basic version is: $$h(f) = \left\{\begin{array}{cc} 1/2 & \mbox{if $f \in [0,1]$} \\ 0 & \mbox{else} \end{array}\right.$$ Then, as in (*), we can interpret $$ P[R\geq 0|F=f] = h(f) \quad \forall f \in [0,1]$$ and so, using this particular $h$ function, we can define $$ P[R\geq 0|F=f]=1/2 \quad \forall f \in [0,1]$$ However, we can define $\tilde{h}:\mathbb{R}\rightarrow\mathbb{R}$ by changing the value $h(0.3)$ to any value we like: $$ \tilde{h}(f) = \left\{\begin{array}{cc} 1/2 & \mbox{if $f \in [0,1]$ and $f\neq 0.3$} \\ 0.9 & \mbox{if $f=0.3$} \\ 0 & \mbox{else} \end{array}\right.$$ and $\tilde{h}(F)$ is also a valid version of $E[1_A|F]$. You may actually prefer to change $h(0)$ to the value 1, but the point is it does not really matter if you change it at particular points. It turns out that any other valid version must correspond to a function, call it $h_{other}(f)$, that satisfies $h_{other}(f)=1/2$ for almost all $f \in [0,1]$. So "in the aggregate" it makes sense to say the answer is really $1/2$.

0
On

Since the tag was included as one of the tags under this question, I would point out the following. Let's assume for simplicity that we are dealing with a 1-dimensional situation, such as the interval $[0,1]$ (instead of a 2-dimensional situation, where a similar analysis can be performed).

It turns out that the probability of hitting a point does not have to be zero. Bernstein and Wattenberg in their article

Bernstein, Allen R.; Wattenberg, Frank. Nonstandard measure theory. Applications of Model Theory to Algebra, Analysis, and Probability (Internat. Sympos., Pasadena, Calif., 1967), pp. 171–185, Holt, Rinehart and Winston, New York-Montreal, Que.-London, 1969

developed the following approach. One includes the real points of the interval $[0,1]$ in a hyperfinite set $S$ (using a suitable embedding $\mathbb R \to {}^\ast\mathbb R$) of internal cardinality $H$, where $H$ is a nonstandard integer. Then one assigns a probability of $\frac1H$ to each point in $S$. In particular, each real number is assigned a nonzero probability. Then the calculation you mentioned with Bayes formula goes through (since the determinant is nonzero), giving the expected answer $\frac13$.