Making sense of notation of probability distributions under integral

119 Views Asked by At

I am trying to make sense of the following notation in section 2 of the paper Stochastic Submodular Maximization, Asadpour et al., 2008:

$$ g_S(ds) = \int_{x \in ds} \prod_{i \in S} g_i(dx_i) $$

Now I have never seen the differentiation symbol as part of a function argument, and I couldn't find any other instances of such online.
There are other ambiguous expressions, e.g. in section 4:

$$ \int_{s} \beta_s = 1 $$ $$ \forall i, dx_i: \int_{s,s_i \in dx_i} \beta_sds = y_ig_i(x_i)dx_i $$

where $\beta_s$ is the "probability density function for the outcome s" and $g_i$ is the same $g_i$ as in the previous example. These two expressions just don't seem coherent to me.

Could anyone shine a light on what a differentiation sign inside a function argument means and also an integral without an e.g. $dx$ at the end?

1

There are 1 best solutions below

0
On

I'm a bit confused why you are complaining about the "ds" in $$g_S(ds) = \int_{x \in ds} \prod_{i \in S} g_i(dx_i).$$ It is playing the role of a dummy variable here. Also, although I do agree that the notation they use is horrendous, the words they use to describe what the notion reflects is pretty clear, in my opinion at least. Anyways, as the authors say, $g_S$ is a measure. Let's define it clearly, i.e., explain what its value is on Borel sets (I assume the sigma-algebra on $[0,1]^n$ is the Borel one). By basic measure theory, it is sufficient to explain what its value is on rectangles: $[a_1,b_1]\times\dots\times[a_n,b_n]$. Letting $S = \{s_1 < \dots < s_k\}$, the definition is $$g_S([a_1,b_1]\times\dots\times[a_n,b_n]) = \int_{a_{s_1}}^{b_{s_1}}\dots\int_{a_{s_k}}^{b_{s_k}} \prod_{i \in S} g_i(x_i) dx_{s_1}\dots dx_{s_k}$$ if $0 \in [a_i,b_i]$ for each $i \not \in S$, and $g_S([a_1,b_1]\times\dots\times[a_n,b_n]) = 0$ otherwise.

That's (hopefully) a completely clear definition. Let me now explain (1) how that coincides with their (confusing) notation and (2) how that definition makes sense in the context of their paper.

I'll do (2) first. We choose a set $S \subseteq [n]$. What we are told are the values of $X_s$, for $s \in S$; more precisely, we are given a vector $\langle x_1,\dots, x_n \rangle$ with $x_s$ being the value of $X_s$ for $s \in S$, and $x_s$ being $0$ for $s \not \in S$. The measure $g_S$ is representing the probability of seeing a given vector. Clearly $g_S([a_1,b_1]\times\dots\times[a_n,b_n])$, i.e., the probability of the vector being in $[a_1,b_1]\times\dots\times[a_n,b_n]$, is $0$ if there is some $i \not \in S$ with $0 \not \in [a_i,b_i]$, since we'll obviously see $x_i = 0$. Conversely, provided that $0 \in [a_i,b_i]$ for each $i \not \in S$, the probability of our vector $\langle x_1,\dots,x_n\rangle$ lying in $[a_1,b_1]\times[a_n,b_n]$ is simply the probability that $x_i$ is in $[a_i,b_i]$ for each $i \in S$, which, due to independence, is $(\int_{a_{s_1}}^{b_{s_1}} g_{s_1}(x_{s_1})dx_{s_1})\dots(\int_{a_{s_k}}^{b_{s_k}} g_{s_k}(x_{s_k})dx_{s_k})$, which is the same as $\int_{a_{s_1}}^{b_{s_1}}\dots\int_{a_{s_k}}^{b_{s_k}} \prod_{i \in S} g_i(x_i) dx_{s_1}\dots dx_{s_k}$.

Now (1). The reason for $\int_{x \in ds}$ is as follows. We could, for example, write $\int_0^1 x^2dx$, or we could write $\int_{x \in (0,1)} x^2$. The authors are using the analogue of the latter. The reason for naming the dummy variable $ds$ is that they want to, nonrigorously/intuitively, explain how $g_S$ is defined on an infinitesimal piece of $[0,1]^n$.

I'm sure once you read and reflect on this answer, you'll be able to figure out the other ambiguous/confusing expressions in the paper.