Mathematical notation for a unique values in a vector

2k Views Asked by At

Can anyone tell me how can I mathematically describe the following problem.

I have two vectors: (i) vector $\boldsymbol{V} = [v_1,...,v_n]^T$, containing some values among which some of them are the same; and (ii) vector $\boldsymbol{P} = [p_1,...,p_n]^T$, containing the probabilities of occurrence for each corresponding value in $\boldsymbol{V}$.

I would like to get a new vector $\boldsymbol{V}^* = [v_1^*,...,v_m^*]^T$, containing only the unique values from vector $\boldsymbol{V}$ (i.e. $m < n$), and the corresponding probability vector $\boldsymbol{P}^* = [p_1^*,...,p_m^*]^T$, in which the probabilities from vector $\boldsymbol{P}$ that have the same value in vector $\boldsymbol{V}$ are summed up.

What would be the best way to denote this using only math symbols?

Thanks in advance!

1

There are 1 best solutions below

3
On BEST ANSWER

I'd say this question is rather in the domain of computer science. I would typically address this problem like this

Let $\textbf{V} \in F^n$ be an $n$-dimensional vector of some field $F$. $\textbf{V}$ can be additionally represented as an enumerated list (AKA array), such that each dimension of $\textbf{V}$ is enumerated by an integer from $0$ to $n-1$.

Notation 0: For ever vector $\textbf{V}$, let dim($\textbf{V}$) be its dimension, that is, the number of its elements.

Notation 1: Let $\textbf{V}[i]$ be the $i$-th element of $\textbf{V}$, the element corresponding to the $i$-th enumerated dimension, such that $\textbf{V}[i] \in F$ and i $\in \mathbb{Z}$, $i \in [0, n)$

Notation 2: An index vector can be defined as $\textbf{IND}_n \in \mathbb{Z}^m$, such that $\textbf{IND}_n[i] \in [0, n) \; \; \forall i \in [0, m)$

Notation 3: We can also use index vectors as indices for other vectors. Let $\textbf{W} = \textbf{V}[\textbf{IND}_n]$ be another vector, such that $W \in F^m$, where $m = \mathrm{dim}(\textbf{IND}_n)$. In particular, $\textbf{W}[i] = \textbf{V}[\textbf{IND}_n[i]] \; \; \forall i \in [0, m)$

Finally, we need to define a function that finds unique elements. I have never seen it having a particular notation. In matlab, for example, a similar function is simply called "unique". Usually people just define a function and explain what it does. We will define a function UniqueIndex$(\textbf{V})$, which will return a vector of indices of $\textbf{V}$, which will correspond to the indices of the unique elements of $\textbf{V}$. We must also specify their order. For example, we will require that all repeating elements of $\textbf{V}$ will be skipped, but the order of the first occurrences of each element will be preserved.

Then, we can find our indices using $\textbf{IND}_n$ = UniqueIndex$(\textbf{V})$

We can find the vector of unique elements $\textbf{V}^* = \textbf{V}[\textbf{IND}_n]$

If we have another vector $\textbf{P}$, elements of which correspond to the elements of $\textbf{V}$ in the same order, the elements of the reduced vector $\textbf{P}^* = \textbf{P}[\textbf{IND}_n]$

This is more or less exactly what you would to in Python to actually perform this operation