Distribution of proportions of each row cell

54 Views Asked by At

I'm trying to make sense of some data I have.

Below is a simplified version of how data is structured. To get some context, the table shows the distribution of an investor's investments across different industries. Each row represents a year and the number of investments in that specific year.

$$ \begin{array}{c|c|c|c} \text{Year}&Industry-1 & Industry-2 & Industry-3 & Etc\\\hline \text{2000}& 1 & 0 & 1\\\hline \text{2001}& 1 & 0 & 1\\\hline \text{2002}& 0 & 1 & 0\\\hline \text{2003}& 3 & 3 & 3\\ \end{array} $$

For each row, I'd like to find a value explaining to what degree an investor diversifies (equal number of investments across every industry) his/her investments. My math/statistics knowledge is limited, so I'd appreciate any push in the right direction!

1

There are 1 best solutions below

2
On BEST ANSWER

"Did or did not" suggests a yes/no answer, so just check if all the investments are equal. This is probably not what you want.

If you have a fixed list of industries, you can normalize the investment in each by dividing by the sum that year. That will give you the proportion invested in each industry and they will sum to $1$. Now you can measure the standard deviationstandard deviation of those proportions. If the division is perfectly even it will be $0$. If it is completely concentrated, it will be $\sqrt{\frac 1n -\frac 1{n^2}}$ for $n$ industries. You could then divide by this number to get a range from $0$ to $1$.