My main question:
Let's imagine a country with a population of $n$ people. Each person has a certain amount of income in a certain year. When we calculate the income distribution of this country, we found that 80% of the total income in this country is only contributed by 1% of the population. If we assume that the government has the ability to add, but not deduct any person's income, how much is the minimum amount of incentive for each person that the government has to offer? Also, assume that $n$ doesn't change for the whole year
When we add each person's current income with each person's incentive, these two rules must be fulfilled at the end of the year:
- 20% of the population contributes 80% of the total income
- The government must achieve the total income target
I will try to formulate this problem mathematically:
Let $x$ be a vector of income for each person in the country
$$
x = \begin{pmatrix}
x_{1} \\
x_{2} \\
... \\
x_{n}
\end{pmatrix}
$$
and $X$ be a vector of $x$ that is sorted in descending order. So i guess,
$X = P \cdot x$
where $P$ is an $nxn$ permutation matrix (?)
The total income in that country is
$$\sum_{i=1}^{n} X_{i}$$
(And at the end of the year, the result of this sum above must achieve a certain level of total income)
And there's $X^c$ , which is a vector of cumulative income, whereby
$$
X^c_i = \sum_{i=1}^{n} X_{n-i+1}
$$
And lastly we have $I$, a vector of income contribution percentage for the top $i$ highest earning person. The vector is defined as:
$$
I = X^c \cdot \frac{100}{\sum_{i=1}^{n} X_{i}}
$$
So basically, $I_i$% of the total income in this country is contributed by $100(\frac{i}{n})$% of the population.
How do i come up with a vector $v$ (minimum incentive), so that when i add $X$ and $v$:
- 80% of the total new income (after incentive) must be contributed by 20% of the population $$ 0.8\sum_{i=1}^{n} (X_{i} + v_{i}) = \sum_{i=1}^{j} (X_{i} + v_{i}) $$ , where $$ 0.2 \le \frac{j}{n} \le 1 $$
- The country must achieve its yearly total income target $T$ $$\sum_{i=1}^{n} (X_{i} + v_{i})=T$$
Context: I'm a data scientist and I think this problem can be effectively solved using computer (like linear programming or machine learning?). I want to know what method should i use, but before that, I also want to know the mathematical thinking behind it.
Thank you