TL/DR
I have a universe $U$ of items $u_i$ and a family $F$ of subsets of $U$ (call them $P_j$ ⊆ $U$).
Given this family of subsets, I would like to find the sub-family $C$ ⊆ $F$ of subsets that can produce a new subset $P_{new}$ of members $u_i$ by adding together (or subtracting) as few subsets $P_j$ as possible.
That's the best I can do. Hopefully an example is more clear:
Example
For instance, if we start with the following family of subsets:
$$ \begin{align} F = \{&P_1 = \{a\},\ P_2 = \{b\},\ ...,\ P_{16} = \{p\}, \\ &P_{17} = \{a, b, c, d, e, f, g, h, i, j, k, l, m, n, o, p\}, \\ &P_{18} = \{a, b, c, d, e\}, \\ &P_{19} = \{g, h, i\} \,\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \}&\\ \end{align} $$
When requested to compute $\{a, b, c, d, e, f, g, h, i\}$, the simplest thing to do is calculate:
$$\{a, b, c, d, e, f, g, h, i\} = \{a\} + \{b\} + \{c\} +\ ...\ + \{h\} + \{i\}$$
This isn't optimal though (requires 8 additions). For instance, I know that I could more quickly produce the set if I instead took advantage of my previously computed sets (using 2 additions):
$$ \begin{align} P_{new} &= \{a, b, c, d, e, f, g, h, i\}\\ &= \{a, b, c, d, e\} + \{f\} + \{g, h, i\} \\ &= P_{18} + P_{6} + P_{19} \\ \mathord{\therefore}\ C ⊆ F &= \{ P_{6}, P_{18}, P_{19} \} \\ \end{align} $$
Example 2
What's even more complex is that (if possible) I want to know when involving subtraction might be optimal:
$$\{e, f, g, h, i\} = \{e\} + \{f\} + \{g, h, i\}$$
This is the best solution using only addition (2 operations), But I could have gotten this even faster with 1 subtraction:
$$\{e, f, g, h, i\} = \{a, b, c, d, e, f, g, h, i\} - \{a, b, c, d\}$$
Why I need this
Each subset $P_j$ has a value $p_j = f(P_j)$ that can be computed. The function $f(P_j)$ is additive. So $p_{\{1,2\}} = p_{\{1\}} + p_{\{2\}}$
When I start my application, I start only by calculating the value $p_i$ for each item $l_i$ on its own. For example:
$$ \begin{align} P_1 = \{a\} ,&\ \ p_1 = f(P_1) = 5 \\ P_2 = \{b\} ,&\ \ p_2 = f(P_2) = 20 \\ P_3 = \{c\} ,&\ \ p_3 = f(P_3) = 8 \\ ...\ & \end{align} $$
I then have to start servicing requests for different subsets. For example:
$$ \begin{align} P_{18} &= \{a, b, c, d, e, f, g, h, i, j, k, l, m, n, o, p\}\ &f(P_{18}) &= 400 \\ P_{19} &= \{b, c, d\}\ &f(P_{19}) &= 43\\ P_{20} &= \{g, h, i\}\ &f(P_{20}) &= 30 \\ ...& \end{align} $$
My goal is to be able to process these request as fast as possible. For early requests, I unavoidably have to spend a lot of time adding up $p_j$ values. But since these values are additive, I know there should be faster ways to process requests by taking advantage of sets for which I've already computed $p_j$.
If $P_{21} = \{b, c, d, g, h, i\}$ is requested, I don't want to waste precious resources retrieving the the 6 values for $p_{2}$ to $p_{7}$, and then adding these values in 5 lengthy operations, when I could have just done a single operation $p_{21} = p_{19}+p_{20}$.
Not set-theory?
This might actually be a better fit for linear algebra, if formulated as follows:
If I have the following known equations and values:
$$ \begin{align} P_{1} &= a, P_{2} = b,\ ...,\ P_{8} = g &f(P_{1}) &= p_{1},\ ...\\ P_{9} &= a + b + c + d &f(P_{9}) &= p_{9} \\ P_{10} &= d + e + f + g &f(P_{10}) &= p_{10} \\ \end{align} $$
And I wish to calculate
$$ \begin{align} P_{11} &= a + b + c + d + e + f + g &f(P_{11}) ? \\ \end{align} $$
I want to be able discover that the fastest solution comes from
$$ \begin{align} P_{11} &= P_{9} + P_{10} - P_{4} \\ P_{11} &= (a + b + c + d) + (d + e + f + g) - (d) \\ &= (a + b + c + 2d + e + f + g) - (d) \\ &= a + b + c + d + e + f + g\ \checkmark\\ \mathord{\therefore}\ p_{11} &= p_{9} + p_{10} - p_{4} \\ \end{align} $$
It's starting to look suspiciously like an np hard problem to me :( If no one can come up with an elegant way of solving the problem, I would also accept a more elegant way of wording the problem (perhaps in terms of an existing well known problem), or a bound on the complexity.
OK, first let's work on a formal representation of your problem.
The first thing to note is that since $f$ is linear, and we have a new input set $P^*$, if we can find a vector $q \in \mathbb{R}^m$ such that $Aq = v(P^*)$, then $f(P^*) = \sum_{i=1}^n q_i \cdot f(P_i)$. Furthermore, note that the cost of computing $f(P^*)$ using the formula above is $C(q) = \sum_{i=1}^m c(q_i) $.
With all of that, we have our optimization problem: $$ \DeclareMathOperator*{\argmin}{argmin} \begin{align} \\ q^*(P^*)&=\argmin_q C(q) \text{ such that }Aq=v(P^*) \end{align} $$
$C$ is piecewise constant function, meaning it has discontinuities both in value and in derivative. Furthermore, if any of the existing subsets are not used in the optimal calculation, it will have a discontinuous derivative at the optimal point. The input vector $q$ can take on any value in $\mathbb{R}^m$. The constraint is a linear equality constraint.
So at first blush, this is a nonsmooth optimization problem with linear equality constraints.
There's also another way to think about the objective function. Suppose we have a guess about which variables are non-zero in the optimal solution. Represent this by a vector with $m$ elements, each either one or zero, which we will call $b$. Can we determine whether this set of subsets to use in the calculation is feasible, and if so, what the appropriate weights are? We can. Simply remove the columns of $A$ and the entries in $q$ corresponding to the $0$ entries in $b$, and solve the resulting problem in a least squares sense. If the solution is feasible, it will have a zero residual. Otherwise, it will have non-zero residuals. If we use $b$ as our input variable instead of $q$ we end up with a nicer objective function, with more restriction on available inputs (each entry of b can be either 1 or 0):
$$ \begin{align} \\ D(b) &= 1^Tb \end{align} $$
Now we just need to specify a set of constraints that make that objective function work. First, we specify the least squares problem we need to solve inside the constraint function: $$ \begin{align} \\ \bar{A}\bar{q} &= v(P^*) \end{align} $$
The least squares solution is : $$ \begin{align} \\ \hat{q} &= \argmin_{\bar{q}} (\bar{A}\bar{q} - v(P^*))^T (\bar{A}\bar{q} - v(P^*)) \\ \hat{q} &= \left(\bar{A}^T \bar{A} \right)^{-1} \bar{A}^T v(P^*) \\ \end{align} $$
And our constraint is $ (\bar{A}\hat{q} - v(P^*))^T(\bar{A}\hat{q} - v(P^*)) = 0 $. In this, it's important to note that $\bar{A}$ and $\hat{q}$ are functions of the input variable $b$.
So, under this formulation of the problem, we have a linear binary programming problem with a nonlinear equality constraint.
Both of these types of problems are NP complete. There are, however, algorithms that can help solve these types of problems, or at least get somewhat close to an optimal solution, in a reasonable amount of computation time. For the first problem formulation, you can look at genetic/evolutionary global optimizers, or direct/pattern search algorithms. These are somewhat common, and included, for instance, in MATLAB's optimization toolbox. The second section approach is a pretty exotic problem, and probably shouldn't be the focus of your search for an optimizer, although I think there are some academic papers that might help solve that formulation.