Find the most optimal plan to sell the products (Python)

66 Views Asked by At

I'm working on a dynamic programming problem and actually, I'm not quite sure whether it is dynamic programming since moving average $M$ is based on previous $M$. No need to consider the efficiency. The problem requires selling a product over $T$ time periods and maximizing the total actual sale amount. The total number of products is $N$ and I plan to sell some products over different periods $n_0, n_1, \cdots, n_{T-1}$ and $\sum_{i=0}^{T-1} n_i = N$.

In conclusion, this question wants to find the most optimal schedule for $n_0, n_1, \cdots, n_{T-1}$ such that $\sum_{i=0}^{T-1} n_i = N$, which maximizes the $\sum_{i=0}^{T-1} S_i$.

And the actual sale amount $S_i$ are based on current moving average $M$ and current $n_i$.
Assume that $\alpha = 0.001$ and $\pi = 0.5$

  1. Initialize $M = 0$. Then for $i = 0, 1, \ldots, T-1$
  2. Compute new $M_i = \lceil 0.5*(M_{i-1} + n_i)\rceil$
  3. At time $i$ we sell $S_i \ = \ \lceil(1 - \alpha M_i^\pi)n_i \rceil$ products

Continue this process until the last step. For example, assume we already know $n_i$ for all periods, the trading will be below

    M = 0
    T = 4
    N = 10000
    alpha = 1e-3
    pi = 0.5
    S = np.zeros(T,dtype='i')
    n  = np.array([5000,1000,2000,2000])
    print(n)
    total = 0
    for i in range(T):
        M = math.ceil(0.5*(M + n[i]))
        S[i] = math.ceil((1 - alpha*M**pi)*n[i])
        total += S[i]
        print('at time %d, M = %d and we sell %d products' %(i,M,S[i]))
    print('total sold =', total)

My idea is to keep track of the state based on time period, products left, and moving average as index and store the actual sale in a high dimension matrix. I think the upper bound for moving average is just $[0,N]$ I'm still confusing how to program it. Could someone provide ideas about dynamic programming? Thank you very much. The below is some of my crude codes but the output is a little strange.

def DPtry(N,T,alpha,pi,S):
    schedule = np.zeros(T)
    M = 0
    
    for n in range(0,N+1):
        for m in range(0,n+1):
            S[T-1,n,m] = math.ceil((1 - alpha*m**pi)*n)
    
    for k in range(1,T):
        t = T - k - 1
        print("t = ",t)
        for n in range(0,N+1):
            for m in range(0,n+1):
                best = -1
                
                for plan in range(0,n+1):
                    salenow = math.ceil((1 - alpha*m**pi)*plan)
                    M = math.ceil(0.5*(m + plan))
                    salelater = S[t+1,n-plan,M]
                    candidate = salenow + salelater
                    if candidate > best:
                        best = candidate
                S[t,n,m] = best
                
    print(S[0,N,0])
N = 100
T = 5
pi = .5
alpha = 1e-3

S = np.zeros((T,N+1,N+1))
DPtry(N,T,alpha,pi,S)
```