Let $Y$ be a Poisson random variable such that $E[Y]= \lambda$ for any $C\geq 1$ let truncation of $Y$ at $C$ be denoted by $X$ that is $X= min(Y,C)$.
How can I estimate the mean of $Y$ when I have only $X$?
I think I should give some background story to make my question more clear:
Let $Y$ be real demand for a specific product and the seller can only sell what she has in her warehouse which is $C$. Now the seller wants to know the mean of the true demand for her product but she has only sales data $X$ which has been historically censored by stock $C$. So $X= min(Y,C)$ might be more clear now.
Edit:
As @BruceET answer suggests an edit, I will answer his questions and I will post also my approach in this edit. (Thank you for your answer @AlgebraicsAnonymous).
I am aware of the difference between Truncation and Censoring but the paper that I am trying to apply on my data does not make this distinction (see section 5.1.1 in Amjad and Shah (2017) originaly published on SIGMETRICS)
The paper suggests two steps to estimate Y. At the firs steps they use "Universal Singular Value Thresholding" algorithm to estimate X because of missing values in the data Ref: Chatterjee (2015). Here is my code (I use soft thresholding):
usvd <- function(X,C,eta){
# step 1 ------------------------------------------------------------------
Y <- X
Y[is.na(Y)]<- 0
# step 2 ------------------------------------------------------------------
A <- svd(Y)
# step 3 ------------------------------------------------------------------
P <- (1-sum(is.na(X))/length(X))
# step 4 ------------------------------------------------------------------
threshold <- function(N, eta) {
my_nulls <- vector("numeric", length = length(N))
S <- sign(N) * pmax(abs(N) - eta, my_nulls)
S
}
# step 5 ------------------------------------------------------------------
W <- 1/P * A$u%*%diag(threshold(A$d,eta))%*%t(A$v)
# step 6 ------------------------------------------------------------------
M <- matrix(data = NA,ncol = ncol(Y),nrow = nrow(Y))
for (i in 1:length(W)) {
M[[i]] <- if(W[[i]]< 0){
M[[i]] <- 0
} else{if(W[[i]]>C[[i]]){
M[[i]]<- C[[i]]
}else{
M[[i]]<-W[[i]]
}
}
}
return(M)
}
At the second step they use $M= [m_{ij}]_{i<=N,j<=T} $ where $m_{ij}=E[X_{ij}]$ and $C$ to get an estimate of $Y$ by using bisection method. That is where I need the mean of $Y$ see section 5.1.1 in Amjad and Shah (2017)
Here is my code:
lambda=colMeans(M)
real_Y <- function(M,C,lambda){
M <- M
lambda <- lambda
C <- C
func <- function(C){
#lambda <- colMeans(M)
lambda=lambda
M=M
qk <- function(C) {C - ppois(C, lambda=lambda)}
lambda * qk(C-1) / qk(C)-M
}
bisection <- function(f, a, b, n = 1000, tol = 1e-7) {
# If the signs of the function at the evaluated points, a and b, stop the function and return message.
if (!(f(a) < 0) && (f(b) > 0)) {
stop('signs of f(a) and f(b) differ')
} else if ((f(a) > 0) && (f(b) < 0)) {
stop('signs of f(a) and f(b) differ')
}
for (i in 1:n) {
c <- (a + b) / 2 # Calculate midpoint
# If the function equals 0 at the midpoint or the midpoint is below the desired tolerance, stop the
# function and return the root.
if ((f(c) == 0) || ((b - a) / 2) < tol) {
return(c)
}
# If another iteration is required,
# check the signs of the function at the points c and a and reassign
# a or b accordingly as the midpoint to be used in the next iteration.
ifelse(sign(f(c)) == sign(f(a)),
a <- c,
b <- c)
}
# If the max number of iterations is reached and no root has been found,
# return message and end function.
print('Too many iterations')
}
# print the result --------------------------------------------------------
print(bisection(func,M,C))
}
Ref: for bisection method
About lost sales: I think the lost sales caused by stock level. And the data has been polluted all the time because of the low-stock so we do not want to change level of $C$ before having an estimate of true demand $Y$.
Note that if $Y$ is a one-dimensional random variable (ie. treating the one-dimensional case first), then for $k \in \mathbb N$ we have $$ P(\min\{Y, C\} = k) = \begin{cases} 0 & k > C \\ \sum_{j \ge C} e^{-\lambda} \frac{\lambda^j}{j!} & k = C \\ e^{-\lambda} \frac{\lambda^k}{k!} & k < C \end{cases}; $$ for simplification, we assume $C \in \mathbb Z$ here. Thus, in order to get the distribution (and hence the expactation) of $Y$, it is necessary and sufficient that each entry of the matrix $C$ is bigger than two. For, you may then read of the coefficient $\lambda_{i,j}$ (ie. the coefficient of the Poisson distribution in the $(i,j)$-th entry) from the equation $$ \alpha_{i,j} = e^{\lambda_{i,j}} \lambda_{i,j}, $$ where $\alpha_{i,j} = P(\min\{Y_{i,j}, C_{i,j}\} = 1)$; for this you do need the Lambert W-function.