I have a dataset $\{(\mathbf x_1, y_1), \ldots, (\mathbf x_N, y_n)\}$, where $\mathbf x \in \mathbb R^M$ is a feature vector and $y\in \mathbb R$ is an output with the following distribution: 30% of values equal $0$ while other 70% have approximately log-normal distribution. My idea was to fit a neural network $N(\mathbf x) = (\hat \pi, \hat\mu, \hat\sigma ^2)$, where $\pi = p(y = 0)$, and $y \mid y > 0\sim \ln \mathcal N(\mu, \sigma ^2)$. The problem is the log-likelihood takes the form:$$\sum _i \ln [\pi \delta _0(y) + (1 - \pi)\ln \mathcal N(y\mid \mu, \sigma ^2)]$$ and I don't know how to deal with delta function here. I tried the following approximation $\delta _0(y)\approx \mathcal N(0, \varepsilon)$ with $\varepsilon=0.0001$, but this is hacky approach.
Thanks in advance for any help!