I am trying to implement gradient decent algorithm. The dataset on which I am working has points which are of a partial function I guess.
For example these are a subset of the dataset.
$(1, 10)$
$(2, 15)$
$(3, 35)$
$(3, 40)$
$(4, 34)$
$(5, 21)$
When it is 3, there can be two different values. How do I handle this? Thanks
Often gradient descent in ML operates over batches when computing the updates. If this is a regression task, learning some $f_\theta$, then for a point $x$ mapping to two different $y$ values, the algo will likely learn to output something like $f_\theta(x)$ being near the average between the two $y$ values. In some cases, this is the best one can do assuming the data generating mechanism has some level of inherent stochasticity that is unavoidable. In other words, one option is to do nothing.