Perhaps the most primitive (and yet very effective) way of estimating a density is by using a histogram. In a histogram, we can manage the 'precision' of the distribution by adjusting the number of bins and the number of samples our histograms will have. This is very intuitive, because by increase the number of bins and the number of samples, we can get a 'smoother' estimate, if afterall, the aim is to fit a continuous function into a discrete system.
Now here comes the 'bias-variance trade off'. The paragraph above is very intuitive and it makes good sense to me. But the author of the lecture I am reading suddenly mentioned this trade-off. What does this mean?
Is this 'bias' related to the 'biased estimator'?
Any insights are helpful.
I suggest you google 'histogram binning algorithm', start with the Wikipedia article, and then look at some examples of data and their histograms elsewhere.
Software packages use various algorithms to choose the number and/or width of histogram bars--usually with some attention to putting centers or cutpoints at 'round' numbers. Of the methods listed in Wikipedia, I believe Sturgis, square root, and Friedman-Diaconis may be the most widely used. The latter is explicitly aimed at trying to retrieve something like the density curve from a histogram of observations.
With unlimited data one could do a nice job of density estimation, but as the sample size goes down one has to use fewer and wider bars to keep from having distracting empty bins and so missing bars. You won't have to browse many of the google links to encounter the ubiquitous Old Faithful eruption data, which properly plotted is distinctly bimodal. (But too few bins can obscure bimodality.)
Other methods of density estimation are to fit pieces of rounded curves together to get an approximation to what the density function of the population may have looked like. (The pieces may be from polynomials, normal curves, etc. Bernard Silverman has written widely on the subject and some of his papers have a uniquely motivational flavor so you can understand the purpose of the mathematics. (google 'density estimation silverman')
Finally, I confess I'm not sure about the exact meaning of 'bias' in the lecture you heard.