Suppose I want to know the number of people infected by Covid-19 in a week and a month using least squares method, assuming the function to be approximated is such that F(t) is the number of people infected in day t, and t is always >= 0.
Should I use all data I have to make the predictions? By all data I mean the number of people infected in the beginning of the pandemic, the number of people infected in the day after the beginning of the pandemic, and so on until today. Or would it be better to use only recent data? If so, how can I determine which data to exclude?
Thank you for your attention.