Most of the literature I can find in the field of machine learning is extremely practical, listing many techniques you can use like neural networks, SVMs, random forests, and so on. There are lots of suggestions on implementations and what approaches are good for different problems. My question is, is there a single unifying formalism or theoretical framework in which one can think about problems of inferring patterns and conclusions from data?
In such a framework I'd be interested in exploring these kinds of questions:
What is the correct number of free parameters to use for a given data set? How do we know when we've over-fit? How do we express the trade-off between the accuracy of low-resolution fitting and the precision of high-resolution fitting?
Can we define a machine learning techniques in general terms (instead of specific ones like neurons or forests or deep learning or graphical models or...)? How can we objectively compare these techniques? Are any of them actually identical?
What kinds of signals are possible to extract from data? What kinds of noise could prevent successful learning?
I agree with the other respondent. The focus on machine learning is more geared towards ready-to-implement algorithms. However there are some Simple results like the Bias-Variance Tradeoff and other sort of discussions of "limiting" performance. From what I've seen, your question might be most practically addressed by the area of model selection (i.e. how to pick one of the algorithms you mentioned to solve a problem). In some cases, there is no single best algorithm, other than to use both (or several ) algorithms in an ensemble method, e.g. boosting, which takes advantage of when different learning algorithms are best at classifying different types of examples.