To those of you who have
- any experience with machine learning in general and neural networks in particular
- logic as their field of expertise or interest:
to what extent and how has the first broaden your understanding of the second?
Why I'm asking:
My background is very distant from logic as well as machine learning. Trying to bridge the gap between the field which I study and logic which I enjoy, I got myself a job involving XML, XSL, xPath and some programming. One thing led to another, a project in deep learning came along the way and now has me wanting to continue working on. Still, I keep wishing to apply for a master's program in logic, which ideally would also let me keep my student job involving the project.
Obviously, doing logic for its own sake is good enough as a reason. For my application, however, I want to arm myself with arguments beyond this one.
So far, I could only think of some ways logic helped me teaching myself and practicing the development of neural networks:
- describing the problem to choose the right model components and parameters
(In matters of classification, which I am working on: how many possible classes can there be in one dataset / in any dataset to be analyzed? How do the features defer from dataset to dataset? Are there any exceptions to be considered?) - extracting feature data by xPath queries
(Anecdotal argument, but hey - it involves considering a lot of cases relevant data can be nested in the structure of the documents my networks are developed for. Every query involves formulating precise conditions covering all of those cases.) - practical and methodical aspects of debugging
(as in: What needs to be changed in the code to allow the right conclusions about where the mistake is?)
Also, neural networks have influenced my philosophical notion of logic:
- Machine learning can mean working with crisp and fuzzy truth values at the same time. (Networks trained for classification compare ground-truth labels coded as binary vectors with ones containing probabilities. The lather could just as well be understood as fuzzy truth-values.)
- Those truth values can just as well serve as features of the instances whose classifications bear them.
(In early stages of building networks, you can try feeding them with the same one-hot vectors as input (the "features" being 0s and 1s) which later are to be compared with the output)
Not very solid arguments, I have to admit. But, as I know that there are pretty serious logicians among you people, I'd be glad to get to know your perspective.