I am not very good with math. I was given a csv with 4 columns & 17,000 rows:
- credit score
- default % (0.00 to 1.00)
- email address entered/provided (0 OR 1)
- credit card paid (0 OR 1)
We are trying to determine if there is a correlation with clients that will pay their card and if they have entered their email, factoring in their credit score and default %. I'm not sure if weighting would be the correct term here - we want to establish with as much accuracy as possible what the correlation is between the credit score, default %, and credit card paid vs. if they entered an email address or not. I am trying to use Matlab to plot this.
We tried using several of the Machine Learning functions (Tree, Ensemble, etc.) and found that the Gaussian Process Regression and Stepwise Linear Regression seem to be telling us something. However, we are unsure how to interpret the charts it is producing. Those 2 charts produce a strong yellow line, but since there are no labels on anything but the x & y axis, we don't know exactly what the line is telling us.
I tried to find some information about how to interpret the results of these, but I couldn't find anything useful. Most of it was way over my head.
Is there another way we should approach this?