I have a data set, which I have created by aggregating at top hierarchy level. My data is based on user's statistics in using a website for about an year. Some of the dimension which I have captured are:
- Number of cookies user
- Number of different locations logged in from
- Number of different browsers used
- Number of different OS used
- Users product utilization rate(its in percent)
- Number of times logged in
and so on.
Using all these dimensions, I want to create an index (weighted). I want to use this Index to find out those customers that are at risk. I don't want to simply add all the dimensions and use what I get as an index.
Is there a way (using statistics or Machine Learning or anything), for assigning a weight to the columns in a way such that we get an index representative of showing risk of losing customer based on its statistics derived from the columns used in making the index ?
I am very confused on how to use all these columns and determine which customer will churn, so I think making a index and defining a threshold below which there is a risk of loosing would be the best option. But then how to make an index. Please guide.