Let's say I have a set of variables (vectors, all of them with the same length N): X1,X2,X3,X4,X5,X6...Xn. and a time series Y (with the same length N) that depend on some variables X .
I need an algorithm to determine which of the variables X are most correlated with Y. i.e I need to discard the least meaningful variables and get the MOST influential variables on Y.
Example:
Let's say we want to determine what influences the web traffic of a specific IT site. And we have 5 keywords: keyword1, keyword2, keyword3, keyword4, and keyword5.
Let's say we have the keywords' search volume on Google (key1= X1,key2= X2,key3=X3,key4=X4,key5=X5), and the total web traffic Y. I want to determine what keywords from the set above (X1,X2,X3,X4, or X5) are most meaningful to the total web traffic to that website. Which variables I can discard and which ones move the most traffic. (Let's say all these vectors and the time series are normalized and standardized time series)