I have a big data set where each entry consists of 5 inputs and 1 output value. Say inputs $x_1$, $x_2$, ... $x_5$, and output $y$.
I know there is a linear correlation between the inputs and output, and I am trying to estimate 5 coefficients $c_i$ so that:
$\sum_1^5 c_i x_i \approx y$
is approximated at closely as possible for my data set. I don't care too much about the definition of "as closely as possible", that can be least squared error or least median difference or something.
I'm kinda rusty in this area, is this multiple regression analysis? What is a good approach for this?
Indeed, as Henry said in the comment, OLS without a constant term may be a good starting point. I.e., instead of $y \approx \sum c_i x_i$, write $$ y_i = \beta_1 x_{1i} + \beta_2 x_{2i} + ...+ \beta_5 x_{5i} + \epsilon_i, \quad i=1,...,n. $$ You are assuming that each observation comes from this linear combination plus some noise term $\epsilon_i$, where $\mathbb{E}[\epsilon_i|X]=0$, $\operatorname{Var}[\epsilon_i|X]=\sigma^2$ and $\operatorname{cov}(\epsilon_i, \epsilon_j)=0$. In order to avoid biased estimators you may consider adding an intercept term (if it coincides with your assumptions), i.e., $$ y_i = \beta_1 x_{1i} + \beta_2 x_{2i} + ...+ \beta_5 x_{5i} + \epsilon_i, \quad i=1,...,n. $$ For estimation you can use one of the many possible softwares; R, MATLAB, EXCEL, Eviews, etc...