Regression: Service time based on number of copiers

Question

Regression: Service time based on number of copiers

211 Views Asked by Bumbble Comm At 25 Feb 2026 - 6:13

For a random sample of 10 service calls, both the number of copiers and the total service time were recorded.

Number of Copiers (x) : 4, 2, 5, 7, 1, 3, 4, 5, 2, 6

Service Time (y): 90, 60, 170, 190, 40, 80, 100, 130, 70, 150

 Σy = 1080, Σx=39, Σy^2=139,000, Σx^2=185, Σxy=5030,

and the data set yields the least-squares regression line $\hat Y = 11.0334+24.8632x.$

a) Compute and interpret the coefficient of determination.

b) If the number of copiers increases by 1, estimate the average increase in service time with 95% confidence.

c)One service call requires 4 copiers to be serviced. Predict the total service time with 95% confidence.

d) Compute and interpret a 95% confidence interval for the mean service time when servicing 4 copiers.

MY WORK:

I've found SSTO=SSyy =22,360 , SSxx=33 , SSR=20277 , SSE = 2083 , MSE = 260.38 , and SSxy = 818.

a) I did SSR/SSTO=20,277/22,360=0.9068, which I believe is correct.

For b) and c) I thought of possibly substituting the values of 1 for b) and 4 for c) into the equation for the least-squares regression line, but then I got confused about the 95% confidence component to the question.

d) I am lost!

Any help is greatly appreciated, thank you.

Original Q&A

There are 1 best solutions below

**Bumbble Comm** · Accepted Answer

Here is a guide to the answers in terms of the process of finding the regression line. I believe you should take an overview of the regression material in your text to try to understand the purpose of the formulas and what is meant by the various notations. Then focus specifically on using the formulas to get numerical values. I hope some of the following helps in this process.

(a) Coefficient of determination is $r^2,$ where $r$ is the correlation. In your text, you should have a formula for $r$ in terms of $x$s and $Y$s.

(b) When you get the regression line $\hat Y = b_0 + b_1 x,$ the slope $b_1$ is the answer to this question. Your book may write the y-intercept of the regression equation as $\hat \beta_0$ instead of $b_0,$ and the slope as $\hat \beta_1.$

(c) In the regression line, plug in $x = 4$ and find the corresponding $\hat Y.$

(d) Most texts have formulas for two "intervals" connected with regression: 'confidence' and 'prediction'. The formula for the former may be written as follows:

$$\hat Y_{n-1} + t^*s_{Y|x}\sqrt{\frac{1}{n}+\frac{(x_{n+1} - \bar X)^2}{SS_{xx}}}.$$ This is a confidence interval for $E(Y_{n+1})$, where $(x_{n+1}, Y_{n+1})$ are the coordinates of an observation $n+1$ in addition to the $n$ observations used to get the regression line. The number $t^*$ cuts 2.5% from the upper tail of Student's t distribution with $n - 2 = 10 - 2 = 8$ degrees of freedom. If you define the residuals as $d_i = \hat Y_i - Y_i$ then $s_{Y|x}$ is their standard deviation. Also, $SS_{xx} = (n-1)S_x^2,$ where $S_x = 1.912$ (for your data) is the sample standard deviation of the $x$s.

Below is Minitab output for the regression procedure using your data. I have annotated it to show a few correspondences with your computations, and you should be able to find other connections.

 ## Data entry
 MTB > name c1 'x'
 MTB > set 'x'
 DATA> 4,2,5,7,1,3,4,5,2,6
 DATA> end
 MTB > name c2 'y'
 MTB > set 'y'
 DATA> 90,60,170,190,40,80,100,130,70,150
 DATA> end

 # Data Description
 MTB > desc 'x' 'y'

 Descriptive Statistics: x, y 

 Variable   N  N*   Mean  SE Mean  StDev  Minimum     Q1  Median     Q3  Maximum
 x         10   0  3.900    0.605  1.912    1.000  2.000   4.000  5.250    7.000
 y         10   0  108.0     15.8   49.8     40.0   67.5    95.0  155.0    190.0

 MTB > corr 'x' 'y'

 Correlations: x, y 

 Pearson correlation of x and y = 0.954
 P-Value = 0.000                          # sig diff from 0

Now we're ready for the regression procedure.

 ## Commands generated by MENU for regression
 MTB > Name c3 "PFIT1" c4 "CLIM1" c5 "CLIM2"
 MTB > Regress 'y' 1 'x';    # The '1' is quirk of Miniab syntax; ignore.
 SUBC>   Constant;
 SUBC>   Predict 4;
 SUBC>     PFits 'PFIT1';
 SUBC>     CLimits 'CLIM1'-'CLIM2';
 SUBC>   Brief 2.

 Regression Analysis: y versus x 

 The regression equation is
 y = 11.0 + 24.9 x


 Predictor    Coef  SE Coef     T      P
 Constant    11.03    11.92  0.93  0.382    # y-int not signif diff from 0
 x          24.863    2.772  8.97  0.000    # slope signif diff from 0


 S = 15.8977   R-Sq = 91.0%   R-Sq(adj) = 89.8%

The y-intercept of the regression line is $b_0 = \hat \beta_0 = 11.0$ and its slope is $b_1 = \hat \beta_1 = 29.4.$ The number S in the Minitab printout is $S_{Y|x}$ in the formula mentioned earlier. The number R-sq = 91.0% indicates that $r^2 = (0.954)^2 = 0.910.$ [For a single predictor variable $x$, it is OK to ignore R-SQ(adj).] It is not surprising that the data are consistent with a y-intercept of $0$; a 'phantom' service call to repair $x = 0$ computers would conceivably require $Y = 0$ service time.

 Analysis of Variance

 Source          DF     SS     MS      F      P
 Regression       1  20338  20338  80.47  0.000
 Residual Error   8   2022    253
 Total            9  22360

The number MS(Resid. Err.) = 253 should correspond to your value $MSE = 260.38.$ The discrepancy may well be due to roundoff error in your computations. (In a computation of this sort, do not round off anything before you get to the final answer.)

 s

 Obs     x       y     Fit  SE Fit  Residual  St Resid
   3  5.00  170.00  135.35    5.88     34.65      2.35R

 R denotes an observation with a large standardized residual.


 Predicted Values for New Observations

 New Obs     Fit  SE Fit       95% CI           95% PI
       1  110.49    5.03  (98.88, 122.10)  (72.03, 148.94)


 Values of Predictors for New Observations

 New Obs     x
       1  4.00

 MTB > print c3 c4 c5

 Data Display 

 Row    PFIT1    CLIM1    CLIM2    # More decimal places than above
   1  110.486  98.8758  122.097

The predicted value $\hat Y_{n+1} = 110.49$ is obtained by plugging $x = 4$ into the regression equation. The confidence interval requested in the last question is $(98.88, 122.10).$ You should read in your text how this is different from the prediction interval.

Here is a graph of the regression line (least squares line) drawn through a scatterplot of your data. Curved lines indicate the confidence intervals at each value of $x$; focus on where a vertical line at $x = 4$ crosses these curves, and compare with the CI in the output.

Notice that the regression line passes through $(\bar x, \bar Y) = (3.9, 108),$ the 'center of gravity' of the data cloud. Also, notice how far the point at $(5, 170)$ falls from the regression line. This was called out in the output from the regression procedure as an Unusual Observation.

Finally, there is an important distinction between correlation and regression: Correlation is $symmetrical$; the correlation between x and Y is the same as the correlation between Y and x. Regression is $not$ symmetrical. Here we are doing 'regression of Y on x' (that is seeking to predict Y-values from x-values). Regression of x on Y would give entirely different results.

Regression: Service time based on number of copiers

There are 1 best solutions below

Related Questions in STATISTICS

Related Questions in REGRESSION

Related Questions in CONFIDENCE-INTERVAL

Trending Questions

Popular # Hahtags

Popular Questions