Square sum and Arithmetic mean

I came across this coding question where one is given as input a sequence of n numbers, say  a_{1}, a_{2},...., a_{n}   and one has to output the number (need not be one of the input values) which would minimize the difference from that number squared and summed over all numbers in the input sequence i.e. if that number is K, then this sum

\displaystyle \sum_{i=1}^{n} (a_{i} - X)^{2} calculated over \forall X \in R is minimum when X=K.

My first guess was that number would be either mean or median. After taking a few small examples, I could see that median wouldn’t be the desired answer. So, I submitted my solution by calculating the mean of the numbers and it was accepted by the system.

But, I had to convince myself that this squared sum of differences of a sequence is indeed the minimum only about the arithmetic mean. Hence, I did the following elementary mathematical proof to convince myself.

Let’s calculate the square sums across, a number X, which is  \delta away from M. So, X = M + \delta  where \delta \neq 0 . Let’s see that difference of square sum calculated about mean, S_{M}, is always lesser than square sum calculated, S_{X} about any other number X.

Arithmetic Mean, M = \frac {\displaystyle \sum_{i=1}^{n} a_{i}}{n} 
S_{M} = \displaystyle \sum_{i=1}^{n} (a_{i} - M)^{2} 
S_{X} = \displaystyle \sum_{i=1}^{n} (a_{i} - X)^{2}

S_{X} - S_{M}  
= \displaystyle \sum_{i=1}^{n} (a_{i} - X)^{2} - \sum_{i=1}^{n} (a_{i} - M)^{2}
= \displaystyle \sum_{i=1}^{n} [(a_{i} - X)^{2} - (a_{i} - M)^{2}]
= \displaystyle \sum_{i=1}^{n} [X^{2} - M^{2} - 2 a_{i} (X - M)]
= \displaystyle \sum_{i=1}^{n} [(M + \delta)^{2} - M^{2} - 2 a_{i} (M + \delta - M)] [Since, X = M + \delta]
= \displaystyle \sum_{i=1}^{n} [\delta^{2} + 2 M \delta - 2 a_{i} (\delta)]
= n (\delta^{2} + 2 M \delta)  - 2 \delta \displaystyle \sum_{i=1}^{n}  a_{i}  
= n \delta^{2} + 2 n M \delta - 2 \delta (n M)  [From definition of mean]
= n \delta^{2}
> 0 [Since \delta \neq 0 ]

Hence proved, S_{X} - S_{M} > 0 .

Hence proved that arithmetic mean is the right choice for minimizing the square difference sums of a given sequence of numbers.

2 thoughts on “Square sum and Arithmetic mean”

  1. Observe that S_X varies continuously as X, and hence we can use high-school calculus for the minimization problem. S'_X = 0 only when X is the mean. The function is continuous, and unbounded, so any extremum points must be minima. The proof that this X minimizes S_X is complete without examining the second derivative.

Leave a Reply

Your email address will not be published. Required fields are marked *

Notify me of followup comments via e-mail. You can also subscribe without commenting.