Tag Archives: mean

Square sum and Arithmetic mean

I came across this coding question where one is given as input a sequence of n numbers, say  a_{1}, a_{2},...., a_{n}   and one has to output the number (need not be one of the input values) which would minimize the difference from that number squared and summed over all numbers in the input sequence i.e. if that number is K, then this sum

\displaystyle \sum_{i=1}^{n} (a_{i} - X)^{2} calculated over \forall X \in R is minimum when X=K.

My first guess was that number would be either mean or median. After taking a few small examples, I could see that median wouldn't be the desired answer. So, I submitted my solution by calculating the mean of the numbers and it was accepted by the system.

But, I had to convince myself that this squared sum of differences of a sequence is indeed the minimum only about the arithmetic mean. Hence, I did the following elementary mathematical proof to convince myself.

Let's calculate the square sums across, a number X, which is  \delta away from M. So, X = M + \delta  where \delta \neq 0 . Let's see that difference of square sum calculated about mean, S_{M}, is always lesser than square sum calculated, S_{X} about any other number X.

Arithmetic Mean, M = \frac {\displaystyle \sum_{i=1}^{n} a_{i}}{n} 
S_{M} = \displaystyle \sum_{i=1}^{n} (a_{i} - M)^{2} 
S_{X} = \displaystyle \sum_{i=1}^{n} (a_{i} - X)^{2}

S_{X} - S_{M}  
= \displaystyle \sum_{i=1}^{n} (a_{i} - X)^{2} - \sum_{i=1}^{n} (a_{i} - M)^{2}
= \displaystyle \sum_{i=1}^{n} [(a_{i} - X)^{2} - (a_{i} - M)^{2}]
= \displaystyle \sum_{i=1}^{n} [X^{2} - M^{2} - 2 a_{i} (X - M)]
= \displaystyle \sum_{i=1}^{n} [(M + \delta)^{2} - M^{2} - 2 a_{i} (M + \delta - M)] [Since, X = M + \delta]
= \displaystyle \sum_{i=1}^{n} [\delta^{2} + 2 M \delta - 2 a_{i} (\delta)]
= n (\delta^{2} + 2 M \delta)  - 2 \delta \displaystyle \sum_{i=1}^{n}  a_{i}  
= n \delta^{2} + 2 n M \delta - 2 \delta (n M)  [From definition of mean]
= n \delta^{2}
> 0 [Since \delta \neq 0 ]

Hence proved, S_{X} - S_{M} > 0 .

Hence proved that arithmetic mean is the right choice for minimizing the square difference sums of a given sequence of numbers.