pouët.net

A math problem (I'm puzzled with this)

category: general [glöplog]
 
Suppose we have an array of floats. I want one float m that minimize the sum of absolute differences with the array. In code:

float array[NUMBER_OF_ELEMENTS];
float sum=0.0f;

for (i=0; i<NUMBER_OF_ELEMENTS; i++)
{
sum+=abs(array[i]-m);
}

So, how to calc the float m to make sum the minimum as possible?

I'm puzzled because I always intuitively thought m should be the average of the array... but it is not. One counter example is the array {100, 0, 0, 0}. In this case, the best m is 0, and not 25 that would be the average.

So... how to calc m? I've tried to search in statistics theory websites, but maybe because of my lack of knowlegde of the terms I've not been able to find the solution for the problem. This is why I'm asking here.

At first sight, it doesn't even look that there is a first-derivate-continuous function for this...

Then, what about the same, but instead the sum of absolute differences, with the sum of the square differences?

Thanks in advance
added on the 2009-09-26 17:43:42 by texel texel
Because abs(x) is not diffentiable, problems like this are most times expressed as:

for (i=0; i<NUMBER_OF_ELEMENTS; i++)
{
sum+=(array[i]-m)* (array[i]-m);
}

This is called the "The method of least squares".

http://en.wikipedia.org/wiki/Least_squares
added on the 2009-09-26 17:52:25 by TGGC TGGC
pwned
added on the 2009-09-26 17:59:41 by ferris ferris
TGGC, he doesn't need least sum of square distances, he needs least sum of absolute differences.
added on the 2009-09-26 18:10:10 by imbusy imbusy
absolute difference = distance. Isn't there a sqrt missing somewhere?
added on the 2009-09-26 18:13:32 by xTr1m xTr1m
Anyway, a quick search reveals this:
http://en.wikipedia.org/wiki/Least_absolute_deviations
More than this thread could offer, I guess. Should have tried google first.
added on the 2009-09-26 18:16:58 by imbusy imbusy
bruteforce is the solution
added on the 2009-09-26 18:19:18 by rmeht rmeht
imbusy: See last sentence of original question.

rmeth: At least do it iterative.
added on the 2009-09-26 18:30:17 by TGGC TGGC
From wikipedia:
The median is also the central point which minimizes the average of the absolute deviations
For squared deviations I'm positive that the mean is optimal, I didn't find a reference, though.
added on the 2009-09-26 18:37:47 by chock chock
TGGC:

Yes, I know, but I want the m1 that minimizes the least squares, and by other hand, the m2 that minimizes the least absolutes.
added on the 2009-09-26 19:14:14 by texel texel
So, from http://en.wikipedia.org/wiki/Least_absolute_deviations:

Quote:
Though the idea of least absolute deviations regression is just as straightforward as that of least squares regression, the least absolute deviations line is not as simple to compute efficiently. Unlike least squares regression, least absolute deviations regression does not have an analytical solving method. Therefore, an iterative approach is required.
added on the 2009-09-26 19:15:33 by texel texel
Quote:
From wikipedia:
The median is also the central point which minimizes the average of the absolute deviations


Thanks
added on the 2009-09-26 19:17:37 by texel texel
There may be more than one solution (median) to your problem. With {0;1} all m in [0;1] will be ok. I think that's why we use least squares instead of least absolute deviation in general.
added on the 2009-09-28 10:44:43 by baah baah
Quote:
Pouet is also the central point which maximizes the average of the absolute deviations
added on the 2009-09-28 10:51:39 by gloom gloom
4
added on the 2009-09-28 11:25:39 by Wade Wade
Quote:
There may be more than one solution (median) to your problem. With {0;1} all m in [0;1] will be ok. I think that's why we use least squares instead of least absolute deviation in general.

The reason that everyone loves least squares is that it boils down to solving a linear system of equations. Also, the 2-norm is invariant under orthogonal transformations (rotations and reflections).

In short, it's enormously convenient. Penalizing larger deviations more makes sense sometimes, yes, but people use least squares even when the actual minimization problem is a 1-norm (sum of absolute differences) or infinity-norm (minimize maximum difference) problem, because it's so much simpler to deal with.
added on the 2009-09-28 11:39:55 by ryg ryg
Quote:
One counter example is the array {100, 0, 0, 0}. In this case, the best m is 0, and not 25 that would be the average.


Hmm.. (100-25) + (0-25) + (0-25) + (0-25) = 0. Well yeah my few cents into the conversation.
added on the 2009-09-28 12:04:27 by harism harism
you're missing some abses there harism
added on the 2009-09-28 12:23:16 by jix jix
haha. how do i remove my comment :D
added on the 2009-09-28 12:42:23 by harism harism
at pouet nobody deletes ANYTHING, it'll haunt you forever
added on the 2009-09-28 17:28:04 by superplek superplek
You can see the problem recursively.
1) Any value between min and max won't affect abs(min-v)+abs(max-v)=v-min+max-v=max-min.
2) So we can safely drop the 2 values from the array.
3) goto 1
the loop stops when there are 2 or 1 items in the array.
4) if 2 items left then any value between these 2 is good
5) if 1 item left then it is the only possible value

So if original array is sorted, the solution is to take the item in the middle (index rounded to upper or lower).

Hope this helps
added on the 2009-09-28 21:05:27 by riben riben
Quote:
haha. how do i remove my comment :D
I think it might take a roboknight to defeat a boboknight. And I think your comment is etched in stone here on pouet sadly.

login