Geometric Median is an important concept in the intersection of Geometry, Data Analysis and Algorithms. This article explores the concept.
Given a set of numbers on the real line, say \( a_1, a_2, a_3, ..., a_n\), the median of these numbers is the middle number when you arrange these numbers in ascending/descending order. To understand more about the median for a set of numbers, go through this.
Let's ask a different question altogether.
Given a set of numbers on the real line, say \( a_1, a_2, a_3, ..., a_n\), what is the number \( x\) such that \( |x-a_1| + |x-a_2| + ... |x-a_n|\) is minimized?
In other words, what is the point on the real number line, such that the sum of the distances of that point from the given set of points is minimum? In some sense, this point gives an idea about the central behavior of the set of points.
In fact, this turns out to be the median of the set of numbers { \( a_1, a_2, a_3, ..., a_n\)}. To understand the proof, go through this.
But, I suggest you try this yourself, in inductive steps. Start with two points, any point in between them is the median. Now, add a point in between those two points, Observe that the new median is the new point added. Continue similarly in this inductive process.
My contentment is to ask questions and seek. Therefore, a lot of questions inpoured.
Well, the intelligent way is to answer the easiest of all and then generalize the idea - the first question.
Given a set of numbers in the 2D space, say \( a_1, a_2, a_3, ..., a_n\), what is the point \( x\) such that \( |x-a_1| + |x-a_2| + ... |x-a_n|\) is minimized?
Let's take the easiest distance concept of all - The Euclidean Distance \(d(x,y) = |x-y|= \sqrt{(x_1 - x_2)^2 + (y_1 - y_2)^2} \).
This concept is known as Geometric Median in the literature.
Now, the problem seems difficult even in the case of three points.
Let's explore it for a quadrilateral ( \( n = 4 \) ).
Step 1: Does there exist a better point like that?
Yes, there is! Let's explore.
Observe that (AI + IF) + (BI + IE) = (AG + GF) + (BE) < (AG + GF) + (BG + GE). We are just using the triangle inequality.
Hence, G is a better point than I.
Step 2: Does there exist a better point than I?
Albeit, there is!
(AI + IF) + (BI + IE) = (AI + IF) + (BE) = (AI + IF) + (BJ + JE) > (AG + GF) + (BG + GE). We are just using the triangle inequality.
Thus, this algorithm results in the converges to the intersection of the diagonals as a special point. This is easy.
But what about, \( n = 3\). It may be easy. No, it's not.
The case for triangles is not at all easy. In the case of triangles, the point is called Fermat's Point.
I will not repeat the solution again. The following pic gives a cute solution. Using the same idea, that the distance between two points is minimum along a straight line.
Now, the rest of the questions remain too difficult to solve. Even there hardly exists a good algorithm to solve it.
Now, this left to your imagination. Stay tuned!