Let X and Y be two arrays, each containing n numbers already in sorted order. Give an O(lgn)-time algorithm to find the median of all 2n elements in arrays X and Y .
It's comparatively easy to write down a progressive algoritm by "divide and conquer".
We compare the medians of respect array and adjust to shrink the search scope by a half until there is at least one empty array.
def medianOf2SortedArray(x:[], y:[]):
countX = len(x)
countY = len(y)
if countX == 0:
return y[0]
elif countY == 0:
return x[0]
medianIndexX = (countX-1) // 2
medianIndexY = (countY-1) // 2
medianX = x[medianIndexX]
medianY = y[medianIndexY]
less = larger = []
if medianX > medianY:
less = y[medianIndexY+1:]
larger = x[:medianIndexX]
elif medianX < medianY:
less = x[medianIndexX+1:]
larger = y[:medianIndexY]
if len(less)==0 and len(larger)==0:
return min(medianX, medianY)
return medianOf2SortedArray(less, larger)
But the fact is that this can be an undetermined algorithm by which we can barely fetch an inaccurate "relative" median since the algorithm doesn't check the specified number of elements before the median.
So here is an refined determined algorithm:
def findMedian(x:[], y:[], n:int, startIndex:int, endIndex:int) -> int:
if startIndex > endIndex:
return -1
medianIndexX = (startIndex+endIndex) // 2
medianX = x[medianIndexX]
complementIndex = n-medianIndexX-2
if medianIndexX == n-1 and medianX <= y[0]:
return medianX
elif medianIndexX < n-1 and medianX >= y[complementIndex] and \
medianX <= y[complementIndex+1]:
return medianX
elif medianX > y[complementIndex+1]:
return findMedian(x, y, n, startIndex, medianIndexX-1)
else:
return findMedian(x, y, n, medianIndexX+1, endIndex)
def twoSortedArrayMedian(x:[], y:[]) -> int:
median = findMedian(x, y, len(x), 0, len(x)-1)
if median < 0:
return findMedian(y, x, len(y), 0, len(y)-1)
return median
Let's build a testcase to be on the safe side.
def mergeOf2SortedArrays(x:[], y:[]) -> []:
array = []
while len(x) and len(y):
if x[len(x)-1] <= y[len(y)-1]:
array.insert(0, y.pop())
else:
array.insert(0, x.pop())
return x + y + array
def test_twoSortedArrayMedian(self):
for i in range(0, 100):
n = randint(20, 30)
k = randint(10, 50)
x = list(range(k, k+n))
k = randint(10, 50)
y = list(range(k, k+n))
median = twoSortedArrayMedian(x, y)
merge = mergeOf2SortedArrays(x, y)
self.assertEqual(median, merge[(len(merge) - 1) // 2])
We merge the two experimented arrays and get the exact median by Θ(n) in the test case.
Every time we experiment, the median hits the specified index without concerning about the identical elements.
The model would be a half-divided binary tree with each evaluation taking one side of the alternatives, which resembles a binary search.
The base case is down to Θ(1), so the depth of the tree is lgn. Every level of the tree spends Θ(1) time, so the total time is amounted to Θ(lgn).