The median is the middle value in an ordered integer list. If the size of the list is even, there is no middle value, and the median is the mean of the two middle values.
arr = [2,3,4], the median is 3.arr = [2,3], the median is (2 + 3) / 2 = 2.5.Implement the MedianFinder class:
MedianFinder() initializes the MedianFinder object.void addNum(int num) adds the integer num from the data stream to the data structure.double findMedian() returns the median of all elements so far. Answers within 10-5 of the actual answer will be accepted.-105 <= num <= 105findMedian.5 * 104 calls will be made to addNum and findMedian.Why do it this way? The definition of median requires the array to be sorted. The most straightforward approach is to just keep appending numbers to a dynamic array. Whenever we need to find the median, we sort the entire array and pick the middle element (or the average of the two middle elements). It's very simple, but incredibly slow if findMedian() is called often, since sorting takes O(N log N) time.
Imagine every time a new student joins your class, the teacher asks you to find the exact middle height of the entire class. Instead of doing anything clever, you make everyone line up by height from scratch every single time a new student walks in the door.
class MedianFinder:
def __init__(self):
self.data = []
def addNum(self, num):
self.data.append(num)
def findMedian(self):
self.data.sort()
n = len(self.data)
if n % 2 == 1:
return self.data[n // 2]
else:
return (self.data[n // 2] + self.data[(n // 2) - 1]) / 2Why do it this way? We don't actually care about the relative order of elements on the extreme ends. We only care about what's in the exact middle. So, we can divide the data into two halves: the "smaller half" and the "larger half". If we put the smaller half in a Max-Heap, its largest element (the median candidate) will bubble to the top. If we put the larger half in a Min-Heap, its smallest element (the other median candidate) bubbles to the top! By keeping these two heaps balanced in size, finding the median is just looking at the top of the heaps.
Imagine sorting rocks by weight. Instead of lining up every single rock in a perfect row, you just throw all the lighter rocks in the Left bucket and the heavier rocks in the Right bucket. As long as you keep the number of rocks in both buckets equal, the heaviest rock in the Left bucket and the lightest rock in the Right bucket will be your median rocks.
class MedianFinder:
def __init__(self):
self.small = [] # Max Heap
self.large = [] # Min Heap
def addNum(self, num):
# Always push to large (Min Heap) first to filter
if self.large and num > self.large[0]:
self.large.push(num)
else:
self.small.push(num)
# Balance the scales!
if len(self.small) > len(self.large) + 1:
val = self.small.pop_max()
self.large.push(val)
if len(self.large) > len(self.small) + 1:
val = self.large.pop_min()
self.small.push(val)
def findMedian(self):
if len(self.small) > len(self.large):
return self.small[0]
elif len(self.large) > len(self.small):
return self.large[0]
else:
return (self.small[0] + self.large[0]) / 2
Discussion
…