Skip to content

LeetCode 1093: Statistics from a Large Sample

A clear explanation of computing statistical measures (minimum, maximum, mean, median, mode) from a frequency count array.

Problem Restatement

We are given a sample of integers. The sample is represented as a count array count where count[k] is the number of times integer k appears.

We need to compute:

  • minimum: the minimum value in the sample.
  • maximum: the maximum value in the sample.
  • mean: the average value, as a floating-point number.
  • median: the middle value (or average of two middle values if even count).
  • mode: the most frequent value (smallest if tie).

Return an array [minimum, maximum, mean, median, mode].

The official constraints state that count.length == 256 and the total count is between 1 and 10^4.

Input and Output

ItemMeaning
InputCount array count (length 256)
Output[minimum, maximum, mean, median, mode]

Function shape:

def sampleStats(count: list[int]) -> list[float]:
    ...

Examples

Example 1:

count = [0,1,3,4,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]

Values: 1 appears once, 2 appears 3 times, 3 appears 4 times.

Min=1, Max=3, Mean=(1+6+12)/8=19/8=2.375, Mode=3 (most frequent), Median=middle of [1,2,2,2,3,3,3,3] = (2+3)/2=2.5.

Answer:

[1.0, 3.0, 2.375, 2.5, 3.0]

Edge Cases

  • Check the minimum input size allowed by the constraints.
  • Verify duplicate values or tie cases if the input can contain them.
  • Keep the return value aligned with the exact failure case in the statement.

Common Pitfalls

  • Do not optimize away the invariant; the code should still make it clear what condition is being maintained.
  • Prefer problem-specific names over one-letter variables once the logic becomes stateful.

Implementation

class Solution:
    def sampleStats(self, count: list[int]) -> list[float]:
        n = sum(count)
        minimum = next(i for i, c in enumerate(count) if c > 0)
        maximum = next(i for i, c in enumerate(reversed(count)) if c > 0)
        maximum = 255 - maximum

        total = sum(i * c for i, c in enumerate(count))
        mean = total / n

        mode = max(range(256), key=lambda i: count[i])

        lo, hi = (n + 1) // 2, (n + 2) // 2
        med1 = med2 = None
        running = 0
        for i, c in enumerate(count):
            running += c
            if med1 is None and running >= lo:
                med1 = i
            if med2 is None and running >= hi:
                med2 = i
                break
        median = (med1 + med2) / 2.0

        return [float(minimum), float(maximum), mean, median, float(mode)]

Testing

def run_tests():
    s = Solution()

    count = [0] * 256
    count[1] = 1; count[2] = 3; count[3] = 4
    result = s.sampleStats(count)
    assert result == [1.0, 3.0, 2.375, 2.5, 3.0]

    print("all tests passed")

run_tests()
TestExpectedWhy
count[1]=1,count[2]=3,count[3]=4[1,3,2.375,2.5,3]Standard statistics computation