A clear explanation of finding the longest substring that appears at least twice using binary search on length with rolling hash.
Problem Restatement
We are given a string s.
Return the length of the longest repeating substring (a substring that appears at least twice).
If no repeating substring exists, return 0.
The official constraints state that 1 <= s.length <= 2000 and s consists only of lowercase English letters.
Input and Output
| Item | Meaning |
|---|---|
| Input | String s |
| Output | Length of the longest repeating substring |
Function shape:
def longestRepeatingSubstring(s: str) -> int:
...Examples
Example 1:
s = "abcd"No repeating substring. Answer: 0.
Example 2:
s = "abbaba""ab" appears at indices 0 and 3. Length 2.
Answer:
2Key Insight
Binary search on the length L. For each L, use a set of hashed substrings of length L to detect duplicates.
The problem is monotone: if a repeating substring of length L exists, one of length L-1 also exists.
Algorithm
- Binary search on
Lfrom0ton-1. - For each
L, collect all substrings of lengthLinto a set. If any appear twice, the search succeeds. - Return the largest valid
L.
Edge Cases
- Check the smallest and largest valid search values; off-by-one errors usually appear at the boundaries.
- Decide whether the predicate is looking for the first true value, the last true value, or an exact match.
- When the target is absent, the loop should still terminate and return the problem-specific failure value.
Complexity
| Metric | Value | Why |
|---|---|---|
| Time | O(n^2 log n) | Binary search × substring hashing |
| Space | O(n) | Set of substrings |
Common Pitfalls
- Do not mix inclusive and half-open bounds inside the same loop.
- Make sure each branch strictly reduces the search interval.
Implementation
class Solution:
def longestRepeatingSubstring(self, s: str) -> int:
n = len(s)
def has_repeat(L):
seen = set()
for i in range(n - L + 1):
sub = s[i:i+L]
if sub in seen:
return True
seen.add(sub)
return False
lo, hi = 0, n - 1
while lo < hi:
mid = (lo + hi + 1) // 2
if has_repeat(mid):
lo = mid
else:
hi = mid - 1
return loTesting
def run_tests():
s = Solution()
assert s.longestRepeatingSubstring("abcd") == 0
assert s.longestRepeatingSubstring("abbaba") == 2
assert s.longestRepeatingSubstring("aabcaabdaab") == 3
print("all tests passed")
run_tests()| Test | Expected | Why |
|---|---|---|
| No repeats | 0 | All chars distinct |
"abbaba" | 2 | "ab" repeats |
| Longer repeats | 3 | "aab" repeats |