Given two integer arrays A
and B
, return the maximum length of an subarray that appears in both arrays.
Example 1:
Input: A: [1,2,3,2,1] B: [3,2,1,4,7] Output: 3 Explanation: The repeated subarray with maximum length is [3, 2, 1].
Note:
Intuition and Algorithm
\nIn a typical brute force, for all starting indices i
of A
and j
of B
, we will check for the longest matching subarray A[i:i+k] == B[j:j+k]
of length k
. This would look roughly like the following psuedocode:
ans = 0\nfor i in [0 .. A.length - 1]:\n for j in [0 .. B.length - 1]:\n k = 0\n while (A[i+k] == B[j+k]): k += 1 #and i+k < A.length etc.\n ans = max(ans, k)\n
Our insight is that in typical cases, most of the time A[i] != B[j]
. We could instead keep a hashmap Bstarts[A[i]] = all j such that B[j] == A[i]
, and only loop through those in our j
loop.
Python
\nclass Solution(object):\n def findLength(self, A, B):\n ans = 0\n Bstarts = collections.defaultdict(list)\n for j, y in enumerate(B):\n Bstarts[y].append(j)\n\n for i, x in enumerate(A):\n for j in Bstarts[x]:\n k = 0\n while i+k < len(A) and j+k < len(B) and A[i+k] == B[j+k]:\n k += 1\n ans = max(ans, k)\n return ans\n
Java
\nclass Solution {\n public int findLength(int[] A, int[] B) {\n int ans = 0;\n Map<Integer, ArrayList<Integer>> Bstarts = new HashMap();\n for (int j = 0; j < B.length; j++) {\n Bstarts.computeIfAbsent(B[j], x -> new ArrayList()).add(j);\n }\n\n for (int i = 0; i < A.length; i++) if (Bstarts.containsKey(A[i])) {\n for (int j: Bstarts.get(A[i])) {\n int k = 0;\n while (i+k < A.length && j+k < B.length && A[i+k] == B[j+k]) {\n k++\n }\n ans = Math.max(ans, k);\n }\n }\n return ans;\n }\n}\n
Complexity Analysis
\nTime Complexity: , where are the lengths of A, B
. The worst case is when all the elements are equal.
Space Complexity: , the space used by Bstarts
. (Of course, we could amend our algorithm to make this .)
Intuition
\nIf there is a length k
subarray common to A
and B
, then there is a length j <= k
subarray as well.
Let check(length)
be the answer to the question "Is there a subarray with length
length, common to A
and B
?" This is a function with range that must take the form [True, True, ..., True, False, False, ..., False]
with at least one True
. We can binary search on this function.
Algorithm
\nFocusing on the binary search, our invariant is that check(hi)
will always be False
. We\'ll start with hi = min(len(A), len(B)) + 1
; clearly check(hi) is False
.
Now we perform our check in the midpoint mi
of lo
and hi
. When it is possible, then lo = mi + 1
, and when it isn\'t, hi = mi
. This maintains the invariant. At the end of our binary search, hi == lo
and lo
is the lowest value such that check(lo) is False
, so we want lo - 1
.
As for the check itself, we can naively check whether any A[i:i+k] == B[j:j+k]
using set structures.
Python
\nclass Solution(object):\n def findLength(self, A, B):\n def check(length):\n seen = set(tuple(A[i:i+length])\n for i in xrange(len(A) - length + 1))\n return any(tuple(B[j:j+length]) in seen\n for j in xrange(len(B) - length + 1))\n\n lo, hi = 0, min(len(A), len(B)) + 1\n while lo < hi:\n mi = (lo + hi) / 2\n if check(mi):\n lo = mi + 1\n else:\n hi = mi\n return lo - 1\n
Java
\nclass Solution {\n public boolean check(int length, int[] A, int[] B) {\n Set<String> seen = new HashSet();\n for (int i = 0; i + length <= A.length; ++i) {\n seen.add(Arrays.toString(Arrays.copyOfRange(A, i, i+length)));\n }\n for (int j = 0; j + length <= B.length; ++j) {\n if (seen.contains(Arrays.toString(Arrays.copyOfRange(B, j, j+length)))) {\n return true;\n }\n }\n return false;\n }\n\n public int findLength(int[] A, int[] B) {\n int lo = 0, hi = Math.min(A.length, B.length) + 1;\n while (lo < hi) {\n int mi = (lo + hi) / 2;\n if (check(mi, A, B)) lo = mi + 1;\n else hi = mi;\n }\n return lo - 1;\n }\n}\n
Complexity Analysis
\nTime Complexity: , where are the lengths of A, B
. The log factor comes from the binary search. The complexity of our naive check of a given is , as we will create the seen
strings with complexity , then search for them with complexity , and our total complexity when performing our check
is the addition of these two.
Space Complexity: , the space used by seen
.
Intuition and Algorithm
\nSince a common subarray of A
and B
must start at some A[i]
and B[j]
, let dp[i][j]
be the longest common prefix of A[i:]
and B[j:]
. Whenever A[i] == B[j]
, we know dp[i][j] = dp[i+1][j+1] + 1
. Also, the answer is max(dp[i][j])
over all i, j
.
We can perform bottom-up dynamic programming to find the answer based on this recurrence. Our loop invariant is that the answer is already calculated correctly and stored in dp
for any larger i, j
.
Python
\nclass Solution(object):\n def findLength(self, A, B):\n memo = [[0] * (len(B) + 1) for _ in range(len(A) + 1)]\n for i in range(len(A) - 1, -1, -1):\n for j in range(len(B) - 1, -1, -1):\n if A[i] == B[j]:\n memo[i][j] = memo[i+1][j+1]+1\n return max(max(row) for row in memo)\n
Java
\nclass Solution {\n public int findLength(int[] A, int[] B) {\n int ans = 0;\n int[][] memo = new int[A.length + 1][B.length + 1];\n for (int i = A.length - 1; i >= 0; --i) {\n for (int j = B.length - 1; j >= 0; --j) {\n if (A[i] == B[j]) {\n memo[i][j] = memo[i+1][j+1] + 1;\n if (ans < memo[i][j]) ans = memo[i][j];\n }\n }\n }\n return ans;\n }\n}\n
Complexity Analysis
\nTime Complexity: , where are the lengths of A, B
.
Space Complexity: , the space used by dp
.
Intuition
\nAs in Approach #2, we will binary search for the answer. However, we will use a rolling hash (Rabin-Karp algorithm) to store hashes in our set structure.
\nAlgorithm
\nFor some prime , consider the following function modulo some prime modulus :
\n\n\n
\nNotably, . This shows we can get the hash of all in linear time. We will also use the fact that .
\nFor every i >= length - 1
, we will want to record the hash of A[i-length+1], A[i-length+2], ..., A[i]
. After, we will truncate the first element by h = (h - A[i - (length - 1)]) * Pinv % MOD
to get ready to add the next element.
To make our algorithm air tight, we also make a naive check when our work with rolling hashes says that we have found a match.
\nclass Solution(object):\n def findLength(self, A, B):\n P, MOD = 113, 10**9 + 7\n Pinv = pow(P, MOD-2, MOD)\n def check(guess):\n def rolling(A, length):\n if length == 0:\n yield 0, 0\n return\n\n h, power = 0, 1\n for i, x in enumerate(A):\n h = (h + x * power) % MOD\n if i < length - 1:\n power = (power * P) % MOD\n else:\n yield h, i - (length - 1)\n h = (h - A[i - (length - 1)]) * Pinv % MOD\n\n hashes = collections.defaultdict(list)\n for ha, start in rolling(A, guess):\n hashes[ha].append(start)\n for ha, start in rolling(B, guess):\n iarr = hashes.get(ha, [])\n if any(A[i:i+guess] == B[start:start+guess] for i in iarr):\n return True\n return False\n\n lo, hi = 0, min(len(A), len(B)) + 1\n while lo < hi:\n mi = (lo + hi) / 2\n if check(mi):\n lo = mi + 1\n else:\n hi = mi\n return lo - 1\n
Java
\nimport java.math.BigInteger;\n\nclass Solution {\n int P = 113;\n int MOD = 1_000_000_007;\n int Pinv = BigInteger.valueOf(P).modInverse(BigInteger.valueOf(MOD)).intValue();\n\n private int[] rolling(int[] source, int length) {\n int[] ans = new int[source.length - length + 1];\n long h = 0, power = 1;\n if (length == 0) return ans;\n for (int i = 0; i < source.length; ++i) {\n h = (h + source[i] * power) % MOD;\n if (i < length - 1) {\n power = (power * P) % MOD;\n } else {\n ans[i - (length - 1)] = (int) h;\n h = (h - source[i - (length - 1)]) * Pinv % MOD;\n if (h < 0) h += MOD;\n }\n }\n return ans;\n }\n\n private boolean check(int guess, int[] A, int[] B) {\n Map<Integer, List<Integer>> hashes = new HashMap();\n int k = 0;\n for (int x: rolling(A, guess)) {\n hashes.computeIfAbsent(x, z -> new ArrayList()).add(k++);\n }\n int j = 0;\n for (int x: rolling(B, guess)) {\n for (int i: hashes.getOrDefault(x, new ArrayList<Integer>()))\n if (Arrays.equals(Arrays.copyOfRange(A, i, i+guess),\n Arrays.copyOfRange(B, j, j+guess))) {\n return true;\n }\n j++;\n }\n return false;\n }\n\n public int findLength(int[] A, int[] B) {\n int lo = 0, hi = Math.min(A.length, B.length) + 1;\n while (lo < hi) {\n int mi = (lo + hi) / 2;\n if (check(mi, A, B)) lo = mi + 1;\n else hi = mi;\n }\n return lo - 1;\n }\n}\n
Complexity Analysis
\nTime Complexity: , where are the lengths of A, B
. The log factor is contributed by the binary search, while creating the rolling hashes is . The checks for duplicate hashes are . If we perform a naive check to make sure our answer is correct, it adds a factor of to our cost of check
, which keeps the complexity the same.
Space Complexity: , the space used to store hashes
and the subarrays in our final naive check.
Analysis written by: @awice. Idea for Solution #2 by @StefanPochmann.
\n