Given two integer arrays A and B, return the maximum length of an subarray that appears in both arrays.
Example 1:
Input: A: [1,2,3,2,1] B: [3,2,1,4,7] Output: 3 Explanation: The repeated subarray with maximum length is [3, 2, 1].
Note:
Intuition and Algorithm
\nIn a typical brute force, for all starting indices i of A and j of B, we will check for the longest matching subarray A[i:i+k] == B[j:j+k] of length k. This would look roughly like the following psuedocode:
ans = 0\nfor i in [0 .. A.length - 1]:\n for j in [0 .. B.length - 1]:\n k = 0\n while (A[i+k] == B[j+k]): k += 1 #and i+k < A.length etc.\n ans = max(ans, k)\n
Our insight is that in typical cases, most of the time A[i] != B[j]. We could instead keep a hashmap Bstarts[A[i]] = all j such that B[j] == A[i], and only loop through those in our j loop.
Python
\nclass Solution(object):\n def findLength(self, A, B):\n ans = 0\n Bstarts = collections.defaultdict(list)\n for j, y in enumerate(B):\n Bstarts[y].append(j)\n\n for i, x in enumerate(A):\n for j in Bstarts[x]:\n k = 0\n while i+k < len(A) and j+k < len(B) and A[i+k] == B[j+k]:\n k += 1\n ans = max(ans, k)\n return ans\n
Java
\nclass Solution {\n public int findLength(int[] A, int[] B) {\n int ans = 0;\n Map<Integer, ArrayList<Integer>> Bstarts = new HashMap();\n for (int j = 0; j < B.length; j++) {\n Bstarts.computeIfAbsent(B[j], x -> new ArrayList()).add(j);\n }\n\n for (int i = 0; i < A.length; i++) if (Bstarts.containsKey(A[i])) {\n for (int j: Bstarts.get(A[i])) {\n int k = 0;\n while (i+k < A.length && j+k < B.length && A[i+k] == B[j+k]) {\n k++\n }\n ans = Math.max(ans, k);\n }\n }\n return ans;\n }\n}\n
Complexity Analysis
\nTime Complexity: , where are the lengths of A, B. The worst case is when all the elements are equal.
Space Complexity: , the space used by Bstarts. (Of course, we could amend our algorithm to make this .)
Intuition
\nIf there is a length k subarray common to A and B, then there is a length j <= k subarray as well.
Let check(length) be the answer to the question "Is there a subarray with length length, common to A and B?" This is a function with range that must take the form [True, True, ..., True, False, False, ..., False] with at least one True. We can binary search on this function.
Algorithm
\nFocusing on the binary search, our invariant is that check(hi) will always be False. We\'ll start with hi = min(len(A), len(B)) + 1; clearly check(hi) is False.
Now we perform our check in the midpoint mi of lo and hi. When it is possible, then lo = mi + 1, and when it isn\'t, hi = mi. This maintains the invariant. At the end of our binary search, hi == lo and lo is the lowest value such that check(lo) is False, so we want lo - 1.
As for the check itself, we can naively check whether any A[i:i+k] == B[j:j+k] using set structures.
Python
\nclass Solution(object):\n def findLength(self, A, B):\n def check(length):\n seen = set(tuple(A[i:i+length])\n for i in xrange(len(A) - length + 1))\n return any(tuple(B[j:j+length]) in seen\n for j in xrange(len(B) - length + 1))\n\n lo, hi = 0, min(len(A), len(B)) + 1\n while lo < hi:\n mi = (lo + hi) / 2\n if check(mi):\n lo = mi + 1\n else:\n hi = mi\n return lo - 1\n
Java
\nclass Solution {\n public boolean check(int length, int[] A, int[] B) {\n Set<String> seen = new HashSet();\n for (int i = 0; i + length <= A.length; ++i) {\n seen.add(Arrays.toString(Arrays.copyOfRange(A, i, i+length)));\n }\n for (int j = 0; j + length <= B.length; ++j) {\n if (seen.contains(Arrays.toString(Arrays.copyOfRange(B, j, j+length)))) {\n return true;\n }\n }\n return false;\n }\n\n public int findLength(int[] A, int[] B) {\n int lo = 0, hi = Math.min(A.length, B.length) + 1;\n while (lo < hi) {\n int mi = (lo + hi) / 2;\n if (check(mi, A, B)) lo = mi + 1;\n else hi = mi;\n }\n return lo - 1;\n }\n}\n
Complexity Analysis
\nTime Complexity: , where are the lengths of A, B. The log factor comes from the binary search. The complexity of our naive check of a given is , as we will create the seen strings with complexity , then search for them with complexity , and our total complexity when performing our check is the addition of these two.
Space Complexity: , the space used by seen.
Intuition and Algorithm
\nSince a common subarray of A and B must start at some A[i] and B[j], let dp[i][j] be the longest common prefix of A[i:] and B[j:]. Whenever A[i] == B[j], we know dp[i][j] = dp[i+1][j+1] + 1. Also, the answer is max(dp[i][j]) over all i, j.
We can perform bottom-up dynamic programming to find the answer based on this recurrence. Our loop invariant is that the answer is already calculated correctly and stored in dp for any larger i, j.
Python
\nclass Solution(object):\n def findLength(self, A, B):\n memo = [[0] * (len(B) + 1) for _ in range(len(A) + 1)]\n for i in range(len(A) - 1, -1, -1):\n for j in range(len(B) - 1, -1, -1):\n if A[i] == B[j]:\n memo[i][j] = memo[i+1][j+1]+1\n return max(max(row) for row in memo)\n
Java
\nclass Solution {\n public int findLength(int[] A, int[] B) {\n int ans = 0;\n int[][] memo = new int[A.length + 1][B.length + 1];\n for (int i = A.length - 1; i >= 0; --i) {\n for (int j = B.length - 1; j >= 0; --j) {\n if (A[i] == B[j]) {\n memo[i][j] = memo[i+1][j+1] + 1;\n if (ans < memo[i][j]) ans = memo[i][j];\n }\n }\n }\n return ans;\n }\n}\n
Complexity Analysis
\nTime Complexity: , where are the lengths of A, B.
Space Complexity: , the space used by dp.
Intuition
\nAs in Approach #2, we will binary search for the answer. However, we will use a rolling hash (Rabin-Karp algorithm) to store hashes in our set structure.
\nAlgorithm
\nFor some prime , consider the following function modulo some prime modulus :
\n\n\n
\nNotably, . This shows we can get the hash of all in linear time. We will also use the fact that .
\nFor every i >= length - 1, we will want to record the hash of A[i-length+1], A[i-length+2], ..., A[i]. After, we will truncate the first element by h = (h - A[i - (length - 1)]) * Pinv % MOD to get ready to add the next element.
To make our algorithm air tight, we also make a naive check when our work with rolling hashes says that we have found a match.
\nclass Solution(object):\n def findLength(self, A, B):\n P, MOD = 113, 10**9 + 7\n Pinv = pow(P, MOD-2, MOD)\n def check(guess):\n def rolling(A, length):\n if length == 0:\n yield 0, 0\n return\n\n h, power = 0, 1\n for i, x in enumerate(A):\n h = (h + x * power) % MOD\n if i < length - 1:\n power = (power * P) % MOD\n else:\n yield h, i - (length - 1)\n h = (h - A[i - (length - 1)]) * Pinv % MOD\n\n hashes = collections.defaultdict(list)\n for ha, start in rolling(A, guess):\n hashes[ha].append(start)\n for ha, start in rolling(B, guess):\n iarr = hashes.get(ha, [])\n if any(A[i:i+guess] == B[start:start+guess] for i in iarr):\n return True\n return False\n\n lo, hi = 0, min(len(A), len(B)) + 1\n while lo < hi:\n mi = (lo + hi) / 2\n if check(mi):\n lo = mi + 1\n else:\n hi = mi\n return lo - 1\n
Java
\nimport java.math.BigInteger;\n\nclass Solution {\n int P = 113;\n int MOD = 1_000_000_007;\n int Pinv = BigInteger.valueOf(P).modInverse(BigInteger.valueOf(MOD)).intValue();\n\n private int[] rolling(int[] source, int length) {\n int[] ans = new int[source.length - length + 1];\n long h = 0, power = 1;\n if (length == 0) return ans;\n for (int i = 0; i < source.length; ++i) {\n h = (h + source[i] * power) % MOD;\n if (i < length - 1) {\n power = (power * P) % MOD;\n } else {\n ans[i - (length - 1)] = (int) h;\n h = (h - source[i - (length - 1)]) * Pinv % MOD;\n if (h < 0) h += MOD;\n }\n }\n return ans;\n }\n\n private boolean check(int guess, int[] A, int[] B) {\n Map<Integer, List<Integer>> hashes = new HashMap();\n int k = 0;\n for (int x: rolling(A, guess)) {\n hashes.computeIfAbsent(x, z -> new ArrayList()).add(k++);\n }\n int j = 0;\n for (int x: rolling(B, guess)) {\n for (int i: hashes.getOrDefault(x, new ArrayList<Integer>()))\n if (Arrays.equals(Arrays.copyOfRange(A, i, i+guess),\n Arrays.copyOfRange(B, j, j+guess))) {\n return true;\n }\n j++;\n }\n return false;\n }\n\n public int findLength(int[] A, int[] B) {\n int lo = 0, hi = Math.min(A.length, B.length) + 1;\n while (lo < hi) {\n int mi = (lo + hi) / 2;\n if (check(mi, A, B)) lo = mi + 1;\n else hi = mi;\n }\n return lo - 1;\n }\n}\n
Complexity Analysis
\nTime Complexity: , where are the lengths of A, B. The log factor is contributed by the binary search, while creating the rolling hashes is . The checks for duplicate hashes are . If we perform a naive check to make sure our answer is correct, it adds a factor of to our cost of check, which keeps the complexity the same.
Space Complexity: , the space used to store hashes and the subarrays in our final naive check.
Analysis written by: @awice. Idea for Solution #2 by @StefanPochmann.
\n