Given an array nums containing n + 1 integers where each integer is between 1 and n (inclusive), prove that at least one duplicate number must exist. Assume that there is only one duplicate number, find the duplicate one.
Note:
O(n2)
.Credits:
Special thanks to @jianchao.li.fighter for adding this problem and creating all test cases.
The first two approaches mentioned do not satisfy the constraints given in\nthe prompt, but they are solutions that you might be likely to come up with\nduring a technical interview. As an interviewer, I personally would not\nexpect someone to come up with the cycle detection solution unless they have\nheard it before.
\nProving that at least one duplicate must exist in nums
is simple\napplication of the\npigeonhole principle.\nHere, each number in nums
is a "pigeon" and each distinct number that can\nappear in nums
is a "pigeonhole". Because there are numbers are\n distinct possible numbers, the pigeonhole principle implies that at\nleast one of the numbers is duplicated.
Intuition
\nIf the numbers are sorted, then any duplicate numbers will be adjacent in the\nsorted array.
\nAlgorithm
\nGiven the intuition, the algorithm follows fairly simply. First, we sort the\narray, and then we compare each element to the previous element. Because\nthere is exactly one duplicated element in the array, we know that the array\nis of at least length 2, and we can return the duplicate element as soon as\nwe find it.
\n\nComplexity Analysis
\nTime complexity : \n
\nThe sort
invocation costs time in Python and Java, so it\ndominates the subsequent linear scan.
Space complexity : (or )
\nHere, we sort nums
in place, so the memory footprint is constant. If we\ncannot modify the input array, then we must allocate linear space for a\ncopy of nums
and sort that instead.
Intuition
\nIf we store each element as we iterate over the array, we can simply check\neach element as we iterate over the array.
\nAlgorithm
\nIn order to achieve linear time complexity, we need to be able to insert\nelements into a data structure (and look them up) in constant time. A Set
\nsatisfies these constraints nicely, so we iterate over the array and insert\neach element into seen
. Before inserting it, we check whether it is already\nthere. If it is, then we found our duplicate, so we return it.
Complexity Analysis
\nTime complexity : \n
\nSet
in both Python and Java rely on underlying hash tables, so\ninsertion and lookup have amortized constant time complexities. The\nalgorithm is therefore linear, as it consists of a for
loop that\nperforms constant work times.
Space complexity : \n
\nIn the worst case, the duplicate element appears twice, with one of its\nappearances at array index . In this case, seen
will contain\n distinct values, and will therefore occupy space.
Intuition
\nIf we interpret nums
such that for each pair of index and value\n, the "next" value is at index , we can reduce this\nproblem to cycle detection. See the solution to\nLinked List Cycle II\nfor more details.
Algorithm
\nFirst off, we can easily show that the constraints of the problem imply that\na cycle must exist. Because each number in nums
is between and\n, it will necessarily point to an index that exists. Therefore, the list\ncan be traversed infinitely, which implies that there is a cycle.\nAdditionally, because cannot appear as a value in nums
, nums[0]
\ncannot be part of the cycle. Therefore, traversing the array in this manner\nfrom nums[0]
is equivalent to traversing a cyclic linked list. Given this,\nthe problem can be solved just like\nLinked List Cycle II.
To see the algorithm in action, check out the animation below:
\n!?!../Documents/287_Find_the_Duplicate_Number.json:1280,720!?!
\n\nComplexity Analysis
\nTime complexity : \n
\nFor detailed analysis, refer to \nLinked List Cycle II.
\nSpace complexity : \n
\nFor detailed analysis, refer to \nLinked List Cycle II.
\nAnalysis and solutions written by: @emptyset
\n