Given a non-empty integer array, find the minimum number of moves required to make all array elements equal, where a move is incrementing a selected element by 1 or decrementing a selected element by 1.
You may assume the array's length is at most 10,000.
Example:
Input: [1,2,3] Output: 2 Explanation: Only two moves are needed (remember each move increments or decrements one element): [1,2,3] => [2,2,3] => [2,2,2]
In the brute force approach, we consider every possible number to which all the array elements should be equated so as to minimize the number of moves required. One point is obvious that the number to which all the elements are equated at the end should lie between the minimum and the maximum elements present in the array. Thus, we first find the minimum and the maximum element in the array. Suppose is the number to which all the elements are equated. Then, we iterate over the range between the minimum and maximum values and find the number of moves required for each , simultaneously finding the minimum moves, which will be the end result.
\n\nComplexity Analysis
\nAlgorithm
\nIn this approach, rather than choosing every possible between the minimum and the maximum values in the array,\nwe can simply consider as every element of the array. To understand why we need not iterate over all the complete range but only the elements of the array, consider the\nfollowing example.
\nSay the array is:
\n\n. Now, if we try to equalize all the elements to , which by the way, may or may not be the final number required to be settled down to.
\nThe total number of moves for doing this is given by: \n
\nSuppose, now, instead of , we try to equalize all the elements to a number , which is not present in the given array, but is slightly larger than and is thus given by\n say , where is an integer. Thus, the total number of moves required now will be given by:
\n\n\n
\n\n\n
\n\n\n
\n\n\n
\n\n ...using from above
\nFrom this equation, it is clear that the number of moves required to settle to some arbitrary number present in the array is always lesser than the number of moves\n required to settle down to some arbitrary number . This completes the proof.
\n\nComplexity Analysis
\nTime complexity : . Two nested loops are there.
\nSpace complexity : . No extra space required.
\nAlgorithm
\nIn the previous approach, we needed to find the number of moves required for every chosen from the array, by iterating over the whole array. We can optimize this approach to sum extent by sorting the array and observing the following fact. The number of moves required to raise the elements smaller than to equalize them to will be given by: (The meanings of the keywords are given below) .\n Similarly, the number of moves required to decrement the elements larger than to equalize them to will be: .\nThe total number of moves required will, thus, be the sum of these two parts.\nHence, for a particular chosen, the total number of moves required will be given by:
\n\n\n
\nwhere, = The number to which all the elements are equalized at the end.
\n\n = The number of elements which are lesser than .
\n\n = The sum of elements which are lesser than .
\n\n = The number of elements which are larger than .
\n\n = The sum of elements which are larger than .
\n\n = The total number of moves required to equalize all the elements of the array to .
\nLet\'s say that the index of the element corresponding to the element be given by . Instead of iterating over the array for calculating and\n, we can keep on calculating them while traversing the array since the array is sorted. We calculate the total sum of the given array once, given\nby . We start by choosing and as .\nTo calculate , we just add the element to the previous .\nTo calculate , we subtract the element from the previous .
\n\nComplexity Analysis
\nTime complexity : . Sorting will take time.
\nSpace complexity : . No extra space required.
\nAlgorithm
\nThe problem of finding the number to which all the other numbers eventually settle can also be viewed as: Given a set of points in 1-d.\n Find a point such that the cumulative sum of distances between and the rest of the points is minimum. This is a very common mathematical problem whose answer is known.\n The point is the median of the given points. The reason behind choosing the median is given after the algorithm.
\nWe can simply sort the given points and find the as the element in the middle of the array. Thus, the total number of moves required to equalize all the array elements is given by\n the sum of differences of all the elements from the . In mathematical terms, the solution is given by:
\n\n , where is the size of the given array.
\n\n!?!../Documents/462_Minimum_Moves1.json:1000,563!?!
\nNow, we\'ll look at the mathematical reasoning behind choosing the median as the number to which we\'ll settle. As discussed in the previous approach, the total number of moves\nrequired is given by:
\n\n, where all the variables have the same definition.
\nNow, as we know, in order to maximize this term w.r.t. the changes in , we can take the derivative of the above term w.r.t. . Thus, we proceed as:
\n\n\n
\n\n\n
\n\n\n
\nSetting derivative equal to , we get:
\n\n or . This property is satisfied by the median only, which completes the proof.
\n\nComplexity Analysis
\nTime complexity : . Sorting will take time.
\nSpace complexity : . Only single extra variable is used.
\nAlgorithm
\nIn the previous approach, we went for finding the median after sorting and then calculated the number of moves required. But, if we observe properly, we\'ll find that if the array is sorted, we can\ndo the same task without actually finding the median or the number to which we need to settle at the end. To proceed with this, let\'s look at the maximum() and the minimum\nnumbers() in the array, which currently lie at its extreme positions. We know, at the end, both these numbers should be equalized to . For the number , the number of moves\nrequired to do this is given by . Similarly, for the number , the number of moves is given by . Thus, the total number of moves for both and is given by\n, which is independent of the number . Thus, we can continue now, with the next maximum and the next minimum number in the array, until the complete array is exhausted.
\nTherefore, the equation becomes:
\n\n, where is the number of elements in the array .
\n\nComplexity Analysis
\nTime complexity : . Sorting will take time.
\nSpace complexity : . No extra space required.
\nAlgorithm
\nIn order to find the median, we need not necessarily sort the given array. But we can find the median directly using the Quick-Select method to find the median, which\ndoesn\'t use sorting.
\nThe quick-select method is similar to the Quick-Sort method. In a single iteration, we choose a pivot and somehow bring it to its correct position in the array.\nIf the correct position happens to be the central position(corresponding to the median), we can return the median directly from there. Now, let\'s look at the implementation of quick-select.
\nQuick-Select makes use of two functions and . function takes the leftmost and the rightmost indices of the given array and the central index as well. If the element reaching the\n correct position in the current function call to function happens to be the median(i.e. it reaches the central position), we return the element(since it is the median).\n The function takes the leftmost and the rightmost indices of the array and returns the correct position of the current pivot(which is chosen as the rightmost element of the array).\n This function makes use of two pointers and . Both the pointers initially point to the leftmost element of the array.
\nAt every step, we compare the element at the\n index() with the pivot element(). If , we swap the elements and and increment and . Otherwise,\n only is incremented. When reaches the end of the array, we swap the with . In this way, now, all the elements lesser than lie to the\n left of the index, and all the elements larger than lie to the right of the index and thus, the reaches at its correct position in the array.\n If this position isn\'t the central index of the array, we again make use of the functions passing the left and the right subarrays relative to the index.
\nFor more clarification, look at the animation below for this example:\n [3 8 2 5 1 4 7 6]
\n !?!../Documents/462_Minimum_Moves2.json:1000,563!?!
\nAfter finding the median, we can find the sum of absolute differences of all the elements from the median to determine the number of moves required. Mathematically, we use:
\n\n , where is the size of the given array.
\n\nComplexity Analysis
\nTime complexity :\nAverage Case: . Quick-Select average case time complexity is .\nWorst Case: . In worst case quick-select can go upto \n
\nSpace complexity : . No extra space required.
\nAlgorithm
\nIt isn\'t hard to see that, in quick-select, if we naively choose the pivot element, this\nalgorithm has a worst case performance of . To guarantee the linear running\ntime in order to find the median, however we need a strategy for choosing the pivot element that\nguarantees that we partition the list into two sublists of\nrelatively comparable size. Obviously the median of the values\nin the list would be the optimal choice, but if we could find the\n median in linear time, we would already have a solution to our problem.
\nThe median-of-medians algorithm chooses its pivot in the following clever way:
\n\n\n
\nDivide into groups where size of each group is 5 elements,\n except possibly the last group which may have less than 5 elements.
\nSort the above created groups and find median\n of all groups. Create an auxiliary array and store medians\n of all groups in this median array.\n Also, recursively call this method to find median of median[0...]
\n\n = \n
\nPartition around and obtain its position(i.e. use as the pivot element).\n \n
\nIf return \n
\nUsing the above method ensures that the chosen pivot, in the worst case, has atmost 70% elements which are larger/smaller than the pivot.\n The proof of the same as well as the reason behind choosing the group size of 5 is given\nin the explanation of time complexity.
\n\nComplexity Analysis
\nTime complexity : . Worst case time complexity is .
\nSpace complexity : . No extra space required.
\nProof: Time Complexity :
\nThe worst case time complexity of the above algorithm is . Let us analyze all steps.
\nThe steps 1. and 2. take time as finding median of an array of size 5 takes O(1) time and there are such arrays.\nThe step 3. takes time(if the whole algorithm takes time). The step 4. is standard partition and takes time.\nThe interesting steps are 6. and 7. At most, one of them is executed. These are recursive steps. What is the worst case size of these recursive calls?\nThe answer is maximum number of elements greater than medOfMed (obtained in step 3) or maximum number of elements smaller than .
\nHow many elements are greater than and how many are smaller?
\nLet\'s assume that the list of medians obtained from step 2. in the sorted order be\n , where is the median chosen as the pivot. To find an upper bound on the number of elements in\n the given array smaller than our pivot, first consider the half of the medians from step 2() which are smaller than\n the pivot. It is possible for all five of the elements in the sublists corresponding to these medians to be smaller than the pivot(, which leads to an upper\n bound of such elements. Now consider the half of the medians from step 2 which are larger than the pivot\n (). It is only possible for two of the\n elements(which are smaller than the respective medians) in the sublists corresponding to these medians to be smaller than the pivot(), which leads to an upper bound of\n such elements. In addition, the sublist containing the pivot() contributes\n exactly two elements smaller than the pivot. It total, we may have at most:
\n\n\n
\nelements smaller than the pivot, or approximately 70% of the list. The same upper bound applies the the number of elements in the list larger than the pivot. It is this\n guarantee that the partitions cannot be too lopsided that leads to linear run time.
\nThus, the minimum number of elements which are smaller or larger than the chosen pivot() is given by or nearly\n30% of the elements.
\nIn the worst case, the function recurs for at most times.
\nNote that for and that any input of 80 or fewer elements requires time. We can therefore obtain the recurrence:
\n$$T(n) <= \n
\nWe show that the running time is linear by substitution. Assume that for some constant and all . Substituting this inductive hypothesis into the right-hand side of the recurrence yields
\n\n\n\n\n\nsince we can pick c large enough so that is larger than the function described by the term for all . The worst-case running time of is therefore linear.
\nChoosing the group size of 3 leads to at least half of the n/3 blocks having at least 2 elements , hence this gives a n/3 split, or 2n/3 in the worst case.
\nThis gives = , which reduces to in the worst case.
\nThere is no reason why you should not use something greater than five; for example with seven the inequality would be
\n\n\n
\n\n\n
\nwhich also works, but five is the smallest odd number (useful for medians) which works.
\nAnalysis written by: @vinod23
\n