I was toying with the apparently innocent problem of rotating an array (shifting and wrapping around) through a given parameter. Coming up with an efficient algorithm is fairly elusive. On our journey, we will unravel some interesting connections to fields that will surprise you.
To shift an array a single cell to the right, we start at the last element, swap with the element before it, repeat at the previous element. The last element that we started with will continuously be moved, while each swap places another element into its correct place. To finish rotating an array of n elements, the algorithm needs n-1 swaps. This algorithm works for both directions (shifting left or right) since it always places one of the elements into its correct position (for positive and negative directions).
A naive approach to rotating an array through a distance entails repeating the corresponding "directional single-step rotate" k times. However, we're talking about shifting a million elements a thousand times. Hardly does a billion (which is just above the number of swaps needed) sound efficient (it's not).
Let's take the following array and shift its elements to the right. We got to try to fit the preceding single step solution to it:
0 1 2 3 4 5
We notice that 5 must swap with 1 first.
0 5 2 3 4 1
Then, 1 must swap with 3.
0 5 2 1 4 3
Starting again at 4, it must swap with 0.
4 5 2 1 0 3
Then, 2 must swap with 0.
4 5 0 1 2 3
Magically we have the array we have been seeking.
Let's look at it in another way:
Swapping at k-elements apart grouped the array into two sub arrays:
0 2 4 and 1 3 5
Doing a single rotate on each of these arrays:
4 0 2 and 5 1 3
Merging back these arrays into their respective places,
4 5 0 1 2 3,
gets us the resulting array that we desire.
The sub array subdivision should only be conceptual, not an actual memory operation. The resulting algorithm needs to swap only as many elements as there are in the initial array.
The remaining question is:
How do we count the sub arrays that result from this subdivision by the parameter k.
The simple solution would be to start at 0, move k elements at a time, and count how many times you need to move to return to 0. This count is the number of elements in each sub array. Divide the total number of elements in the array by that, and you have the number of groups. This solution however lacks performance. A million elements with a single step will require a million moves.
What we know is (given the size of array is n, and the step size is k)
- (rule1) if k is 1, the whole array is a single group (which reverts the rotation into the first single step version)
- (rule2) if k is either 0 or n, then there are n groups (the importance of this is not in 0 rotations but rather to resolve the following point after recursion)
- (rule3) if neither above rule apply, the problem is equivalent to finding the number of subarrays in an array of size k, and a step size that is k less the remainder of the division of n by k = (k - (n%k)).
The reason for that is that after a full movement sequence around the array, the cursor will be at (k-(n%k)). After another full rotation starting from that index (rather than 0) we expect to stop at the next increment of this value. Disregarding the values outside (0..k], we have another rotation grouping problem embedded in the first.
Let's take an example. n = 14, and k = 4.
Let's go through the movement sequence first. [0], 4, 8, 12, [2], 6, 10, [0]. I highlighted my sub problem points. The movement distance is 2, and the sub problem array size is 4. 4 from the previous step size, and 4-(14%4) = 4-2 = 2.
Applying the algorithm on (4, 2) we end with (2, 2). We know from (rule2) that the number of groups is 2. This makes sense as we need to apply the single shift algorithm, 4 steps at a time, once for the even numbers, and once for the odd numbers.
Let's try it with n=10 and k=3. 0, 3, 6, 9, 2, 5, 8, 1, 4, 7, 0. Apparently, all numbers are listed.
Let's apply the subgroup counting algorithm: (n, k) -> (10, 3) -> (3, 2) -> (2, 1) -> (1, 1). We have a single group. 3 steps don't divide 10 elements into unrelated groups.
Note: I also found out that the subgroup counting algorithm is equivalent to the GCD algorithm. Apparently, (n,k)->(k, n%k) is equivalent to the above formulation. The explanation within the above framework would be that, just before we finish a run around our array, our index is n-n%k which is equivalent to -n%k modula n. The sign does not matter for grouping by movement, so we can drop the minus to get our new array size and step size (n, k) -> (k, n%k).