Jul 11, 2024

Sorting Algorithms Collection

O(n^2) sorting Algorithms

Bubble Sort

It is easy to understand this kind of sorting algorithm from the name - bubble sort, as it’s like the bubble in the water. Every time we choose the unsorted largest/smallest value as a bubble, and move it to the certain index.

    public static void bubbleSort(int[] nums){
        int length = nums.length;
        for(int i = 0; i < length; i++){
            for(int j = 1; j < length - i; j++){
                if(nums[j] < nums[j - 1]){
                    swapNumbers(nums, j, j - 1);
                }
            }
        }
    }

    private static void swapNumbers(int[] nums, int indexA, int indexB){
        nums[indexA] ^= nums[indexB];
        nums[indexB] ^= nums[indexA];
        nums[indexA] ^= nums[indexB];
    }

Select sort

Also easy to understand, in each loop, you just need to locate the index of largest/smallest value, and move it to certain position

    public static void selectSort(int[] nums){
        int length = nums.length;
        int maxIndex = 0;
        for(int i = 0; i < length; i++){
            for(int j = 1; j < length - i; j++){
                if(nums[j] > nums[maxIndex]){
                    maxIndex = j;
                }
            }
            swapNumbers(nums, maxIndex, length - 1 - i);
            maxIndex = 0;
        }
    }

    private static void swapNumbers(int[] nums, int indexA, int indexB){
        //can not use this to swap same indexes as it will turn out to 0
        if(indexA == indexB){
            return;
        }
        nums[indexA] ^= nums[indexB];
        nums[indexB] ^= nums[indexA];
        nums[indexA] ^= nums[indexB];
    }

O(nlogn) sorting Algorithms

Heap sort

Let me give you a story as begining: Jarvis doesn’t like to wash socks, so every day he just take off the old socks and throws them to a heap every day after work. The heap is sorted by the smell of the socks, the least smelly sock will be on top of the heap, then jarvis will pop the top two socks and wear them before work, throws them back to heap and re-sort after work. That is heap sort.

To understand how heap sorts values, we need this structure - Complete Binary Tree. The root node of the tree will be treated as the heap top, and make sure each value of child are smaller/larger than the value of the root node, then we will have a sorted heap.

Now, treate the array as a CBT, we will have some conclusions:

for every index, we treat them as a node, so for index i(starting from 0), its two children node will be index 2 * i + 1 and 2 * i + 2
the last node with children will be index length of the array / 2 + 1

So how could we do heap sort for an array: there are two steps in general:

build a heap initially to do the heap sort
after building a heap, we will have a result - the largest/smallest number will be on the top(root node) of the CBT, then we move this number to the last node which is not sorted
repeat the building heap step but exclude the sorted nodes/indexes Let’s get to the codes:

    public static void heapSort(int[] nums){
        buildHeap(nums);
        int length = nums.length;
        for(int index = 0; index < length; index++){
            swap(nums, 0, length - 1 - index);
            maxHeapify(nums, 0, length - 2 - index);
        }
    }

    public static void buildHeap(int[] nums){
        int length = nums.length;
        for(int index = length / 2 + 1; index >= 0; index--){
            maxHeapify(nums, index, length - 1);
        }
    }

    public static void maxHeapify(int[] nums, int index, int heapSize){
        int leftChild = index * 2 + 1;
        int rightChild = index * 2 + 2;
        int largestIndex = index;
        if(leftChild > heapSize){
            return;
        }
        largestIndex = nums[leftChild] > nums[largestIndex] ? leftChild : largestIndex;
        if(rightChild <= heapSize){
            largestIndex = nums[rightChild] > nums[largestIndex] ? rightChild : largestIndex;
        }

        if(largestIndex != index){
            swap(nums, index, largestIndex);
            maxHeapify(nums, largestIndex, heapSize);
        }
    }

    public static void swap(int[] nums, int indexA, int indexB){
        int temp = nums[indexA];
        nums[indexA] = nums[indexB];
        nums[indexB] = temp;
    }

Quick sort

The quick sort is an implementation of “Divide-and-conquer” algorithm, the main logic of quick sort are four steps:

Choose a value at the first index as pivot, then use two pointers point to the head index and tail index. Move these two pointers to the middle.
When the value of head index are larger than pivot and the value of tail are smaller than pivot, exchange them. Continue to move this two pointers until they meet.
When they meet, exchange the value of header pointer and the pivot value. That means now the values on the left of pivot are smaller than pivot, the values on the right of pivot are lager than pivot
Use the left range and right range as the sorting target area , repeat the 3 steps above until there are just one value in the area.

    public static void quickSort(int[] nums, int startIndex, int endIndex){
        if(startIndex >= endIndex){
            return;
        }
        int partitionIndex = partition(nums, startIndex, endIndex);
        quickSort(nums, startIndex, partitionIndex - 1);
        quickSort(nums, partitionIndex + 1, endIndex);
    }

    public static int partition(int[] nums, int startIndex, int endIndex){
        int left = startIndex;
        int right = endIndex;
        while(left < right){
            while(left < right && nums[right] >= nums[startIndex]){
                right--;
            }
            while(left < right && nums[left] < nums[startIndex]){
                left++;
            }
            swap(nums, left, right);
        }
        swap(nums, startIndex, left);
        return left;
    }

    public static void swap(int[] nums, int indexA, int indexB){
        if(indexA == indexB){
            return;
        }
        nums[indexA] ^= nums[indexB];
        nums[indexB] ^= nums[indexA];
        nums[indexA] ^= nums[indexB];
    }

But for original quick sort, there might be some issues: When the array are already sorted by order, then one of the range will be large, we can just sort one value each loop. So it is important to have a useful pivot value - we can choose three values in the range and use the middle one as the pivot:

    public static void quickSort(int[] nums, int startIndex, int endIndex){
        if(startIndex >= endIndex){
            return;
        }
        int partitionIndex = partition(nums, startIndex, endIndex);
        quickSort(nums, startIndex, partitionIndex - 1);
        quickSort(nums, partitionIndex + 1, endIndex);
    }

    public static int partition(int[] nums, int startIndex, int endIndex){
        int pivot = choosePivot(nums, startIndex, endIndex);
        swap(nums, startIndex, pivot);
        int left = startIndex;
        int right = endIndex;
        while(left < right){
            while(left < right && nums[right] >= nums[startIndex]){
                right--;
            }
            while(left < right && nums[left] < nums[startIndex]){
                left++;
            }
            swap(nums, left, right);
        }
        swap(nums, startIndex, left);
        return left;
    }

    public static int choosePivot(int[] nums, int startIndex, int endIndex){
        int middleIndex = startIndex + (endIndex - startIndex) / 2;
        if(nums[startIndex] > nums[endIndex]){
            if(nums[endIndex] > nums[middleIndex]){
                return middleIndex;
            }
            return endIndex;
        }else{
            if(nums[startIndex] > nums[middleIndex]){
                return startIndex;
            }
            return middleIndex;
        }
    }

    public static void swap(int[] nums, int indexA, int indexB){
        if(indexA == indexB){
            return;
        }
        nums[indexA] ^= nums[indexB];
        nums[indexB] ^= nums[indexA];
        nums[indexA] ^= nums[indexB];
    }

Also, when the array are already sorted, the stack used will be O(n) as each loop only choose only one single value. So we can just choose the shorter part of the two divided areas to add to stack.

    public static void quickSort(int[] nums, int startIndex, int endIndex){
        if(startIndex >= endIndex){
            return;
        }
        while(startIndex < endIndex){
            int partitionIndex = partition(nums, startIndex, endIndex);
            if(partitionIndex - startIndex < endIndex - partitionIndex){
                quickSort(nums, startIndex, partitionIndex - 1);
                startIndex = partitionIndex + 1;
            }else{
                quickSort(nums, partitionIndex + 1, endIndex);
                endIndex = partitionIndex - 1;
            }
        }
    }

    public static int partition(int[] nums, int startIndex, int endIndex){
        int pivot = choosePivot(nums, startIndex, endIndex);
        swap(nums, startIndex, pivot);
        int left = startIndex;
        int right = endIndex;
        while(left < right){
            while(left < right && nums[right] >= nums[startIndex]){
                right--;
            }
            while(left < right && nums[left] < nums[startIndex]){
                left++;
            }
            swap(nums, left, right);
        }
        swap(nums, startIndex, left);
        return left;
    }

    public static int choosePivot(int[] nums, int startIndex, int endIndex){
        int middleIndex = startIndex + (endIndex - startIndex) / 2;
        if(nums[startIndex] > nums[endIndex]){
            if(nums[endIndex] > nums[middleIndex]){
                return middleIndex;
            }
            return endIndex;
        }else{
            if(nums[startIndex] > nums[middleIndex]){
                return startIndex;
            }
            return middleIndex;
        }
    }

    public static void swap(int[] nums, int indexA, int indexB){
        if(indexA == indexB){
            return;
        }
        nums[indexA] ^= nums[indexB];
        nums[indexB] ^= nums[indexA];
        nums[indexA] ^= nums[indexB];
    }

Merge Sort

Merge sort is kind of like Quick sort as they both use the “Divide&Conquer” algorithm, so for Merge Sort, there are two main steps:

divide the array to two arrays from the middle, keep dividing the array until there are just single value in the target array
after dividing all the values, merge them back - treate them as merge two sorted array - after merging them all, then we will get a sorted array

    public static void mergeSort(int[] nums, int startIndex, int endIndex) {
        if (startIndex >= endIndex) {
            return;
        }
        int divideIndex = startIndex + (endIndex - startIndex) / 2;
        mergeSort(nums, startIndex, divideIndex);
        mergeSort(nums, divideIndex + 1, endIndex);
        merge(nums, startIndex, endIndex, divideIndex);
    }


    public static void merge(int[] nums, int startIndex, int endIndex, int boundIndex) {
        //basically we are tring to sort two sorted array and merge them to one here, just using a single array, nothing special"
        int itIndex = 0;
        int indexLeft = startIndex;
        int indexRight = boundIndex + 1;
        int[] tempArray = new int[endIndex - startIndex + 1];
        while (indexLeft <= boundIndex || indexRight <= endIndex) {
            if (indexLeft > boundIndex) {
                tempArray[itIndex++] = nums[indexRight++];
                continue;
            }
            if (indexRight > endIndex) {
                tempArray[itIndex++] = nums[indexLeft++];
                continue;
            }
            if (nums[indexRight] <= nums[indexLeft]) {
                tempArray[itIndex++] = nums[indexRight++];
            } else {
                tempArray[itIndex++] = nums[indexLeft++];
            }
        }
        for (int i = startIndex; i <= endIndex; i++) {
            nums[i] = tempArray[i - startIndex];
        }
    }

O(n) sorting Algorithms

We have introduced some kinds of sorting algorithms, but all of them are based on comparison, which means during the sorting process we always compare one value with another - which kind of algorithm can not better than O(nlogn). Now we will introduce some sorting algorithms and there main thought are not base on comparison

Bucket Sort

Another example of “Divide & Conquer” algorithm - the main thought of bucket sort is we divide the unsorting nums to some buckets(with orders), sort them inside the bucket and merge them back together. Kind of like the merge sort right, but when we create buckets, the buckets are already in good order. So lets talk about the complexity, imagine we have k buckets, and in each bucket we use quick sort, so in each bucket, the complexity are O(n/klog(n/k)), and the whole complexity for all buckets are O(nlog(n/k)) - you already noticed right? When the k is large enough, log(n/k) is a constant, so the time complexity for bucket sort could be O(n) Let’s have an example - let’s say we have a float array and the values are in this range - (0,1).

    public static void bucketSort(float[] nums){
        //try use array length / 2 as amount of buckets
        int k = nums.length / 2;
        List<List<Float>> buckets = new ArrayList();

        for (int i = 0; i < k; i++) {
            buckets.add(new ArrayList<>());
        }

        for(var num: nums){
            //choose a bucket
            int bucketIndex = (int)(num * k);
            buckets.get(bucketIndex).add(num);
        }

        for(var list: buckets){
            //just use java built-in sorting algorithms here - also we could use the quick sort or merge sort
            Collections.sort(buckets);
        }

        int i = 0;
        for (List<Float> bucket : buckets) {
            for (float num : bucket) {
                nums[i++] = num;
            }
        }
    }

But the main point for bucket sort is how to distribute the numbers into buckets evenly, if the distribution is not even, then bucket sort might be slow as fuck - talk later //todo

Counting sort

Now let us talk about the counting sort, counting sort is a special kind of bucket sort - when a non-negative integer array is given, we could find the largest value of the array, and create another array - sorting this non-negative array by the index of the new array:

    public static void countingSort(int[] nums){
        int max = -1;
        for(var num: nums){
            if(num > max){
                max = num;
            }
        }
        if(max < 0){
            return;
        }

        int[] temp = new int[max + 1];
        for(var num: nums){
            temp[num]++;
        }
        int index = 0;
        for(int i = 0; i < temp.length; i++){
            if(temp[i] > 0){
                nums[index++] = i;
                temp[i]--;
                i--;
            }
        }
    }

So as we can see, the time complexity is O(mn) - m is the largest value in the array. If the m is super large(maybe large than n), then the complexity will be larger than O(n^2) and of course larger than O(nlogn). But as you can see, we are sorting values now, but what if we have bunch of objects to sort? If we want to sort the person objects by age order, using current counting sort could just sort the age values but not the objects. So we can introduce a way to sort the objects - using the prefix sum to find out the position for different object. Now we have a sorted buckets array - buskets:[0,0,1,2,3,2,0], so for buckets[num], buckets[num] - 1 is the index of the last num will be, then we can sort the objects not just values:

    public static void countingSortPerson(Person[] persons){
        if(persons.length <= 0){
            return;
        }
        var temp = countingSort(persons);
        Person[] res = new Person[persons.length];
        int prefixSum = persons.length;
        for(int i = persons.length - 1; i >= 0; i--){
            int index = temp[person[i]]--;
            res[index] = person[i];
        }

        for(int i = 0; i < persons.length; i++){
            persons[i] = res[i];
        }
    }

    public static int[] countingSortWithPrefixSum(Person[] persons){
        int maxAge = -1;
        for(Person person : persons){
            if(person.age > maxAge){
                maxAge = person.age;
            }
        }
        int[] temp = new int[maxAge + 1];
        for(Person person : persons){
            temp[person.age]++;
        }
        int prefixSum = 0;
        for(int i = 0; i < temp.length; i++){
            if(temp[i] > 0){
                temp[i] += prefixSum;
                prefixSum = temp[i];
            }
        }
        return temp;
    }

Radix sort

Radix sort is a kind of Counting sort. Let’s say there are an array with 10^6 values, each value is less than 10^8, so if we use counting sort, means we need a temp array (length is 10^8) to store the amount, the complexity will more than O(n^2). That we could use Radix sort here, from the last digit, we perform a counting sort on each digit of the number. Then the complexity will be less than 8 * O(n) - which means the complexity of radis sort is O(n).

    public static void radixSort(int[] nums){
        int maxNumber = Integer.MIN_VALUE;
        for (int num : nums)
            if (num > maxNumberm)
                maxNumber = num;
        for (int exp = 1; exp <= maxNumber; exp *= 10) {
            countingSortDigit(nums, exp);
        }
    }

    public static void countingSort(int nums, int divideNumber){
        int[] temp = new int[10];
        for(var num: nums){
            temp[(num/divideNumber) % 10]++;
        }
        for (int i = 1; i < 10; i++) {
            temp[i] += temp[i - 1];
        }
        int[] res = new int[nums.length];
        for(int i = nums.length - 1; i >= 0; i--){
            int d = (nums[i] / divideNumber) % 10;
            int index = temp[d] - 1;
            res[index] = nums[i];
            temp[d]--;
        }
        for(int i = 0; i < nums.length; i++){
            nums[i] = temp[i];
        }
    }