📒📝📓 record seven sort algorithms.
they are: bubble sort 、selection sort、insertion sort、quick sort 、heap sort、merge sort、counting sort
Introduction
Some algorithms(selection, bubble, heap) work by moving elements to their final position, one at a time. You sort an array of size N, put 1 item in place and continue sorting and array of size N -1,
Some algorithms(insertion, quicksort, counting)put items to temporary place, closer to their final position, moving them closer to final position ,in each iteration.
Some algorithms(merge) start with a 'sorted list ' of one element, and merge unsorted items into it, one at a time
note
- assume we are sorting a list or array of N elements
- once sorted, smaller items are on the left, and larger items are on the right
Bubble Sort
best: O(N), worst: O(N^2)
starting on the left, compare adjacent items and keep 'bubbling' the the larger one to the right(it's in its final place), bubble sort the remaining N -1 items.
in each iteration, it will find the largest one in the rest of an array, and place it in the right.
pseudocode in C language:
void bubble_sort(int *a, int len){
int tmp;
for(int i = 0; i < len; i ++){
for(int j = 0; j < len - i - 1; i ++){
if(a[j] > a[j + 1]){
tmp = a[j];
a[j] = a[j + 1];
a[j + 1] = tmp;
}
}
}
}
Selection Sort
best/worst: O(N^2)
scan all items and find the smallest. Swap it into the position as the first item. Repeat the selection sort on the remaining N - 1 items
it just like the bubble sort, the outer iteration from 0 to N - 1, the inner iteration is to find the minimun item, and always swap it with i.
void selection_sort(int *a, int len){
int min, index, tmp;
for(int i = 0; i < len; i ++){
min= a[i];
for(int j = i; j < len; j ++){
if(a[j] < min){
min = a[j];
index = j;
}
tmp = a[i];
a[i] = a[index];
a[index] = tmp;
}
}
}
Insertion Sort
best: O(N), worst: O(N^2)
start with a sorted list of 1 element on the left, and N - 1 unsorted items on the right. Taking the first unsorted item (element #2) and insert it into the sorted list, moving elements as neccesary. We now have a sorted list of size 2, and N -2 unsorted items, repeat for all items.
void insertion_sort(int *a, int len){
int min;
for(int i = 0; i < len; i ++){
int j = i;
min = a[i];
while(j > 0 && a[j - 1] > min){
a[j] = a[j - 1];
j = j - 1;
}
a[j] = min;
}
}
Quick Sort
best: O(N * lgN), avg: O(N * lgN), worst: O(N^2)
there are many versions of QuickSort, which is one of the most popular sorting methods due to its speed O(N * lgN) average, but O(N^2) worst cae, here's a few:
- pick a 'pivot' item
- partition the other items by adding them to a 'less than pivot' sublist, and 'larger than pivot' sublist
- the pivot goes between the two sublists
- repeat the quicksort on the two sublists until you get to a sublist of size 1(which is sorted)
the most important in QuickSort is partition function
int partition(int *a, int low, int high){
int pivot = a[i];
int i = low;
int j = high;
while(i < j){
while(j > i && a[j] >= pivot)
j --;
if(j > i)
a[i] = a[j];
while(i < j && a[i] <=pivot)
i ++;
if(i < j)
a[j] = a[i];
}
a[i] = pivot;
return i;
}
int quick_sort(int *a, int i, int j){
if(i < j){
int pivot = partition(a, i, j);
quick_sort(a, i, pivot - 1);
quick_sort(a, pivot + 1, j);
however, the algorithms in the partition above has a defects, the time complexity can be O(N^2), we will elaborate how it happens now. If a bad pivot is chosen, you can imagine that the 'less than pivot' sublist is always empty, that means we are only creating a sublist of one item smaller each time, which gives us O(N^2)behavior in the worst case. If you choose the first item, it may be the smallest item in the sorted list and give worst-case behavior. You can choose a random item, or median-of three(front,middle,rear).
an improved partition function in QuickSort :
void compare(int *a, int *b){
if(*a > *b){
int tmp;
tmp = *a;
*a = *b;
*b = tmp;
}
}
void swap(int *a, int *b){
int tmp = *a;
*a = *b;
*b = tmp;
}
void partition(int *a, int low, int high){
int i = low;
int j = high;
int middle = (i + j) / 2;
compare(&a[i], &a[middle]);
compare(&a[i], &a[j]);
compare(&a[middle], &a[j]);
swap(&a[i], &a[middle]);
int picot = a[i];
while(i < j){
while(j > i && a[j] >= picot)
j --;
if(j > i)
a[i] = a[j];
while(i < j && a[i] <=pivot)
i ++;
if(i < j)
a[j] = a[i];
}
a[i] = pivot;
return i;
}
Merge Sort
best/avg/worst: O(N * lgN)
Merge sort is a divid-and-conquer algorithm based on the idea of breaking down a list into several sublists until each sublists consists of a single element and merging those sublists to a sorted list.
- divide the unsorted lists into N sublists, each containing 1 element.
- take adjacent pair of two singleton lists and merge them to from a list of 2 elements, N will now convert into N / 2 lists of size 2.
- repeat the process till a single sorted lists of obtained.
void merge(int *a, int low, int middle, int high){
int left_low = low;
int left_high = middle;
int right_low = middle + 1;
int right_high = high;
int k = 0;
int *tmp = (int *)malloc(sizeof(int) * (high - low + 1));
for(k < high; left_low < left_high && right_low < right_high; k ++){
if(a[left_low] < a[right_low])
tmp[k] = a[left_low ++];
else
tmp[k] = a[right_low ++];
}
if(left_low < left_high){
for(int i = left_low; i < left_high; i ++)
tmp[k ++] = a[i];
if(right_low < right_high){
for(int i = right_low; i < right_low; i ++)
tmp[k ++] = a[i];
for(int i = 0; i < high - low + 1; i ++)
a[low + i] = tmp[i];
free(tmp);
}
void merge_sort(int *a, int low, int high){
if(low < high){
int middle = (low + high) / 2;
merge_sort(a, low, middle);
merge_sort(a, middle + 1, high);
merge(a, low, middle, high);
}
}
Heap Sort
best/avg/worst: O(N * lgN)
Heaps can be used in sorting an array, in max-heaps, maximum element will always be at the root, Heap sort uses this property of heap to sort the array. Adding all items into a heap, Pop the largest item from the heap and insert it at the end(final position), and repeat for all items.
time complexity analysis:
Creating the heap needs O(N * lgN).Poping items from heap is O(1), and adjust the heap after pop needs O(lgN), there are N pops, so there is another O(N * lgN) factor, which is O(N * lgN) overall.
Heap sort has O(N * lgN) behavior, even in the worst case, making it good for real-time applications.
void adjust_heap(int *a, int root, int len){
int left_child = 2 * root + 1;
int right_child = 2 * root + 2;
int max_child;
if(left_child >= size)
return;
else{
if(righ_child >= size)
max_child = left_child;
else
max_child = a[left_child] > a[right_child]? left_child, right_child;
if(a[root] < a[max_child]){
int tmp = a[root];
a[root] = a[max_child];
a[max_child] = tmp;
adjust_heap(a, max_child, len);
}
}
}
void build_heap(int *a, int len){
for(int i = len / 2 - 1; i >=0; i --)
adjust_heap(a, i, len);
}
void heap_sort(int *a, int len){
build_heap(a, len);
for(int i = len - 1; i >=0; i ++){
int tmp = a[0];
a[0] = a[i];
a[i] = tmp;
adjust_heap(a, 0, i);
}
}
Counting Sort
best/avg/worst: O(N)
Assuming the data are integers, in a range of 0-K. Creating an array of size K to keep track of how many items appears(3 items with value 0, 4 items with value 1, etc.). Given this count, you can tell the position of an item —— all the 1's must come after the 0's, of which these are 3. Therefore, the 1's start at item #4. Thus, we can scan the items and insert them into their proper position. In Counting Sort, the frequencies of distincts of the array to be sorted is counted and stored in an auxiliary array, by mapping its as value of the auxiliary array.
- Creating the auxiliary array is O(N)
- Inserting items into their proper position needs O(N)
void counting_sort(int *a, int len){
// find the maximum value
int i;
int max = 0;
for(i = 0; i < len; i ++){
if(a[i] > max)
max = a[i];
// initialize auxiliary array
int auxiliary[max];
for(i=0; i < max; i ++)
auxiliary[i] = 0;
for(i = 0; i < len; i ++)
auxiliary[a[i]] ++;
int sorted_a[len];
int j = 0;
for(i = 0; i <= max; i ++){
counts = auxiliary[i];
while(count --){
sorted_a[j] = i;
j ++;
}
}
for(i = 0; i < len; i ++){
a[i] = auxiliary[i];
}