Sort Alogorithm
Notes: 大部分动画图片来自于 https://github.com/MisterBooo/Article
Insertion Sort (插入排序)
- 原理
- 算法
- 将整个数组分为已排序和未排序的部分(初始时将数组第一个元素当做已排序的部分);
- 每次选择未排序部分的第一个元素,将它插入到已排序部分的“合适位置”;
- 不断重复步骤2,直到整个数组有序。
- 代码
void insertionSort(int array[], int length, int gap, bool isAscend)
{
int temp, j;
// i always points to the first position of unsorted subarray.
for (int i = gap; i < length; i++)
{
temp = array[i];
// j's initial value is always the end of sorted subarray, and its direction of movement is from back to front.
for (j = i - gap; j >= 0 && (isAscend ? temp < array[j] : temp > array[j]); j -= gap)
{
array[j + gap] = array[j]; // right shift.
}
array[j + gap] = temp; // insert the element i into the proper position.
}
}
void simpleInsertionSort(int array[], int length, bool isAscend)
{
// Simple inserion sort, the gap is 1.
insertionSort(array, length, 1, isAscend);
}
算法稳定,平均时间复杂度:O(n²)
Shell Sort(希尔排序)
- 原理
算法
希尔排序是特殊的插入排序,通过不断调整步长(gap)来分成不同的组,每一组都进行简单的插入排序,最后达到整个数组有序。代码
void shellSort(int array[], int length, bool isAscend)
{
// The initial value of gap is length / 2, then it will divide by 2 each cycle.
for (int gap = length >> 1; gap > 0; gap >>= 1)
insertionSort(array, length, gap, isAscend);
}
算法不稳定,时间复杂度:与gap有关,最好情况为O(n*log(n))
Bubble Sort(冒泡排序)
- 原理
- 算法
- 每次都与后继的元素比大小,若是不符合排序顺序,则进行交换;
- 每一轮交换后,缩减冒泡排序的区间,因为冒泡排序的有序子数组是从后往前增长的,后面的有序子数组无需进行比较。
- 若是有一轮所有元素都未发生交换,说明数组已经有序,可以结束循环。
- 代码
void bubbleSort(int array[], int length, bool isAscend)
{
bool isSortedFlag = false;
for (int i = length - 1; i >= 0; i--)
{
isSortedFlag = false;
for (int j = 0; j < i; j++)
{
if (isAscend ? array[j] > array[j + 1] : array[j] < array[j + 1])
{
// exchange the value between array[j] and array[j + 1]
array[j] = array[j] ^ array[j + 1];
array[j + 1] = array[j] ^ array[j + 1];
array[j] = array[j] ^ array[j + 1];
isSortedFlag = true;
}
}
if (!isSortedFlag)
{
break;
}
}
}
算法稳定,平均时间复杂度:O(n²)
Selection Sort(选择排序)
- 原理
- 算法
- 确定一个待交换的元素位置;
- 从该元素后序的数组中选择一个最大或最小的元素与之交换;
- 待交换元素的位置+1,重复步骤2,有序数组从前往后增长到整个数组长度length - 1 为止。
- 代码
void selectionSort(int array[], int length, bool isAscend)
{
int index = 0; // Store the index of the value when it's max or min.
// i points to the first element of unsorted subarray.
for (int i = 0; i < length - 1; i++)
{
index = i;
// j is used to traverse the unsorted subarray.
for (int j = i + 1; j < length; j++)
{
if (isAscend ? array[index] > array[j] : array[index] < array[j])
{
index = j;
}
}
// Find the max or min value and exchange it with i.
if (i != index)
{
array[i] = array[i] ^ array[index];
array[index] = array[i] ^ array[index];
array[i] = array[i] ^ array[index];
}
}
}
算法不稳定,平均时间复杂度:O(n²)
Merge Sort(二路归并排序)
- 原理
- 算法
- 递归地将数组不断分为左右两个部分,直到每个部分只有一个元素;
- 分别从两个子数组中按排序顺序将元素取出合并到一个大数组中,此时,大数组里的元素已经有序;
- 将大数组当做步骤2中的子数组,继续执行步骤2,整个递归执行完后,排序完毕。
- 代码
void merge(int array[], int start, int mid, int end, bool isAscend)
{
int* temp = (int *)calloc(end - start + 1, sizeof(int)); // temp array is for the merge operation from two devided parts
int index1 = start, index2 = mid + 1, tempIndex = 0;
// Merge the two parts into temp array in some order.
while (index1 <= mid && index2 <= end)
{
if (array[index1] <= array[index2])
temp[tempIndex++] = isAscend ? array[index1++] : array[index2++];
else
temp[tempIndex++] = isAscend ? array[index2++] : array[index1++];
}
// If right part isn't completed, merge the rest into the temp array.
while(index1 <= mid)
temp[tempIndex++] = array[index1++];
// If left part isn't completed, merge the rest into the temp array as well.
while (index2 <= end)
temp[tempIndex++] = array[index2++];
// Copy the sorted temp array into the actual array.
for (index1 = start, tempIndex = 0; tempIndex < end - start + 1;)
array[index1++] = temp[tempIndex++];
free(temp);
}
void division(int array[], int start, int end, bool isAscend)
{
if (end > start)
{
int mid = (start + end) >> 1;
// Divide left part.
division(array, start, mid, isAscend);
// Divide right part.
division(array, mid + 1, end, isAscend);
// Though it has no regularity when dividing into two parts, there is a rule that when merging two parts, the beginning of two parts both are the min(max) element, and the end of two parts are the max(min) element in their respective parts. If the end of right part is less than the beginning of the left, means these two parts have been sorted, don't need to do merging operation, or else carry out the merge function.
if (isAscend ? array[mid] > array[mid + 1] : array[mid] < array[mid + 1])
merge(array, start, mid, end, isAscend);
}
}
void mergeSortRecursion(int array[], int length, bool isAscend)
{
if (length > 0)
division(array, 0, length - 1, isAscend);
}
算法稳定,平均时间复杂度:O(n*log(n))
Quick Sort(快速排序)
- 原理
- 算法
- 首先确定pivot,如果pivot不是第一个元素,则让pivot与第一个元素交换(方便后续编程);
- 定义左指针left,右指针right,初始时分别指向第一个元素与最后一个元素;
- right指针首先与pivot比较并且向左移动,移动到合适位置(如果是从小到大排序,则是遇到比pivot小的元素所在的位置)时,与pivot进行交换,然后换成left指针与pivot进行比较并向右移动,同样移动到合适位置(如果是从小到大排序,则是遇到比pivot大的元素所在的位置)时,再次与pivot进行交换,然后继续让right指针移动……直到两指针相遇,相遇位置则为此趟排序的pivot的所在位置;
- 此时pivot将数组分为了左右两部分,左边部分的元素都比pivot小,右边元素都比pivot大,对左右两部分重复步骤1~3,最后可得排序完毕的序列。
注意:pivot的选取会影响子数组的划分是否均匀(即 pivot的左右部分是否大致相等),会直接影响快速排序的效率,因此本算法采取的办法是:选取开始,中间以及结尾的三个元素,取三者的中位数来作为 pivot,从概率上减少因 pivot选取不当导致左右两部分差异巨大进而造成效率低下的问题。
- 代码
// Find the median
int findMidVal(int a, int b, int c)
{
int max = a > b ? a : b;
int min = a < b ? a : b;
max = max > c ? max : c;
min = min < c ? min : c;
return a + b + c - max - min;
}
int partition(int array[], int start, int end, bool isAscend)
{
// Find the proper pivot.
int mid = (start + end) >> 1;
int pivotValue = findMidVal(array[start], array[mid], array[end]);
int pivotIndex = (pivotValue == array[start]) ? start : (pivotValue == array[mid] ? mid : end);
int left = start, right = end;
// Swap the pivot with the first element because it's convenient for the code implementation if the pivot is the first element. According to the following code, the first element will be overwritten by the proper right element, at that time, the right element will appear twice, and then it will also be overwritten by other left element...all elements have chance to appear twice except for the first one, on account of that, if the pivot is the first one, the first element can appear not only in array[0], but also in pivot, and don't need to worry about the overwritten problem. Eventually, the pivot can overwrite the element where the left and the right index point simultaneously.
if (pivotIndex != start)
{
array[start] = array[start] ^ array[pivotIndex];
array[pivotIndex] = array[start] ^ array[pivotIndex];
array[start] = array[start] ^ array[pivotIndex];
pivotIndex = start;
}
while (left < right)
{
while (left < right && (isAscend ? array[right] >= pivotValue : array[right] <= pivotValue))
right--;
if (left < right)
array[left++] = array[right];
while (left < right && (isAscend ? array[left] <= pivotValue : array[left] >= pivotValue))
left++;
if (left < right)
array[right--] = array[left];
}
array[left] = pivotValue;
return left;
}
void quickSort(int array[], int start, int end, bool isAscend)
{
if (end > start)
{
int pivotIndex = partition(array, start, end, isAscend);
if (pivotIndex > start)
quickSort(array, start, pivotIndex - 1, isAscend);
if (pivotIndex < end)
quickSort(array, pivotIndex + 1, end, isAscend);
}
}
算法不稳定,平均时间复杂度:O(n*log(n))
Heap Sort(堆排序)
- 原理
“堆”实际上是一棵完全二叉树,最大堆就是所有父结点都大于子节点,且根结点是最大结点的完全二叉树,如果我们用数组来存储这棵二叉树,则该数组将会满足如下关系:
- 若某元素为父结点root,则该元素的左孩子结点下标lChild = 2 * root + 1,右孩子结点下标rChild = 2 * root + 2;
- 该数组的最后两个元素是这棵二叉树的最后两个叶子结点;
- 由上述性质可知,该二叉树最后一个“父结点”的坐标为:(数组长度length / 2) - 1。
- 算法
- 从该二叉树最后一个父结点开始递归地建“堆”(即堆的建立是从下往上不断通过堆的调整而建立的,而堆的调整是从当前结点往下递归调整);
- 每建完一次“堆”,数组的第一个元素(二叉树的根节点)就是这些元素的最值,因此可将第一个元素与最后一个元素交换,并将需要建堆的数组长度缩小,此时堆因为根节点的变换,破坏了堆的性质,需要重新建堆;
- 反复进行步骤2,则有序数组将从后往前逐渐形成。
- 代码
// Adjust the heap to maximum heap or minimum heap.
void heapify(int array[], int length, int rootIndex, bool isAscend)
{
int extremum = rootIndex;
int lChild = 2 * rootIndex + 1;
int rChild = 2 * rootIndex + 2;
if (lChild < length && (isAscend ? array[lChild] > array[extremum] : array[lChild] < array[extremum]))
extremum = lChild;
if (rChild < length && (isAscend ? array[rChild] > array[extremum] : array[rChild] < array[extremum]))
extremum = rChild;
// if the maximum or minimum is not the root index.
if (extremum != rootIndex)
{
// Swap the value of root with the smallest(largest) one.
array[rootIndex] = array[rootIndex] ^ array[extremum];
array[extremum] = array[rootIndex] ^ array[extremum];
array[rootIndex] = array[rootIndex] ^ array[extremum];
// Recursively adjusting the heap
heapify(array, length, extremum, isAscend);
}
}
void heapSort(int array[], int length, bool isAscend)
{
// Build the heap(complete binary tree), the index of the first non-leaf node is "length/2 - 1", we choose it as the first node (which has child nodes) to recursively build heap.
for (int adjIndex = (length / 2) - 1; adjIndex >= 0; adjIndex--)
heapify(array, length, adjIndex, isAscend);
// Owing to the first element is the largest or smallest one in the heap(i.e. the root node in complete binary tree), after swapping the first element with the last element and adjusting heap each cycle, you will get the sorted array at last.
for (int i = length - 1; i > 0; i--)
{
array[0] = array[0] ^ array[i];
array[i] = array[0] ^ array[i];
array[0] = array[0] ^ array[i];
heapify(array, i, 0, isAscend);
}
}
算法不稳定,平均时间复杂度:O(n*log(n))
堆的插入与删除操作:堆的插入操作是将新结点放在堆的末端,从下往上重新调整堆;堆的删除则是对堆顶元素替换为末尾元素,并将原末尾元素删除后,从上往下进行堆的调整。
各类排序算法的总结
Name | Best | Average | Worst | Stable | Method |
---|---|---|---|---|---|
Insertion sort | n | n² | n² | Yes | Insertion |
Shell sort | n*log(n) | Depends on gap sequence | Depends on gap sequence | No | Insertion |
Bubble sort | n | n² | n² | Yes | Exchanging |
Selection sort | n² | n² | n² | No | Selection |
Merge sort | n*log(n) | n*log(n) | n*log(n) | Yes | Merging |
Quicksort | n*log(n) | n*log(n) | n² | No | Partitioning |
Heapsort | n, If all keys are distinct, n*log(n) | n*log(n) | n*log(n) | No | Selection |