3.1 MergeSort
Mergesort: java sort for objects
1. Merge sort(recursive,top-down)
-
思路:
- 将array对半分
- 递归地(recursively)将每一半各自排序
- 再将这两半合并
- 复制一个aux[]
- 两个已排序的subarray: aux[lo] ~ aux[mid] 和 aux[mid+1] ~ aux[hi],
- 分别设index:i、j,aux[i], aux[j]比大小,取小的复制回a[];若相等,将aux[i]复制回a[]
-
Performance (size ):
- worst case: compares, array accesses
- best case (input array is sorted): ~
- optimized version (也就是在sort()函数中多加一行代码,即优化2:a[mid]<=a[mid+1]): compares
- Complexity:
- 考虑compares次数:Mergesort is optimal
- 考虑space使用:Mergesort is not optimal
-
Memory(size ):
- extra memory: proportional to N
- 但一个in-place的sorting algorithm 应该只能使用 的extra memory,如insertion sort,selection sort,shellsort
-
Stability:
- sort(): stable
- merge(): stable
-
Java Implementation
public class Merge{ private static void merge(Comparable[] a, Comparable[] aux, int lo, int mid, int hi){ // assert expression(逻辑运算表达式) // 如果expression为true,表示断言成功,程序继续执行。如果为false,会抛出AssertionError assert isSorted(a, lo, mid); // precondition: a[lo..mid] sorted assert isSorted(a, mid+1, hi); // precondition: a[mid+1..hi] sorted // copy for(int k = lo; k <= hi; k++){ aux[k] = a[k]; } // merge int i = lo; int j = mid+1; for(int k = lo; k <= hi; k++){ if(i>mid){ // i超出mid,表明i所在的subarray已全部排完 // 只需把j所在的subarray的剩余部分copy回a[]即可 a[k] = aux[j++]; // 等同于两行代码:a[k] = aux[j]; j++; }else if(j>hi){ // j超出hi,与上述同理 a[k] = aux[i++]; }else if(less(aux[j],aux[i])){ // aux[j]<aux[i], 将aux[j]复制回a[] a[k] = aux[j++]; }else{ // aux[j]>=aux[i], 将aux[i]复制回a[] a[k] = aux[i++]; } } assert isSorted(a, lo, hi); // postcondition: a[lo..hi] sorted } private static void sort(Comparable[] a, Comparable[] aux, int lo, int hi){ // 递归终止条件 if(hi<=lo){ return; } // 优化1:对于比较小的array(定cutoff=7),用merge sort太浪费memory,改用insertion sort int cutoff = 7; if(hi <= lo+cutoff - 1){ Insertion.sort(a, lo, hi); // Insertion.java与Merge.java放在同一个目录下 return; } // 结束优化1 int mid = lo + (hi - lo) / 2; // 类似binary research sort(a, aux, lo, mid); sort(a, aux, mid+1, hi); // 优化2:如果上一半中最大的item也小于下一半中最小的item,那么merge就不必要了 if(!less(a[mid+1], a[mid])){ return; } // 结束优化2 merge(a, aux, lo, mid, hi); } public static void sort(Comparable[] a){ aux = new Comparable[a.length]; sort(a, aux, 0, a.length-1); } private static boolean less(Comparable v, Comparable w){ return v.CompareTo(w) < 0; } private static void exch(Comparable[] a, int i, int j){ Comparable swap = a[i]; a[i] = a[j]; a[j] = swap; } private static boolean isSorted(Comparable[] a){ for(int i=1; i<a.length; i++){ if(less(a[i], a[i-1])){ return false; } } return true; } }
优化3:将merge()函数中:循环里头的aux[]和a[]互换位置,第一个sort()函数中:sort(), merge()里的aux,a互换位置。这样可以save time(but not space)
2. Merge sort(non-recursive、buttom-up)
-
思路:
- 遍历整个array,将size=1的subarray合并起来
- 再重头开始,不断重复size=2,4,8,16...
-
Java implementation
public class MergeBU{ private static void merge(Comparable[] a, Comparable[] aux, int lo, int mid, int hi){ // assert expression(逻辑运算表达式) // 如果expression为true,表示断言成功,程序继续执行。如果为false,会抛出AssertionError assert isSorted(a, lo, mid); // precondition: a[lo..mid] sorted assert isSorted(a, mid+1, hi); // precondition: a[mid+1..hi] sorted // copy for(int k = lo; k <= hi; k++){ aux[k] = a[k]; } // merge int i = lo; int j = mid+1; for(int k = lo; k <= hi; k++){ if(i>mid){ // i超出mid,表明i所在的subarray已全部排完 // 只需把j所在的subarray的剩余部分copy回a[]即可 a[k] = aux[j++]; // 等同于两行代码:a[k] = aux[j]; j++; }else if(j>hi){ // j超出hi,与上述同理 a[k] = aux[i++]; }else if(less(aux[j],aux[i])){ // aux[j]<aux[i], 将aux[j]复制回a[] a[k] = aux[j++]; }else{ // aux[j]>=aux[i], 将aux[i]复制回a[] a[k] = aux[i++]; } } assert isSorted(a, lo, hi); // postcondition: a[lo..hi] sorted } public static void sort(Comparable[] a){ int n = a.length; Comparable[] aux = new Comparable[n]; for(int sz=1; sz<n; sz = sz+sz){ for(int lo=0; lo<n-sz; lo+=sz+sz){ merge(a, aux, lo+sz-1, Math.min(lo+sz+sz-1, n-1)); } } } private static boolean less(Comparable v, Comparable w){ return v.CompareTo(w) < 0; } private static void exch(Comparable[] a, int i, int j){ Comparable swap = a[i]; a[i] = a[j]; a[j] = swap; } private static boolean isSorted(Comparable[] a){ for(int i=1; i<a.length; i++){ if(less(a[i], a[i-1])){ return false; } } return true; } }
3. Comparator interface
-
优点
- 相比较于comparable,comparator对于给定的data type支持多种方式的ordering
-
用法:
创建一个Comparator对象
-
传给Arrays.sort的第二个argument一个自定义的order
String[] a; ... // 使用natural order Arrays.sort(a); // 使用用Comparator<String>对象中自定义的order Arrays.sort(a, String.CASE_INSENSITIVE_ORDER); Arrays.sort(a, new BritishPhoneBookOrder());
-
应用(举例:insertion sort)
public class Insertion implements Comparator{ public static void sort(Object[] a, Comparator comparator){ int n = a.length; // 向右移动pointer for(int i = 0; i < n; i++){ // j从右向左移动,a[j]和它左边较大的那个交换位置 for (int j = i; j > 0; j--){ if (less(comparator, a[j], a[j-1])){ exch(a, j, j-1) }else{ break; } } } } // item v,w比较大小 private static boolean less(Comparator c, Object v, Object w){ return c.compare(v,w) < 0; } // a[i]和a[j]交换位置 private static void exch(Object[] a, int i, int j){ Object swap = a[i]; a[i] = a[j]; a[j] = swap; } // 检验array是否完成排序 private static boolean isSorted(Object[] a, Comparator comparator){ for(int i = 1; i < a.length; i++){ if(less(comparator, a[i], a[i-1])){ return false; } } return true; } }
-
Comparator interface:implementing
- 思路:建一个nested class,该类继承接口Comparator,然后在该类中写一个compare()的方法
public class Student{ public static final Comparator<Student> BY_NAME = new ByName(); public static final Comparator<Student> BY_SECTION = new BySection(); private final String name; private final int section; ... //这里的static和上面attribute中的static表明对这个类只有这一个comparator private static class ByName implements Comparator<Student>{ public int compare(Student v, Student w){ return v.name.compareTo(w.name); } } private static class BySection implements Comparator<Student>{ public int compare(Student v, Student w){ return v.section - w.section; //这里不会产生overflow的危险 } } }
4. Stability
- 重要性:在排序时,先根据A-order排完了这组data,在此基础上,当我再根据B-order排序时,如果这个sort algorithms是not stable,我的A-order可能会因为第二次排序而打乱;但如果这个sort algorithms是stable的,当我第二次排序结束后,相同的item(基于B-order)之间还保持着原来的A-order
- Stable:insertion sort,mergesort
- Equal item never move past each other
- Not stable:selection sort,shell
- a Long-distance exchange might move an item past some equal item