Princeton Alogorithm COS226 Week9 Radix Sort

Alphabets

image.png

review: summary of the performance of sorting algorithms

image.png

we can do better if we don't depend on key compares.
assume that keys are integers between 0 and R-1


image.png

image.png

LSD Radix sort

consider character from right to left, stably sort using d th character as the key (using key-indexed counting)


image.png

image.png

image.png

How to sort one million 32 bit integers?

  1. if we use quick sort, we will cost 1.39 * log(1 million) * n = 28 N
  2. if we use LSD, we will cost 2 * 10 * n = 20N
    but the space cost quick sort wins; for stability, LSD sort wins.

MSD Radix sort

partition array into R pieces according to first character
recursively sort all strings that start with each character


image.png

treat strings as if they had an extra char at end (this char should smaller than any char)


image.png

improvement : cutoff to insertion sort


image.png

MSD vs quick sort

msd need extra space for aux[] and count[], inner loop has a lot of instructions, access memory randomly
linearithmic number of string compares
has to rescan many character in keys with long prefix matches

combine MSD and quick sort

do 3-way partitioning on the dth character
less overhead than R-way partitioning in MSD string sort
does not re-examine character equal to the partitioning char


image.png

image.png

image.png

image.png

image.png

suffix array

image.png
image.png
image.png
image.png

Manber-Myers algorithm implementation is added into extra credit

summary

we can develop linear-time sorts. key compares not necessary for string keys.

we can develop sublinear-time sorts, input size is amount of data in keys, not all of the data has to be examined
input size is Radix, not number of keys.

3-way string quicksort is asymptotically optimal.
・1.39 N lg N chars for random data.

QUIZ

Q1 2-sum

image.png

just sort with LSD, then use two pointer

Q2 American flag sort

image.png

counting sort

Q3 Cyclic rotations.

image.png

foreach string, we could build suffix string sorting array, and use the first string in array as the finger print, check the finger print have same, we could use hashset to achieve it.
if we use Ukkonen’s algorithm, we could achieve it in O(L), then the total time complexity is O(nL)
in my github, i use manber myer which time complexity is O(L * log L * n)

course quiz and project : https://github.com/yixuaz/algorithm4-princeton-cos226

最后编辑于
©著作权归作者所有,转载或内容合作请联系作者
【社区内容提示】社区部分内容疑似由AI辅助生成,浏览时请结合常识与多方信息审慎甄别。
平台声明:文章内容(如有图片或视频亦包括在内)由作者上传并发布,文章内容仅代表作者本人观点,简书系信息发布平台,仅提供信息存储服务。

友情链接更多精彩内容