動機
記錄用
TODO: 這能幹嘛
定義
就是一個array是map一個排名到以該數字開始的suffix
排名就是把suffix sort過的順序
像是[2,3,0,4,1]
就是
- 以2開始的suffix是第一名
- 以3開始的suffix是第二名
流程
concept(用到分塊與倍增):
- counting sort
- 分class
init:
- 做counting sort (對每個字做)
- 分class (sort過,所以只要前後字不一樣就是不同classes)
2, 4, 8 …:
- 先取前半string的位置
- 以前半的class做counting sort
- 分class (前半要在同一個class,後半也要在同一個class)
def count_sort(arr, kinds, f):
ret = [0]*len(arr)
cnts = [0]*kinds
for n in arr:
cnts[n] += 1
for i in range(1,kinds):
cnts[i] += cnts[i-1]
for i in reversed(range(len(arr))): # reversed!!
cnts[arr[i]] -= 1
ret[cnts[arr[i]]] = f[i]
return ret
def classify(arr, p, isDiffLevel):
l = [0]*len(arr)
level = 0
for i in range(1,len(arr)):
if isDiffLevel(i,i-1):
level += 1
l[p[i]] = level
return l, level+1
s = [0,0,1,0]
p = count_sort(s, len(s), list(range(len(s))))
l, levels = classify(s, p, lambda a,b: s[p[a]] != s[p[b]])
for h in filter(lambda h: (1 << h) < len(s), range(len(s))):
def isDiff(a,b):
rhs = (l[p[a]], l[(l[a]+(1<<h))% len(s)] )
lhs = (l[p[b]], l[(l[b]+(1<<h))% len(s)] )
return rhs != lhs
p_lhs = [p[i] - (1 << h) + (len(s) if p[i] < (1 << h) else 0) for i in range(len(s))]
p = count_sort([l[x] for x in p_lhs], levels, p_lhs)
l, levels = classify(s, p, isDiff)
print(p)