最简单的排序有三种:插入排序,选择排序和冒泡排序。它们的平均时间复杂度均为O(n^2),在这里对原理就不加赘述了。
贴出源代码:
插入排序:
1 def insertion_sort(sort_list): 2 iter_len = len(sort_list) 3 if iter_len < 2: 4 return sort_list 5 for i in range(1, iter_len): 6 key = sort_list[i] 7 j = i - 1 8 while j>=0 and sort_list[j]>key: 9 sort_list[j+1] = sort_list[j] 10 j =j - 1 11 sort_list[j+1] = key 12 return sort_list
冒泡排序:
1 def bubble_sort(sort_list): 2 iter_len = len(sort_list) 3 if iter_len < 2: 4 return sort_list 5 for i in range(iter_len-1): 6 for j in range(iter_len-i-1): 7 if sort_list[j] > sort_list[j+1]: 8 sort_list[j], sort_list[j+1] = sort_list[j+1], sort_list[j] 9 return sort_list
选择排序:
1 def selection_sort(sort_list): 2 iter_len = len(sort_list) 3 if iter_len < 2: 4 return sort_list 5 for i in range(iter_len-1): 6 smallest = sort_list[i] 7 location = i 8 for j in range(i, iter_len): 9 if sort_list[j] < smallest: 10 smallest = sort_list[j] 11 location = j 12 if i != location: 13 sort_list[i], sort_list[location] = sort_list[location], sort_list[i] 14 return sort_list
其中:
sort_list[i], sort_list[location] = sort_list[location], sort_list[i]
是不是觉得很奇怪?没错,这是交换两个数的做法,通常在其他语言中如果要交换a与b的值,常常需要一个中间变量temp,首先把a赋给temp,然后把b赋给a,最后再把temp赋给b。但是在python中你就可以这么写:a, b = b, a,其实这是因为赋值符号的左右两边都是元组(这里需要强调的是,在python中,元组其实是由逗号“,”来界定的,而不是括号)。
平均时间复杂度为O(nlogn)的算法有:归并排序,堆排序和快速排序。
归并排序。对于一个子序列,分成两份,比较两份的第一个元素,小者弹出,然后重复这个过程。对于待排序列,以中间值分成左右两个序列,然后对于各子序列再递归调用。源代码如下,由于有工具函数,所以写成了callable的类:
1 class merge_sort(object): 2 def _merge(self, alist, p, q, r): 3 left = alist[p:q+1] 4 right = alist[q+1:r+1] 5 for i in range(p, r+1): 6 if len(left)>0 and len(right)>0: 7 if left[0]<=right[0]: 8 alist[i] = left.pop(0) 9 else: 10 alist[i] = right.pop(0) 11 elif len(right)==0: 12 alist[i] = left.pop(0) 13 elif len(left)==0: 14 alist[i] = right.pop(0) 15 16 def _merge_sort(self, alist, p, r): 17 if p<r: 18 q = int((p+r)/2) 19 self._merge_sort(alist, p, q) 20 self._merge_sort(alist, q+1, r) 21 self._merge(alist, p, q, r) 22 23 def __call__(self, sort_list): 24 self._merge_sort(sort_list, 0, len(sort_list)-1) 25 return sort_list
1 class heap_sort(object): 2 def _left(self, i): 3 return 2*i+1 4 def _right(self, i): 5 return 2*i+2 6 def _parent(self, i): 7 if i%2==1: 8 return int(i/2) 9 else: 10 return i/2-1 11 12 def _max_heapify(self, alist, i, heap_size=None): 13 length = len(alist) 14 15 if heap_size is None: 16 heap_size = length 17 18 l = self._left(i) 19 r = self._right(i) 20 21 if lalist[i]: 22 largest = l 23 else: 24 largest = i 25 if ralist[largest]: 26 largest = r 27 28 if largest!=i: 29 alist[i], alist[largest] = alist[largest], alist[i] 30 self._max_heapify(alist, largest, heap_size) 31 32 def _build_max_heap(self, alist): 33 roop_end = int(len(alist)/2) 34 for i in range(0, roop_end)[::-1]: 35 self._max_heapify(alist, i) 36 37 def __call__(self, sort_list): 38 self._build_max_heap(sort_list) 39 heap_size = len(sort_list) 40 for i in range(1, len(sort_list))[::-1]: 41 sort_list[0], sort_list[i] = sort_list[i], sort_list[0] 42 heap_size -= 1 43 self._max_heapify(sort_list, 0, heap_size) 44 45 return sort_list
1 class quick_sort(object): 2 def _partition(self, alist, p, r): 3 i = p-1 4 x = alist[r] 5 for j in range(p, r): 6 if alist[j]<=x: 7 i += 1 8 alist[i], alist[j] = alist[j], alist[i] 9 alist[i+1], alist[r] = alist[r], alist[i+1] 10 return i+1 11 12 def _quicksort(self, alist, p, r): 13 if p<r: 14 q = self._partition(alist, p, r) 15 self._quicksort(alist, p, q-1) 16 self._quicksort(alist, q+1, r) 17 18 def __call__(self, sort_list): 19 self._quicksort(sort_list, 0, len(sort_list)-1) 20 return sort_list
1 import sys 2 sys.setrecursionlimit(99999)
1 def _randomized_partition(self, alist, p, r): 2 i = random.randint(p, r) 3 alist[i], alist[r] = alist[r], alist[i] 4 return self._partition(alist, p, r)
完整的randomize_quick_sort的代码如下(这里我直接继承之前的quick_sort类):
1 import random 2 class randomized_quick_sort(quick_sort): 3 def _randomized_partition(self, alist, p, r): 4 i = random.randint(p, r) 5 alist[i], alist[r] = alist[r], alist[i] 6 return self._partition(alist, p, r) 7 8 def _quicksort(self, alist, p, r): 9 if p<r: 10 q = self._randomized_partition(alist, p, r) 11 self._quicksort(alist, p, q-1) 12 self._quicksort(alist, q+1, r)
关于快速排序的讨论还没有结束。我们都知道,Python是一门很优雅的语言,而Python写出来的代码是相当简洁而可读性极强的。这里就介绍快排的另一种写法,只需要三行就能够搞定,但是又不失阅读性。(当然,要看懂是需要一定的Python基础的)代码如下:
1 def quick_sort_2(sort_list): 2 if len(sort_list)<=1: 3 return sort_list 4 return quick_sort_2([lt for lt in sort_list[1:] if lt<sort_list[0]]) + 5 sort_list[0:1] + 6 quick_sort_2([ge for ge in sort_list[1:] if ge>=sort_list[0]])
1 class counting_sort(object): 2 def _counting_sort(self, alist, k): 3 alist3 = [0 for i in range(k)] 4 alist2 = [0 for i in range(len(alist))] 5 for j in alist: 6 alist3[j] += 1 7 for i in range(1, k): 8 alist3[i] = alist3[i-1] + alist3[i] 9 for l in alist[::-1]: 10 alist2[alist3[l]-1] = l 11 alist3[l] -= 1 12 return alist2 13 14 def __call__(self, sort_list, k=None): 15 if k is None: 16 import heapq 17 k = heapq.nlargest(1, sort_list)[0] + 1 18 return self._counting_sort(sort_list, k)
1 def normal_find_same(alist): 2 length = len(alist) 3 for i in range(length): 4 for j in range(i+1, length): 5 if alist[i] == alist[j]: 6 return True 7 return False
这种方法的代价是非常大的(平均时间复杂度是O(n^2),当列表中没有重复元素的时候会达到最坏情况),由之前的经验,我们可以想到,利用内置sort方法极快的经验,我们可以这么做:首先将列表排序,然后遍历一遍,看是否有重复元素。包括完整的测试代码如下:
1 import time 2 import random 3 4 def record_time(func, alist): 5 start = time.time() 6 func(alist) 7 end = time.time() 8 9 return end - start 10 11 def quick_find_same(alist): 12 alist.sort() 13 length = len(alist) 14 for i in range(length-1): 15 if alist[i] == alist[i+1]: 16 return True 17 return False 18 19 if __name__ == "__main__": 20 methods = (normal_find_same, quick_find_same) 21 alist = range(5000) 22 random.shuffle(alist) 23 24 for m in methods: 25 print ‘The method %s spends %s‘ % (m.__name__, record_time(m, alist))
运行以后我的数据是,对于5000长度,没有重复元素的列表,普通方法需要花费大约1.205秒,而快速查找法花费只有0.003秒。这就是排序在实际应用中的一个例子。
文章来源:http://www.cnblogs.com/chineking/archive/2011/05/24/implement-sort-algorithm-with-python.html