在日常工作中,做很多数据处理的时候经常会遇到一些请求或数据需要重复执行多次,数据量大了很耗时,针对性看了下并发的方法,目前仅多线程,后续有多进程、多协程 更新
单线程对比多线程方法
import blog_spider
import threading
import time
def single_thread():
for url in blog_spider.urls:
blog_spider.craw(url)
def multi_thread():
print("multi_thread begin")
threads = []
for url in blog_spider.urls:
threads.append(
threading.Thread(target=blog_spider.craw,args=(url,))
)
for thread in threads:
thread.start()
for thread in threads:
thread.join()
print("multi_thread end")
if __name__ == '__main__':
start = time.time()
single_thread()
end = time.time()
print("single_thread cost:",end - start,"s")
start = time.time()
multi_thread()
end = time.time()
print("multi_thread cost:",end - start,"s")
饮水思源:代码取自网络上的视频