存在的问题:ValueError: too many file descriptors in select()
原因分析:asyncio内部用到了select,而select就是系统打开文件数是有限度的,操作系统的限制,linux打开文件的最大数默认是1024,windows默认是509,超过了这个值,程序就开始报错
解决方法1:限制并发量
1. 首先,定义一个run协程函数
async def run():
semaphore = asyncio.Semaphore(500) # 限制并发量为500
to_get = [get_data(url.format(), semaphore) for url in urls]
await asyncio.wait(to_get)
其中get_data(url,semaphore)也是一个协程函数,即限制这个协程函数的并发数量,url为待请求的链接地址
2. 在限制并发量的协程函数中添加如下语句:
async with semaphore:
如下图所示,由
async def get_data(url, semaphore):
async with aiohttp.ClientSession() as session:
async with session.get(url=myUrl, headers=headers) as response:
变成
async def get_data(url, semaphore):
async with semaphore:
async with aiohttp.ClientSession() as session:
async with session.get(url=myUrl, headers=headers) as response:
3. 在实现协程函数的实例中,添加如下语句即可
loop = asyncio.get_event_loop()
loop.run_until_complete(run())
实例参考
#coding:utf-8
import asyncio,aiohttp
url = 'https://www.baidu.com/'
async def hello(url,semaphore):
async with semaphore:
async with aiohttp.ClientSession() as session:
async with session.get(url) as response:
return await response.read()
async def run():
semaphore = asyncio.Semaphore(500) # 限制并发量为500
to_get = [hello(url.format(),semaphore) for _ in range(1000)] #总共1000任务
await asyncio.wait(to_get)
if __name__ == '__main__':
loop = asyncio.get_event_loop()
loop.run_until_complete(run())
loop.close()