scrapy自定制命令

自定制命令

  • 在spiders同级创建任意目录,如:commands
  • 在其中创建 crawlall.py 文件 (此处文件名就是自定义的命令)
  • scrapy自定制命令
     1 from scrapy.commands import ScrapyCommand
     2     from scrapy.utils.project import get_project_settings
     3 
     4 
     5     class Command(ScrapyCommand):
     6 
     7         requires_project = True
     8 
     9         def syntax(self):
    10             return '[options]'
    11 
    12         def short_desc(self):
    13             return 'Runs all of the spiders'
    14 
    15         def run(self, args, opts):
    16             spider_list = self.crawler_process.spiders.list()
    17             for name in spider_list:
    18                 self.crawler_process.crawl(name, **opts.__dict__)
    19             self.crawler_process.start()
    crawlall.py
  • 在settings.py 中添加配置 COMMANDS_MODULE = '项目名称.目录名称'
  • 在项目目录执行命令:scrapy crawlall

 单个爬虫:

import sys
from scrapy.cmdline import execute

if __name__ == '__main__':
    execute(["scrapy","crawl","chouti","--nolog"])

 

上一篇:Python之爬虫(十七) Scrapy框架中Spiders用法


下一篇:分布式爬虫—原理