需要安装的工具和库
开发工具
python https://www.python.org/
pycharm https://www.jetbrains.com/pycharm/
可以直接去官网下载安装
内置基本库
urllib re
>>> from urllib.request import urlopen >>> response = urlopen("http://www.baidu.com") >>> response <http.client.HTTPResponse object at 0x1032edb38>
网络请求库
requests http://cn.python-requests.org/zh_CN/latest/
>>> import requests >>> response = requests.get("http://www.baidu.com") >>> response <Response [200]>
浏览器工具
selenium https://www.seleniumhq.org/
chromedriver
google官网:https://sites.google.com/a/chromium.org/chromedriver/downloads
淘宝镜像:https://npm.taobao.org/mirrors/chromedriver/
>>> from selenium import webdriver >>> driver = webdriver.Chrome() >>> driver.get("http://www.baidu.com") >>> driver.get("https://www.python.org") >>> html = driver.page_source
phantomjs http://phantomjs.org/
>>> from selenium import webdriver >>> dirver = webdriver.PhantomJS() >>> dirver.get("http://www.baidu.com") >>> html = driver.page_source
网页解析库
lxml http://lxml.de/
beautifulsoup4 https://www.crummy.com/software/BeautifulSoup/bs4/doc.zh/
>>> from bs4 import BeautifulSoup as BS >>> html = "<html><h1></h1></html>" >>> soup = BS(html, "lxml") >>> soup.h1 <h1></h1>
pyquery https://pythonhosted.org/pyquery/
>>> from pyquery import PyQuery as pq >>> html = "<html><h1>title</h1></html>" >>> doc = pq(html) >>> doc("html").text() 'title' >>> doc("h1").text() 'title'
数据库
mysql https://dev.mysql.com/downloads/mysql/
redis https://redis.io/
mongobd https://www.mongodb.com/
mac os 可以使用 brew 安装 https://docs.brew.sh/
数据库包:
pymysql
>>> import pymysql https://pypi.org/project/PyMySQL/ >>> conn = pymysql.connect(host="localhost", user="root", password="123456", port=3306, db="demo") >>> cursor = conn.cursor() >>> sql = "select * from mytable" >>> cursor.execute(sql) 3 >>> cursor.fetchone() (1, datetime.date(2018, 4, 14)) >>> cursor.close() >>> conn.close()
pymongo http://api.mongodb.com/python/current/index.html
>>> import pymongo >>> client = pymongo.MongoClient("localhost") >>> db = client["newtestdb"] >>> db["table"].insert({"name": "Tom"}) ObjectId('5adcb250d7696c839a251658') >>> db["table"].find_one({"name": "Tom"}) {'_id': ObjectId('5adcb250d7696c839a251658'), 'name': 'Tom'}
redis
>>> import redis >>> r = redis.Redis("localhost", 6379) >>> r.set("name", "Tom") True >>> r.get("name") b'Tom'
web框架包:
flask http://docs.jinkan.org/docs/flask/
django https://www.djangoproject.com/
jupyter http://jupyter.org/
运行:jupyter notebook
快捷键 增加一行:b
一条命令安装以上所有库
pip install requests selenium beautifulsoup4 pyquery pymysql pymongo redis flask django jupyter