学习视频来源:崔庆才《Python3爬虫入门到精通》
Python安装
Anaconda
国内镜像:Index of /anaconda/archive/ | 清华大学开源软件镜像站 | Tsinghua Open Source Mirror
conda list,看到所有安装的包,几乎不需要额外再安装其他包
安装时用pip或conda安装都可以
官方安装
下载executable installer(64位),安装时需要添加到环境变量中(路径可自定义)
IDE开发工具
Pycharm
Ubuntu安装
sudo apt-get install python3-dev build-essential libssl-dev libffi-dev libxml2 libxml2-dev libxslt1-dev zlib1g-dev
sudo apt-get install python3
sudo apt-get install python3-pip
输入pip3,进入pip3的环境
MAC OS
homebrew
~ /usr/bin/ruby -e "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install)"
brew install python3
输入python3进入到环境中
MongoDB环境安装
Windows安装
windows选择下载2008 server版本
Server/3.4目录下新建data文件夹,进入后新建db文件夹
bin目录下,shift+右键,选择“在此处打开命令窗口”,输入mongod --dbpath C:\MongoDB\Server\3.4\data\db,启动mongodb
浏览器访问localhost:27017
bin目录下,shift+右键,选择“在此处打开命令窗口”,输入mongo,进入mongo客户单交互模式,输入db,返回test数据库
db.test.insert({'a':'b'}),插入一条数据
在data文件夹下建立logs文件夹,进入后再新建mongo.log
以管理员身份运行cmd.exe,cd C:\MongoDB\Server\3.4\bin,输入mongod --bind_ip 0.0.0.0 --logpath C:\MongoDB\Server\3.4\data\logs\mongo.log --logappend --dbpath C:\MongoDB\Server\3.4\data\db --port 27017 --serviceName "MongoDB" --serviceDisplayName "MongoDB" --install,配置mongoDB服务
查看计算机服务,可以看到MongoDB服务,右键启动
Robomongo客户端,可视化查看MongoDB数据
https://robomongo.org/download 下载
Ubuntu安装
sudo apt-get install mongodb
mongod,自动创建db文件夹
mongo,进入命令行交互模式
show dbs
use local
db.test.insert({'a':'b'}),插入数据
MAC OS
确认homebrew已经安装
brew install mongodb
自己的电脑MAC OS10.11,无法使用brew下载,该mongodb官网下载mac版
home目录下创建文件夹java,解压下载文件拷贝至该目录下
open -e .bash_profile,编辑,添加bash_profile
guoliangs-MacBook-Pro-15-inch:~ guoliang$ cat .bash_profile
# mongodb
MONGODB_HOME=/Users/guoliang/java/mongodb-osx-x86_64-3.4.19
# maven
export M2_HOME=/usr/local/apache-maven-3.2.2
export PATH=$PATH:$M2_HOME/bin:$MONGODB_HOME/bin
# java
# export JAVA_HOME=/Library/Java/JavaVirtualMachines/jdk1.8.0_162.jdk/Contents/Home
export JAVA_HOME=/Library/Java/JavaVirtualMachines/jdk1.7.0_80.jdk/Contents/Home
export CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar
# tomcat
export PATH=$PATH:/Library/tomcat/bin
# MySQL
export PATH=${PATH}:/usr/local/mysql/bin
# Scala
SCALA_HOME="/Library/scala-2.12.5/"
export PATH=$PATH:$SCALA_HOME/bin
# added by Anaconda3 5.1.0 installer
export PATH="/anaconda3/bin:$PATH"
# added by Anaconda3 4.4.0 installer
export PATH="/Users/guoliang/anaconda/bin:$PATH"
# added by Anaconda3 4.2.0 installer
export PATH="/Users/guoliang/anaconda/bin:$PATH"
# added by Anaconda3 4.2.0 installer
export PATH="/Users/guoliang/anaconda/bin:$PATH"
# added by Anaconda3 4.4.0 installer
export PATH="/Users/guoliang/anaconda/bin:$PATH"
guoliangs-MacBook-Pro-15-inch:~ guoliang$
source .bash_profile是配置生效
mongod -version查看版本
终端,进入/Users/guoliang/java/mongodb-osx-x86_64-3.4.19目录
mkdir data
mkdir log
mongod --dbpath data --logpath log/mongod.log --logappend --fork
mongo,进入命令行交互模式,同linux
Redis安装
Windows
https://github.com/MSOpenTech/redis/releases,下载Redis-x64-3.2.100.msi
https://github.com/uglide/RedisDesktopManager/releases,下载redis-desktop-manager-0.8.8.384.exe打开Redis Desktop Manager,点击“Connect to Redis Server”,Host为localhost
Ubuntu
sudo apt-get install redis-server -y
redis-cli,进入redis命令行模式
set 'a' 'b'
get 'a'
sudo vi /etc/redis/redis.conf
注释 bind 127.0.0.1,这样就可以远程访问
取消注释 requirepass foobared,这样可以设置redis连接密码,默认为foobared
sudo service redis restart
redis-cli
get 'a',会提示没有权限
redis-cli -a foobared
get 'a',可以得到正常的值
MAC OS
brew install redis
redis-cli
set 'name' 'Mike'
/usr/local/etc/redis.conf下可以修改配置文件,同Linux配置
brew services list
brew services restart redis
redis-cli
MySQL安装
Windows
百度搜索mysql,百度软件中心有mysql-5.7.17.msi下载
百度搜索mysql-front下载,Host为localhost,密码为安装mysql安装时设置的123456
Ubuntu
sudo apt-get install mysql-server mysql-client
设置密码为123456
mysql -uroot -p
show databases;
use mysql;
select * from db;
vi /etc/mysql/mysql.conf.d/mysqld.cnf
注释 bind-address
sudo service mysql restart
MAC OS
brew install mysql
mysql -uroot -p
密码为root
show databases
Python多版本共存配置
关于环境变量
Windows
where python,查到python的路径值
默认python和pip按照环境变量中排在前面的优先调用
修改python36目录下的python.exe文件名,改为python3.exe;anaconda下python.exe修改为python-conda.exe;python27目录下的python.exe改为python2.exe
同理修改pip文件名,注意pip-conda.exe -V无法正常执行,需要复制pip-script.py至pip-conda-script.py
Ubuntu、MAC OS
echo $PATH
whereis python2,whereis python3;MAC OS为which
ln -s /usr/bin/python3.5 /usr/bin/python3
ln -s /usr/bin/python2.7 /usr/bin/python2
PyCharm设置
自行选择解释器
爬虫常用库的安装
Windows
urllib re
内置,不需要安装
requests
pip install requests
selenium
pip install selenium
chromedriver
https://chromedriver.storage.googleapis.com/index.html?path=2.22/,下载chromedriver_win32.zip
把解压后的chromedriver.exe放入到python的安装目录下,如c:/python36/Scripts
要求windows电脑上chrome的版本是53.0
进入python
from selenium import webdriver
driver = webdriver.Chrome()
driver.get('www.baidu.com')
driver.page_source
phantomjs
phantomjs.org/download.html,下载phantomjs-2.1.1-windows.zip
解压并将该目录(bin目录)添加到环境变量中
进入python
from selenium import webdriver
driver = webdriver.PhantomJS()
driver.get('http://www.baidu.com')
driver.page_source
lxml
pip install lxml或lxml · PyPI 下载,pip install x:/xx/lxml-3.7.3-cp36-cp36m-win_amd64.whl离线安装。前提是必须安装pip install wheel
beautifulsoup
pip install beautifulsoup4
进入python
from bs4 import BeautifulSoup
soup = BeautifulSoup('<html></html>', 'lxml')
pyquery
pip install pyquery
进入python
from pyquery import PyQuery as pq
doc = pq('<html>Hello</html>')
result = doc('html').text()
pymysql
pip install pymysql
进入python
import pymysql
conn = pymysql.connect(host='localhost', user='root', password='123456', port=3306, db='mysql')
cursor = conn.cursor()
cursor.execute('select * from db')
cursor.fetchone()
pymongo
pip install pymongo
进入python
import pymongo
client = pymongo.MongoClient=('localhost')
db = client['newtestdb']
db['table'].insert({'name': 'Bob'})
db['table'].find_one({'name':'Bob'})
redis
pip install redis
进入python
import redis
r = redis.Redis('localhost', 6379)
r.set('name', 'Bob')
r.get('name')
flask
pip install flask
django
pip install django
jupyter
pip install jupyter
Linux MAC
pip install requests selenium beautifulsoup4 pyquery pymysql pymongo redis flask django jupyter