读取url_list文件批量下载网页
url_list
http://www.tianyancha.com/company/2412078287
http://www.4399.com/special/1.htm
http://www.we7.cc/
http://kongzhong.tmall.com/
http://dianying.2345.com/
http://www.takefoto.cn/viewnews-1521788.html
http://www.x4jdm.com/bf/429-1-1.html
http://www.douyu.com/546715
http://www.zjedu.gov.cn/default.html
http://dl.xunlei.com/
download.sh
#!/bin/bash
for line in $(cat $)
do
id=$(echo $line| getid | awk '{$id=10000+$1;print $id;}')
echo $line | gethtmlfile $id > "./result/"${id}".html"
done
运行:
[spider@zhangsuosheng]$ chmod +x ./download.sh
[spider@zhangsuosheng]$ ./download.sh url_list
1、sh文件格式
http://www.runoob.com/linux/linux-shell.html
2、bash按行读取文件+bash读取命令行参数
测试文件:url_list_zss
[spider@zhangsuosheng]$ cat url_list_zss
cccccc
ddddddddd
aaaaaa
正确写法:
#!/bin/bash
for line in $(cat $)
do
echo $line
done
[spider@zhangsuosheng]$ chmod +x ./download.sh
[spider@zhangsuosheng]$ ./download.sh url_list
cccccc
ddddddddd
aaaaaa
不合适的写法:
#!/bin/bash
for line in 'cat $1'
do
echo $line
done
[spider@zhangsuosheng]$ chmod +x ./download_testhtml.sh
[spider@zhangsuosheng]$ ./download_testhtml.sh url_list_zss
cat $
https://www.jb51.net/article/122918.htm
3、读取命令行参数
https://blog.csdn.net/qq_30145093/article/details/78191941
https://blog.csdn.net/ruidongliu/article/details/9717905
4、加法运算
用的awk
5、读取管道中的标准输入 直接读/xargs
https://www.cnblogs.com/wangqiguo/p/6464234.html
6、变量赋值
https://blog.csdn.net/lemontree1945/article/details/79126819
7、字符串拼接
https://www.jb51.net/article/44207.htm