scrapy 和 django 学习笔记

scrapy startproject <projectname>
scrapy genspider -t crawl sohu2 sohu.com
scrapy crawl sis001
scrapy crawl sis001bot -o xxx.json -t json


调试语句
from scrapy.shell import inspect_response

inspect_response(response)


记录log
self.log('No item received for %s' % response.url,level=log.WARNING)


$x('xpath表达式')


关注
1、只爬10页
2、进一步过滤url


Django
django-admin startproject blog
python manage.py startapp sblog or django-admin startapp sblog
用文本编辑器编辑 settings.py urls.py views.py 三个文件




http://38.103.161.185/forum/thread-4437321-1-7.html
thread-/d{5-10}-1-/d{1-2}

猜你喜欢

转载自blog.csdn.net/s98/article/details/45479509