Python爬虫 - 获取美团美食数据 - 代码天地

Python爬虫 - 获取美团美食数据

其他 2018-12-23 02:56:10 阅读次数: 0

这两天接触了一下python爬虫，根据网上的一些博客写了下面的代码来抓取美团网上的美食数据，记录一下。


#from bs4 import BeautifulSoup #解析html或xml文件的库
import urllib.request
import csv
import re
import json


csv_file = open("rent.csv","w",encoding='utf-8') 
csv_writer = csv.writer(csv_file, delimiter=',')

class Spider:
	def loadPage(self,page):
		url = "http://gz.meituan.com/meishi/pn"+str(page)+"/"

		#user-Agent头
		user_agent="Mozilla/5.0 (compatible; MSIE 9.0; Windows NT6.1; Trident/5.0"
		headers = {"User-Agent":user_agent}
		req = urllib.request.Request(url,headers = headers)
		response = urllib.request.urlopen(req)
		html =str(response.read(),'utf-8')

		
		#找到商家信息的内容为：{"poiId":xxx}
		#re.S 如果没有re.S,则是只匹配一行有没有符合规则的字符串，如果没有则匹配下一行重新匹配
		#如果加上re.S,则是将所有的字符串按一个整体进行匹配
		
		pattern = re.compile(r'{"poiId":.*?}',re.S)
		item_list = pattern.findall(html)#获取数据
			
		#dictinfo = json.loads(item_list[0])#把字符串转化为字典
		
		list = []#存放数据的数组
		
		for data in item_list:
				dictinfo = json.loads(data)
				csv_writer.writerow([dictinfo["title"],dictinfo["address"],dictinfo["avgScore"],dictinfo["avgPrice"]])
		
		

if __name__ == "__main__":
	mySpider = Spider()
	
	for i in range(1,33):
		print("fecth:Page"+str(i))
		mySpider.loadPage(i)
		
	csv_file.close()

猜你喜欢

转载自blog.csdn.net/weixin_41018377/article/details/82562280

Python爬虫 - 获取美团美食数据

美团爬虫-美食

python爬虫爬取美团西安美食数据

python爬虫——美团美食店铺信息

Python爬虫实战+数据分析+数据可视化（美团美食信息）

美团西安美食部分爬虫（修改版）（python）

python爬虫---实现项目(三) Selenium分析美团美食

实战 Python 网络爬虫：美团美食商家信息和用户评论

Python爬虫练习之一：抓取美团数据

爬虫-美团各个分类的数据

Python爬虫四：美团爬虫（店铺信息抓取）

python爬取“美团美食”汕头地区的所有店铺信息

美团爬虫进阶

美团爬虫总结

python爬取美团数据

python爬虫练习爬取美团网酒店信息

python 爬虫方式获取数据

使用BeautifulSoup爬取无锡美团美食店铺数据

python爬虫获取图片

爬虫07-美团

Python爬虫爬数据

python爬虫--数据解析

Python爬虫_数据存储

Python——爬虫——数据提取

python爬虫与数据采集

Python 爬虫JD数据

【Python】爬虫数据提取

淘宝美食爬虫python3.6+selenium

美团西安酒店数据爬取（Python）

使用Python进行美团外卖数据采集的简易教程

今日推荐

周排行

vue + echart +map中国地图，省市地图，区县地图

spring boot2 (31)-cors跨域请求

『学习资料推荐』299元买的微信营销资料打包

个人学习卷积神经网络的疑惑解答

网络工程师-软考

模拟人生4 春夏秋冬、星梦起飞版更新下载方法以及常见问题

python关于对象的字符串显示str和repr以及

奇怪的session混乱问题

【3】分治法（divide-and-conquer）

Java项目开发成绩管理系统（九）各模块实现信息修改

每日归档

更多

2024-08-07(0)

2024-08-06(0)

2024-08-05(0)

2024-08-04(0)

2024-08-03(0)

2024-08-02(0)

2024-08-01(0)

2024-07-31(0)

2024-07-30(0)

2024-07-29(0)