爬取安居客长沙新房的位置、户型、面积等信息。 - 代码天地

爬取安居客长沙新房的位置、户型、面积等信息。

其他 2020-08-10 10:12:50 阅读次数: 0

import requests
import bs4
import time
import random
import pandas as pd
import os

house_info=[]
headers = {
    "user-agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/76.0.3809.87 Safari/537.36"
}
# for i in range(1,50):
for i in range(1,5):
    url="https://cs.fang.anjuke.com/loupan/all/p"+str(i)+"/#filtersort"

    print("开始爬取安居客平台长沙新房第%s页信息....." %(str(i)))
    response = requests.get(url=url, headers=headers)
    # response = requests.get(url=url)
    if not os.path.exists('anjukecs/'):
        os.mkdir('anjukecs/')

    with open('anjukecs/page{}.html'.format(i), 'a+', encoding='utf-8') as f:
        # print(driver.page_source.encode('utf-8'))
        f.write(str(response.text))

    #生成bs4对象
    bsoup=bs4.BeautifulSoup(response.text,'lxml')

    house_list=bsoup.find_all('div',class_="infos")

    for house in house_list:
      #bs4解析文件
       titile = house.find('a').text.strip()
       try:
           house_type = house.find('a', class_='huxing').text.replace('\t', '').replace('\n', '').strip()
       except:
           house_type = ''

       try:
           area = house.find('span', class_='building-area').text
       except:
           area = ''

       try:
           address = house.find('a', class_='address').span.text.replace("&nbsp;","").strip()
       except:
           address = ''

       pd1= pd.DataFrame({'titile': titile, 'house_type': house_type,
                 'area': area, 'address': address},index=[0])
       house_info.append(pd1)


    second=random.randrange(3,5)
    time.sleep(second)

house_info2=pd.concat(house_info)
house_info2.to_excel('cs_house_info.xlsx',index=False)

爬取安居客长沙新房的位置、户型、面积等信息。

猜你喜欢

转载自blog.csdn.net/u012410628/article/details/106659460

爬取安居客长沙新房的位置、户型、面积等信息。

安居客新房信息爬取

爬取安居客新房(urllib+bs4)

爬取安居客住房信息

PyCharm+Scrapy爬取安居客楼盘信息

安居客二手房信息爬取

scrapy实例：爬取安居客租房信息

爬取安居客上的优质业务员信息

Python爬取安居客经纪人信息

python爬取深圳安居客租房信息

爬取安居客指定市的所有小区信息

Python requests+BeautifulSoup 采集安居客_新房信息

房天下新房信息爬取

爬取安居客上住房信息的简单爬虫，并存储为表格文件

Python 如何通过网络爬虫简单爬取“安居客”网站的租房信息

爬取安居客租房数字乱码解决

python爬取安居客储存到csv或者mongo

上海安居客房源信息采集与爬取

Python爬取安居客房产经纪人信息

安居客scrapy房产信息爬取到数据可视化(上)-scrapy爬虫

Python爬取安居客房产经纪人信息采集

python爬取安居客地图页信息，并保存为csv文件

爬取链家所有地区最新房源信息

爬虫实战——房天下新房信息爬取（selenium+Chrome）

Python 使用selenium爬取房天下网站，新房房源详情信息

python爬取链家新房数据

基于SSM的安居客房产信息网站(新房二手房租房)

安居客scrapy房产信息爬取到数据可视化(下)-可视化代码安居客scrapy房产信息爬取到数据可视化(下)-可视化代码

Python爬取全国最新房价信息保存为CSV文件,进行简单的数据分析

破解安居家数字加密成功爬取

今日推荐

周排行

LRU cache算法

windows10, 自带的OpenSSH, key权限问题, 文件权限问题

测试用例书写方法

HIVE-默认分隔符的（linux系统的特殊字符）查看，输入和修改

最贵的AMD 7nm显卡来了！这设计够狂野

java多线程简单demo

[ 转载 ]在Android系统上使用busybox——最简单的方法

QT connect学习

BFSIFT算法分析

Xcode10：library not found for -lstdc++.6.0.9 临时解决

每日归档

更多

2024-08-06(0)

2024-08-05(0)

2024-08-04(0)

2024-08-03(0)

2024-08-02(0)

2024-08-01(0)

2024-07-31(0)

2024-07-30(0)

2024-07-29(0)

2024-07-28(0)