爱彼迎

1、数据分析背景

Inside Airbnb is an independent, non-commercial set of tools and data that allows you to explore how Airbnb is really being used in cities around the world.

By analyzing publicly available information about a city's Airbnb's listings, Inside Airbnb provides filters and key metrics so you can see how Airbnb is being used to compete with the residential housing market.

With Inside Airbnb, you can ask fundamental questions about Airbnb in any neighbourhood, or across the city as a whole. Questions such as:

  • "How many listings are in my neighbourhood and where are they?"
  • "How many houses and apartments are being rented out frequently to tourists and not to long-term residents?"
  • "How much are hosts making from renting to tourists (compare that to long-term rentals)?"
  • "Which hosts are running a business with multiple listings and where they?"

The tools are presented simply, and can also be used to answer more complicated questions, such as:

  • "Show me all the highly available listings in Bedford-Stuyvesant in Brooklyn, New York City, which are for the 'entire home or apartment' that have a review in the last 6 months AND booked frequently AND where the host has other listings."

These questions (and the answers) get to the core of the debate for many cities around the world, with Airbnb claiming that their hosts only occasionally rent the homes in which they live.

In addition, many city or state legislation or ordinances that address residential housing, short term or vacation rentals, and zoning usually make reference to allowed use, including:

  • how many nights a dwelling is rented per year
  • minimum nights stay
  • whether the host is present
  • how many rooms are being rented in a building
  • the number of occupants allowed in a rental
  • whether the listing is licensed

The Inside Airbnb tool or data can be used to answer some of these questions.

The Occupancy Model

One of the biggest issues with Airbnb is whether hosts are renting out residential properties permanently as hotels, as opposed to sharing the primary residence in which they live "occasionally".

Airbnb could easily answer this question, say, by applying their Data Science and Analytics team to the task, but instead it's up to their Public Policy team to make us feel embarrassed to be questioning the wisdom of Silicon Valley's ability to shape our communities and solve our urgent need to house tourists. Where do I sign up as a host?

Enter the occupancy model, which can be used to estimate how often an Airbnb listing is being rented out, and also approximate a listing's income. (Large numbers produced by the data for the estimated occupancy rate and income show both a behaviour AND incentive to turn residential properties into full-time hotels with Airbnb).

Inside Airbnb uses an occupancy model which we've christened the "San Francisco Model" in honor of the public policy and urban planners working for that fair city who created occupancy models to quantify the impact of Airbnb on housing.

The two models created by, respectively, Alex Marqusee for the San Francisco Planning Department; and the Budget and Legislative Analyst's Office, are detailed here:

Inside Airbnb's "San Francisco Model" uses a modified methodology as follows:

  • Review Rate of 50% is used to convert reviews to estimated bookings.
    • Alex Marqusee uses a review rate of 72%, however this is attributed to an unreliable source: Airbnb's CEO and co-founder Brian Chesky.
    • The Budget and Legislative Analyst's Office (page 49) also use a value 72% for their review rate, and in addition, introduce a higher impact model using a review rate of 30.5% - based on comparing public data of reviews to the The New York Attorney General’s report on Airbnb released in October 2014.
    • Inside Airbnb analysis found that using a review rate 30.5% is more fact based, however probably not conservative enough, given that the Budget and Legislative Analyst's Office did not take into account missing reviews because of deleted listings. A review rate of 72% is unverifiable - therefore 50% was chosen as it sits almost exactly between 72% and 30.5%.
  • An average length of stay is configured for each city, and this, multiplied by the estimated bookings for each listing over a period gives the occupancy rate
    • Where statements have been made about the average length of stay of Airbnb guests for a city, this was used.
    • For example, Airbnb reported 5.5 nights as the average length of stay for guests using Airbnb in San Francisco.
    • Where no public statements were made about average stays, a value of 3 nights per booking was used.
    • If a listing has a higher minimum nights value than the average length of stay, the minimum nights value was used instead.
  • The occupancy rate was capped at 70% - a relatively high, but reasonable number for a highly occupied "hotel".
    • This controls for situations where an Airbnb host might change their minimum nights during the high season, without the review data having a chance to catch up; or for a listing with a very high review rate.
    • It also ensures that the occupancy model remains conservative.
  • Number of nights booked or available per year for the high availability and frequently rented metrics and filters were generally aligned with a city's short term rental laws designed to protect residential housing.

2、数据来源

3、数据分析目标

4、了解数据信息

listings 数据为短租房源基础信息,包括房源、房东、位置、类型、价格、评论数量和可租时间等等。明细版中包含更多房源相关细节。

calendar 数据为短租房源时间表信息,包括房源、时间、是否可租、租金和可租天数等等。


reviews_detail 数据为短租房源的评论信息。汇总版中仅包括房源 listing_id和评论日期,用来时间序列和数据可视化分析。明细版还包括评论相关的内容和作者信息。

房屋id 该条评论代号 评论日期 评价人代号 评价人姓名 评价详情 抓取地址
listing_id id date reviewer_id reviewer_name comments listing_url
listing_id     # 房屋id
id           # 该条评价代号
date         # 评价日期
reviewer_id        # 评价人代号
reviewer_name    # 评价人姓名
comments       # 评价详情
listing_url        # 抓取地址

neighbourhoods 数据为北京的行政区划。

review

猜你喜欢

转载自www.cnblogs.com/Iceredtea/p/11965636.html