How Airbnb Uses Data Science to Improve Their Product and Marketing

As well-known as it is today, Airbnb had quaint beginnings. Unable to afford the rent for their San Francisco loft, founders Brian Chesky and Joe Gebbia turned their living room into a mini bed and breakfast, hosting three guests from a local sold-out trade show where lodging was scarce. The original design for AirBed and Breakfast provided temporary living quarters, breakfast and business networking opportunities for those who were unable to book a place to stay for local events due to demand.

Having grown quickly from a niche site providing accommodations for high profile events, Airbnb turned the hospitality and travel industry on its head and generated a great deal of press and brand recognition in the process. Since its humble beginnings, Airbnb has made no secret of its heavy use of data science to build new product offerings, improve its service and capitalize on new marketing initiatives. Here’s how they do it – and what you can learn from them:

Data is the Voice of the Customer, Data Science is the Interpretation of that Voice

Riley Newman, former head of data science at Airbnb, explains that the company looks at data as the voice of the customer, and data science as the interpretation of that voice. What’s more, Airbnb data scientists are not sitting around, holed up in their cubicles poring over spreadsheets. Instead, they’re actively engaged and organized to partner directly with engineers, designers, product managers and others on various teams.

Data Tackles Diversity

Airbnb uses data to not only improve their service and search, but their hiring practices and customer groups as well. They’ve actively looked to hire female data scientists and take great strides to ensure that there is no unconscious bias in their hiring practices. Much in the way one would approach conversion optimization, they looked at the top of their hiring funnel and found that, historically, about 30% of their applicants were women. That meant that the opportunity to include a more diverse workforce was ripe.

But simply “hiring more women” wasn’t enough. Traditionally, the job of “data scientist” isn’t something young girls dream of. There are plenty of opportunities sprouting for girls who code, as well as women engineers, but very little in the way of data science. So Airbnb created it through a series of community events and talks. Women from all types of data science backgrounds were invited to speak, collaborate and mingle. The events were sold out.

However, they still weren’t done. They continued to scrutinize their interviewing process to ensure that applicants weren’t just a match analytically and communicatively, but culturally as well. They do this through a set of one-on-one conversations, a presentation and a take-home challenge. For the conversational point, the conversation was strong, but very different for both the presentation and the take-home challenge.

Much as a company would analyze their customer journey to improve conversions, Airbnb took it upon itself to look at every point in the hiring journey, from candidates displaying poor or “junior” communication in front of a panel of all-male data scientists scrutinizing their approach and making them nervous, to potential bias in grading the take-home challenge because of subjective views of “success”.

airbnb-womenAirbnb credits diverse hiring as a key motivator in its product growth(Image Source)

The end result was that they were not only able to add more female data scientists to their rosters, but the quality and experience of applicants was also greatly improved.

Of course, it’s one thing to apply conversion optimization practices to people, and another thing entirely to apply them to products.

Or is it really so different?

Improving Search Using Data

At the heart of the Airbnb site is its search. Carefully tuned, its search has been designed to inspire, amaze and delight customers at every step. But it wasn’t always such a walk in the park. Originally, Airbnb didn’t know what kind of data to give customers, so it settled on a model which returned the highest quality listings within a certain radius based on the user’s search.

As more users came to the site and Airbnb acquired more data, it was able to replace its basic search with a more user-data driven one. Newman explains:

“…[W]e decided to let our community solve the problem for us. Using a rich dataset comprised of guest and host interactions, we build a model that estimated a conditional probability of booking in a location, given where the person searched. A search for ‘San Francisco’ would thus skew toward neighborhoods where people who also search for San Francisco typically wind up booking, for example the Mission District or Lower Haight.”

You can read how their search models have evolved on Airbnb’s own blog here.

Airbnb also used data to tailor the search experience demographically. It noticed back in 2014 that users from certain Asian countries typically had a high bounce rate when visiting the homepage. Analyzing the data further, they discovered that users would click the “Neighborhood” link, start browsing photos and then never come back to book a place.

The data scientist who discovered the problem showed it to the engineering team, who created a redesigned version for users from those countries; replacing the Neighborhood links with the top traveling destinations in China, Japan, Korea and Singapore. As a result, they saw a 10% lift in conversions from users from those countries.

Using Data to Determine Host Preferences

The premise behind Airbnb is simple enough: match people who are looking for accommodations to those wanting to rent out their place. One of the data scientists, Bar Ifrach, who works at Airbnb discovered the site through a friend. His friend offered a nice apartment for people to stay in while he was traveling during grad school break. The friend wanted to fit as many bookings as possible during the 1-2 weeks while he was away. He would accept or reject applicants based on how much it would help him maximize his occupancy.

Ifrach remembered this particular scenario and used it to create a miniature research project of his own: asking the question:

What Affects Hosts Acceptance Decisions?

Obviously, not every host will want to take the same approach as Ifrach’s friend, but those that do will try to avoid check-in and checkout gaps as noted below:

checkin-gapsImage Source

In applying what Ifrach learned from his friend to the host base as a whole, Ifrach found that hosts were more accepting of requests that fit into their calendar while minimizing those gaps:

host-gapsImage Source

But did this kind of information apply to every market? Or did big and small markets differ? The results were quite surprising:

preference-differencesImage Source

Throw in host’s specific (personal) preferences for last minute versus plenty-of-notice notifications and what started as a small research project turned into a full-blown machine learning algorithm. Ifrach partnered with an Airbnb engineer to create an application that would essentially personalize results based on both host and guest preferences to ensure a more accurate fit.

Here, Airbnb data scientists looked at everything from hosts’ prior acceptance and decline decisions to the particulars of the trip itself. Rather than clutter the algorithm with too much noise, they created their own set of filters and applied them using a flow chart like the one below:

preference-flowImage Source

In order to test how it worked, they conducted an experiment that used probability as well as a ranking algorithm that took other preferences into consideration. The main goal in this case was to check the likelihood that a guest requesting accommodation would get a booking. As a result of applying these new filters and preferences, they enjoyed a nearly 4% lift in booking conversion as well as a considerably significant increase in the number of successful matches of guests and hosts – a win-win for everyone.

Creating the “Airbnb Experience”

The real “meat and potatoes” of data science converges as part of the “Airbnb Experience” – when guests are traveling to their host, being welcomed, settling in and exploring. These are the things that can make or break a user experience with the site, and are incredibly valuable to Airbnb itself in terms of understanding the quality of the trip.

They measure this experience through the use of a Net Promoter Score or NPS, a customer loyalty metric introduced back in 2003. The Net Promoter Score asks, in essence, “How likely are you to recommend Airbnb?”

Because Airbnb wants the “likelihood to recommend” to make accurate predictions, they control for other parameters too, including:

  • Overall review score and responses to review subcategories on a scale from 1-5.
  • Guest acquisition channel (organic or marketing campaigns)
  • Trip destination
  • Guest origin
  • Previous bookings from the guest on Airbnb
  • Trip length
  • Number of guests
  • Price per night
  • Month of checkout (to account for seasonality)
  • Room type (entire home, private room, shared room)
  • Other listings owned by the host

Airbnb acknowledges that other kinds of loyalty may be in play (like word of mouth referrals) that they cannot account for. Because reviews themselves are so important to the overall Airbnb experience, the company wanted to determine if their Net Promoter Score (“likelihood to recommend”) increased rebooking better than reviews.

In this case, predictive accuracy plus likelihood to recommend, plus review subcategories were tested to see how accurate they were. These were the results:


As a result of this study, Airbnb found that post-trip reviews (including the likelihood to recommend) only marginally improved their ability to predict when users would rebook. Although reviews do much more than just potentially predict rebooking numbers, and there are other factors from the Net Promoter Score not mentioned here, the data science discovered that predicting rebooking using these categories and the Net Promoter Score was only marginally improved at best.

In this particular case, were it not for data scientists and other team members delving in to do the research on the accuracy of using reviews and the Net Promoter Score to forecast future bookings, Airbnb would never have known if the prediction could have added to their improved guest experience and thus, their revenue – yet another example of data science helping to save hours of time and money, even if things don’t ultimately work out as intended.

Split Testing to Tweak the Process

Like all smart, cutting-edge companies, Airbnb also makes liberal use of split tests. They call these “experiments” and conduct them regularly at every stage of development from conceptualization to completion and beyond. In many cases, however, it’s difficult to tell just how much of an impact a particular product or product change had.

airbnb-experimentsIt’s difficult to tell what effect this product launch had(Image Source)

Airbnb has its own internal A/B testing framework rather than using an out of the box solution, since there are some aspects of their business model and customer experience that make it more involved than simply changing the color of a button and measuring what happens.

For example, users can browse Airbnb whether they’re logged in or not. This can make it a challenge to tie actions to a particular user. It’s also possible for them to browse on their mobile device, then come home and complete the booking process on their home computer.

Furthermore, a successful booking depends on the guest’s request (and inventory) and how responsive the host is – things that are beyond Airbnb’s control.

And although they’ve simplified the process quite a bit, their booking process is still quite complex. Airbnb mainly looks at the conversion rate between searching and finally booking – even though there are several steps in between:

airbnb-bookingImage Source

Much of what constitutes a “conversion” in this case is a guest looking for a place to stay in a specific area, and a host setting a price and the two coming together to agree and take care of the necessary formalities. There are a lot of little road bumps inherent in a process like this, which is why experiments are so crucial.

In another example, Airbnb (which provides professional photo services to hosts) felt that users would have a better experience if listings were made available as beautiful, full-color photos in the search results:

airbnb-before-after-location-listingImage Source

In testing this new design, they discovered that it broke a crucial click-through action in some older versions of Internet Explorer (to the surprise of absolutely no one). In fixing that issue, they were able to continue with the testing as well as uncover many more important lessons on how change impacts various user groups.

In cases like this, they must also take great care not to infer changes where there are none, or let results bias negatively influence their decisions. You can read the full design implementation blog post here.

Looking Toward the Future

Because Airbnb is using data to constantly improve itself, is also forging into new frontiers where laws and regulations have yet to catch up. Case in point, the launch of its Price Tips feature a little over a year ago. With Price Tips, a host can look at the calendar to see which dates are likely to be booked at their current price, as well as which aren’t, and get suggestions.

Seems simple enough, right? But Price Tips pulls information from five BILLION training points as well as leveraging machine learning and personal inputs to create its data. A lot of trends are easy to recognize, such as big events like SXSW which can raise prices citywide. Other factors like amenities and even specific neighborhoods can also influence demand.

To help stay abreast of these changes and potentially earn hosts (and Airbnb) themselves more money, the company developed Aerosolve, an open source machine-learning system that detects patterns and attempts to use these to see why certain listings command higher prices.

In one example shared on Forbes, the Aerosolve model highlighted listings at a specific location were commanding good prices and were also using the word “sabbia”. The location in particular was Playa del Carmen, a resort town in Mexico, and “sabbia” is the Italian word for “sand” – something that Airbnb can recommend to hosts in the form of a price tip.

Of course, it’s one thing to recommend prices and another entirely to control them. Certain “on demand” marketplaces, like Uber and Lyft have their hosts (or drivers in this case) set pricing, a blurry legal move that Airbnb has shied away from up to this point. But any savvy business owner would be remiss if they didn’t see the strategy in Airbnb wanting to squeeze every drop of potential from its listings.

What You Can Learn from Airbnb’s Data Science

All of these examples where Airbnb proved, disproved or otherwise discovered trends or surprising results aren’t meant to overwhelm you. Instead, it’s meant to illustrate the importance of having this raw information to learn from. When used correctly, and in partnership with many other departments in your company, data science can be used as a springboard to create new hypotheses, test new ideas and improve on existing ones.

Embracing the science behind the data means not being afraid to dig deeper or even forge your own questions when faced with a challenge that regular tests can’t solve. But above all, it is meant to help inspire you and remind you that a successful company is never content to rest on its laurels – it’s always learning, adapting and growing – fueled by data and science.

About the Author: Sherice Jacob helps business owners improve website design and increase conversion rates through compelling copywriting, user-friendly design and smart analytics analysis. Learn more at and download your free web copy tune-up and conversion checklist today!