SOLD OUT – Thursday 21st May – The Trust Game: How Data Scientists at Airbnb are Cracking the Trust Code

I am giving a talk about Trust at Airbnb on May 21st at Imperial College London‘s new Data Science Institute. You can see all the details here and also reserve tickets. Please join!


Imperial College London interviewed me

Last Summer Imperial College London sent a delegation led by Professor David Gann, Vice President and Head of Development & Innovation, to San Francisco and Silicon Valley.


During their trip I invited them to the offices at Airbnb to see how we thought about innovation – especially from the point of view of Data Science – and also for a catchup. Although I only studied at Imperial College for 1 year between 2005 and 2006, I have a great affinity for the university.

As part of the visit to the office, I was invited to contribute an interview for their alumni pages to hopefully get more of their current students thinking about a life in San Francisco and the tech startup scene. You can read the full transcript here.

Join me at the 2015 SF Sentiment Analysis Innovation 29-30th April

Innovation Enterprise is hosting a conference on Sentiment Analysis Innovation on April 29-30.


I will be joining a panel discussion on behalf of Airbnb on the topic of ‘Extracting Actionable Insights Using Sentiment Analysis’. It’s sure to be a great event bringing together all the big players in the field.

Check out the full schedule and register to join!

Check out my new Machine Learning blog post on Airbnb


While almost all members of the Airbnb community interact in good faith, there is an ever shrinking group of bad actors that seek to take advantage of the platform for profit. This problem is not unique to Airbnb: social networks battle with attempts to spam or phish users for their details; ecommerce sites try to prevent the use of stolen credit cards. The Trust and Safety team at Airbnb works tirelessly to remove bad actors from the Airbnb community and to help make the platform a safer and trustworthy place to experience belonging.

Missing Values In A Random Forest

We can train machine learning models to identify new bad actors (for more details see the previous blog post Architecting a Machine Learning System for Risk). One particular family of models we use is Random Forest Classifiers (RFCs). A RFC is a collection of trees, each independently grown using labeled and complete input training data. By complete we explicitly mean that there are no missing values i.e. NULL or NaN values. But in practice the data often can have (many) missing values. In particular, very predictive features do not always have values available so they must be imputed before a random forest can be trained.

Read more…

The sharing economy helps the bottom half more than the top half

sharingA paper written by Samuel Fraiberger and Arun Sundarajan of New York University about sharing economy companies such as Uber and Airbnb claims to show that

“…below-median income consumers will enjoy a disproportionate fraction of eventual welfare gains from this kind of ‘sharing economy’ through broader inclusion, higher quality rental-based consumption, and new ownership facilitated by rental supply revenues…”.

Commentators, such as Mashable, of the paper write that, while historically emphasis has been placed on the benefits to the consumer of increased access to higher quality products, this study looks at the other side of the equation and identifies short term and long term benefits to the suppliers also.

It remains to be seen if these benefits really hold in the longer term and can be sustained as momentum in the sharing economy gathers. Moreover, if these companies move towards IPO, then it is unclear how the pressure of shareholders will sit these positive effects.