Along with a team of Stanford University Sociologists led by Karen Cook and Paolo Parigi, I am conducting a study on behalf of Airbnb to understand the social consequences of sharing goods and services with strangers.
Karen has published multiple books on the formation of Trust in modern societies and more recently on the role of Trust in the online world. Paolo is also interested in social networks and has conducted previous studies of Trust in the sharing economy.
Together we will be surveying Airbnb members to better understand Trust inside and outside of the sharing economy, as well as what drives changes in Trust. Stay tuned for more!
My latest Machine Learning blog post Confidence Splitting Criterions Can Improve Precision And Recall in Random Forest Classifiers is out on the Airbnb Data blog:
The Trust and Safety Team maintains a number of models for predicting and detecting fraudulent online and offline behaviour. A common challenge we face is attaining high confidence in the identification of fraudulent actions. Both in terms of classifying a fraudulent action as a fraudulent action (recall) and not classifying a good action as a fraudulent action (precision).
A classification model we often use is a Random Forest Classifier (RFC). However, by adjusting the logic of this algorithm slightly, so that we look for high confidence regions of classification, we can significantly improve the recall and precision of the classifier’s predictions. To do this we introduce a new splitting criterion (explained below) and show experimentally that it can enable more accurate fraud detection.
Have a read and let me know what you think!
I spoke at yesterday’s Social Data Revolution class on Trust & Identity at Berkeley University on behalf of Airbnb. The class also had speakers from Uber and Reddit. You can see the live recording on youtube.
I am hoping to give a talk with Eric Levine on behalf of Airbnb at next year’s SXSW Interactive conference in Austin. Please vote for our submission and leave some comments too!
Yesterday I was invited by David Webster to talk to the team at innovative design company IDEO. IDEO is a cutting edge digital and physical design studio in Palo Alto that has been leading creativity for over 30 years. I was lucky enough to have a tour by David through their workshop, engineering office, and toy lab.
After the tour we had a joint Q&A with the whole team about how big data is used at Airbnb and how it might be used more in the design process at IDEO. Some key thoughts emerged:
- The world is moving towards more wearable sensory technology e.g. Google glasses, Apple watch, Fitbit. With this comes a wealth of feedback data on the user in the offline world. The internet of things (IoT) will make, for example, A/B testing in the offline (physical) world possible.
- For designers to be more data empowered, we first need the analytics and prediction tools to catch up. Currently it is easy to log data, cheap to store data and there are standardised tools to query data. However, no leader has emerged for extracting insights from data. This democratisation of insights needs to happen before data can permeate design.
- Data science works best with design when they collaborate early. At the start of a project it is easier to scope what data is necessary and easy to collect at the outset so that decisions can be informed and iterations can be faster.
The future for Data Science in Design is exciting and, when they start to overlap more, we will see changes in the world around around us accelerate even faster.
Michael Li of the Data Incubator has written a timely article in VentureBeat on what a Data Scientist is not. In short a Data Scientist is:
- Not just a Business Analyst working on more data,
- Not just a rebranded Software Engineer,
- Not just a Machine Learning expert with no business knowledge.
A Data Scientist needs to be able to extract insights from datasets that are orders of magnitude larger than what they were 5 years ago. And they need to extract this insight carefully, with statistical significance and integrity. Moreover, the insight is only as useful as the business need it solves.
As a regular interviewer at Airbnb for junior and senior Data Scientists, attention to data cleaning and diligence in statistical analysis are fundamental for successful candidates. Moreover, we look for people that understand the ‘why’ of a problem and the business impact of a solution. This is what differentiates a really smart candidate from a hired candidate.
Read Riley Newman, head of Data Science at Airbnb, describe his experience during the past 5 years of Airbnb’s hypergrowth in today’s VentureBeat article. Learn about how he scaled the team and brought data to the top of mind in every corner of the company. There’s also a picture of my team featured!