Do you really need Data Scientists?

Data_Science_VDThis and many more common questions about Data Science are tackled by Instacart VP Data Science Jeremy Stanley, and former LinkedIn data leader Daniel Tunkelang. The term Data Science was only coined a decade or so ago but has gathered so much momentum that most business leaders now feel like they should have a Data Science team – even if they don’t know what they would do with them.

Jeremy and Daniel take us through some common misconceptions and recommended ways for thinking about finding real impact from Data Science. Some of my favourite lines from the article:

The above may sound a lot like data analytics, and indeed the difference between analytics and decision science isn’t always clear. Still, decision science should do more than produce reports and dashboards.

But collecting data isn’t enough. Data science only matters if data drives action.

Similarly, data-driven decision making requires a top-down commitment. From the CEO down, the organization has to commit to making decisions using data, rather than based on the highest paid person’s opinion ( or HiPPO).

Many people equate big data to data science, but size isn’t everything. Data science is about separating the signal in data from the noise.

Don’t hire a head of data or build a team until you have work for them to do. At the same time, ensure you’re collecting key data early on so that team can have an impact once you’re ready.

Build a company culture early that makes it a great place to practice data science, and you’ll reap dividends when they matter most.

Over time, the impact that a data science team has will be far higher if you build a diverse team with extremely different backgrounds, skill-sets, and world views.

Finally, focus early on hiring data scientists who reflect your company ideals. To be effective, data scientists must be trusted by their teams, the users of their products, and the decision makers they influence.

DJ Patil wants Silicon Valley to work on real problems

I was fortunate enough to attend a Q&A with United States Chief Data Scientist DJ Patil at the Commonwealth Club last week. DJ was keen to stress the challenges facing the US government and the Big Data available to help solve these problems. But noted that the talent and progress we see in technology is not being applied to ‘real’ problems.

djpatil

DJ gave examples from Law Enforcement and Health Care amongst as areas that are ripe for disruption by data and technology. He also stressed that much public data is readily available online, both at the local and national level – and he invited the Data Scientists in the audience to start hacking for social solutions!

Joined the Advisory Board for Imperial’s MSc in Business Analytics

Delighted to have officially joined the Advisory Board for Imperial College‘s MSc in Business Analytics at the Business School. This is the first year the MSc has been running and the progress so far has been phenomenal. The course is heavily subscribed by candidates from around the globe and the current class has a wonderful diversity of students and experience.

icadvisoryboard

I attended the most recent Board meeting in March and was impressed with the content and ambition of the course. We are working hard to improve the course for the next class and strongly believe the course will be a world leader in delivering data science and business analytics over the years to come. Excited to be a part of this!

We came, we saw, we hacked!

Last weekend I spent Saturday and Sunday hacking on government data at this year’s BayesImpact‘s hackathon – Bayes Hack 2016! Located at OpenDNS HQ, the event invited teams of Data Scientists, Engineers, Designers, and anyone who is interested in data to hack for 24 hours.

For those unfamiliar with ‘hacking’: the premise is basically to build something in a very very short amount of time. We call it ‘hacking’ because you have to cut corners and write some ugly code to get a product out quickly. It’s different to your normal job but very liberating!

IMG_0347

I teamed up with four other Data Scientists from Airbnb and an Engineer and we decided to look at the Department of Labour‘s database on jobs and associated skills, knowledge, education requirements. Our prompt was the following:

Economic landscapes change dramatically, often outpacing a workforce lagging in its adaptation to new opportunities and industries. How can data scientists leverage predictive modeling to close the gap?

What did we build? We broke into two teams and one team built a recommendation engine for users to enter their skills and abilities to get back job suggestions. The second team, which I worked on, built an interactive visualisation for these recommendations to enable users to explore related jobs.

airjobs_main

For each of the 954 jobs in the database, we computed the the coordinates of the job in the 35-dimensional space of skills. These skills are include: Reading Comprehension, Active Listening, Writing, etc. For each pair of jobs in this skills vector space, we computed the distance between them using the Kullback-Leibler Divergence to give us a value between 0 and 1. The smaller the distance (divergence), the more similar two jobs are in terms of the skills required to be competent in the jobs. The visualisation was made in Gephi and exported to SigmaJs.

We were one of the 8 finalists on the day but eventually lost out to the fantastic Go Bot Chat team working on the Department of Interior’s database. The project provides parks and recreations recommendations to people using a chatbot service built on top of Facebook’s Messenger.

The weekend was super fun and inspiring to see how much can be done so quickly on so much openly available data. You can see our full source code on Github and all the other projects from the competition there too.

 

Artificial Intelligence set to dominate Financial Services

Capturepwcfintechatrisk

A recent article by a Foreign-Exchange Journalist suggests the ‘Skynet’ of Finance is not too far away.

The article points  to the huge improvements in Artificial Intelligence and the bullishness of Financial Services firms for takeover by technology in their industry. In particular, Transfer/Payments business expect to lose 28% of their business to FinTech in the next 5 years, and Banks expect to lose 24% of their business.

The silver lining to this takeover could however be, the article points out, the greater emphasis on the ‘human touch’ in key customer interfacing areas. For example, a human hand at the wheel to prevent another ‘flash crash’ or a human interpreter of the decisions of an Artificial Intelligence made lending / investment decision.

Whatever happens, we are likely to see more automation, lower costs for the customer, and smarter decision making – albeit in the near term.

Hedge funds are luring away Tech’s AI superstars

hedgefundsai An arms race has resumed amongst the world’s biggest hedge funds. Seeing the potential of the technologies produced at some of the most prolific Machine Learning groups in big tech companies such as Google and Facebook, a recent article notes that hedge funds are lifting lead Data Scientists to work on building better alpha strategies.

In the past, algorithmic trading prided itself on hiring highly skilled statisticians to sculpt informative signals and combine them in a state-of-the-art model to predict movements in prices. With the success of deep learning software, such as IBM’s Watson, hedge funds now see potential in throwing their financial big data at artificial intelligence at these artificial intelligence black boxes to predict alpha.

Bridgewater hired David Ferrucci, former lead engineer at IBM for developing Watson, Renaissance Technologies was founded by Bob Mercer and Peter Brown, former language recognition leads at IBM, and recently Blackrock hired Bill MacCartney, a former Google scientist.

For these robotics rockstars moving from Tech to Finance, one downside is that there work becomes a lot more secretive. The nature of algorithmic trading is very hush hush with all hedge funds in direct competition with each other. Compared to publishing research papers at IBM or Google, the traders at these funds will have to keep their advances to themselves – which is a loss for the rest of the scientific community.