Deep Learning: A Sport of Kings?

The big news in the machine learning/deep learning world this week is Google’s release of TensorFlow, their deep learning toolkit. This has prompted some to ask: why would they give away “crown jewels” for such a strategic technology? The question is best answered with a machine learning joke (paraphrased): “the winners usually have the most data, not the best algorithms”.

Neural networks have been around for a while, but it’s only been within the past 10 yrs that researchers have figured out how to train networks with many, many layers (the “deep” in “deep learning”). That research has been greatly accelerated by using GPUs as very high-performance, general purpose, vector processors. If a researcher can turn around an algorithm experiment in a day (vs 3 months), a lot more research gets done.

But as the joke suggests, it’s all about that data: you need lots and lots and LOTS of data to train a high-performance deep learning network. And Google has more data than anyone else —so they don’t worry so much about giving away algorithms.

(Also, Google, Baidu, Twitter, Facebook, etc. are investing in GPU compute clusters that can only be described as the new “mainframe supercomputers”. Sure, you can rent GPU instances on Amazon, but there’s nothing like having the latest Nvidia board with lots of RAM and very high-performance interconnect).

What does this all mean for early stage startups? The situation creates several tough hurdles: first, freely available code and technology from Google (and Facebook) enables competitors and devalues whatever the startup might develop. Second, few startups have access to a large enough proprietary data source to compete at scale. And third, GPU compute clusters need real capital.

What’s left for startups? I see at least two interesting patterns:

  • Using deep learning as a key feature to enhance another app.  Use freely available technology to add magic.  Google Photos is a great example of this, and I think every photo and video app will soon be able to recognize stuff, people, people, items, etc. to enhance the functionality.
  • “Man-teaches-machine”.  Start out with a lot of humans doing some task and capture their work to train a network.  Over time, have the network handle the common cases, with the exceptions / ambiguous cases routed to humans for resolution.  Build a large, proprietary training set, enjoy compounded interest, and profit.