Strata+Hadoop World 2016 in New York - Another Perspective

I was lucky enough to attend the Strata Hadoop World conference alongside Alexis Seigneurin, who wrote about his experiences here. This is my take on the conference.

Keynotes

Over the course of 2 days there were 20 different keynotes that touched upon a variety of topics. Each keynote talked about a technology or an impact data has on a field. There were fascinating keynotes from the use of data to improve Lasik eye care to determine the best candidates, to public policy that makes sure that the right data can get to the right people, and even venture capital data to examine the different trends that lead to the best investments. There also were keynotes on security and how we can use data to better keep our data secure.

One of the keynotes that resonated the most with me was by Pagan Kennedy on “The art and science of serendipity”. It resonated because as I have been progressing through my career how to come up with the latest and greatest ideas and insights always seems so daunting. She discussed the true meaning of the word “serendipity” – which is “fortunate happenstance” or “pleasant surprise”. It was a wonderful discussion on how we can discover what we are not looking for. Since the majority of the insights that are most beneficial to the community are found within the “unknown unknown”.

Technical Talks

As Alexis mentioned in his blog all the talks were recorded. These are the talks that I got the most out of. When they are published I highly suggest listening to these.

The state of Spark and what’s next after Spark 2.0 by Ram Sriharsha (Databricks)

As a current user of Apache Spark 1.6 for Spark Streaming, I was very interested to attend this talk. It was a great overview of what was added in Spark 2.0 and what will be added in subsequent releases. The concept of Structured Streaming is very beneficial to me because it could directly impact what I am working on.

How the Washington Post uses machine learning to predict article popularity by Eui-Hong Han and Shuguang Wang (The Washington Post)

This was a fascinating talk about the Washington Post’s journey on using Data Science to determine the future popularity of an article. It was not only a fascinating use case for machine learning but the speakers did a great job of walking through the miscues and how they developed the best model that allowed them to predict article popularity correctly.

Machine Intelligence at Google Scale by Kazunori Sato (Google)

This was my favorite talk that I listened to at Strata New York. It was jam packed with information from TensorFlow, the machine learning capability from Google, Google’s use cases for machine learning as well as the open API’s from Google such as Cloud Vision API, Cloud Speech API, and Cloud Natural Language API.