My Experience at a $200 Machine Learning Conference

When I saw that the head of research at Google, an artificial intelligence researcher from Facebook, the lead machine engineer at Meetup, the CTO of Kaggle, and an eBay data scientist were among the talented people speaking at Machine Learning Conference 2017 (MLC), I signed up immediately. I was so afraid of missing out that I bought my ticket first at $217.27, and only thought to ask about special student pricing while looking at my receipt. My final total came out to $147.77.

When I got there, I could see why I had paid so much money; the whole event was in 230 Fifth Ave. Rooftop with breakfast and lunch included. One piece of advice: If it’s your first time going to a conference like this, I highly recommend getting a lot of sleep and not going with the flu like I did because not everything will interest you. The day was packed with very dense information and if you are new to data science like myself, it will require a lot of mental stamina.

I spoke to quite a few of the businesses sponsoring the event and quite a few were really eager to speak with me. My first stop was SigOpt, where I spoke with CEO, Scott Clark. He described their product as, “an optimization service that uses machine learning parameters and hyper-parameters to try to force it to grid solve”. With more research, I understood that they have a specific method of training data in a feedback loop that gives a result; you adjust the model, and retrain data until you have maximized your output. Essentially, “it allows data scientists to build better models with less trial and error“.

Another interesting conversation that I had was with a recruiter from eBay. EBay, being another big name in the data science world because of the massive amounts of information they work with, is hiring research engineers. Mar said, “eBay is looking for people who are smart, innovative, and driven.” When I asked him what level of experience hired applicants had, he offered, “most come with a lot of experience”. At that point, I felt that it was time to hear some of the panelists. While his answer was fairly vague, it is worth noting that the qualifications for being a data scientist in general are fuzzy. If you can combine your skill-set into a spectacular portfolio of case studies, then you will probably get a job.

I had two favorite talks. The first was a talk from Ross Goodwin who focused on the “narrated reality”. He passionately described how he used machine learning to create art. He showed us words and had us guess if they were real or computer-generated gibberish and proved that it was pretty hard to tell if it was computer generated gibberish. He used this to create a Twitter bot that people could tweet a word at and they received a response with a gibberish meaning. Eventually, Twitter removed his bot. “To me, it felt magical and it felt like Twitter killed my pet,” Goodwin said.

My second favorite talk was by Claudia Perlich. When talking about human predictability, she said, “in two weeks in 2012, across hundreds of different models, we saw a doubling of the median lift and, no, it does not mean the end of free will.” The median lift she is talking about is the predictability of  humans and how well data scientists can guess your traits based off of your decisions or, in this case, your clicks. Her conclusion was that bots got smart. These bots are what help websites that pay for their services to achieve a certain ranking by browsing as if they were real-life users. Prediction models were not only working with human behavior, but bot behavior and this cluttered our browsing data.  She wrapped up by saying that the people creating algorithms have the ethical responsibility to consider when predictability fails because these models are so engrained in our lives. Some examples she gave were job success based on your LinkedIn profile, where policing resources get allocated, and what ads would be most likely to show up next to your name.

Unlike other MLC headliners, Goodwin and Perlich were amazing not because they were trying to sell a product or convince us that their product was the best, but because they introduced ideas that I felt challenged the audience to think of data in a different way.

Don’t get me wrong though, many of the other panelists had very exciting and important things to say and their presenters were often fairly good, but the event felt too sponsor focused. The sponsors bought the coffee, they spoke about their company, they promoted themselves, and tried to sell us product. If it’s your first time going to a conference like this, I highly recommend getting a lot of sleep and not going with the flu like I did because not everything will interest you. The day was packed with very dense information and if you are new to data science like myself, it will require a lot of mental stamina.

As someone who is not so involved in the industry yet, I would reconsider spending so much money on one event.

If it’s your first time going to a conference like this, I highly recommend getting a lot of sleep and not going with the flu like I did because not everything will interest you. The day was packed with very dense information and if you are new to data science like myself, it will require a lot of mental stamina.

 

The Push: What I Learned About Data in Startups

Many of new businesses are relying on the wealth of data available to them to find new markets and develop targeted products. As data scientists, it is important to understand how startups are innovating ways to utilize data and how we can add to their current systems to improve them . Entrepreneur.com calls the current trend of new businesses, the “golden age of startups”. In other words, they’re going to need a lot of processing power from data scientists to succeed. Continue reading

The Push: Predicting the Future of Artificial Intelligence

When people think about the future of artificial intelligence (AI), they generally fall under two schools of thought. “The first one”, says guest speaker Dr. James Fan, “is that robots will start taking over the world and kill people and the other generally says that AI is awesome. The reality is that AI researches don’t think any of those two ways. There is a big disconnect.” Continue reading

Education and Data Science: Link Roundup

  • Math has long been overlooked by disciplines focusing on soft-skills, but this article summarizes a report as to why, “taking a cross-disciplinary approach to higher education will help prepare graduates to be not just workers, but productive overall citizens”.
  • Students at Georgia Tech created a student-run organization that helps other students explore the world of data science.
  • Are you trying to learn Python for data science? This article outlines a plan for how to learn data science in a month if you have had prior exposure to programming (taken your intro to programming course in college or equivalent).
  • The Huffington Post published an article about a new desktop application that is helping make data analysis accessible to the masses.
  • Are you thinking about getting a Master’s in data science? Forbes just created a list of their top 6 programs they’d recommend.
  • How students are using their own data collection devices to prevent the censorship of public city data.
  • Hackers gather to deliver public data to communities so their residents can be more connected.

When Data Scientists Go Hiking

Until now, all my experiences leading up to Hike with a Data Scientist have meant going to some indoor organized event where there will be one or a small panel of speakers that will speak to a larger group about a topic relating to data science. Pitched as a way to, “have fun, make friends, talk about data, and help your hiking buddy with his or her data questions,” Hike with a Data Scientist seemed like a welcome deviation from the norm. Continue reading

3 Actionable Steps Forward to Protect Public Data

As you may have seen in my post about Open Data Week 2017, Trump announced that his administration would slash funding to the Environmental Protection Agency by 31%. In an article by the Gothamist, the impacts of these changes are outlined relating specifically to New York. Two notable changes from this article that intersect with data science include:

A 50% reduction in the EPA’s Office of Research and Development, the total elimination of the clean power plan.

 

The total elimination of funding for the Clean Power Plan, international climate change programs, climate change research and partnership programs, and “related efforts,” totaling $100 million.

-Gothamist

Continue reading

Saving Our Data: Open Data Week 2017

The weekend of March 4th 2017 was Open Data Weekend where people all over the world gathered to discuss the latest news and innovations of the Open Data movement.

Locally, I attended the New York City School of Data event hosted by BetaNYC where data enthusiasts, data scientists, students, activists, and more came together to exchange news on projects and current events. This event was organized by Noel Hidalgo who is the founder and executive director of BetaNYC. BetaNYC hosted this event and is a nonprofit organization dedicated to using technology and data to improve the lives of New Yorkers.

Alisha Austin, a speaker at the New York City School of data event who works at the Mayor’s Office of Operations, described the importance of data science in politics, “when I think of data science, I think of looking at patterns”.  She sees data science as “really important to bring improvement to systems”. Although she admits there’s a lot about data that she’s still learning, she believes, “when people have ownership of their data they can improve their situation.”.

To better understand the importance of Open Data Weekend, it helps to understand a little history of the relationship between government and data. Continue reading

Data in the News

In case you missed it…

  1. Open Data Week overview in New York City [progrss.com]

The first week of March was open data.  Open Data Week is an internationally celebrated event where people and organizations get together to discuss the importance of keeping data accessible to the public. Check-in to see what events happened around you to see how you can get involved.

  1. White House budget cuts for Environmental Protection Agency put data collection and historical records in danger [Washington Post]

The republican spending plan was released this month, which ultimately increases military spending and slashes funding for public services. Activists organized in opposition of federal spending cuts to the EPA by organizing data rescues that make a digitized backup of public records.

  1. Librarians and ‘hacktivists’ work to protect and secure federal data [Houston Chronicle]

The Houston Chronicle documented the community heroes working to protect our data from deletion.

  1. Google will buy Kaggle the online community that runs competitions for data scientists [Tech crunch]

The search engine, Google, establishes its presence in, data science, another area relating to the gathering an understanding of information.

  1. Yale will join other large universities in the competition for providing a post-grad data science program [Yale News]

New York University, Harvard, and Columbia already have Master’s programs in data science. The question remains if they will remain successful under competitive pressure from bootcamps.

  1. Back to Basics: The Wall Street Journal defines the role of a data scientist [The Wall Street Journal]

Always good to know what the public thinks the role of data scientists is.