DataKind's Founder & Executive Director, Jake Porway, and Wes McKinney recently caught-up in NYC for a chat. Here's some of what they spoke about...
Wes: Early in my career, I was struck (scarred, perhaps?) by how tedious the daily work of a data scientist can be. I started building better data wrangling tools in Python to make myself and others more productive. Building this software and making it freely available for everyone to use has been really satisfying, since people can use the time they save on mundane data manipulation to solve bigger and more ambitious problems. I like hearing stories about how the work of the open source community has empowered individuals to do good with data.
Wes: It’s a mixed bag. One of the best things I’ve read recently on this is the blog “Artificial Intelligence -- The Revolution Hasn’t Happened Yet” by UC Berkeley professor Michael I. Jordan. The upside of the AI hype cycle is that it’s spurring major investments in computational infrastructure, systems, and new hardware for machine learning and general data processing. The new systems being developed are general purpose enough that even if “deep learning” becomes less popular, we’ll be able to reap the benefits in other areas of statistical computing for decades to come.
Wes: As a wannabe-linguist I like to think that machine translation and natural language processing are helping people better understand each other and make the world feel “smaller”. We don’t quite have a Universal Translator yet, but we are making progress.
Wes: I’m excited to see more in-depth open source collaborations happening between different programming language ecosystems like Python and R. The Apache Arrow open source project was created at the beginning of 2016 to help with this. I’ve founded Ursa Labs in partnership with Hadley Wickham and RStudio to help boost Python-R cross-pollination and to raise funding to build out the Arrow ecosystem.