“When you’ve got amazing volunteers that work with Crisis Text Line to better understand teens in distress, with DonorsChoose.org to analyze teachers’ needs across the U.S., or with Amnesty International to predict and prevent human rights abuses, it is all too easy to forget what they do for a day job,” said Jake Porway, DataKind’s founder and executive director. “But let me tell you, it’s no less impressive.”
At the recent DataKind NYC Meetup, we got a peek into the day in the life of four volunteers’ fascinating careers, learning how data science is being used for everything from curing cancer to fighting poverty in Tanzania. We also got a sneak peek at Civic Hall, DataKind’s beautiful new home here in New York City.
What physics and internet problems have in common.
Meet Jon Roberts, vice president of data science at about.com. Founded in 1996, about.com is a veteran company and yet, before Jon came on board, no one had looked at about.com’s data. A trained physicist, he went to work: “Physics and internet problems are really very similar.” Who would have thought? “It’s all about extracting meaning from messy time series data.”
And about.com’s data can give us some deep insight into what motivates us. For example, apparently searches for “weight loss” spiked after 9/11 on the site. Jon offered one hypothesis: “If reminded of their own mortality, people want to get fit.” The effect is not long-lasting though it seems. “It only lasts for about a month.”
In his spare time you can find Jon mapping NYC open data or winning global prizes at NASA’s International Space Apps Challenge making data more accessible through tagging NASA data and across the federal government.
Anyone who’s anyone is into sexy random forests.
Meet Ana Areias, a data scientist who works on poverty prediction in Tanzania, a project in collaboration with the World Bank. It’s critical for the World Bank to have up to date poverty measures in order to effectively direct resources, but their current methodology using lengthy surveys is expensive and time consuming. By the time the surveys have been sent out, filled out, and sent back, the data is already out of date.
Ana is working to predict poverty (almost) real time, using statistical models from the economist’s workhorse, backwards stepwise regression, to the sexy random forest. “Everyone is all over them these days,” said Ana. We can’t wait to hear more once she cracks the code!
Guacamole can help cure cancer. Wait, what?
Meet Arun Ahuja - when he is not volunteering his skills for the greater social good at DataKind, he helps cure cancer at Mount Sinai. As a Biomedical Software Developer at Mount Sinai Hospital, he develops software and tools for genomic analysis. Much of his work has focused on building a variant caller to discover cancer mutations on Spark and analyzing the effect of the mutations on therapeutic response.
The guacamole he was talking about is actually a framework for identifying DNA mutations by sequencing data. Since DNA mutations can result in cancer, understanding the mutation profile of tumor DNA can help inform effective cancer treatments and save lives.
A method to the madness.
And last but not at all least, meet Tim Rich, director of data science Publicis North America. A natural performer, Tim jogged back and forth across the stage, covering a lot of ground both physically and philosophically!
As a trained sociologist, Tim shared his love for methodology. “Data science needs more off it,” he told the audience, referencing Marx, Weber, and Durkheim (names I did not expect to hear at a data science Meetup). “We need to study ideal types.” What? “Ideal types are constructs that help us put social reality in order - they are abstractions arrived at by concentrating on the most central characteristics of social phenomena.” Pretty “out there” stuff. Why do we need it? Why do we need methodology? Data science is about people embedded in social reality. “Data is generated by and about people. To understand data, we need methods to think about people in their embeddedness.”
It’s clear DataKind’s volunteers are embedded in tons of fascinating projects when they’re not busy saving the world on their pro bono projects.
To get a peek into the one of these projects, check out a recent one Jon, Ana and Tim worked on together to help GlobalGiving maximize donations on its site in order to fund worthy humanitarian organizations.
Finally, sign up to the DataKind NYC Meetup so you don’t miss out on the next event!