Kush Varshney

IBM Researcher

Yorktown Heights, NY

Kush teamed up with Brian Abelson to help DataCorps partner GiveDirectly transform the way they serve their communities and then presented their project at KDD 2014. 

What’s your day job?

I'm a researcher in the Mathematical Sciences and Analytics Department at the IBM Thomas J. Watson Research Center in Yorktown Heights, NY. I apply data science and predictive analytics to human capital management, salesforce management, healthcare, and public affairs. I also conduct academic research on the theory and methods of statistical signal processing and learning.

Tell us about your work with DataKind.

We helped GiveDirectly, an organization that efficiently delivers unconditional cash transfers to the extremely poor, improve its operations. The first step in giving donations to the extremely poor is locating them. This step is not so straightforward in rural regions of countries like Kenya and Uganda because data on the poverty level of individual villages does not exist. GiveDirectly was previously sending staff members out to regions of interest to conduct manual censuses, which was very time intensive.

We developed a remote sensing approach leveraging the fact that among rural households, the extremely poor tend to live in households with thatched roofs instead of metal roofs. We estimated the proportion of homes with thatched roofs in a village by applying image processing and machine learning techniques to satellite images. GiveDirectly then prioritized villages with the highest proportions of thatched roofs for cash transfers. 

What inspires you to use your data skills for good in your spare time?

I feel that all people should be afforded a chance to reach their potential, no matter their background or circumstance. If an expertise analytics algorithm I have developed can highlight a pearl of an employee within a company of 425,000 individuals, affording him or her a chance for career success, or if a remote sensing algorithm I have developed facilitates an unconditional cash transfer, giving a family a chance to escape a poverty trap, I feel that I have done machine learning that matters.

What is one of the most surprising things you've learned or seen in working with data?

In a lot of data work that I've contributed to, the insight hasn't been overly surprising: strategic outsourcing helps companies financially for a while before the effect wears off, employees underpaid compared to peers voluntarily resign more than others, new hires from college have a longer ramp-up time to full productivity than experienced new hires, serving first in a tennis match is an advantage, prescription drug claims help predict medical procedure claims, salespeople on commission-based compensation plans microblog less and with less helpful content than salespeople not on commission, a history of bill cosponsorship is a good predictor for the votes of Congresspeople.  The value of a data-driven approach has been in quantifying how long or how much, which is often just as transformative as a surprising or paradigm-changing insight.

What’s the most interesting or visually striking data project you’ve seen recently?

My GiveDirectly project partner Brian Abelson's company recently put up a nice visualization of daily temperature anomalies across the country over time.

What does someone getting started with data science need to learn?

Less is more. Simple models usually generalize better than more complicated ones. Simple visualizations are usually easier to understand than more cluttered ones.

Fancy stuff is only rarely required, and because of this, it is most important to understand the core problems and ask the right questions.

Acquiring, cleaning, joining, and preparing data is most of the battle. Dirty data can sometimes even reverse conclusions, as we found with the flavor pairing hypothesis in Medieval European cuisine.

Who are your top 3 favorite people you follow on Twitter?

@sadatshami for social business, analytics, and success guides

@lrvarshney for cognitive cooking, connectomics, information theory, and mathematical models of society

@erikbryn for understanding work, progress, and prosperity in a time of brilliant technologies

Lav, by the way, is my twin brother. 

What did you eat for breakfast?

Honey Nut Cheerios with warm milk, just like most weekdays in the last 25 years.