Blog
DataKind DC DataDive Projects
March 11, 2013

We’re buzzing with excitement over at DataKind about the 6 excellent projects we’re scoping out for our weekend DataDive with the World Bank, UNDP, QCRI, and others next weekend!  Each of these projects focuses on either using creative data techniques to alleviate poverty or to combat the prevalence of fraud and corruption in international development projects.  Granted, we’re not looking to fully solve these problems this weekend, but we’ve identified 6 projects that could make a huge difference.

We’ve listed each project’s goal, the data available, and the skills each could use.  Note that anyone with any skill set will be useful on any project (seriously!) so the skills section merely lists specific abilities that would be useful for each project.

THIS BEING THE BANK, PLEASE ADD DISCLAIMER SAYING THESE ARE ROUGH PROJECT DESCRIPTIONS AND THAT THE DIVE IS AN EXPERIMENTAL PILOT. PROJECT DESCRIPTIONS DO NOT NECESSARILY CORRESPOND TO BANK PRIORITIES OR REFLECT ON THE CURRENT WORK UNDERWAY IN THE RELEVANT THEMES.

Measuring Socioeconomic Indicators in Arabic Tweets   (Poverty)

Goal:
Determine whether socioeconomic indicators can be identified by observing conversations in Arabic on Twitter.  Examples include listening for poverty terms or human development phrases such as “no medicine”, “bankrupt”, or “bad education”.

Datasets Available:
• 10GB of Arabic tweets from 2/2012 on.
• An English to Arabic translation of key socioeconomic terms

Skills:
• Arabic fluency (translators may be available)
• Natural language processing
• Timeseries analysis
• Data processing skills (up to 10GB)

Combining and Analyzing the World Bank’s Project Data for 'Signals' (Fraud and Corruption)

Goal:
To combine all open information about a single World Bank project into one source to identify signals in the data. Questions include - ADD SPIEL ABOUT CLUSTERING CONTRACTORS ETC., MAY WANT TO CITE THIS EXAMPLEhttp://europeandcis.undp.org/blog/2013/01/31/big-data-and-development-organizations-what-happens-when-you-move-from-theory-to-practice/

Datasets Available:
• The open data on data.worldbank.org.
• An initial combination of some of the data provided by our Data Ambassador Taimur, who worked on the project during Open Data Day.

Skills:
• Data wrangling
• Exploratory analysis skills

Analyzing World Bank Supplier Profiles (Fraud + Corruption)

Goal:
To analyze detailed profiles of World Bank suppliers to better understand their relationships and identify potential for fraud in contracts.  Automated methods could be developed to, for example, identify companies whose phone numbers map to uninhabited regions or who share the same phone number / address with entities known to be high risk, or that bid together on multiple projects.

Read more posts
January 11, 2022
Our Ethics + Responsible Data Science Practices at DataKind
At DataKind, we take an expansive definition of data ethics and responsible data science as broad terms that can be used to describe the appropriate handling of data...
Read full story
December 21, 2021
Lessons from DataKind San Francisco’s Launch of DataAdvisory Projects
From financial forecasting to targeted advertisements, advancements in data collection and analysis have benefited a myriad of for-profit organizations today.
Read full story
October 14, 2021
Celebrating DataKind’s CEO: An Interview with Lauren Woodman
We’re thrilled to welcome Lauren Woodman as the new CEO of DataKind. She brings to the role over 25 years of experience working at the intersection of technology, development, policy, and NGOs...
Read full story
December 20, 2021
Shining a Light on Community: Looking Back at DataKind’s Virtual DataDive® Event
We hosted a DataDive® event in fall 2021, and with it being the season of giving, we thought what better time to share some highlights and express our deepest gratitude to our partners, volunteers, and sponsors...
Read full story
Blog Archive