DataDive Recap: Benetech

December 18, 2011

Benetech empowers people from around the world by allowing them to upload human rights data to publicly available servers through a tool called Martus. This data can be used by activists and organizers the world around to battle injustice and strengthen human rights internationally. The data kept in Martus can be shared (public data) or kept private (even from Benetech). Benetech came to us with a very simple question: Based on public data, who is using Martus and what do we have in our databases?

One of the the most important tools to have in data analysis is a lens into what the data looks like. Our DWB + Benetech team put their heads together and developed a set of scripts that would automatically generate reports for Benetech about the type of public human rights data that had been uploaded to Martus and by whom. Using a combination of R scripts, and the SWeave markup, the team created a set of scripts that can be re-run against new data as it flows in to the Martus servers. This information would allow them to see where and when sudden bursts in activity from around the world were happening.

The figure above shows the volume of Martus bulletins, megabytes, new accounts, and attachments posted over time. We can see a few spikes in activity that Benetech may want to explore further. These automatic reports are critical for understanding usage of the tool over time and can be used to alert Benetech to sudden changes in activity.

Our team also felt it would be important to see what regions were posting the most public information over time. The figure above shows a map of where bulletins were submitted from, where darker red circles indicate more bulletins were posted. We can see hotspots in South America and Asia around conflict regions. If this map were adapted to realtime we could easily imagine Benetech identifying high-activity regions in near-realtime.

Even though the data from Martus is (intentionally) quite opaque to protect the rights of the users involved, the Benetech + DWB team was able to pull out a huge amount of public information. By creating these automatically generated reports and maps for Benetech, Data Without Borders has enabled them to understand how their database is changing over time and where there may be early indicators of impending strife. This project is also a great example of the breadth of work needed by social organizations. Benetech benefitted most not by a handy p-value or a quick chart, but by reusable scripts for data summarization and visualization. These tools will help them see into their data in ways they could not before.