Keep Calm and Collect Your Data

November 1, 2013

DataKind Volunteer Katya Vasilaky interviewed Sara-Jayne Terp after her talk at the October DataKind Meetup


Sara-Jayne designs and creates custom systems for Ushahidi, teaches at Columbia University’s SIPA, and is working with Columbia’s Journalism School to help bring sensors into the journalism curriculum.  At the same time, Sara manages to lead and improve crisis data management and dissemination for various Crisis mapping groups including Open Crisis.  Crisis maps use online data and can inform agencies and organizations of critical events or needs that organizations might have missed. This can save lives and speed up the recovery process from a crisis. I talked with Sara to get an idea of the incredible work that she does and what it takes to make a crisis map happen.

 

What is crisis mapping? What are your first steps to getting a map started when it’s requested?

Crisis mapping is the intersection of crowdsourcing, data science and geographical mapping. It is as much about the people involved as it is about the map: about teaching people who manage disaster information how to build their own maps. When a crisis happens and data is requested, the first thing we look for are digital geographic maps; if these don’t exist, mappers begin to develop them online from paper maps and satellite images. Organizations like UNOCHA and FEMA, who coordinate disaster responders, and ACAPS, who provide data analysis and summaries, are primary sources of information in a disaster zone.

Volunteer groups supporting them include Map Action, a British based volunteer team that builds maps from inside the disaster area, Standby Task Force, a group creating situation maps from online data, Info4Disasters, a group that curates datasets for disaster-prone areas, and the Digital Humanitarian Network, a group that link crisis organisations with crisis data volunteers.  In addition to working with these groups, I also help OpenCrisis maintain chat rooms for contributors outside these groups who are creating their own maps, and who need individual advice and support. Issues that new mappers need to be aware of include how to talk to agencies, and individual issues like PTSD, which can be triggered by things like entering in data about lives that no longer exist; the chatrooms are partly there for us to check on each other and call in volunteer counselors if needed.

 

So once you have a working map up, where does the human data come from, and how is it then mapped?

The next step is to collect field evidence of the crisis from a variety of sources: SMS, tweets, Facebook, online data, and direct reports from affected individuals, diaspora, volunteers, and response agencies. In each country, we deal with different technologies and data sources: in the Philippines, Twitter is a primary source of information, while in Kenya, SMS is a major source.  Preparation is important here: in countries where cell phone usage is more prevalent than web use, mappers work to establish a connection to cellphone providers before a known event (e.g. elections) takes place, and countries that have experienced a disaster previously are usually better equipped to assist crisis management. Most recently, India’s disaster agencies and volunteer teams were well-equipped when the latest cyclone hit its coast, because they’d recently been dealing with large-scale floods. 

The field evidence is then categorized, analyzed and mapped.  Geolocation is often a major effort here: if no latitude-longitude is given for a location, a mapper will begin looking for the location based on the description or name given, which could have a multitude of spellings, often depending on who translated the name when.  The UN has multiple-spellings lists that can be useful here, but sometimes it’s just down to detective work and persistence.

 

What’s been your fastest map deployment?

For Libya, I had a map up within half an hour. Of course, that’s just the start: the map is dynamic, its contents are enhanced over time, and the information coming in is updated continuously.

 

Do you use your maps and data for prediction? E.g. to predict crises?

We are not at that stage yet. So much work still needs to be done in collecting and standardizing the data. But this is why I teach my course at SIPA—to create a safe space for individuals to learn what data is out there and to learn the technologies they may need to analyze or build upon data.  It seems to be working: two of our students recently won first and second place in hackathons, having just learned about technical projects, which is wonderful.

 

Are you concerned with any biases in your data?

In terms of bias, we can only map the information we receive, but we do rigorously check data, verifying inputs against each other. We cannot know for sure what is missing in a map, but sometimes we can work out where we might be missing data if we know, for example, how many cell towers remain functioning in an area.  We also have to be careful with the data we map in terms of spammers. There are individuals who will post false information to draw attention to themselves, or falsely claim charity donations on behalf of people affected by a crisis.

Right now, all the data processing and verification is manual, although some algorithms are starting to be applied to input feeds.  More work needs to be done to automate the detection of both spammers and useful, actionable, content within large volumes of data, and much of this work already exists within data science.

 

Do you have time to test your map products to make ensure that they’re working properly, and displaying the data that should be displayed?

Where we can, we cross-check our mapped products with data from on-the-ground users before sending them back to the organization that first requested the map. We ask people on the ground to tell us what they see. 

In terms of sensitive information, I call human development the last unregulated sector. Mappers have guidelines on how to handle personal information and information that might be used to identify and target individuals, but not everyone creating a map follows these (which is why OpenCrisis works with mappers outside the main groups).  What data can be shared online is also dictated by each country, and we have to be aware of these protocols on a country-by-country basis.

 

We heard that your cat Emily has taken a break from blogging about crisis mapping. Can I ask what she is up to these days?

For the last few months, Emily has conquered much of the outdoor wildlife near our home, bringing home both live and dead squirrels and chipmunks. This has led to the enactment of the “No Wildlife” policy in the Terp household. When Emily is not making friends with wildlife, she’s likely slumbering on my computer on the letter Q. As a result, I have scaled back on the use of the letter Q.


Katya is a postdoctoral fellow at Columbia University's Earth Institute, where she works on both the use of games, experiments, and social network analysis to encourage the take-up of new agricultural technologies in developing countries and on Bayesian weather simulations used in micro-insurance products for small holder farmers.