The Importance of Installing Checkpoints Throughout a Project to Maintain Data Ethics Standards

By Afua Bruce, Chief Program Officer, DataKind

This is the second blog post in our ethics blog series. Read about how DataKind defines data ethics and embeds ethical checks throughout the project process in this introductory blog on our ethical and responsible data science practices at DataKind. This blog includes many references to the DataKind Playbook, so make sure to create an account in order to view.

Often, data scientists set out to make a positive impact on their communities, generating insights and analyses to better inform decisions. Despite having the best intentions, sometimes an ethical issue arises that threatens the viability of an entire project. Even at DataKind, where we’re committed to using data science and AI for good, we’re not infallible when it comes to data. That’s why we’ve built regular ethics checks into our DataKind project process, not just at the beginning of the project but at each step along the way. This is essential to prevent or minimize misuse of technology which could lead to individual and societal harm. 

Recently, a team of our pro bono data scientists were working on a project with a nonprofit partner who believed they had clear ability to use a data set, and it was only later on in the process when the volunteers  discovered the data trail to be incomplete. They didn’t have enough information on how data was obtained by the partner organization. The volunteers working on the project correctly identified this as a data ethics issue, as improper data access violates data protections for individuals. Rightfully so, the team stopped working on developing the algorithm to focus on open and mindful communication about next steps. Internal discussions and deliberations focused on analyzing mistakes made thus far and determining where the project partners and DataKind went wrong.

At DataKind, we know that being “well-meaning” is not sufficient. Rather, intentional ethical review is needed to minimize harm, even if a project is being done “for good.” Therefore, the volunteers reached out to the partner organization to request additional information about the data source. The insufficient responses they received ultimately led DataKind to stop work on the project.

For those involved in the project, the lack of clarity around how the data was initially collected meant they couldn’t confirm people had consented to providing their data for use by a secondary or tertiary party. Understanding the importance of data privacy and responsible use of data, the volunteers determined the risk of moving forward was too great. And while the discovery of the broken data trail was disappointing, the experience reinforced several valuable lessons, including:

  1. Admit to mistakes – It’s better to walk away from a data science solution or project when ethical issues arise than to see it through despite red flags, otherwise risking damage to your reputation, loss of trust, and technical harm.
  2. Pay attention to the data trail Before embarking on generating a machine learning model, think through the data being used to train that model. Understand what your data sources are and whether you should have access in the first place. If obtaining the data from a collaborating organization, ensure that they too abide by the same data collection standards.
  3. Conduct ongoing ethics checks – Identify ways to question, challenge, and raise tensions when it comes to data ethics, security, and protecting personal data at regular intervals throughout the lifespan of a project.
  4. Communicate values – Working with partners can be tricky. It’s important to share your values upfront and flag any conflicts that may arise while working together. A willingness to discontinue a partnership for the sake of data protection is yet another way to reinforce your values with the broader community and maintain respect for the communities the algorithms serve.
  5. Do good in an ethical mannerWhile completion of a project might support our mission to do good in the world, and not completing the project has its own set of risks, ultimately, the cost harming communities and violating trust outweighs the potential benefits.

While a goal of every project and partnership is to see a positive impact in the way organizations work or communities live, the real impact lies in the quality of work and the commitment to guiding values each step of the way. In learning from this project, we’ve refined our ethical data practices and checkpoints throughout the DataKind project process, to do our best to ensure something like this never happens again.

Though some mistakes are inevitable in any project, what ultimately matters most is how they’re addressed so that they can be corrected and altogether avoided in the future. An uncompromising belief in consistently valuing data privacy and ethical data use ensures DataKind remains a value-based organization and its solutions have a positive impact in the social sector. 

Join us as we continue the conversation with O’Reilly Media about a Case Study: How DataKind Walked Back a Project When Ethical Issues Arose on Tuesday, January 25, at 12pm ET.


As a member of the Executive Team, Afua leads all DataKind’s programming, including its global portfolio of projects, its Impact Practices which address sector-wide challenges, and the Center of Excellence which ensures all projects meet the highest standards. She also oversees DataKind’s global Chapter network and volunteer community.

Header image courtesy of iStock/nadia_bormotova.

Join us in advancing the use of data science and AI to support causes that can help make the world a better place. 

As always, thanks for your support of this critical work!

Quick Links

Scroll to Top