Doing Data for Good Right

As an organization focused on using data science to support humanitarian issues, DataKind knows that ethics are of the utmost importance. In this blog, DataKind UK outlines the ethical principles their community has created. Read on and see what you think. Is this a code you’d adopt? What would you add? Join the conversation by leaving comments below and stay tuned for more on this topic.

Ethical principles for pro bono data scientists

Guest blog: Christine Henry and the DataKind UK team

“Ethics is knowing the difference between what you have a right to do and what is right to do.” – Potter Stewart

As data scientists volunteering to help nonprofits, we hope that our work will have a positive impact on those around us. However, in the new frontiers of data science and artificial intelligence, it is sometimes difficult to know what right and wrong looks like or what the impact of our work will be. We can all agree that we don’t want to discriminate against people, but we also recognize that, in data science, labelling and categorising types of people and types of behaviour is at the heart of what we do. At DataKind, ethics around data and technology takes on an even more critical and serious note when you consider that the projects we work on are often about, and for, the most vulnerable populations in our society

DataKind’s projects often lead to a nonprofit partner reallocating scarce resources (money, food, or even advocacy and attention), and this may mean that some groups will go unsupported. The “do no harm” principle is not as simple to apply in our work. Our job is often about minimising harm and maximising positive impacts, rather than avoiding harm all together. And sometimes doing nothing is not necessarily better: a charity’s mission can be furthered by data analysis even if the analytical project is imperfect.

At DataKind UK, we’ve been thinking about some of these tough ethical questions. How do we ensure that the predictive models we build don’t have unintended consequences – and can we ever be sure of that? How can we assess the benefits of implementing an algorithm versus the possible risks? How do we ensure that we don’t allow these ethical challenges to prevent us from taking action when the status quo is worse?

We believe the best way to act ethically as an organisation is to directly confront these hard ethical questions and to support open, frank discussions. With input from our pro bono data scientists, we put together a set of principles to guide these discussions. The principles will help us think about risks within our community and share these concerns with our nonprofit partners. In creating the principles, we focused on understanding potential harms, looking carefully at data context and biases, and being transparent about analysis limits and the reasons for analysis choices.

See the principles we outlined below and learn more about how we crowdsourced these from our community.

How did we do it?

On an evening this past October, we brought together 20 members of our volunteer community, plus a couple of DataKind friends and ethics experts. The brilliant Alix Dunn, Founder and Executive Director of the Engine Room, and conveniently the partner of DataKind UK’s Executive Director, adeptly facilitated the event. For Alix’s reflections on the discussion see here and check out the Responsible Data Forum here.

Rather than start from a blank page, we decided to “seed” the workshop with samples of other related documents that participants could take ideas from or react to. We selected half a dozen sets of principles from different fields and professions (e.g. government, corporations and academia).

Working in small groups, people pulled useful principles out of the sample documents, or hacked their own variants. We also supplied short (anonymised) case studies from past DataKind projects that included possible ethical issues, to help groups think about the real world application of the principles they were discussing. Lastly, we pulled together everyone’s principles into one shared document which, after some heavy editing, turned into the five principles outlined below.

What’s next?

We will begin rolling out the principles to volunteers starting new projects, and track ethical issues raised and what happens. Our volunteer-run Programmes Committee will also look to start building any required processes – for example, a tracking document for issues, identifying someone for volunteers to contact with ethical issues on a project or general level, and updating our existing scoping process to identify ethical issues at an early stage.

This is intended to be a living document for the DataKind UK community and anyone interested in ethical data science. The principles will be updated and adapted as necessary in response to future changes in data science practice; development of ethical standards in the broader data community; and the needs of charities, stakeholders or our community. We hope that the principles we outlined can be an example or starting point for other organisations and data science practitioners as well.

The Principles

As a Datakind UK volunteer, I will strive to adhere to the following principles:

I will actively seek to consider harms and benefits of my work with DataKind UK.

1. I know that data often represents people and misusing it can do harm. In light of DataKind UK’s mission to do data science for social good, I will consider the impact of my work on vulnerable people and groups in particular.
2. I understand that data can be a tool for inclusion and exclusion, and that these effects may be non-obvious and indirect. The output from data analysis can be used in decisions that have disparate impacts on people, including the allocation of scarce resources (e.g. money, food, or even advocacy and attention). I will openly discuss with my team and the charity partner the different potential impacts of this project, including any indirect consequences.
3. In thinking about the impact of my work, I will weigh up the costs of the status quo. What is the cost of doing nothing?
4. I will advocate for fair and accurate representation of my work in public and within charities and their partners.

I will actively seek to understand the context of the data and tools I use.

1. I will look for and interrogate biases in data and collection methods.
2. I will consider built-in assumptions, defaults and affordances of tools, and consider how these may impact my work.
3. I will think about the history of the data and tools we’re using, and I understand that all datasets and tools carry a history of human decision-making. This history also includes choices about the data and people not included.
4. I understand that privacy is not binary, and that context matters for consent and for the expectation of people whose data are available to me.
5. I understand the limits of the stories I can tell with the available data.

I will enable others to understand the data and analysis choices I have made, now and in the future.

1. I will be open and transparent about my choice of data and sources.
2. I will be open and transparent about analysis choices and tools, and the choices made in assessing model and result quality.
3. I will work so that users – including people without data expertise – can use the analyses and tools I work on, effectively and appropriately.
4. I will think about the configurability, sustainability, transparency, auditability, and understandability of my work. I will make stakeholders aware of the limits to these.
5. I will be aware of the time window in which my analysis may be valid, and will share this with stakeholders.

I will actively seek to understand my own limits and the limits of the organisations involved.

1. I will be aware of my own limits and realistic about what I can offer, and what DataKind UK can offer within its different programme formats.
2. I will be aware of the limits of new technology, and I will respect human expertise and incorporate technology into existing human decision-making.
3. I will be alert to possible legal issues and seek out advice and expertise where necessary.

I will debate and discuss ethical choices.

1. I will debate ethics openly and acknowledge that the choices we make are uncertain.
2. I will raise any ethical concerns within DataKind UK, and listen to those of other volunteers. I will acknowledge that other people may make other ethical decisions based on the same information.
3. Where appropriate, I will seek to carry these principles outside of DataKind UK.

Ethical principles for pro bono data scientists

How did we do it?

What’s next?

The Principles

Related Posts