By Nick Sorros, DataKind UK Ethics Committee
Ethics is an increasingly important topic in the area of data. From Amazon’s sexist hiring algorithm to Google’s racist photo tagging system and Northpointe’s biased recidivism scores, there are plenty of examples lately of technologies built around data that are causing harm. And so, this year DataKind UK created an Ethics Committee, adding to the four other committees that make our work possible.
At DataKind UK, we want to ensure that our projects with social change organisations make responsible use of data and that our volunteer community of data scientists keep ethics front and centre in their work. The Ethics Committee’s goal has been to increase awareness of the sometimes complex ethical considerations of doing data science projects.
Our first book club session in Edinburgh, talking about automation and self-driving cars.
One of the very first things we decided to do was to set up a data ethics book club. The book club would offer a timely opportunity to explore topics related to data ethics in depth, through books, research papers, newspaper articles, and sometimes videos. We covered topics such as face recognition, fairness, financial inclusion, and gender among others. The book club has grown from 5 people to 50, operating in London, Edinburgh and online. Read about our previous book club discussions and find out about the next one!
The data scientists’ view
While setting up the book club, we also ran a community survey on views relating to data ethics. We found that there’s quite a variety of views on the topic. For example, there was an almost perfect split among those that believed that data and AI will have a positive impact on society, and those that thought it wouldn’t. Opinion was also divided on whether AI is moving too fast, with over half agreeing with this. Those that work in the social/public sector lean towards the view that AI is moving at the right pace, whereas those on the private sector lean towards the view that it’s moving too fast.
The next thing we did was to run ethics training for core DataKind UK volunteers. We wanted to provide guidance on how to recognise potential ethical issues early on, using EthicalOS as a basis for our discussion. We used a case study based approach drawing from examples across the industry, charities, government, and our own past projects.
The EthicalOS.org risk zones
While embarking on this journey of diving deeper into ethics, we realised that there is no shortage of resources to learn from. As we did our own reading, we also shared what we came across, and are continuously updating this to reflect a good set of material for someone relatively new to the topic.
We’ve also had the opportunity to speak at a number of events — about our approach on the subject, as well as the dangers to be avoided — including Digital Catapult, the BBC Machine Learning Fireside Chat on AI for Good, Strata, London’s City Hall, Beyond Tech 2019, the IBM/Prezi Crunch Conference, TicTec and Dataiku.
We’ve also begun to tackle some big challenges. For example, the challenge of working with free text data. With this type of data, it can be a hard (if not impossible) to guarantee that personal identifiers are stripped out of the data. Though it’s fairly straightforward to remove the bulk of names, emails, phones, postcodes etc., sensitive information may remain. Even if we're able to anonymise the data, the resulting text can be highly sensitive information about difficult topics. The Ethics Committee agreed that using free text data shouldn't be a hard no and should very much depend on the context. The nature of the data, the subject matter, the potential benefit of analysis, and the ability to remove personal data are some of the considerations that need to feed into such a decision.
Another longstanding challenge in the area of ethics is the skewed demographic that's attracted to ethical discussions — and who is not in the room. Even though data practitioners are mostly male, ethics activities are predominantly women. Going forward, we'd like to explore ways to attract more diverse and representative audiences to these discussions.
In 2020, the Ethics Committee will aim to better embed ethics in all our processes from our light touch support for social change organisations to selecting which projects we work with and how we run our projects. We’ll also continue to run the book club, to engage the data community in ethics discussions, and learn from each other.
Finally, we're partnering with the Ada Lovelace Institute to combine their research expertise with our hands on experience of doing data for good in order to share best practices on using data science in the social sector.
If you’re a social change organisation that needs support, or a data expert who wants to join the discussion, get in touch on firstname.lastname@example.org.