Data Science for Good Project Scoping: Identifying Your Data-Scienceable Problem

Illustration above by iStock/drogatnev

By Rachel Wells, Senior Manager, Center of Excellence, DataKind

We recently shared about the launch of the Center of Excellence (COE). As part of our COE blog series on project processes and best practices at DataKind, we’re excited to share this introduction to scoping. 

At DataKind, we feel “a problem well-stated is half-solved” (Charles Kettering), which is why our teams spend a significant amount of time working with high-impact organizations to understand their pain points, co-design an ideal end state, and identify the resources needed and available to create a data-driven solution. We find that scoping a project well is the most important element of a “data science for good” project. 

Our scoping process is split into two primary stages: Discovery and Design. We go through each stage with a project partner in order to ensure that each element of the project is co-created. Scoping a project well means that our partners and volunteer teams have a clear understanding of the problem and the pathway to solving it. 


We’re eager to use data science to help advance the social sector, and each potential project starts with the Discovery Stage – where DataKind and a prospective partner work together to identify potential ways we might collaborate. We understand data science might not be the only – or best – solution to a prospective partner’s needs, so this stage is all about mutual fact-finding, brainstorming, and mission alignment. 

At DataKind, we believe there are six components of a successful data science for good project and we look for all six as part of the Discovery Stage. 


The six components of a successful data science for good project 

What Do We Look for in a Partner?

While it’s important to note that not everything about what makes a great partner can be put into a rubric, we look for clear signs that indicate a partnership would be set up for mutual success. When discovering with a project partner, we ask ourselves a number of questions, including:

  • Is there a mutual desire to work together and is there buy-in for a partnership from both organizations? 
  • Is the organization equipped to implement the results of a data science solution that has the potential to create meaningful impact? 
  • Does the prospective partner have mission and values alignment with DataKind’s goal of serving humanity? 
  • Is the organization data mature enough to sustain a project output? 
  • Does the prospective partner have staff with bandwidth to support the project? 

Reflecting on these questions, teams sometimes find that it isn’t the right time to partner with DataKind, and that’s okay. We will happily make an introduction to other data science capacity providers if we think they’re a better fit. 

After a Discovery Conversation, Can We Complete an Impact Map?

One of our favorite tools for discovering a possible project’s mission alignment is what we call an Impact Map – a logic model in which we map potential project ideas to the organization’s processes and theory of change. These can be summarized in Problem Statements that look something like this:

We want to __(Analysis)______________

Using _______(Data)_______________
So that _____(Behavior Change)________
So that ______(Impact)______________

In the Discovery Stage, we often draft multiple Problem Statements based on possible data science projects. The project scope will be refined and defined in the Design Stage.

Prove Us Wrong – Critiquing Our Own Ideas!

Before we fully endorse moving to the Design Stage of a project, which requires several weeks to deep dive into the data and fully scope the project, we make sure to engage in significant internal review and reflection to ensure we’re setting our projects up to be ethical, useful, and intelligent. In this step of the Discovery Stage, we:

  • Consider the possible ethical implications of the project, asking questions about risks and the worst possible project outcomes
  • Assess that there’s no alternative or existing transferable solution with research on the landscape of the potential projects
  • Ensure the partner has data available now or knows of public data that will meet their requirements

If DataKind and our potential partner decide to move forward to a deeper shared understanding of a potential project, we move to the Design Stage of scoping. 


Our Design Stage often includes a sharing of data, ideas, programs, and outcomes. To ensure our partners are protected during this stage, and comfortable with sharing materials with us, the Design Stage kicks off with a mutual non-disclosure agreement and a data-sharing agreement. Once we’ve signed contracts and shared data, the deep design starts. In this stage, DataKind dives into the data and incorporates human-centered design principles to pull together a coherent and detailed project design. 

What’s in the Data? Completing a Detailed Data Audit

The Data Audit is how we determine what solution is possible. With data security requirements and a data management plan in place, we dive into the data to explore all the different end-state possibilities. With a deeper understanding of what’s realistic with the data we have, DataKind is able to:

  • Determine if the proposed project ideas are feasible 
  • Refine the Problem Statement
  • Define technical methodology

Ensuring Ethical and Responsible Project Design

There are many essential ethical elements to incorporate into our project design. We’re still learning and improving our processes around ethical data science for good project design, but some of the steps we always take to ensure ethical design include:

  • Conduct project risk and ethical assessments for each project idea
  • Evaluate possible data inclusion risks and create an associated mitigation strategy
  • Define accountability to the communities affected by the work 
  • Establish project-specific responsible data science ethical standards for analysis
  • Create a pathway for evaluating the ethical implications of any end products
  • Identify who should weigh in on project ideas as advisors and subject matter experts and incorporate their feedback

(Please note this list is just a starting point and far from exhaustive. The Center of Excellence will be back on the blog to share more about our ethical data science practices and processes in 2021.) 

Identifying a Clear Pathway to Success

In order to move forward, we go back to our Impact Map and the updated Problem Statement that was summarized above. We work with our partner to develop associated success metrics for the project and agree on a plan for how we will gather these metrics and measure impact. We draft a pathway to adoption and sustainability plan for the organization post-project, checking that the project is realistically sustainable. We agree on any software and tools needed for the project to be successful and a path for implementation. 

With these defined, we’re ready for all parties to agree on clearly defined deliverables and decide whether to pursue the project. A final services agreement confirms that both parties are onboard with the action plan and ready to execute on the project!

What Do You Think?

All this said, we’re always refining and improving our processes, and this is what DataKind’s Center of Excellence is all about. This introduction to scoping summarizes the process that’s outlined in detail in the coming DataKind Playbook, which we’ll share in Fall 2021. Share your feedback and ideas for how we can improve our scoping process by reaching out to Rachel Wells at the Center of Excellence at   



Rachel Wells is the Senior Manager of DataKind’s Center of Excellence, where she’s responsible for overseeing project and programs quality, ethics, learning, impact, and measurement. Rachel’s a former data scientist and was a longtime DataKind volunteer before joining the team full time.

Quick Links

Scroll to Top