By DataKind DC
The Coronavirus Aid, Relief, and Economic Security (CARES) Act and related pandemic spending represent the biggest bailouts in U.S. history - Congress approved nearly $4 trillion in pandemic relief spending in 2020. But tracking where the CARES Act and other federal stimulus spending funds have been distributed isn’t as easy as one would think.
The U.S. Small Business Administration (SBA) released Paycheck Protection Program (PPP) data only on loans of more than $150,000, and news organizations have filed suits to get access to all details of how the taxpayer funds were spent. Additionally, the data that the SBA did release in 2019 was incomplete and contained inconsistencies. Without demographic data, it’s difficult to know whether minority-owned businesses or low income households, who have been deeply impacted by the pandemic, received their fair share.
To track federal COVID-19 stimulus spending, journalists need access to data scientists to support their investigative reporting. In July 2020, the National Press Foundation (NPF), a nonprofit organization which trains journalists how to use the latest reporting tools and techniques, joined forces with DataKind DC to host an investigative data journalism training course for a select group of journalists. The goal was to enable these journalists to better track spending from the CARES Act.
“It’s more important than ever that journalism be powered by data analysis, but investigative data journalism is time-consuming, expensive, and hard,” said NPF President Sonni Efron. “In today’s disastrous financial climate, many newsrooms are slashing staff and budgets. Only the very largest news organizations have the resources to hire a data professional to help them with their reporting. But the accountability stories being reported by local and regional reporters are critical to maintaining our democracy. That’s why the partnership between DataKind and the National Press Foundation to train and empower journalists is so important.”
Read on to learn more about the partnership between NPF and DataKind DC and what they have in the works for 2021.
NPF approached DataKind DC to match these journalists with DataKind’s data science volunteers. DataKind DC Chapter leaders Rich Carder and Dan Kheloussi led this unique partnership, working with volunteers to clean, enhance, and synthesize relevant data sets, while also pairing volunteers with individual journalists to provide in-depth analysis for the specific leads the journalists were pursuing. In particular, volunteer experts produced a set of databases designed to make it easier for journalists to track the SBA’s PPP data, which was missing critical information and had numerous inconsistencies.
One of the lead data science volunteers, John McCambridge, worked alongside other volunteer data experts to create a clean and complete data file to act as the foundation for the work of other volunteers and journalists. This data file also integrated industry classification and Census data, improving the depth and breadth of insights beyond what was possible within the original data alone.
John and the team also developed simple tools to accelerate the discovery of potentially interesting records (e.g., lists of loans issued to different companies sharing identical addresses, and maps showing clusters of loans within specific areas), and developed materials and tools to ensure the data and its challenges could be quickly and fully understood by journalists, allowing them to focus on their investigative work. "It was a deeply rewarding experience to work alongside such a talented, diverse, and motivated team of volunteers," he shared. The volunteers also hosted a training session, led by John, to explain the potential of the data and tools. One of the tools, a dashboard for exploring the loan amounts by congressional district, which includes additional demographics and election data, can be accessed here.
Source: DataKind DC
Partnering with Reporters
With the power of the tools DataKind volunteers created, the journalists then partnered with volunteer data scientists to produce several important investigative articles tracking loans for fraud and abuse. John worked with Jay Cridlin from the Tampa Bay Times to publish a piece about a businessman who received 10 different PPP loans. “I know that it wouldn’t have happened had I not been a part of the National Press Foundation’s workshop and been paired with a great team at DataKind,” shared Jay Cridlin.
Another DataKind volunteer, Nandini Nadkarni, partnered with Craig Harris of the Arizona Republic for a story on charter schools accepting PPP money. Craig said about the partnership: “Through the PPP seminar, I have been paired with an incredible data scientist.” For Nandini, the collaboration offered a number of powerful skills: “Collaborating with Craig was a great experience! I learned that active listening and asking for feedback were as important as data analysis. This project was more than data manipulation and synthesis; it provided me with the opportunity to interpret and find answers to my journalist partner’s questions.”
To read additional articles, take a look below.
Following this successful partnership, NPF and DataKind DC agreed to continue working together to match journalists with volunteer data scientists to provide deeper analysis for their stories through four different programs set for 2021:
- Paul Miller Washington Reporting Fellowship: Brings together 22 early-career reporters either based in Washington, DC for regional news organizations or working for national news outlets. (The 2021 fellows are here.)
- Covering the Statehouse in a Time of Crisis: NPF will select 25 reporters based in state capitals around the country. (The application is here.)
- Holding Power Accountable: Two training programs that will help 50 journalists from around the country understand critical financial policy issues that will emerge in the new Congress and state legislatures, as well as ongoing pandemic relief spending.
- In Business Insider, Rhea Mahbubani was paired with Dave Winkler, producing a story on some of the worst-rated nursing homes in the country getting PPP loans.
- For the Kentucky Center for Investigative Reporting, Jared Bennett was paired with data scientist Mike Shumpert for a story on elevated COVID-19 death rates in the poorest counties in the state.
- For Spectrum Magazine, Alva James-Johnson was paired with data scientist M.D. Shuey for a story on Adventist schools and their decisions on whether to accept CARES Act funds.
“NPF fellows told us that they would never have been able to land the investigative stories they did without the expertise of their data scientist partners,” said NPF President Sonni Efron.
If you’d like to volunteer to work with a journalist (even if you don’t live in Washington, DC!), or if you just want to follow this ongoing partnership, stay connected with DataKind DC (on Slack). To get involved in current and upcoming projects with DataKind DC, please check our Meetup page (join us for our upcoming virtual DataJam!) and follow us on Twitter.
- Tracking COVID Cash Tips and Resources
- A New Tool for Tracking COVID Cash
- Dashboard: Exploring PPP Loans by Congressional District
- National Press Foundation Project Presentation at the 2020 DataKind x Teradata Community Event (starts at 20:39)