With one successful DataDive under our belt, we were excited to host our second DataDive this November with Community Technology Alliance (CTA), San Francisco Child Abuse Prevention Center (SFCAPC), and Skoll Global Threats.
Thanks to all of you that attended, we're two for two with another successful DataDive in the books!
What exactly makes a DataDive successful? For us, it's an event that does three key things...
Community Technology Alliance (CTA) empowers communities to develop data-driven solutions to ending poverty and homelessness. Over the weekend, volunteers worked with data from Monterey County’s homelessness assistance program called Transitional Housing. This is a standard US Department of Housing and Urban Development (HUD) program giving qualified individuals or families a home for up to a maximum of 24 months, in order to help them transition out of homelessness to self-supported living.
CTA came to DataKind San Francisco for help finding insights that would lead to more successful transitions or "exits" from the HUD program. For example, how long does it take them to provide a "successful" outcome for people, such as moving to more stable housing? Is there a point after which successful outcomes become unlikely? If so, they could better allocate resources by frontloading services or connecting people to other services if it became clear they needed more support than the program could provide.
In the words of one of our volunteers: “this is a hard problem.”
Led by Data Ambassadors Jeremy Sterns, Abhishek Kapatkar, and Wade Fuller, this group of fearless volunteers dove in nevertheless to look at exit timing, measured by the number of months a person spends in the transitional housing, and compared that to the positive or negative outcomes on exit for different kinds of homeless situations.
However, they realized that analyzing the data was only part of the problem. A lot of effort was focused on building a “golden pipeline” to enable CTA to take not just the findings from this weekend, but start to integrate more data from Monterey County or other locations using the same feature set, or what they called the “golden table." The golden table would feed into their existing visualizing system, Tableau, and enable CTA to create interactive charts about program outcomes and causal relationships
While it was indeed a hard problem, Data Ambassador Jeremy Sterns said it best when he described the weekend as “a starting point for changing the world.”
Social problems are hard, and so is getting insights out of data. San Francisco Child Abuse Prevention Center (SFCAPC, pronounced “SF cap-see”) aims to prevent child abuse in San Francisco and reduce its impact. During the DataDive, volunteers on the SFCAPC project used counselor and client survey data to try and better understand the effectiveness of different services on subgroups of clients.
Saturday morning at a DataDive is all about the context. Dealing with survey results and HIPPA anonymized data, Data Ambassador Qian Li led a team through collaborative data exploration to build a shared understanding of it that got everyone involved seeing the world through a slightly different lens. Throughout the rest of the weekend, the data was sliced and diced, with volunteers applying a variety of techniques including plotting and regression, correlation matrices, random forests and cluster analysis. SFCAPC left the weekend with a number of factors to monitor more closely, some suggestions for areas to gather more data, and a new-found appreciation for the art of insight generation.
Early detection and early response are key to preventing the spread of any disease. Created by epidemiologists at HealthMap of Boston Children’s Hospital and The Skoll Global Threats Fund, Flu Near You asks people to take a few seconds each week to report if they or their family members have been healthy or sick.
Following on the heels of an “EpiHack” workshop organized to brainstorm how to effectively leverage “Flu Near You” data, the Pandemics group at Skoll Global Threats wanted to use the DataDive weekend to build out a proof-of-concept data product - the Flusion Dashboard. This R-Shiny interactive data exploration tool, combines crowd-sourced data from Flu Near You, CDC laboratory and symptom reports and Athena health claims.
The goal is to provide public health decision makers (Health Officers, Hospital Administrators, and Epidemiologists) a real-time and comprehensive picture of Influenza like Illness (ILI) activity. By showing the current rates of symptoms and trends reported via multiple data sources (Flu Near You, CDC, Athena Health, Twitter), the hope is to identify and mitigate serious ILI outbreaks, leading to fewer hospitalizations and deaths.
Data Ambassadors Eric Williams and Brian Spiering, worked closely with Adam Crawley from the Pandemics group at Skoll Global Threats to ensure the value of the data product. First, volunteers came together to figure out the best way to combine and visualize the different, and at times contradictory, data sources. Being able to build a shared language between the data and project vision lead to rapid prototype. One sub-team focus on creating production level infrastructure for the Flusion Dashboard which will ensure automatic updating and easy maintenance in the future. Additionally, some volunteers worked on advanced inference, such as migration patterns of symptoms. Understanding how to communicate those findings through a web tool is crucial to them being impactful.
A huge thank you to our Data Ambassadors and volunteer teams that gave their weekend to help three incredible partner organizations use data to improve the world.
If these three things make you just as excited about unleashing data for good as we are, we'd love to see you at our next event! Register on our Meetup to stay in the know and check out our Facebook page for more photos. Share your memories from the weekend below!