Meet Kelson Shilling-Scrivo! He’s a volunteer with DataKind DC and recently partnered with the Red Cross to create a national fire risk score. As the Data Ambassador, he worked alongside the rest of the volunteer team to predict the locations in greatest need of intervention so that the Red Cross can target services to maximize their impact and save lives. By day, Kelson’s a computational neuroscientist working on his PhD in neuroscience at the University of Maryland (he graduates in the winter!). Learn more about him and his DataKind journey below!
Tell us a little bit about yourself and your background.
I'm a trained systems/computational neuroscientist. Hearing well in noisy environments is one thing we’re usually pretty good at. As we all get older, we start to lose our hearing and places such as crowded bars become more difficult to hear in. My research tries to understand what circuits in the brain are responsible for hearing in noise and how they break down with age. To do this, I record from thousands of neurons in the auditory cortex of mice. As a scientist, we’re trained to solve important problems with messy data so data science was a natural fit.
Can you briefly describe the project that you worked on?
The Red Cross has a national fire prevention campaign where they hand out smoke detectors to high-risk areas in order to prevent death and injury from fires. According to their research, about 20% of homes in the U.S. have no working smoke detector and there’s no working smoke detectors in over half of all fire-related deaths. Putting a working smoke detector in every home in America would be a wonderful solution, but the Red Cross doesn’t have the manpower or funds to do it.
So, the Red Cross tasked DataKind DC to create a national fire risk score. We believe that the most at risk places in the U.S. are places that have (1) a lot of fires, (2) have severe fires and (3) have no working smoke detectors. I built models to predict fire propensity (1), severity (2), and smoke detector coverage (3) for the U.S. at the census block level. Areas that rank high on all three models can then be tracked by Red Cross for resource allocation. These models will also be put into an online interactive visualization so that local firefighters and other agencies can also benefit from this knowledge.
What surprised you most about this project?
It’s surprising to think my biggest contribution to the project might not be my code, but a phone call. In an earlier iteration of the project, a previous team had used the American Housing Survey (AHS) to determine the percentage of smoke alarms for different parts of the country as a baseline for their models. Unfortunately, they had dropped the question about smoke alarms a few years ago. Through a mutual connection, I was able to find out that the question was dropped partially because on average it was a phone survey, and people are notoriously bad at knowing if they’re smoke detector works or not. I also got to learn how they take the thousands of phone calls and use statistics to extrapolate out to the entire country.
That’s when I realized, when the Red Cross does a home visit, they keep a record of it and as well as the number of smoke detectors checked and installed. We didn’t need to find thousands of phone surveys, they had a million home visits! Their own data was already the most comprehensive survey of smoke detectors in homes across the U.S. ever compiled. We made the first new model in a day by simply switching out datasets in a couple lines of code then the brand new model a few weeks later.
What data science skills have been most useful for this project?
The ability to work and think about real data. All of the datasets that we’re collecting take some level of data munging to get into usable form. Additionally, we’re doing this project at multiple geographies so there’s also a lot of work that goes into munging the data into the right geography. Data visualization is also an important skill. We’re creating risk scores for the entire U.S., and we’re able to quickly see how model changes affect performance. DataKind DC always needs more people who can create good choropleth maps of data. Even more so for interactive maps.
What professional skills (non-data science) have been most useful for this project?
Team/project management is so important as a Data Ambassador. Everyone is a volunteer, so if the project is too hard to get into, or the project starts to hit a roadblock or progress stalls out, people will leave. I reworked the project intro a dozen times until it was simple for volunteers of any skill level to download our code and data and get started right away. I think if the volunteers feel that you’re investing your time in them, they’ll invest their time in your project.
What tips do you have about communicating data science findings to nonprofits most effectively?
Speak their language. With the Red Cross, I was very fortunate that the subject matter expert from the Red Cross, Jake Janecek, was incredibly involved in the project from the beginning. He told me how the Red Cross traditionally reported fire data in the field, so I was able to present data to him that was in the style he was used to, which helped with buy-in and validation. Talking about changes to the project in terms of how it would affect the metrics the Red Cross cared about made the design process easier and also kept the team from going down some blind alleys that we thought were interesting, but the Red Cross wasn't interested in.
What advice would you like to share with volunteers who are new to DataKind or the Data for Good movement?
Working with DataKind has really demystified the process of data science for me. Going in, I think everyone has this feeling that there’s some secret knowledge that all ‘real’ data scientists have, and you’ll be ‘discovered’ at some point for not having it. And while there’s always more to learn, it's more important to figure out what skills and unique perspectives you do have and what you can bring to the conversation.
The best advice I can give is to work with real data. Collecting, munging, and validating data is 80% of the work so if you’re just working with data sets that have been cleaned for you, you won’t learn the critical skills you’ll need when working on a data science project in the real world. (DataKind is a great way to learn those skills!)
What did you discover about yourself while working on the project?
I learned that I love working on collaborative projects in a team setting. In academia, you rarely have the ability to truly work on something as a large group, so leading this project allowed me to realize my passion for managing a large team than working on a solo project.
If you could be any animal, which would you be?
What’s the last book you read?
Weapons of Math Destruction by Cathy O’Neil. Ethical AI is a hot topic right now, and so it’s more important than ever to think about the implications of the tools we create.
What’s one piece of advice you’d give to your younger self?
Buy bitcoin at $1,000 it's not too expensive. But seriously, I would say to just get started. Too many times in school, when faced with big assignments, I would try to wait for the perfect opportunity to get started or wait until I’ve analyzed the problem from every possible angle. Perfectionism is a form of procrastination. In the time it took to try to write it once perfectly, you could write it once terribly and rewrite it five more times.
About Volunteer Spotlights
Our volunteers are the lifeblood of our mission. They’ve inspired people to use their skills in ways they never dreamed of. They’ve slayed misconceptions. They’ve shown organizations trying to make the world a more humane place how data science and AI can change the game. We’re honored (and thrilled) to feature their stories in DataKind’s Volunteer Spotlight series. Follow this series to learn about their impeccable skill sets, their work with our brilliant project partners, and what inspires them to give their time, resources, and energy to causes that matter.