Meet Matt Harris! He’s a superstar who's held many volunteer roles here at DataKind.
More recently, he supported Jacaranda Health, one of our long-term partners based in Kenya under our Frontline Health Systems Impact Practice. In this project, he worked on an AI driven software application that allows help desk agents to quickly triage medically urgent messages and make correspondence with users more timely and efficient, helping new mothers and mothers-to-be have access to critical advice and support. In addition to this project, Matt’s been involved in multiple initiatives, for example analyzing official eHealth policies from 39 African countries, developing a web app exploring health text using Natural Language Processing (NLP) approaches, serving as an active member in the Scoping Squad, and more.
Matt’s day job is with a FinTech company in New York where he leads the data science and application development team. He loves to roll up his sleeves and explore datasets to see if he can find new insights and solutions to problems that help people. Check out more about Matt and his story below!
My background is in Astrophysics, studying atmospheres of Earth and other planets, so my interest in data goes back a long way and far out into the galaxy. I currently lead data science and application development teams for a FinTech company in New York, where we try to find ways to save time and money so that people have more tools and space to do great things. I’m also lucky enough to be a volunteer for DataKind which brings my worlds together - data and tackling social issues to try and make our planet a better place for everybody. When I'm not nerding out, I can be found stomping around the hills of Vermont, looking after foster doggies and playing hideously bad guitar.
I'm currently working with Jacaranda Health looking at ways to help pregnant mothers get the help they need. Jacaranda Health already has great software in place that automatically helps to understand the questions thousands of mothers in Kenya have about their pregnancies, medical advice, nutrition, and more while simultaneously triaging their questions to flag potential emergencies. In this project, I’m looking at the ways in which the technology might be extended to automatically extract entities from interactions between mothers and the AI-driven software application. More specifically, I’m looking at the different types of foods mentioned in mothers’ questions about nutrition and cases where Jacaranda needs to identify the names of medical facilities for reporting and monitoring. It's an interesting challenge because chats can be in English, Swahili, or the Sheng dialect spoken by many in Kenya.
A lot! But I did observe that there are quite a few questions about avocados. Kenya is in the top 10 avocado producers of the world.
It’s been inspiring to see the fantastic support triage automation that Jacaranda Health has developed.
The project originally aimed to investigate how AI driven model performance might be improved. Coincidentally, the model classes (i.e., chatbot intents) were in the process of being updated and a new training set developed. The new model is now running in production, but there isn't yet enough telemetry captured to analyze how it might be improved. Because of this we pivoted to focus on entity extraction.
To analyze the chats I needed to use a number of Natural Language Processing (NLP) applications, such as LDA topic analysis, SpaCy POS tagging, and evaluation of chat classification. For exploring model performance I tried several approaches, such as fastText and BERT. Generating named entity recognition (NER) training data is done by using Fuzzy Matching and clustering. Lastly, the integration with Jacaranda Health’s processes required me to learn a bit about Google Auto ML, as well as using Azure cognitive services LUIS to prototype workflows for ongoing maintenance and retraining of models.
I’ve been able to apply some of the concepts I use in my day job, for example, process design and triage automation. It's been great to exchange some ideas on that, for example in how best to generate accurate training data as part of issues being progressed through the Jacaranda support team.
Data science can get very technical and an obvious insight to a data scientist can sometimes look like gobbledygook to non-technical, social sector experts, so it's important to always translate clearly into real-world context and goals in the simplest way possible. Stating the obvious, I know, and not always easy! But this is such a key step towards the “stickiness” of any technical solution proposed to a human being, no matter how amazing that solution may be. It has to be distilled into clear understandable terms.
Always great, a nice group of people. Good positive energy, caring and engaged folks, I'm always learning on these projects.
An obvious challenge is the lack of resources and skills a nonprofit might have around data science. That said, I also think there’s a more subtle challenge - there’s so much hype around data science and AI being able to magically solve problems that I think expectations can be a little high sometimes. I think clear engagement with problem and goal definitions are key, as well as precursor analysis to answer the initial question - do we have sufficient data to actually do this?
Listen closely to your teammates, you’ll always learn something new.
I met Benjamin Kinsella, DataKind’s technical project manager as well as Mitali Ayyangar, DataKind’s portfolio manager. Both are enthusiastic, positive and super nice - I’ve learned a lot from them and really enjoy working with them and others at DataKind.
That I'm interested in eHealth policy!
To try and do things that help improve the world, no matter how small.
Former President Barack Obama, a class act, Isaac Newton (even though he was a bit grumpy!), and Freddie King.
One of those cool octopuses that can change color to match the background.
AC/DC at the Giants Stadium.
The Warmth of Other Suns: The Epic Story of America's Great Migration by Isabel Wilkerson.
Very little, so that I wouldn't introduce any time travel paradoxes.
Learn to hold the guitar pick correctly!
Our volunteers are the lifeblood of our mission. They’ve inspired people to use their skills in ways they never dreamed of. They’ve slayed misconceptions. They’ve shown organizations trying to make the world a more humane place how data science and AI can change the game. We’re honored (and thrilled) to feature their stories in DataKind’s Volunteer Spotlight series. Follow this series to learn about their impeccable skill sets, their work with our brilliant project partners, and what inspires them to give their time, resources, and energy to causes that matter.