DataDiving with DataKind Singapore

June 7, 2018

DataKind Singapore hosted another DataDive this April, gathering around 70 volunteers for a weekend of analyzing data to help three phenomenal organizations advance their missions. Learn more about the work achieved in support of the Community Justice Centre, Law Society Pro Bono Services and Effective Altruism.

Community Justice Centre & Law Society Pro Bono Services

“We would like to express gratitude for the opportunity for my staff to understand more about data analytics and the complexity in achieving good data, and we thank you for your invaluable contribution towards the drive towards the greater access to justice.”
-Leonard Lee, Executive Director, Community Justice Centre

"Thanks for having us, you guys are truly awesome in how you’re helping nonprofits and we’re truly very grateful for your support. Looking forward to working with you guys more!"
-Claudine Tan, Assistant Director, Law Society Pro Bono Services


The Community Justice Centre (CJC) is a community partnership between the public sector, the philanthropic sector, and the legal profession rendering assistance to Litigants-in-Person (LiPs) in need.

Law Society Pro Bono Services (LSPBS) was established in 2007 with the mission to help bring free legal assistance to those in need in our community to ensure access to justice for all.  LSPBS runs a wide range of programs and activities including public education initiatives, free legal clinics, representing qualifying applicants and assist nonprofits and social enterprises as well.  

DataDive volunteers split up into three teams and collaborated with subject matter experts like volunteer lawyers and law trainees as well as representatives from CJC and LSPBS to look at three common challenges the organizations faced, including understanding trends in beneficiary demographic data and legal cases, automating the labeling of cases and better serving beneficiaries by identifying their common questions.  

Understanding legal case trends and demographic profiles


Volunteers listening to sharings by the Trend Analysis team

The Problem
Both CJC and LSPBS wanted to gain a greater understanding of the beneficiaries they serve and also to understand any trends in their recent legal cases.

What We Did
Analyzing case information by LiP demographics, the Trends Analysis team graphed case type trends and LiP language spoken by various demographic information like postal district, legal clinic location, nationality and employment status.

What’s Next?
With these insights, the organizations will be better informed in their decision making processes to quickly respond to trending needs through resource management and public education.  In addition, the organizations are also looking to enhance their current reporting processes.

 

The challenge of manually labeling past cases

The Problem
With years of case data each, both CJC and LSPBS have been categorizing past cases manually with the help of volunteers. This is becoming a major resource challenge and the organizations wanted to explore how they might automate this process. In addition CJC wanted to look at how to classify certain case subtypes as well as reduce human error in case type labelling by seeing if they could use case details to predict the labels.

What We Did
After working with CJC and LSPBS representatives to prioritize the common case types (for example, family issues and civil cases), the Case Classification team created a logistic regression model and a random forest model for the CJC and LSPBS cases respectively. The team also created classification models for case subtypes like Bankruptcy and Divorce for CJC. Finally, the team produced a case type classification toolkit that takes in case details to predict case type labels for common case types.  

What’s Next?
Following the DataDive, the organizations are looking to fill additional gaps in case data, particularly less common case sub-types. The Case Classification team has also highlighted challenges and limitations faced when classifying criminal cases, where further improvements can potentially be explored.

 

What are the frequently asked questions from our beneficiaries?

The Problem
As part of CJC’s vision to be a one-stop hub that delivers a seamless set of services for court users in need, CJC intends to launch a chatbot and wanted to uncover their beneficiaries frequently asked questions to inform and enhance their chatbot service.  

LSPBS also hoped to compile a list of frequently asked questions (FAQs) from the cases that they routinely handle to provide accessible legal information to all.   

What We Did
The FAQ team collaborated with volunteer lawyers and explored methods like phrase models and topic models.  While some outputs from topic modeling of case types and sub-types as shown in the visualization below were insightful, the team nevertheless faced difficulty in curating the frequently asked questions.

 

 

Anonymized image of the topic model visualization

Hence the team sought to create a visual aid toolkit for volunteer lawyers, providing them with an easy way to explore open-ended case descriptions and curate FAQs based on their professional experience.

What’s Next?
CJC is intending to use the insights from the topic model visualizations to enhance their chatbot service and LSPBS is intending to assemble a small team of volunteer lawyers to use the visual aid toolkit and prepare a list of FAQs and corresponding legal information. They may also explore further text analysis with regard to FAQs and improvements for the visual aid toolkit.

 

Effective Altruism

“EA SG is still in the early stages of attempting to identify where the greatest unmet needs are and which charities might be very effective in targeting those areas in SE Asia. Datakind SG helped us take the first crucial step of finding regional charities and we're going to use this data to further reach out to the organizations - to understand their work and evaluate for effective charities to share with the public.

I'm really grateful for the dedicated, kind and efficient support from the core leads to the volunteers at all the meetups and the DataDive. Thank you very much!”

-- Zeng Wanyi, Singapore Co-ordinator, Effective Altruism Singapore

The Problem
Effective Altruism (EA) is a global community of people who care deeply about the world and are focused on using evidence and reason to best benefit others. One approach they employ is to optimize cause prioritization and direct resources and efforts towards high impact areas of work. Through the course of their work in Singapore, they’ve found that many people would like to give and volunteer regionally in Southeast Asia (SEA), but have no resources to guide them in this process. They’ve conceptualized a platform to offer recommendations on where one can give effectively in SEA, and approached DataKind SG for helped creating a database of Non-Governmental Organisations (NGOs) in SEA and visualizing the data on a map overlayed with country metrics.

What We Did
These goals were split into the different tracks below, and the team wasted no time in digging in!

 

By the end of the DataDive - together with accumulated efforts from previous DataJams - a total of 34 websites were scraped, containing details of over 15,000 organisations in SEA. While the team explored a variety of methods to geocode these organizations, they were unable to finish due to the limited number of queries imposed by the Google Maps API.

The team also worked on tagging these NGOs into their associated causes, with the help of keywords provided by EA. Finally, a visualization of the data was also done in Tableau, thanks to the expertise of some of the volunteers within the team. As the geocoding is not fully done, the visualization below is only a rough draft of how it will eventually look. 

Next Steps

 

While our work with EA is clearly not done yet, it was great to see the volunteers achieve such progress during the DataDive! The scripts the volunteers created were fully reproducible, which should help support any future work, including the refinement of the cause tagging method and the map visualization. EA will also be using this data to contact notable NGOs to learn more about their impact.

 The DataDive volunteers at the end of a long but exciting weekend of using their data skills for good.

Get Involved with DataKind Singapore

Thank you to all the volunteers that came out to help support these organizations and the tremendous work they do. Special thanks to Expedia for hosting us! If you’re local, we’d love to see you at the next DataDive or Meetup. Sign up to get involved!