Exploring Open Data to Support Primary Education in India

Guest post from DataKind Bangalore


While open data has the potential to be a valuable resource for those working to address tough social issues, it can be difficult to find and often needs to be transformed before it can be useful in analysis. In December 2017, DataKind Bangalore tried a new event format where, instead of working directly to address a specific organization’s painpoints or questions, volunteers would explore and glean insights from open data related to a specific sector.


In this case, the volunteers were focused on the primary education sector in India, exploring what open data was available and seeing what insights might be generated from it. The event kicked off with Menaka Khare of Karnataka Learning Partnership, a nonprofit partner of DataKind Bangalore, who spoke about the state of primary education in India and how KLP has been trying to tackle challenges in this sector.



Menaka Khare from KLP talking about KLP’s efforts in the primary education sector


After the introductory talk and some helpful pointers, the volunteers got to work. They first focused on data discovery and data extraction, scouring the Internet for potential open datasets related to the primary education sector in India. By the end of the day, they had come up with close to fifty new relevant sources of open data, including the National Achievement Survey, reports related to the midday meal program, the All India School Education Survey and others.



Our teams of volunteers who made the whole event possible.


Day two focused on data extraction and data analysis. In many cases, the data was in tables in PDF files so the teams used Tabula to extract the data for analysis. By making use of Tabula, Excel, pandas, and other software libraries, teams came up with interesting questions and explored the data to gain insight into the state of primary education across the country.


The data analysis session lead to some interesting insights, problem formulations and suggestions, including an insight related to the ratio of the number of boys to the number of girls enrolled in the state of Karnataka, the suggestion to use Anganwadi data along with NASER data to develop a thematic map for achievement across all the districts in the state of Karnataka, and the correlation of availability of drinking water and toilets to the dropout rate among students. One of the most valuable things to come out of the event was the formulation of a database schema, which could be potentially used to combine and analyze multiple data sources.



Database schema to incorporate multiple sources of data in analysis


During the share-back session, it was evident that what had begun as a two-day event with broad goals morphed into a useful discussion on the state of the primary education sector in India using open data. There were many takeaways from the event, such as the discovery of multiple open data sources, the creation of a potentially useful database schema for organizations working in the education sector, and several ideas for how to use these data sources to address challenges plaguing the primary education sector in India.


A big thank you to the wonderful volunteers for their active and enthusiastic participation, to Menaka Khare from our NGO partner Karnataka Learning Project and our venue sponsor – Sahaj Software Solutions.


Join Us

As the event drew to a close, many volunteers were already wondering when the next one would be held! We would love to see even more of you at DataKind Bangalore’s upcoming events. Join our Meetup group, or follow us on Facebook or Twitter for more updates and announcements.

Scroll to Top