Using Machine Learning to Understand What Drives Student Success in Dallas County

January 29, 2019

Image above courtesy of The Commit Partnership

Founded in 2012, The Commit Partnership (Commit) is a coalition of more than 200 partners (public and private schools, colleges and universities, foundations, businesses, and nonprofits) that work to solve Dallas County’s biggest systemic education challenges, including improving early childhood education, preparing and retaining effective educators, and increasing college completion rates.

Commit’s vision is to work together, as a community, to ensure that all students receive an excellent and equitable education. In order to drive student achievement from cradle to career, Commit leverages data and community expertise. Additionally, they use key indicators to measure student achievement, identify practices that create an environment of success, and work to help align resources to share what works.

Dallas County educates over 500,000 students in Grades K-12, 10% of all students in Texas, the nation’s second largest state. By the fall of 2027, Texas is projected to have the largest total public school enrollment of 6.1 million students.(1) This trend in enrollment shows the critical need for communities to develop actionable plans to solve systemic challenges in education.

Public education data in Texas is difficult to access and even more challenging to comprehend. This combination too often results in decisions that are informed by perceptions or assumptions which, despite positive intentions, are ultimately harmful for students. Commit recently received access to sixteen million Texas Educational Agency deidentified, academic assessment records, matched longitudinally, for students who were in grades three to eight between 2012 and 2016. While the team had previously performed analyses on aggregate school and district level data, it had never before had access, or the analytical capacity needed, to work with deidentified student-level records at this scale.

With the support of the Microsoft Cities Team and StriveTogether, DataKind collaborated with Commit’s Analytics team to leverage these individual records to understand predictors of academic assessment performance and student success over time.

To start, the team worked closely to structure a framework for the data in the Microsoft Azure cloud. They developed reusable scripts for data extract, transform, load (ETL), database management, and descriptive visualizations to allow for the data to be efficiently and reproducibly stored, accessed, analyzed, and modelled. By using descriptive techniques to analyze trends and patterns, Commit would be able to identify students at the greatest risk of low performance.

The team identified the following predictors of future growth:

  • Early success in reading is more predictive of later grade success across subjects than any other subject
  • Economically disadvantaged(2) students tend to improve less than affluent ones
  • Students who move tend to improve less than similar students who do not move

These insights aligned with education research and supported ongoing strategic initiatives at Commit. The team then built regression models to further their analyses and better understand the factors that were most predictive of future assessment growth. The models confirmed that previous assessment scores were the strongest predictor of later assessment scores, more so than demographic or school-based characteristics.

undefined

Plot above shows 2012 and 2016 reading score distributions. Of a subset of students that had similar scores in 2012, those that moved in 2014 had slightly lower scores two years later than those that did not move.

Commit will be able to reproduce findings and build upon existing models to leverage their data to support educational strategies and policies that could bolster outcomes for students. “We were working through a large, relatively complex data set; it could have been easy to lose track of the message and overall impact that we wanted to communicate,” said Ashwina Kirpalani, Commit’s Managing Director of Analytics. “The DataKind team, however, had a great way of isolating the most relevant piece of the analysis and visualizing that insight in a clear, intuitive way.”

The Commit team has already used DataKind’s findings on the predictive power of students’ early scores to inform discussions about interventions with its early childhood education partners. "DataKind produced an analysis around 3rd grade reading proficiency and how it plays a crucial role in a student's success in future grades,” said Chris Hudgens, Director of Regional Analytics at Commit. “This deeply insightful analysis has helped us have more productive conversations with school district partners around what strategies they can employ to support early grade literacy and strengthened our case as we encourage potential donors to support early literacy strategies that work.”

  1. National Center for Education Statistics, The Condition of Education 2018, https://nces.ed.gov/pubs2018/2018144.pdf.
  2. Texas Education Agency defines economically disadvantaged status as one who is eligible for free or reduced-price meals under the National School Lunch and Child Nutrition Program, which is based on income eligibility guidelines.