By Caitlin Augustin, PhD, Vice President, Product & Programs, DataKind
Insights from a Complete College America NEXT panel on how common data models – and shared data infrastructure – can unlock better outcomes for students. At DataKind, we see every day how much possibility is trapped behind fragmented systems. This conversation explored not just the technical challenge, but the opportunity: building the data foundation needed for more students to thrive.
A public-good common data model can turn fragmented student records into a holistic foundation – making GenAI both useful and safe in higher education.
On any given day, higher education staff might hop between dozens of systems just to understand how students are doing. A learner’s story lives in a Student Information System (SIS), a Learning Management System (LMS), a CRM platform used for outreach, financial aid platforms, degree audit tools, and a growing list of edtech apps. Staff do heroic work to stitch those fragments into something actionable – but too often they’re asked to intervene with only the “tip of the iceberg” visible.
It’s a question we hear often in conversations with colleges – and one we explored deeply in our recent session at Complete College America’s NEXT Conference: what becomes possible if our data systems truly talk to each other?
Panel Speakers
Our panel brought together leaders working across the ecosystem:
- Sonia Jindal, Program Officer, Gates Foundation
- John Harnisher, PhD, Head of Education Research DataKind
- Nick Walsh, Principal & Technical Lead, DataKind
- And me, representing DataKind’s work building public-good data infrastructure for higher education

Photo of panel speakers: John Harnisher, Nick Walsh, Caitlin Augustin, Sonia Jindal
The interoperability problem isn’t new. The urgency is.
For more than two decades, education has developed standardized data models – Common Education Data Standards (SEDS), IMS Global standards (IMS), Ed-Fi (widely used in K-12), and others – to support interoperability across systems and institutions. The vision is right. Adoption is uneven.
From our discussion, we heard three major reasons why:
- Local variation even with standards. Custom fields accumulate, definitions drift, and the “common” model becomes many dialects.
- Standards without usable tooling. A model in a PDF doesn’t help campuses wrestling with Extract-Transform-Load (ETL) pipelines and reporting deadlines. Adoption becomes a heavyweight lift for already-stretched data teams.
- A public-goods challenge. Everyone benefits from shared infrastructure, but no single actor is naturally incentivized to steward it long-term.
So why does this moment demand action? Because the costs of fragmentation are growing – and the opportunity for shared progress has never been greater. Every new digital tool adds another stream – and another silo unless we integrate effectively. At the same time, GenAI creates a rare window to make interoperability less manual, more scalable, and more flexible than ever.
Based on what we’ve learned at DataKind, a shared model doesn’t just tidy up data pipelines – it expands what people on the ground can actually do. If a student stops out, re-enrolls, switches modalities, or stacks credentials, we should be able to see that in one place. Today, that’s rarely the case.
What a common data model can unlock
A common data model provides a consistent way to represent and interpret student data across systems and institutions. When it works, it changes what’s practical.
- Stronger insights. Instead of piecemeal retention dashboards built on partial records, institutions can generate deeper, more consistent insights about who students are, how they’re engaging, and what support they need.
- Visibility into the whole learner. Interoperable data surfaces transfer pathways, stop-outs, credential stacking, and re-entry across the P-20 continuum – exactly where students’ journeys often get lost today.
- Faster basics. Integrated Postsecondary Education Data Systems (IPEDS) reporting, Post Secondary Data Partnership (PDP) reporting, the metrics institutions rely on to measure progress, and compliance workflows still drain huge capacity because they require major data wrangling campus by campus. Standardization helps this work move faster and with fewer errors.
Why “public good” matters
For this work, the model must serve the sector – not any one company.
Every system already has a data model. The stakes are in who controls it and who can build on it. If a model is locked inside a proprietary platform, its benefits stop at that vendor’s boundary. A public-good model keeps ownership with institutions, while giving the entire ecosystem a shared foundation – enabling open innovation, cross-platform tool development, and resilience across funding cycles.
Other sectors show what it takes to make interoperability stick. Healthcare’s FHIR (Fast Healthcare Interoperability Resources) standard succeeded because it’s modular, extensible, and backed by strong governance. In tech, OAuth and W3C standards became defaults because they combined clarity with excellent developer experience. Stripe’s rise is a reminder that documentation and tooling are adoption strategies, not afterthoughts.
Higher education can learn the same lesson: aligned incentives, strong community stewardship, and tools that meet people where they work. And amid all the excitement about AI, we can’t lose sight of the basics: students benefit most when the people supporting them have trustworthy data at their fingertips.
GenAI as an accelerant – with guardrails
GenAI won’t fix messy data by magic – but it can reduce some of the hardest friction points in getting to a common model:
- Mapping fields across systems: Large language models (LLMs) can infer relationships between fields across disparate systems, accelerating early-stage ETL.
- Validation and enrichment: AI can flag data that doesn’t look right and missing values, helping teams trust what they standardize.
- Conversational data use: Once data is standardized, institutions can safely explore chat-based analytics and advising experiences – broadening access to insights beyond a small institutional research (IR) bottleneck.
But AI must be earned, not assumed. LLMs can be confidently wrong, and there is no perfect “box” to confine them in. Human-in-the-loop review, privacy protections, and clear governance are non-negotiable.
As one panelist summed it up: AI readiness starts with data readiness. Without clean, standardized, well-understood data, AI tools can’t reliably support students – and may introduce new risks.
Standardization without erasing campus reality
A concern we heard repeatedly was: “We don’t call it that here,” or “Our categories are different.” Institutions have valid local requirements, and the solution isn’t a brittle one-size-fits-all schema. It’s a modular model with disciplined extensions – common where it matters, flexible where it must be.
Think of it this way: agree on what an “apple” is, then allow campuses to note whether theirs is Golden Delicious or Macintosh. Interoperability depends on shared cores and carefully governed variation. And getting there requires co-design all the way through implementation, not standards handed down from afar.
What success looks like
For institutions, it means time back. For students, it means support that actually reaches them. For the ecosystem, it means durable, scalable solutions – not one-off pilots.
Success isn’t just cleaner pipelines – it’s a new operating reality:
- Institutions can adopt new student success or AI-enabled tools faster because they connect to a shared model.
- Student records aren’t trapped in any one platform; schools own their data and can move with confidence.
- The ecosystem has a portable, trustworthy language for success metrics across contexts.
- AI agents and dashboards are grounded in data people actually trust – not patchwork approximations.
Join us in building the foundation
At DataKind, we’re building the first version of a public-good common data model for student success, starting with three core systems: SIS, LMS, and degree audit. Our goal is a first release in summer 2026, shaped by the institutions, researchers, and vendors who will use it.
This is exactly the kind of shared infrastructure DataKind is committed to building: practical, ethical, public-good tools that help institutions serve more learners well.
If you want to help test assumptions, stress-test definitions, or co-design what “standard” should mean in practice, we’d love to work with you through light-lift quarterly sessions and targeted feedback cycles.
Get involved
If you’re interested in shaping the future of ethical, effective AI in higher education, we’d love to connect.
- Join our Software Advisory Group to help guide the development of our public-good common data model
- Explore our latest products, including DataKind Edvise
- Subscribe to our newsletter for updates on research, releases, and opportunities to collaborate
- Follow us on social (LinkedIn | X | Facebook | YouTube)
- Get in touch at education@datakind.org to partner with our team
Interoperability is infrastructure. And infrastructure is how we scale student success – together.
Join the DataKind movement.
- Interested in sponsoring a project? Partner with us.
- Interested in subscribing to our newsletter? Sign up.
- Interested in supporting our work? Donate here.
Quick Links
- How the University of Central Florida Is Scaling Student Success with DataKind Edvise
- How Lee College is Advancing Student Success with DataKind Edvise
- Empowering Advisors, Supporting Students: Introducing Edvise
- Partnership Spotlight: Ed Advancement
- Partnership Spotlight: National Student Clearinghouse



