Data, Databases, Standards (Data Ecosystem)

Data, databases, and standards constitute a key component of the learn-and-work ecosystem (a data ecosystem). Data are collected to meet accountability requirements set by entities such as federal and state governments, and higher education institution’s Board of Regents. These requirements are typically established by policymakers and higher education leaders to assure return on investment (ROI) in credentialing (education) and in workforce development (training). 

Data also are collected to help answer important questions about the efficacy of the many components of the learn-and-work ecosystem. Data are used to answer questions such as:

  • What is working (or not) in the ecosystem?
  • Is the needle moving and for whom? Are there clear outcomes? As example, are populations left out or poorly served by race-ethnicity, gender, age, adults with some college/no credential, adults with no postsecondary education, veterans, incarcerated, low-income, and/or disabled? 
  • What changes and fine-tuning are called for in the various components of the ecosystem?  
  • Is there interoperability among databases in the ecosystem to enable us to follow learners as they move between education and work? Are learner and work privacy rights observed?
  • What are the trends and projections– how may the future of work and learning affect the ecosystem?

These questions cannot be answered without information; and that requires collecting and storing information, enabling interoperability, and protecting users’ privacy.

In the complex, highly decentralized U.S. learn-and-work ecosystem, significant challenges complicate efforts to develop and maintain adequate data systems. Key challenges include the lack of a coherent national infrastructure for data systems; the large number of complex technology efforts working within the ecosystem (hardware and software); significant siloing (common in decentralized systems); and lack of resources to update, coordinate and connect disparate systems. As example:  

  • There are nearly 4,000 postsecondary institutions in the U.S. They all collect data (which typically then constitute longitudinal databases) on their student enrollments and completions, the number of faculty and staff they employ, the size of their libraries, etc. Most institutions develop independent systems, though they often work with commercial vendors that market common products. In either case, many  institutions customize their systems to their own circumstances, which hampers or even precludes interoperability.  As a culture, many higher education institutions also are hard-pressed to share data with other institutions—except for data required by the U.S. Department of Education’s NCES for Title IV institutions.
  • Numerous commercial providers sell hardware and software that assists credential providers in their work. One key product is a student information system (SIS), a management information system that institutions use to manage student data; register students in courses; document grading, student transcripts, and assessment scores; form student schedules; track student attendance; generate reports and manage other data needs. Information is a continuous concern since higher education institutions house an array of sensitive personal information, making them attractive targets for security breaches. Software services in other areas also are available from many commercial providers, including: Academic/Education; Asynchronous Learning; Corporate/Business; Course Authoring; Course Management; eCommerce Management; eLearning Companies.

The learn-and-work ecosystem features databases that organize data on myriad subjects(e.g., enrollment and completion, learner outcomes, resumés, job ads, supply and demand, wages and employability, —even specialty databases that develop from research studies on a variety of topics). This confounds the interoperability efforts.

  • There are private data companies such as Lightcast (formerly Emsi Burning Glass) which gather and integrate economic, labor market, demographic, education, profile, and job posting data from dozens of government and private-sector sources. This creates a comprehensive and current dataset that includes both published data and detailed estimates with full U.S. coverage. Industry, occupation, education, demographic, job postings, and profiles data are available at national, state, metropolitan area, and county levels. ZIP code estimates are available for employment, earnings, job change, and demographics data. Industry data is the backbone of Emsi Burning Glass’s core Labor Market Information (LMI)  data. Emsi Burning Glass industry data focuses on businesses, categorized by type: hospitals, oil refineries, grocery stores, etc. The Bureau of Labor Statistics’ Quarterly Census of Employment and Wages (QCEW) dataset provides detailed employment counts and earnings information for 95% of the employed workforce in the U.S., broken out by industry. The employment counts data provided by this dataset are the gold standard of employment counts throughout Emsi Burning Glass data. 
  • The Census Bureau’s Population Estimates Program, published by the Census down to the county level, is the most significant population data in the U.S. 
  • There are a growing number of platforms which function based on data acquired from multiple sources. The platforms typically generate tailor-made matches for users of this data; e.g., for individuals to match their interests or skills with job ads. is one of the world’s largest employment websites and job search engines. It’s used primarily to help job seekers find openings that match their skills and location. There are a growing number of alternative sites for a variety of platforms (including online/web-based, Android, iPhone, iPad and SaaS) such as LinkedIn, Indeed, Glassdoor, Xing and Polywork. The alternatives are mainly Job Search Services but may also be Social Networks or Classified Ad Services. They all function based on extensive databases.

Issues of critical importance in the area of data, databases, and standards include interoperability; the use of common standards to permit the exchange of data among nations; and the use of a common language to enable transparency in credentialing. Rapid advances in technology enabled digital transformation in our data systems. It’s now possible for us to move from inadequate, siloed data systems to interoperable systems that enable sharing among the many components of the learn-and-work ecosystem. This would offer data users far more tailor-made services.

However, a lack of interoperability, particularly within the campus ecosystem—continues to challenge higher education. As a result, “IT teams are often bogged down with navigating the quirks and idiosyncrasies of specific brands of technology used by colleges and universities. Systems don't work together the way they should, or worse, inferior technologies win out because of their compatibility with legacy systems.”

There is growing recognition, however, of the benefits of data interoperability: “IT teams will spend less time learning the arcane details of different technologies and more time working toward achieving strategic objectives. With access to a greater range of solutions from different providers, colleges and universities will be better equipped to move quickly and confidently when facing unforeseen circumstances or when new challenges arise. Faculty and staff will be able to freely share data across the institution while maintaining a single source of truth for analytics, enabling data-informed decision-making in support of student and institutional success.”

Many organizations are working to address these technological and cultural challenges, often through partnerships and common visions. These include:

  • The National Student Clearinghouse,  a nonprofit, nongovernmental organization, works to relieve the administrative burdens and costs related to student data reporting and exchange. The Clearinghouse is the leading provider of educational reporting, data exchange, verification, and research services in the United States. It covers 3,600 participating colleges and universities, 97% of students in public and private institutions enrolled by participants, 20,000 participating high schools, and 70% of secondary students enrolled in participating high schools. Work is performed in a trusted, secure, and private environment. Its research arm, the National Student Clearinghouse® Research Center™, works to better inform practitioners and policymakers about student educational pathways and enable informed decision making. Clearinghouse services are designed to facilitate compliance with the Family Educational Rights and Privacy Act, The Higher Education Act, and other applicable laws.
  • The National Center for Education Statistics (NCES) is the primary federal entity for collecting and analyzing data related to education in the U.S. and other nations. NCES is located within the U.S. Department of Education and the Institute of Education Sciences. NCES fulfills a Congressional mandate to collect, collate, analyze, and report complete statistics on the condition of American education; conduct and publish reports; and review and report on education activities internationally.
  • The Integrated Postsecondary Education Data System (IPEDS) is a system of interrelated surveys conducted annually by the NCES. IPEDS consists of related survey components collected over three collection periods (fall, winter, and spring) each year as described in the Data Collection and Dissemination Cycle. The completion of all IPEDS surveys is mandatory for all institutions that participate in, or are applicants for participation in, any federal financial assistance program authorized by Title IV of the Higher Education Act of 1965, as amended. IPEDS collects data on postsecondary education in the U.S. in the following areas: institutional characteristics, institutional prices, admissions, enrollment, student financial aid, degrees and certificates conferred, student persistence and success (retention rates, graduation rates, and outcome measures), institutional human resources, fiscal resources, and academic libraries. 
  • The American National Standards Institute (ANSI) oversees standards and conformity assessment activities in the U.S. A private, nonprofit organization founded in 1918, the Institute works in close collaboration with stakeholders from industry and government to identify and develop standards- and conformance-based solutions to national and global priorities. ANSI is not itself a standards-developing organization. Rather, it provides a framework for fair standards development and quality conformity assessment systems, works to safeguard their integrity, and serves as a neutral venue for coordination of standards-based solutions.
  • The Postsecondary Electronic Standards Council (PESC) is an open standards-development and open standards-setting body governed by a voluntary, consensus-based model. PESC, has developed with many partners and issued approved standards for admissions, financial Aaid, and registrar’s offices. PESC has led the establishment and adoption of trusted, free and open data standards across all sectors of education. Its mission is to enable cost-effective connectivity between data systems, networks and applications in order to accelerate performance and service; simplify data access and research; and improve data quality along the learner lifecycle.
  • 1EdTech calls for campus ecosystems to be integrated, flexible, and extensible. They should be filled with solutions that work together but be loosely coupled to achieve the most promising vision for information technology in higher education. This type of interoperability can be achieved in the campus ecosystem with the widespread adoption of open technology standards.
  • EDUCAUSE is a nonprofit association with a mission to advance higher education through the use of information technology. It features a large collection of information about higher education technology. The EDUCAUSE Taxonomy, an expert-created listing of over 200 terms, helps users find information among a wealth of online resources. The EDUCAUSE Library is a key clearinghouse for information about timely topics and research supporting the use and management of technology in higher education. It aggregates over 22,000 resources.
  • The Groningen Declaration Network is an international, nonprofit and voluntary network that supports academic and professional digital credential mobility. It seeks to enable citizens worldwide to consult and share their authentic educational data autonomously, with the expectation of fair recognition.  It does this by bringing together stakeholders from across the global Digital Student Data Ecosystem. The Network consists of participants and signatories of over 29 countries worldwide.
  • HR Open Standards ( is dedicated to the development and promotion of common specifications that simplify the exchange of data related to human resources. By championing collaboration and innovation, HR Open Standards leads standards-development projects. This saves HR professionals time and money by providing employers, government, software and service providers with free, flexible, and comprehensive global HR interoperability standards.
  • The T3 Innovation Network, led by the U.S. Chamber of Commerce Foundation, explores emerging technologies and standards in the talent marketplace to create more equitable and effective learning and career pathways. 
  • In North Carolina, Finish First NC (FFNC) is a data tool developed and originally used by Wake Technical Community College to identify students within one semester of completing a degree, diploma, or certificate; and identify students who have already fulfilled requirements for credentials but were not aware of them. The tool is now used in 54 community colleges. In a single academic year, Finish First NC helped Wake Tech identify more than 31,500 credentials of students close to completion and more than 6,500 credentials eligible to be awarded to students. From Spring 2017- Spring 2020, FFNC helped Wake Tech identify and award more than 11,000 certificates. In fall 2020, Wake Tech participated with the UNC System and the company InsideTrack to reach out to the students identified as “stop-outs” by the Finish First NC data tool and encourage them to re-enroll.
  • JEDX (Jobs and Employment Data Exchange), implemented by the Colorado Department of Higher Education with the Colorado Workforce Development Council, is a public-private approach for organizing, collecting, and using standards-based data on jobs and employment for the benefit of learner and employer communities. It is envisioned as a national public-private data trust to (1) improve government reporting (e.g., UI wage records); (2) support workforce analytics; and (3) make possible verifiable employment records for worker use (e.g., benefit eligibility).  

Key sources of data to inform the learn-and-work ecosystem, as noted by Lightcast, include: 

  • Traditional Labor Market Information (LMI)From Government sources (Bureau of Labor Statistics, U.S. Census Bureau, etc.), this is industry, occupation, education, and demographic data that is useful for understanding the structure of an economy and the major trends in jobs and wages. Traditional LMI lacks a certain level of detail and isn’t collected very often, so it is strongest when used in concert with job posting analytics and profile data. 
  • Job Posting Analytics (JPA)From online job postings, this is data collected from hundreds of millions of job postings created by employers. JPA can help measure the demand for talent in a given region. It is more granular than traditional LMI, providing details about the labor market (e.g., specific skills requested by employers). JPA also has no time lag, since job postings are live. However, the number of postings may be either higher or lower than the number of actual hires. Postings might outnumber hires when a company is trying hard to find talent, or postings may be significantly fewer than hires because certain types of jobs are not typically advertised online. 
  • Profile Data—From online profiles and resumés, this is public, self-reported information about individuals’ city/state/nation of residence, job history, education history, and skills. Profile data can help measure the supply of talent in a region. Profile data complements LMI because it includes levels of detail impossible for LMI to provide.
  • Compensation Data—From government sources and postings data, these are wage estimations based on wage data (from government sources) and self-reported wage data from jobs postings information. Compensation data can help estimate how much a position should be paid based on a worker’s actual experience and skills, rather than a job title. Compensation data contains estimations, not hard numbers; nevertheless, these estimations are reliable due to methodology which reduces rampant sample bias in self-reported compensation.
  • Global Data—From government and profile data sources around the world, this is industry, occupation, and profile data collected from various countries. Global data helps users compare  markets across borders, determine where a company might expand, and understand the skills landscape. Countries use all kinds of different names and definitions when they report their talent supply, so this disparity can be addressed by using a taxonomy that unifies the occupation categories between countries.
  • Skills—From job postings, resumés, and online profiles, this is data on skills possessed by real people in the real world. Emsi Skills creates a common language for students and jobseekers, colleges and universities, employers, and communities. Students and jobseekers can use Emsi Skills to write better resumés and discover the skills most in demand. Colleges and universities can use it to design and market work-relevant programs that help students communicate their skills to potential employers. Communities can use skills to describe the talent that businesses need, the abilities that local people have, and any gaps between the two. Skills are the basic units which define the economy. They often describe a person’s job much more accurately than the raw job title. This is why skills make such an effective common language.
Alternative Terms
  • Credential Management Systems
  • Database Management System
  • Student Information Systems
  • Learning Management Systems
  • Information Systems
  • Relational Databases
Relationship to Ecosystem

Data, databases, and standards are critical to the learn-and-work ecosystem because they enable checks and balances for the system. They generate information to meet accountability requirements of policymakers at the federal, state, and local levels. And they help respond to questions about return on investment posed by those who provide funds (e.g., governments, foundations) and the legislative direction for the ecosystem. This is especially the case for  credentials and providers, and employers and workforce. 


Learn & Work Ecosystem Library. Glossary: Credential Management Systems (CMS)

Wulff, Mike. (October 18, 2021). Interoperability: How to Turn a Blind Spot into a Strength. EDUCAUSE

Have something to submit?

For the ecosystem to function effectively, all parts of the system must be connected and coordinated.

Organizations (279)

Initiatives (311)

Topics (95)

Skip to content