Pillar 03 · Research Program

STAGE:
Performing Arts Analytics

A standardized data warehouse enabling cross-institutional research on opera audience behavior, demographics, and the effectiveness of engagement strategies — at a scale never before possible.

Overview

A Data Infrastructure for the Performing Arts

STAGE (Standardized Theater and Arts Granular Evidence) is a performing arts data warehouse developed in partnership with major opera institutions in North America and Europe. It addresses a fundamental gap: opera houses collect rich data about their audiences, but this data is siloed, inconsistently structured, and rarely used for rigorous research.

The STAGE schema defines 184 standardized fields for performing arts event and audience data, enabling cross-institutional analysis at a scale previously impossible. Partner institutions contribute anonymized audience, ticket, and engagement records under formal data sharing agreements, allowing researchers to study audience behavior as a population phenomenon.

The analogy to biomedical research is direct: just as clinical data platforms transformed our understanding of disease through large-scale, multi-site data integration, STAGE creates the infrastructure for evidence-based performing arts research. Dr. Rubin's career in building NIH-funded biomedical data platforms provides the methodological foundation.

STAGE is designed with interoperability as a first principle, supporting integration with Tessitura — the industry-standard ticketing and CRM platform used by most major opera houses and performing arts organizations worldwide. This enables longitudinal studies tracking individual audience members' engagement trajectories over years.

A bilingual (English/Spanish) data request framework has been developed to facilitate data sharing with European partner institutions, including those operating in Spanish-speaking regions. Controlled terminology spanning 21 data domains ensures consistency across contributing sites.

The platform is developed in collaboration with Opera Verace Foundation, which manages institutional partnerships and data sharing agreements. Research outputs will be published in peer-reviewed venues and shared as open methodologies with the broader performing arts research community.

Data Architecture

184-Field Schema

The STAGE schema covers all aspects of performing arts audience engagement, from initial ticket discovery through post-performance longitudinal follow-up. Key data domains include:

Audience demographics
Ticket purchase behavior
Event programming data
Artist engagements
Geographic reach
Digital engagement
Subscription patterns
Donation history
Pre-visit intent surveys
Post-visit satisfaction
Docent interaction logs
Return visit behavior
Referral pathways
Content exposure history
Social media signals
Controlled terminology (21 domains)
Research Applications

What STAGE Makes Possible

Application 01

Audience Segmentation & Prediction

Machine learning models trained on STAGE data to identify audience segments most likely to convert from first-time to repeat attendance, enabling targeted engagement strategies.

Application 02

Demographic Trend Analysis

Cross-institutional, longitudinal study of demographic shifts in opera audiences — quantifying the decline in younger audiences and identifying interventions that measurably reverse the trend.

Application 03

Docent Impact Measurement

Linking AI docent interaction logs to downstream ticket purchase and attendance behavior, providing causal evidence for the effectiveness of AI-driven audience development.

Application 04

Programming Optimization

Evidence-based recommendations for opera programming decisions — which repertoire, formats, and pricing structures most effectively attract and retain new audience segments.