Pillar 03 · Research Program
A standardized data warehouse enabling cross-institutional research on opera audience behavior, demographics, and the effectiveness of engagement strategies — at a scale never before possible.
STAGE (Standardized Theater and Arts Granular Evidence) is a performing arts data warehouse developed in partnership with major opera institutions in North America and Europe. It addresses a fundamental gap: opera houses collect rich data about their audiences, but this data is siloed, inconsistently structured, and rarely used for rigorous research.
The STAGE schema defines 184 standardized fields for performing arts event and audience data, enabling cross-institutional analysis at a scale previously impossible. Partner institutions contribute anonymized audience, ticket, and engagement records under formal data sharing agreements, allowing researchers to study audience behavior as a population phenomenon.
The analogy to biomedical research is direct: just as clinical data platforms transformed our understanding of disease through large-scale, multi-site data integration, STAGE creates the infrastructure for evidence-based performing arts research. Dr. Rubin's career in building NIH-funded biomedical data platforms provides the methodological foundation.
STAGE is designed with interoperability as a first principle, supporting integration with Tessitura — the industry-standard ticketing and CRM platform used by most major opera houses and performing arts organizations worldwide. This enables longitudinal studies tracking individual audience members' engagement trajectories over years.
A bilingual (English/Spanish) data request framework has been developed to facilitate data sharing with European partner institutions, including those operating in Spanish-speaking regions. Controlled terminology spanning 21 data domains ensures consistency across contributing sites.
The platform is developed in collaboration with Opera Verace Foundation, which manages institutional partnerships and data sharing agreements. Research outputs will be published in peer-reviewed venues and shared as open methodologies with the broader performing arts research community.
The STAGE schema covers all aspects of performing arts audience engagement, from initial ticket discovery through post-performance longitudinal follow-up. Key data domains include:
Machine learning models trained on STAGE data to identify audience segments most likely to convert from first-time to repeat attendance, enabling targeted engagement strategies.
Cross-institutional, longitudinal study of demographic shifts in opera audiences — quantifying the decline in younger audiences and identifying interventions that measurably reverse the trend.
Linking AI docent interaction logs to downstream ticket purchase and attendance behavior, providing causal evidence for the effectiveness of AI-driven audience development.
Evidence-based recommendations for opera programming decisions — which repertoire, formats, and pricing structures most effectively attract and retain new audience segments.