Seth E. Spielman

Seth E. Spielman

AI Evaluation · Computational Social Science · Measurement

I work at the intersection of measurement, social science, and AI. At Microsoft, I built the evaluation infrastructure for Copilot and co-led the largest empirical study of AI usage ever conducted. Previously, I led data science teams at Apple Maps and held faculty appointments at Brown, Columbia, and the University of Colorado Boulder. My academic work is deeply interdisciplinary, spanning fields as diverse as geography, computer science, history, and statistics, with ~50 peer-reviewed publications and a book. I've held advisory roles with the U.S. Census Bureau, Oak Ridge National Laboratory, the National Academy of Sciences, and early-stage startups on how to measure products, people, places, and the economy. Once upon a time, I was the sole proprietor of an antiquarian bookshop in Manhattan.

Microsoft AI

Copilot Evaluation & the 2025 Copilot Report

Designed Copilot's core evaluation infrastructure and co-led the analysis of 37.5M conversations—the largest empirical study of AI usage ever conducted.

Federal Appointment

U.S. Census Bureau Scientific Advisory Board

Appointed by the Department of Commerce to advise on the design, operation, and modernization of how the U.S. measures its population and economy. Served 2022–2025.

In the Press

Featured in Major Outlets

My work has been covered in Science, The Atlantic, the Associated Press, Axios, ABC News, and international media.

Apple Maps

Building a Map of the World

Served as interim DRI for Apple's effort to build a map from scratch. Developed patented technology using mobile sensor data to detect and correct map errors at scale.

Scientific Impact

~50 Publications & Major Awards

Published in PNAS, Demography, SIGIR, and more. Authored Urban Analytics (Sage, 2018). Recognized with the AAG Distinguished Scholar Award, the ASA's Spatial Analysis & Intergovernmental Statistics (SPAIG) Award, and the Michael Breheny Prize.

University Leadership

Inaugural Chief Data Officer, CU Boulder

Directed strategy, institutional research, and digital transformation for a 35,000+ student institution. Led a team of 40+ staff and contractors.

Current Research

Active lines of inquiry at the frontier of AI measurement, user understanding, and evaluation theory.

01

Evaluating AI at Scale with LLM Judges

How do you know if an AI system is actually helping people?

Building complex agentic evaluation systems that use LLM judges to assess product experiences in real time. This work lands as Copilot’s primary KPI—the Session Success Rate—which Microsoft AI CEO Mustafa Suleyman discusses in the press as the metric he optimizes above all others. These systems go beyond simple accuracy to evaluate whether conversations are genuinely useful, safe, and aligned with what users need—across millions of interactions.

02

Synthetic Users & Simulation-Based Intelligence

Can you build synthetic agents realistic enough to pass a Turing test—and then scale them?

Creating synthetic user agents, grounded in AI-mediated interviews with real people, that are realistic enough to stand in for actual users. The goal is to scale these agents into populations that enable deep product insights, competitive intelligence, and “soft flights”—running A/B experiments against synthetic users rather than launching to the public and waiting weeks for results. This compresses the experimentation cycle from weeks to hours while surfacing blind spots that traditional research misses.

03

Validity of Metrics in Large-Scale AI Systems

When you measure a billion interactions, what are you actually measuring?

Every metric carries a validity gap—the distance between what you think you’re measuring and what your metric actually captures. In large-scale AI and search systems, this gap can have profound consequences for products and organizations, because subtle choices in measurement design compound into consequential differences in what gets optimized. This research develops theoretical foundations for understanding and closing the validity gap in metrics that shape products used by hundreds of millions of people.

04

The Collective Consequences of AI-Mediated Decisions

What happens when millions of personal decisions are shaped by AI?

There is a broad literature—in the press, in academia, and from major labs—about how AI is poised to reshape the future of work: jobs displaced, productivity gained. But recent research from Microsoft and OpenAI has shown that people discuss personal matters with AI more often than professional ones. These aren’t trivial queries—they’re about what to buy, what to study, how to maintain health and well-being, what to do on a Sunday afternoon. AI now supports decisions ranging from the minor and routine to the major and life-altering. What are the collective consequences of these millions of AI-mediated choices? We are working to better understand and build a theoretical framework for how personal decisions made with AI aggregate into broader social and economic impacts.

The throughline: For twenty years, my work has focused on measuring complex social phenomena—building metrics that faithfully capture the complexity of human behavior in products, on maps, and in the economy, then mobilizing those metrics to drive decisions, strategy, and insight. I’ve applied this practice across domains ranging from official government statistics to global consumer mapping to social indicators of vulnerability and well-being to the evaluation of AI systems. The craft is the same in every case: build measurement systems that are highly attuned to the problem at hand, that are transparent about what they capture and what they miss, and that let you zoom from the scale of an entire system down to the texture of individual interactions. Measurement is always reductionist—the art is in having an opinionated take, rooted in theory, experience, and direct observation, and translating it into numbers that drive action.

Experience

Positions spanning industry, academia, and public service.

2021 – Present

Member of Technical Staff, Director

Microsoft AI
  • Designed and built Copilot's core evaluation infrastructure—the organization's primary quality metric tracking user experience, quality, policy compliance, and safety at scale
  • Pioneered contextual evaluation methods that dynamically adapt definitions of success, risk, and failure to specific usage scenarios
  • Built LLM-Judge based systems supporting A/B experimentation, rapid model iteration, and feature development
  • Co-Led the 2025 Copilot Report analyzing 37.5M conversations
  • Direct competitive intelligence for Copilot, providing a broad analytical view of the AI marketplace and informing product strategy
  • Developed novel methods for measuring document relevance using AI, contributing to the “New Bing”
2019 – 2021

Assistant Vice Chancellor for Strategy & Institutional Research

University of Colorado Boulder
  • Directed strategy, institutional research, and digital transformation for 35,000+ student institution
  • Led team of 40+ staff; established enterprise data governance program
2015 – 2019

Senior Manager, Apple Maps

Apple, Inc.
  • Served as interim DRI for Apple's effort to build a map of the world from scratch
  • Led team of 12 ML/data scientists developing novel methods to map the built environment using distributed mobile sensors
  • Developed patented technology using aggregated mobile device data to detect and correct map errors at scale
  • Contributed to Apple Card launch through location intelligence features
2010 – 2021

Associate Professor / Assistant Professor of Geography

University of Colorado Boulder
  • Tenured faculty with courtesy appointment in Information Science
  • Published ~50 peer-reviewed papers; authored Urban Analytics (Sage, 2018)
  • Secured $3M+ in competitive research funding as PI or Co-PI
  • Elected by faculty to chair oversight of ~$2B annual operating budget
2021 – Present

Visiting Professor of Geography (Honorary)

University of Liverpool
  • Active research collaborations on geodemographics and urban analytics
Earlier

Earlier Positions

Brown University · Columbia University · U.S. Forest Service
  • Associate Director, Spatial Structures in Social Sciences Research Institute & Lecturer (Brown)
  • Adjunct Associate Research Scientist & Adjunct Professor (Columbia)
  • Cartographer (U.S. Forest Service)

Selected Advisory & Leadership

U.S. Census Bureau, Dept. of Commerce — Scientific Advisory Board (2022–2025)
Oak Ridge National Laboratory — Scientific Advisory Board, Geospatial Science & Human Security (2021–2025)
National Academy of Sciences — Working Group on Data Privacy in the 2020 Census (2020–2021)
U.S. Census Bureau — Data Products Redesign Group (2015–2020)
U.S. Department of Interior — Strategic Sciences Group, Hurricane Sandy (2013–2014)
Early-Stage Startups — Technical advisor on data strategy, measurement, and AI product development

Selected Awards & Honors

Distinguished Scholar Award, American Association of Geographers
Statistical Partnership Among Academe, Industry, and Government (SPAIG) Award, American Statistical Association
Michael Breheny Prize for Urban Analytics and City Science
Fellow, Institute of Behavioral Science, University of Colorado
Founder’s Award for Best Paper, Social Science History Association
Best Paper Award, IEEE Conference on Social Networking and Computing

Publications

Selected peer-reviewed articles, books, patents, and book chapters.

Patent
  1. McErlain, J., Guggemos, N., Spielman, S.E. (2021) Detecting Changes in Map Data based on Changes in Device Data. United States Patent US20200011684A1.
Books & Edited Volumes
  1. Singleton, A., Spielman, S.E., and Folch, D. (2018) Urban Analytics. Sage.
  2. Spielman, S.E., Xiao, N., Cockings, S., and Tanton, R. (2017) Spatial Analysis with Demographic Data: Emerging Issues and Innovative Approaches. Special Issue of Computers, Environment and Urban Systems.
Peer-Reviewed Journal & Conference Articles
  1. Singleton, A. and Spielman, S.E. (2026) Geodemographics and residential differentiation: A methodological review and future directions for learned representations of the social landscape. Computers, Environment and Urban Systems, 125: 102396.
  2. Costa-Gomes, B., Chen, S., Hsueh, C., Morgan, D., Schoenegger, P., Shah, Y., Way, S., Zhu, Y., Adeline, T., Bhaskar, M., ... Spielman, S.E., et al. (2025) It’s About Time: The Temporal and Modal Dynamics of Copilot Usage. arXiv preprint arXiv:2512.11879.
  3. Frazier, A.E., Nelson, T.A., Kedron, P., Shook, E., Dodge, S., Murray, A., Goodchild, M., Battersby, S., Blanford, J.I., Claramunt, C., ... Spielman, S.E., et al. (2025) Rethinking GIScience education in an age of disruptions. Transactions in GIS, 29(2): e70048.
  4. Nelson, T.A., Frazier, A.E., Kedron, P., Spielman, S.E., et al. (2025) A research agenda for GIScience in a time of disruptions. International Journal of Geographic Information Science, 39(1): 1–24.
  5. Thomas, P., Kazai, G., Craswell, N., and Spielman, S.E. (2024) What Matters in a Measure? A perspective from large-scale search evaluation. ACM SIGIR 2024. Best Paper Nominee
  6. Thomas, P., Spielman, S.E., Craswell, N., Mitra, B. (2024) Large Language Models Accurately Predict Searcher Preferences. ACM SIGIR 2024.
  7. Singleton, A. and Spielman, S.E. (2024) Segmentation using Large Language Models: A new typology of American neighborhoods. EPJ Data Science, 13(1): 34.
  8. Folch, D.C., Spielman, S.E., and Graber, M. (2023) The Impact of Covariance on the American Community Survey Margins of Error. Population Research and Policy Review, 42(4): 55.
  9. Spielman, S.E., Singleton, A. (2022) Generalized Activity Spaces: A new model for socio-spatial context. Annals of the American Association of Geographers, 112(8): 2212–2229.
  10. Tuccillo, J., Spielman, S.E. (2022) A method for measuring coupled individual and social vulnerability to environmental hazards. Annals of the American Association of Geographers, 112(6): 1702–1725.
  11. Zhang, Y., Spielman, S.E., Liu, Q., Shen, S., Zhang, J.S., Lv, Q. (2020) Exploring the Usage of Online Food Delivery Data for Intra-Urban Job and Housing Mobility Detection. IEEE SocialCom-2020. Best Paper Award
  12. Spielman, S.E., Tuccillo, J., Wood, N., Tate, E. (2020) Evaluating social vulnerability indicators: criteria and their application to the Social Vulnerability Index. Natural Hazards, 100(1): 417–436.
  13. Fowler, C., Frey, N., Folch, D.C., Nagle, N., and Spielman, S.E. (2020) Who are the people in my neighborhood?: The “Contextual Fallacy” of Measuring Individual Context with Census Geographies. Geographical Analysis, 52(2): 155–168.
  14. Weinberg, D., Abowd, J., Belli, R., Cressie, N., Folch, D.C., Holan, S., Levenstein, M., Olson, K., Reiter, J., Shapiro, M., Smyth, J., Soh, L-K., Spencer, B., Spielman, S.E., Vilhuber, L., Wikle, C. (2019) Effects of a Government-Academic Partnership: Has the NSF-Census Bureau Research Network Helped Improve the U.S. Statistical System? Journal of Survey Statistics and Methodology, 7(4): 589–619.
  15. Jurjevich, J., Griffin, A.L., Spielman, S.E., Folch, D.C., Merrick, M., Nagle, N. (2018) Navigating Uncertainty: How Urban and Regional Planners Use The American Community Survey. Journal of the American Planning Association, 84(2): 112–126.
  16. Folch, D.C., Spielman, S.E., Manduca, R. (2018) Fast Food Data: Where User Generated Content Works, and Where it Doesn’t. Geographical Analysis, 50(2): 125–140.
  17. Bellman, B., Spielman, S.E., and Franklin, R.S. (2018) Local Population Change and Variations in Racial Integration in the United States, 2000–2010. International Regional Science Review, 41(2): 233–255.
  18. Spielman, S.E., Xiao, N., Cockings, S., and Tanton, R. (2017) Statistical Systems and Census Data in the Spatial Sciences. Computers, Environment, and Urban Systems, 63: 1–2.
  19. Gutmann, M., Brown, D., Cunningham, A., Dykes, J., Leonard, S., Little, J., Mikecz, J., Rhode, P., Spielman, S.E., and Sylvester, K. (2016) Migration in the 1930s: Beyond the Dust Bowl. Social Science History, 40(4): 707–740.
  20. Folch, D.C., Arribas-Bel, D., Koschinsky, J., and Spielman, S.E. (2016) Uncertain Uncertainty: Spatial Variation in the Quality of the American Community Survey. Demography, 53(5): 1535–1554.
  21. Singleton, A., Brunsdon, C., and Spielman, S.E. (2016) Establishing a Framework for Open Geographic Information Science. International Journal of Geographic Information Science, 30(8): 1507–1521. IJGIS Most Downloaded 2016
  22. Spielman, S.E. and Singleton, A. (2015) Studying Neighborhoods Using Uncertain Data From the American Community Survey. Annals of the American Association of Geographers, 105(5): 1003–1025.
  23. Wood, N., Jones, J., Spielman, S.E., Schmidtlein, M. (2015) Community clusters of tsunami vulnerability in the U.S. Pacific Northwest. Proceedings of the National Academy of Sciences, 112(17): 5354–5359.
  24. Spielman, S.E. and Folch, D.C. (2015) Reducing the Margins of Error in the American Community Survey Through Data-Driven Regionalization. PLoS ONE, 10(2): 1–21.
  25. Singleton, A. and Spielman, S.E. (2014) The Past Present and Future of Geodemographics in the US and UK. The Professional Geographer, 66(4): 558–567.
  26. Spielman, S.E., Folch, D., Nagle, N. (2014) Causes and Patterns of Uncertainty in the American Community Survey. Applied Geography, 46: 147–157.
  27. Folch, D. and Spielman, S.E. (2014) Identifying Regions based on Flexible User Defined Constraints. International Journal of Geographic Information Science, 28(1): 164–184.
  28. Spielman, S.E. (2014) Spatial Collective Intelligence? Accuracy, Credibility and Volunteered Geographic Information. Cartography and Geographic Information Science, 41(2): 115–124.
  29. Nagle, N., Buttenfield, B., Leyk, S., Spielman, S.E. (2014) Dasymetric Modeling and Uncertainty. Annals of the Association of American Geographers, 104(1): 80–95.
  30. Spielman, S.E., Harrison, P. (2014) The Co-Evolution of Residential Segregation and the Built Environment at the Turn of the 20th Century: a Schelling Model. Transactions in GIS, 18(1): 25–45.
  31. Spielman, S.E. and Logan, J. (2013) Using High Resolution Population Data to Identify Neighborhoods and Determine their Boundaries. Annals of the Association of American Geographers, 103(1): 67–84.
  32. Spielman, S.E., Yoo, E.H., Linkletter, C. (2013) Neighborhood context, health, and behavior: the role of scale and residential sorting in statistical inference. Environment and Planning B, 40(3): 489–506. Breheny Prize
  33. Logan, J., Spielman, S.E., Xu, H., Klein, P. (2011) Identifying and Bounding Ethnic Neighborhoods. Urban Geography, 32(3): 334–359.
  34. Spielman, S.E. and Yoo, E.-H. (2009) The Spatial Dimensions of Neighborhood Effects. Social Science and Medicine, 68(6): 1098–1105.
  35. Spielman, S.E. and Thill, J.-C. (2008) Social Area Analysis, Data Mining, and GIS. Computers, Environment and Urban Systems, 32(2): 110–122. CEUS Top 25 Most Cited
  36. Erdemir, E.T., Batta, R., Spielman, S.E., Rogerson, P. (2008) Optimization of aeromedical base locations in New Mexico. Accident Analysis and Prevention, 40(3): 1105–1114.
  37. Erdemir, E.T., Batta, R., Spielman, S.E., Rogerson, P., Blatt, A., and Flanigan, M. (2008) Location Coverage Models with Demand Originating from Nodes and Paths. European Journal of Operations Research, 190(3): 610–632.
  38. Borrell, L., Northridge, M.E., Miller, D., Golembeski, C.A., Spielman, S.E., Sclar, E., Lamster, I. (2006) Oral Health and Health Care for Older Adults: A Spatial Approach for Addressing Disparities and Planning Services. Special Care in Dentistry, 26(6): 252–256.
  39. Spielman, S.E. (2006) Appropriate use of the K-function in Urban Environments. American Journal of Public Health, 96(2): 205.
  40. Spielman, S.E., Golembeski, C.A., Northridge, M.E., et al. (2006) Interdisciplinary Planning for Healthier Communities: Findings from the Harlem Children’s Zone Asthma Initiative. Journal of the American Planning Association, 72(1): 100–108.
Book Chapters
  1. Singleton, A., Spielman, S.E. (2021) Urban Governance. In Kwan, M.-P., Batty, M., Goodchild, M., Shi, W. (eds) Urban Informatics. Springer Nature.
  2. Spielman, S.E. (2017) The Potential for Big Data to Improve Neighborhood-Level Census Data. In Thakuria, V. (ed) Seeing Cities Through Big Data. Springer.
  3. Spielman, S.E. (2016) Point Pattern Analysis. International Encyclopedia of Geography.
  4. Spielman, S.E., Folch, D.C. (2015) Social Area Analysis with Self-Organizing Maps. In Brunsdon, C. and Singleton, S. (eds) Geocomputation. Sage Press.
Selected Reports & Working Papers
  1. National Academies of Sciences, Engineering, and Medicine. (2020) 2020 Census Data Products: Data Needs and Privacy Considerations: Proceedings of a Workshop. Washington, DC: The National Academies Press.
  2. Department of Interior Strategic Sciences Group (2014) Operation Group Sandy Technical Report. Department of Interior, Reston, VA.

Selected Invited Presentations

2025 Keynote, AI and the Future of Consumer Data Research — Oxford University, Saïd School of Business
2024 Panelist, Directions for Social and Behavioral Sciences — National Academy of Sciences
2022 Keynote, Urban Analytics 2.0 — Alan Turing Institute, London
2021 Expert Panel, COVID Vaccine Distribution Equity — CDC / NCVHS
2019 Differential Privacy: Spatial Considerations — National Academy of Sciences
2018 Mayer Lecture, “How to Fix a $400 Billion Map” — University of Wisconsin
2017 Keynote, Spatial Data Science Conference — Carto.com, New York

Education

Ph.D.
Geographic Information Science
State University of New York at Buffalo
2008
M.S.
Urban Planning
Columbia University
2001
B.A.
Geography
Macalester College
1997

I’m always interested in conversations about measurement, AI, and how to make better decisions with better data. If you’re working on something where these things matter, I’d enjoy hearing from you.

Get in Touch