Questioning the Survey Responses of Large Language Models

2024

Conference Paper

sf

As large language models increase in capability, researchers have started to conduct surveys of all kinds on these models in order to investigate the population represented by their responses. In this work, we critically examine language models' survey responses on the basis of the well-established American Community Survey by the U.S. Census Bureau and investigate whether they elicit a faithful representations of any human population. Using a de-facto standard multiple-choice prompting technique and evaluating 39 different language models using systematic experiments, we establish two dominant patterns: First, models' responses are governed by ordering and labeling biases, leading to variations across models that do not persist after adjusting for systematic biases. Second, models' responses do not contain the entropy variations and statistical signals typically found in human populations. As a result, a binary classifier can almost perfectly differentiate model-generated data from the responses of the U.S. census. At the same time, models' relative alignment with different demographic subgroups can be predicted from the subgroups' entropy, irrespective of the model's training data or training strategy. Taken together, our findings suggest caution in treating models' survey responses as equivalent to those of human populations.

Author(s):	Ricardo Dominguez-Olmedo and Moritz Hardt and Celestine Mendler-Dünner
Book Title:	arXiv preprint arXiv:2306.07951
Year:	2024
Month:	September

Department(s):	Social Foundations of Computation
Bibtex Type:	Conference Paper (conference)

State:	Published
URL:	https://openreview.net/pdf?id=Oo7dlLgqQX

Links:	ArXiv

BibTex @conference{dominguez2024questioning, title = {Questioning the Survey Responses of Large Language Models}, author = {Dominguez-Olmedo, Ricardo and Hardt, Moritz and Mendler-D{\"u}nner, Celestine}, booktitle = {arXiv preprint arXiv:2306.07951}, month = sep, year = {2024}, doi = {}, url = {https://openreview.net/pdf?id=Oo7dlLgqQX}, month_numeric = {9} }

People

ei sf

Ricardo Dominguez-Olmedo

Ph.D. Student

Moritz Hardt

Managing Director

Celestine Mendler-Dünner

Research Group Leader

Questioning the Survey Responses of Large Language Models

2024

Conference Paper

sf

People

Latest News

Links

Contact Us